This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Springer Complexity SpringerComplexity is a publication program, cutting across all traditional disciplinesof sciences as wellas engineering, economics, medicine, psychologyand computersciences, whichis aimedat researchers, studentsand practitioners working in the field of complex systems. Complex Systems are systems thatcomprisemanyinteractingpartswiththe abilityto generatea newquality of macroscopic collectivebehaviorthrough self-organization, e.g., the spontaneous formation of temporal, spatial or functional structures. This recognition, that the collectivebehavior of the whole system cannot be simply inferred from the understanding of the behavior of the individual components, has led to various newconceptsand sophisticated tools of complexity. The main conceptsand tools - with sometimes overlapping contents and methodologies - are the theories of self-organization, complexsystems,synergetics, dynamical systems, turbulence, catastrophes, instabilities, nonlinearity, stochasticprocesses, chaos, neural networks, cellular automata, adaptive systems, and genetic algorithms. The topics treated within Springer Complexity are as diverse as lasers or fluids in physics, machinecuttingphenomena of workpieces or electriccircuitswithfeedback in engineering, growth of crystalsor patternformation in chemistry, morphogenesis in biology, brainfunction in neurology, behaviorof stockexchangerates in economics, or the formation of publicopinionin sociology. All these seeminglyquite different kinds of structure formation have a number of important features and underlying structuresin common. Thesedeep structuralsimilarities can be exploited to transfer analytical methodsand understanding from one fieldto another. The SpringerComplexity program therefore seeks to foster cross-fertilization between the disciplinesand a dialogue between theoreticians and experimentalists for a deeper understanding of the general structure and behaviorof complexsystems. The programconsistsof individual books,booksseries such as "SpringerSeriesin Synergetics", "Instituteof NonlinearScience", "Physics of NeuralNetworks", and "Understanding Complex Systems", as well as variousjournals.
New England Complex Systems Institute
NECS/
President Yaneer Bar-Yam New England Complex Systems Institute 24 Mt. Auburn St. Cambridge , MA 02138, USA
For over 10 years, The New England Complex Sy stem s Institute (NECSI) has been instrumental in the development of complex systems science and its applications . NECSI conducts research , education, know-ledge dissem ination , and community development around the world for the promotion of the study of complex sys tems and its appli cation for the betterment of society . NECSI was founded by faculty of New England area academic institutions in 1996 to further international research and understanding of complex systems . Complex sys tems is a growing field of science that aims to understand how parts of a syst em give rise to the sys tem's collective behaviors, and how it interacts with its environment. These questions can be studied in general, and they are also relevant to all traditional fields of science. Social systems formed (in part) out of people , the brain formed out of neuron s, molecules formed out of atom s, and the weather formed from air flows are all examples of complex systems. The field of complex system s intersects all traditional disciplines of physical, biological and social sciences, as well as engineering , management , and medicine . Advanced education in complex system s attracts profes sion als, as complex systems science provides practical approaches to health care, social network s, ethnic violence, marketing , milit ary confl ict , education , systems eng ineering , international developm ent and terrorism . The study of complex syst ems is about understand ing indirect effects. Problems we find difficult to solve have causes and effects that are not obv iously related, Pushing on a complex system "here" often has effects "over there " because the parts are interdependent. This has become more and more apparent in our efforts to solve societal problems or avoid ecolo gical disasters caused by our own actions . The field of complex systems provides a number of soph isticated tools , some of them conceptual helping us think about these systems , some of them analytical for studying these systems in greater depth, and some of them computer based for describing, modeling or simulating them . NECSI research develops basic concepts and formal approach es as well as their applications to real world problem s. Contribution s of NECSI researcher s include studies of networks, agent-based modeling , multiscale analy si s and complexity , chaos and predictability, evolution, ecology, biodiversity , altrui sm, systems biology , cellular res ponse, health care, sys tems engineering , negotiation , military conflict, ethnic violence, and international development. NECSI uses many modes of education to further the investigation of complex systems . Throughout the year, cla sses, seminars, conferences and other program s assi st students and professionals alike in the ir understanding of complex systems. Courses have been taught all over the world: Australia , Canada, China, Colombia, France, Italy, Japan, Korea, Portug al, Russia and many states of the U.S. NECSI also sponsors postdoctoral fellows, provides research resources, and hosts the Intern ational Conference on Complex Systems, discussion groups and web resources.
@ NECSI
New England Complex Systems Institute Book Series Series Editor Dan Braha
New England Complex Systems Institute 24 Mt. Auburn St. Cambridge, MA 02138, USA
New England Complex Systems Institute Book Series The world around is full of the wonderful interplay of relationships and emergent behaviors . The beautiful and mysterious way that atoms form biological and social systems inspires us to new efforts in science . As our society becomes more concerned with how people are connected to each other than how they work independently, so science has become interested in the nature of relationships and relatedness . Through relationships elements act together to become systems, and systems achieve function and purpose . The study of complex systems is remarkable in the closeness of basic ideas and practical implications . Advances in our understanding of complex systems give new opportunities for insight in science and improvement of society . This is manifest in the relevance to eng ineering , medicine, management and education . We devote this book series to the communication of recent advances and reviews of revolutionary ideas and their application to practical concerns.
Unifying Themes in Complex Systems VI Proceedings of the Sixth International Conference on Complex Systems
Edited by Ali Minai, Dan Braha and Yaneer Bar-Yam
Ali A. Minai Univeristy of Cincinnati Department of Electrical and Computer Engineering, and Computer Science P.O. Box 210030, Rhodes Hall 814 Cincinnati, OH 45221-0030, USA Email: [email protected] Dan Braha New England Complex Systems Institute 24 Mt. Auburn St. Cambridge, MA 02138-3068, USA Email: [email protected] Yaneer Bar-Yam New England Complex Systems Institute 24 Mt. Auburn St. Cambridge, MA 02138-3068, USA Email : [email protected]
This volume is part of the New England Complex Systems Institute Series on Complexity Library of Congress Control Number: 2008931598
ISBN-978-3-540-85080-9 Springer Berlin Heidelberg New York This work is subject to copyright . All rights are reserved, whether the whole or part of the material is concerned, specifically the right s of translation, reprinting, reuse of illustrations, recitation, broad casting, reproduction on microfilm or in any other way, and storage in data banks . Duplication of this publication or parts thereof is permitted only under th e provisions of the German Copyright Law of September 9, 1965, in its current version. Violations are liable for prosecution under the German Copyright Law. Springer is a part of Springer Science-l-Business Media springer.com NECSI Cambridge, Massachusetts 2008 Printed in the USA The use of general descriptive names, registered names , trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.
CONTENTS -
VOLUME VI
Introduction
vii
Organization and Program
viii
NECSI Publications
xxiv
PART I: Methods Daniel Polani Emergence, Intrinsic Structure of Information, and Agenth ood
3
Susan Sgorbati & Bruce W eber How Deep and Broad are t he Laws of Emergence?
11
Victor Korotkikh & Galina Korotkikh On an Irreducible Th eory of Complex Systems
19
Jacek Marczyk & Balachandra Deshpande Measuring and Tracking Complexity in Science
27
Val K . Bykovsky Dat a-Driven Modeling of Complex Systems
34
Tibor Bosse, Alexei Sharpanskykh & Jan Treur Modelling Complex Systems by Integration of Agent-Based and Dynamical Systems Models
42
Yuriy Gulak On Elementary and Algebraic Cellular Automat a
50
David G. Green, Tania G. Leishman & Suzanne Sadedin Dual Ph ase Evolution - A Mechanism for Self-Organization in Complex Systems
58
Jun Wu, Yue-Jin Tan, Hong-Zhong Deng & Da-Zhi Zhu A New Measure of Heterogeneity of Complex Networks Based on Degree Sequence
66
Daniel E . Whitney & David Alderson Are Technological and Social Networks Really Different ?
74
Takeshi Ozeki
Evolutional Family Networks Generat ed by Group-Entry Growth Mechanism with P referential Att achment and t heir Features
82
Gabor Csardi, Katherine Strandburg, Laszlo Zalanyi, Jan Tobochnik & Peter Erdi Estimating the Dynamics of Kernel-Based Evolving Networks
90
Pedram Hovareshti & John S. Baras Consensus Problems on Small World Graphs: A Structural Study
98
Thomas F. Brantle & M. Hosein Fallah Complex Knowledge Networks and Invention Collaboration
106
Philip Vos Fellman & Jonathan Vos Post Complexity, Competitive Intelligence and the "First Mover" Advantage
114
Jiang He & M. Hosein Fallah Mobility of Innovators and Prosperity of Geographical Technology Clusters
122
Vito Albino, Nunzia Carbonara & Haria Giannoccaro Adaptive Capacity of Geographical Clusters: Complexity Science and Network Theory Approach
130
Philip Vos Fellman Corporate Strategy an Evolutionary Review
138
Diane M. McDonald & Nigel Kay Towards an Evaluation Framework for Complex Social Systems
146
Kevin Brandt Operational Synchronization
154
Philip Vos Fellman The Complexity of Terrorist Networks
162
Czeslaw Mesjasz Complexity Studies and Security in the Complex World: An Epistemological Framework of Analysis
170
Giuseppe Narzisi, Venkatesh Mysore, Jeewoong Byeon & Bud Mishra Complexities, Catastrophes and Cities: Emergency Dynamics in Varying Scenarios and Urban Topologies
178
Samantha Kleinberg, Marco Antoniotti, Satish Tadepalli, Naren Ramakrishnan & Bud Mishra Systems Biology via Redescription and Ontologies(II): A Tool for Discovery in Complex Systems
PART II: Models R ene Doursat Th e Growing Canvas of Biological Development : Multiscale Pattern Generat ion on an Expanding Latt ice of Gene Regulatory Nets
203
Franziska Matthaus, C arlos Sal azar &: Oliver Ebenhoh Compound Clustering and Consensus Scopes of Metabolic Networks
211
Robert Melamede Endocannabinoids: Multi-scaled, Global Homeost at ic Regulators of Cells and Society
219
Walter Riofrio &: Luis Angel Aguilar Different Neurons Population Distribution correlat es with 'Iopologic-Temporal Dynamic Acoust ic Inform ation Flow
227
Mark Hoogendoorn, Martijn C. Schut &: Jan Treur Modeling t he Dynamics of Task Allocat ion and Specialization in Honeybee Societies
235
Garrett M. Dancik, Douglas E. Jones &: Karin S . Dorman An Agent- Based Model for Leishman ia major Infection
243
Holger Lange, Bjorn 0kland &: Paal Krokene To Be or Twice To Be? The Life Cycle Development of the Spruce Bar k Beetle Under Climate Change
251
Tibor Bosse, Alexei Sharpanskykh &: Jan Treur A Formal Analysis of Complexity Monotonicity
259
Claudio Tebaldi &: Deborah Lacitignola Complex Features in Lot ka-Volterra Syst ems with Behavioral Adaptation 267 Gerald H. Thomas Sc Keelan Kane A Dynamic T heory of Str ategic Decision Making Applied to t he Prisoners Dilemma
275
Mike Mesterton-Gibbons &: Tom N . Sherratt Animal Network Phenomena: Insights from Triadic Games
283
Simon Angus Endogenous Cooperation Network Formation
291
Khan Md. Mahbubush Salam &: Kazuyuki Ikko Takahashi Mat hematical Model of Conflict and Cooperation with Non-Annihilat ing Multi-Opponent
299
Margaret Lyell, Rob Flo &: Mateo Mejia-Tellez Simulatio n of Pedest rian Agent Crowds, with Crisis
307
Michael T. Gastner Traffic Flow in a Spatial Network Model
315
Gergana Bounova & Olivier de Weck Augmented Network Model for Engineering System Design
323
Daniel E. Whitney Network Models of Mechanical Assemblies
331
Jun Yu, Laura K. Gross & Christopher M. Danforth Complex Dynamic Behavior on Transition in a Solid Combustion Model 339 Ian F. Wilkinson, James B. Wiley & Aizhong Lin Modeling the Structural Dynamics of Industrial Networks
347
Leonard Wojcik, Krishna Boppana, Sam Chow, Olivier de Weck, Christian LaFon, Spyridon D. Lekkakos, James Lyneis, Matthew Rinaldi, Zhiyong Wang, Paul Wheeler & Marat Zborovskiy Can Models Capture the Complexity of the Systems Engineering Process? 366 Clement McGowan, Fred Cecere, Robert Darneille & Nate Laverdure Biological Event Modeling for Response Planning
374
Dmitry Chistilin Principles of Self-Organization and Sustainable Development of the World Economy are the Basis of Global Security
382
Walid Nasrallah Evolutionary Paths to Corrupt Societies of Artificial Agents
390
Roxana Wright, Philip Vos Fellman & Jonathan Vos Post Path Dependence, Transformation and Convergence A Mathematical Model of Transition to Market
398
Kumar Venkat & Wayne Wakeland Emergence of Networks in Distance-Constrained Trade
406
Ian F. Wilkinson, Robert E. Marks & Louise Young Toward Agent-Based Models of the Development and Evolution of Business Relations and Networks
414
Sharon A. Mertz, Adam Groothuis & Philip Vos Fellman Dynamic Modeling of New Technology Succession: Projecting the Impact of Macro Events and Micro Behaviors On Software Market Cycles
422
Manuel Dias & Tanya Araujo Hypercompetitive Environments: An Agent-Based Model Approach
430
V . Halpern Precursors of a Phase Transition in a Simple Model System
438
C. M. Lapilli, C. Wexler & P . Pfeifer Universality Away from Critical Points in a Thermostatistical Model
446
Philip Vos Fellman & Jonathan Vos Post Quantum Nash Equilibria and Quantum Computing
454
PART III: Applications Hiroki Sayama Teaching Emergence and Evolution Simultaneously Through Simulated Breeding of Artificial Swarm Behaviors
463
Ashok Kay Kanagarajah, Peter Lindsay, Anne Miller & David Parker An Exploration into the Uses of Agent-Based Modeling to Improve Quality of Healthcare
471
Neena A. George, Ali Minai & Simona Doboli Self-Organized Inference of Spatial Structure in Randomly Deployed Sensor Networks
479
Abhinay Venuturumilli & Ali Minai Obtaining Robust Wireless Sensor Networks through Self-Organization of Heterogenous Connectivity
487
Orrett Gayle & Daniel Coore Self-Organizing Text in an Amorphous Environment
495
Adel Sadek & N agi Basha Self-Learning Intelligent Agents for Dynamic Traffic Routing on Transportation Networks
503
Sarjoun Doumit & Ali Minai Distributed Resource Exploitation for Autonomous Mobile Sensor Agents in Dynamic Environments
511
Javier Alcazar & Ephrahim Garcia Interconnecting Robotic Subsystems in a Network
519
Chad Foster Estimating Complex System Robustness from Dual System Architectures 527 Dean J. Bonney Inquiry and Enterprise Transformation
535
Mike Webb Capability-Based Engineering Analysis (CBEA)
540
Keith McCaughin & Joseph DeRosa Stakeholder Analysis To Shape the Enterprise
548
George Rebovich Jr. Systems Thinking for the Enterprise: A Thought Piece
556
Matt Motyka, Jonathan R.A. Maier & Georges M . Fadel Representing the Complexity of Engineering Systems: A Multidisciplinary Perceptual Approach
564
Dighton Fiddner Policy Scale-free Organizational Network: Artifact or Phenomenon?
572
Hans-Peter Brunner Application of Complex Systems Research to Efforts of International Development
580
Alex Ryan About the Bears and the Bees: Adaptive Responses to Asymmetric Warfare
588
Donald Heathfield Improving Decision Making in the Area of National and International Security - The Future Map Methodology
596
Andrei Irimia, Michael R. Gallucci & John P. Wikswo Jr. Comparison of Chaotic Biomagnetic Field Patterns Recorded from the Arrhythmic Heart and Stomach
604
F. Canan Pembe & Haluk Bingol Complex Networks in Different Languages: A Study of an Emergent Multilingual Encyclopedia
612
Gokhan Sahln, Murat Erentiirk & Avadis Hacinliyan Possible Chaotic Structures in the Turkish Language with Time Series Analysis
618
Index of authors
626
vii
INTRODUCTION The science of complex systems has made impressive strides in recent years. Relative to the opportunities, however, the field is still in infancy. Our science can provide a unified foundation and framework for the study of all complex systems. In order for complex systems science to fulfill its potential, it is important to establish conventions that will facilitate interdisciplinary communication. This is part of the vision behind the International Conference on Complex Systems (ICCS) . For over a decade, ICCS has fostered much-needed cross-disciplinary communication. Moreover, it provides a forum for scientists to better understand universally applicable concepts such as complexity, emergence, evolution, adaptation and self organization. The Sixth ICCS proved that the broad range of scientific inquiry continues to reveal its common roots. More and more scientists realize the importance of the unifying principles that govern complex systems . The Sixth ICCS attracted a diverse group of participants reflecting wide ranging and overlapping interests. Topics ranged from economics to ecology, particle physics to psychology, and business to biology. Through plenary, pedagogical, breakout and poster sessions, conference attendees shared discoveries that were significant both to their particular field, as well as the overarching study of complex systems. This volume contains the proceedings from that conference. Recent work in the field of complex systems has produced a variety of new analytic and simulation techniques that have proven invaluable in the study of physical, biological and social systems. New methods of statistical analysis led to better understanding of patterns and networks. The application of simulation techniques such as agent-based models, cellular automata, and Monte Carlo simulations has increased our ability to understand or even predict the behavior of systems. The concepts and tools of complex systems are of interest not only to scientists, but also to corporate managers, physicians and policy makers. The rules that govern key dynamical behaviors of biochemical or neural networks apply to social or corporate networks, and professionals have started to realize how valuable these concepts are to their individual fields. The ICCS conferences have provided the opportunity for professionals to learn the basics of complex systems and share their real-world experience in applying these concepts.
Sixth International Conference on Complex Systems: Organization and Program Host : New Eng land Complex Systems Inst it ut e
Partial Financial Support: National Science Found ation
Additional Support : Birkhouser Edward Elgar Publishing Springer
Chairman: Yaneer Bar-Yam - NECSI
*
Executive Committee: Ali Minai - University of Cincinnati t Dan Braha - Univers ity of Massachuset t s, Dart mout h
* t
NECS I Co-facu lty NECS I Affiliate
t
ix
Program Committee: Russ Abbott - CSU Los Angeles Yaneer Bar-Yam - NECSI * Philippe Binder - University of Hawaii Dan Braha - MIT Jeff Cares - NECSI Director, Military Programs t Irene Conrad - Texas A&M University Kingsville Fred Discenzo - Rockwell Automation Carlos Gershenson - NECSI and Vrije Universiteit Brussel James Glazier - Indiana University Charles Hadlock - Bentley College Nancy Hayden - Sandia National Laboratories Helen Harte - NECSI Organization Science Program t Guy Hoelzer - University of Nevada, Reno Sui Huang - Harvard University t
Mark Klein - MIT t Tom Knight - MIT Michael Kuras - MITRE May Lim - NECSI, Brandeis University and University of the Philippines, Diliman Dwight Meglan - SimQuest Ali Minai - University of Cincinnati t Lael Parrott - Universi ty of Montreal Gary Nelson - Homeland Security Institute Doug Norman - MITRE Hiroki Sayama - Binghamton University, SUNY t Marlene Williamson Sanith Wijesinghe - Millenium IT Jonathan Vos Post - Computer Futures, Inc . Martin Zwick - Portland State University
Founding Organizing Committee of the ICCS Conferences: Philip W . Anderson - Princeton University Kenneth J . Arrow - Stanford University Michel Baranger - MIT * Per Bak - Niels Bohr Institute Charles H. Bennett - IBM William A. Brock - University of Wisconsin Charles R. Cantor - Boston University * Noam A. Chomsky - MIT Leon Cooper - Brown University Daniel Dennett - Tufts University Irving Epstein - Brandeis University * Michael S. Gazzaniga - Dartmouth College William Gelbart - Harvard University * Murray Gell-Mann - CalTech/Santa Fe Institute Pierre-Gilles de Gennes - ESPCI Stephen Grossberg - Boston University Michael Hammer - Hammer & Co John Holland - University of Michigan John Hopfield - Princeton University Jerome Kagan - Harvard University * Stuart A. Kauffman - Santa Fe Institute
* t
NECSI Co-faculty NECSI Affiliate
Chris Langton - Santa Fe Institute Roger Lewin - Harvard University Richard C. Lewontin - Harvard University Albert J. Libchaber - Rockefeller University Seth Lloyd - MIT * Andrew W . Lo - MIT Daniel W. McShea - Duke University Marvin Minsky - MIT Harold J. Morowitz - George Mason University Alan Perelson - Los Alamos National Lab Claudio Rebbi - Boston University Herbert A. Simon - Carnegie-Mellon University Temple F . Smith - Boston University * H. Eugene Stanley - Boston University John Sterman - MIT * James H. Stock - Harvard University * Gerald J . Sussman - MIT Edward O. Wilson - Harvard University Shuguang Zhang - MIT
x
Session Chairs: Iqbal Adjali - Unilever Yaneer Bar-Yam - NECSI * Dan Braha - University of Massachusetts, Dartmouth t Hans-Peter Brunner - Asian Development Bank Jeff Cares - NECSI Director, Military Programs t Irene Conrad - Texas A & M University Kingsville Joe DeRosa - MITRE Ronald DeGray - Saint Joseph College Fred Discenzo - Rockwell Automation Adrian Gheorghe - Old Dominion University Helen Harte - NECSI Organization Science Program t Guy Hoelzer - University of Nevada, Reno Plamen Ivanov - Boston University Mark Klein - MIT t Holger Lange - Norwegian Forest and Landscape Institute May Lim - NECSI, Brandeis University and University of the Philippines, Diliman Thea Luba - The MediaMetro Society
Joel MacAuslan - S.T .A.R Corp Gottfreid Mayer-Kress - Penn State University Dwight Meglan - SimQuest David Miguez - Brandeis University Ali Minai - University of Cincinnati t John Nash - Princeton University Doug Norman - MITRE Daniel Polani - University of Hertfordshire Jonathan Vos Post - Computer Futures, Inc . Hiroki Sayama - Binghamton University, SUNY t Jeff Schank - University of California, Davis Hava Siegelmann - University of Massachusetts, Amherst John Sterman - MIT Sloan * William Sulis - McMaster University Jerry Sussman - MIT Stephenson Tucker - Sandia National Laboratory Sanith Wijesinghe - Millenium IT Martin Zwick - Portland State University
Logistics and Coordination: Sageet Braha Eric Downes Nicole Durante Luke Evans Debra Gorfine
* t
NECSI Co-faculty NECSI Affiliate
Konstantin Kouptsov Aaron Littman Greg Wolfe
Xl
Subject areas: Unifying themes in complex systems The themes are: EMERGENCE, STRUCTURE AND FUNCTION: substructure, th e relationship of component to collect ive behavior, th e relationship of internal structure to external influence, multiscale structure and dynamics. INFORMATICS: structuring, storing, accessing, and distributing information describing complex systems. COMPLEXITY: characte rizing the amount of information necessary to describe complex systems, and the dynami cs of this information. DYNAMICS : tim e series analysis and prediction , chaos, temporal correlations, t he tim e scale of dynamic processes. SELF-ORGANIZATION: pattern formation , evolut ion, development and adapt at ion.
The system categories are: FUNDAMENTALS, PHYSICAL &; CHEMICAL SYSTEMS: spatio-temporal patterns and chaos, fractals , dynamic scaling, nonequilibrium processes, hydrod ynami cs, glasses, non-linear chemical dynamics, complex fluids, molecular self-organization, information and computation in physical syst ems. BIO-MOLECULAR Sc CELLULAR SYSTEMS: prot ein and DNA folding, bio-molecular informatics , membranes, cellular response and communication , genet ic regulation , gene-cytoplasm interactions, development , cellular differenti ation , primitive mult icellular organisms , th e immune system. PHYSIOLOGICAL SYSTEMS: nervous system , neuro-muscular control, neural network models of brain , cognition , psychofunction , pattern recognition , man-machine interactions . ORGANISMS AND POPULATIONS: population biology, ecosystems, ecology. HUMAN SOCIAL AND ECONOMIC SYSTEMS: corporate and social structures, markets , the global economy, the Int ernet. ENGINEERED SYSTEMS: product and product manufacturing, nano-technolo gy, modified and hybrid biological organisms , compute r based interactive systems, agents, art ificial life, artificial intelligence, and robots.
xii
Program: Sunday, June 25, 2006 PEDAGOGICAL SESSIONS - Mark Klein and Dan Braha - Session Chairs Albert-Laszlo Barabasi - The Architect ure of Complexity : Networks, biology, and dynamics Alfred Hubler - Understanding Complex Systems - From Parad igms to App licat ions Ali M inai - Biomorphic Systems Lev Levitin - Zipf Law Revisited: A Mode l of Em ergence and Manifestation Felice Frankel - T he Visual Expression of Complex Systems: An App roach to Understanding W hat Questions We Should Ask. Katy Borner - Mapping Science Susan Sgorbati - The Emergent Improvisat ion Project OPENING RECEPTION Kenneth W ilson - Pet er Druc ker's Revolutionary Ideas abo ut Ed ucation
Monday, June 26, 2006 EMERGENCE - Jonathon Vos Post - Session Chair Judah Folkman - Ca ncer Irving Epste in - Emergent Pat terns in Reaction-Diffusion Systems Eric Bonabeau - Simulat ing Systems SCIFI TO SCIENCE - Jonathon Vos Post - Session Chai r David Brin - Prediction as Faith, Prediction as a Too l: Peering Int o Tomorrow's World . FROM MODELING TO REALITY - Hiroki Sayama - Session Chai r Ed Fredkin - Finite Nat ure Eric Klopfer - StarLogo The Next Generation Ron Weiss - Programming Biology NETWORKS - Ali Minai - Session Chair John Bragin, Nicholas Gessler - Design and Implement at ion of an Undergraduate Degree Program in Social an d Organizational Complexity. Jun Wu, Yue-jin Tan, Hong-zhong Deng, Da-zhi Zhu - A new measure of heterogeneity of complex networks based on degree sequence David Green, Tania Le ishman, Suzanne Sadedin - T he emergence of socia l consensus in Boolean networks D aniel Whitney - Network Models of Mechanical Assemblies Daniel Whitney, David Alderson - Are tec hnological an d social networks really different ? Nathan LaBelle, Eugene Wallingford - Inter-package dep endency net works in ope nsource software Gabor Csardi, Tamas Nepusz - The igraph software package for complex network resea rch Gergana Bounova, Olivier de Weck - Augmente d network mode l for engineering system design Tania Leishman, David Green , Suzanne Sadedin - Dual phase evolution : a mechanism for self-orga nization in comp lex systems Kurt R ichardson - Complexity, Informat ion and Rob ust ness: T he Role of Infor mat ion 'Barriers' in Boolean Networ ks Andrew Krueger - Inferring Networ k Connectivity with Combine d Perturbations
xiii Takeshi Ozeki - Evolutional fam ily networ ks generated by group entry growt h mechan ism with preferenti al at tachment and t heir features Valeria Prokhotskaya, Valentina Ipatova, Aida Dmitrieva - Intrapopul ation cha nges of algae under tox ic exposure John Voiklis, Manu Kapur - T wo Walks th rough P rob lem Space John Voiklis - An Act ive Walk t hrough Semantic Space (Poster) David Stokes, V. Anne Smith - A complex syste ms approach to history: correspondence networks among 17th century astronomers Martin Greiner - Selforganizing wireless mult ihop ad hoc commun icat ion networks Lin Leung - Int elligent Inte rnationa l Medical Network: Design and Performance Ana lysis Durga Katie, Dennis E. Brown, Charles E . Johnson, Raphael P . Hermann, Fernande Grandjean, Gary J . Long, Susan M. Kauzlarich - An Ant imony-121 Moessbauer Spectral St udy of Yb 14MnSbll HOMELAND SECURITY - Stephenson Tucker - Session Cha ir Nairn Kapucu - Interorganizat ional coord ination in the National Response Plan (NRP) : t he evolution of comp lex systems Philip Fellman - T he complexity of te rrorist networks W . David Stephenson - Networked Homela nd Security strategies Gustavo Santana Torrellas - A Knowledge Manage ment Framework for Security Assessment in a Multi Agent P KI-based Net working Envi ronme nt Dighton Fiddner - Scale-free policy organizations' net work: ar tifact or pheno menon? Stephen Ho, Paul Gonsalves , Marc Richards - Complex adaptive systems-based too lkit for dynam ic plan assessment Donald Heathfield - Improving decision mak ing in t he area of national an d internationa l security: t he future map met hodology Steven McGee - Met hod to Ena ble a Homeland Security Hear t beat - Heartbeat e9-1-1 Maggie Elestwani - Disaster response and com plexity model: t he sout heast Texas Kat rina-Ri t a response Margaret Lyell, Rob Flo, Mateo Mejia-Tellez - Agent-based simulation framework for st udies of cognitive pedestrian agent behavior in an ur ban environme nt wit h crisis MATHEMATICAL METHODS - Daniel Polani - Session Chair G . P . Kapoor - St abili ty of simulation of highly uneven curves and surfaces using fract al interpo lat ion Val Bykovsky - Data-driven tec hnique to mode l complex systems Fu Zhang, Benito Fernndez-Rodriguez - Feedback Linearizat ion Control Of Systems W ith Singularities Nikola Petrov, Arturo Olvera - Regularity of critical objects in dynamical systems James K ar, Martin Zwick , Beverly Fuller - Short-term financial market prediction using an informatio n-t heoret ic app roac h Jean-Claude Torrel, Claude Lattaud, Jean-Claude Heudin - Complex St ellar Dynam ics an d Ga lactic Patterns Formation Tibor Bosse , Alexei Sharpanskykh, J an Treur - Modelling complex systems by int egrat ion of agent-based and dy na mica l systems methods Kostyantyn Kovalchuk - Simulatio n uncert ainty of complex economic system behavior Xiaowen Zhang, Ke Tang, Li Shu - A Chaot ic Cipher Mmohocc an d Its Randomness Evaluation J M arczyk, B Deshpande - Measuring and Tracking Complexity in Science Cliff Joslyn - Reconstructibility Analysis as a n Order T heoretical Knowledge Discovery Technique Alec Resnick , Jesse Louis-Rosenberg - Dynamic State Networks Matthew Francisco, Mark Goldberg, M alik M agdon-Ismail, William Wallace Using Agent-Based Modeling to Traverse Frameworks in T heor ies of t he Socia l
xiv ENGINEERING - Fred Discenzo - Session Chair Bogdan Danila, Andrew Williams, Kenric Nelson, Yong Yu, Samuel Earl, John Marsh, Zoltan Toroczkai, Kevin Bassler - Optimally Efficient Congestion Aware Transport On Complex Networks Orrett Gayle, Daniel Coore - Self-organising text in an amorphous computing environment Javier Alcazar, Ephrahim Garcia - Interconnecting Robotic Subsystems in a Network Chad Foster, Daniel Frey - Estimating complex system robustness from dual system architectures Mark Hoogendoorn, Martijn Schut, Jan Treur - Modeling decentralized organizational change in honeybee societies Jonathan R. A. Maier, Timothy Troy, Jud Johnston, Vedik Bobba, Joshua D. Summers - A Case Study Documenting Specific Configurations and Information Exchanges in Designer-Artifact-User Complex Systems Mo-Han Hsieh, Christopher Magee - Standards as interdependent artifacts: application to prediction of promotion for internet standards Chi-Kuo Mao, Cherng G. Ding, Hsiu-Yu Lee - Comparison of post-SARS arrival recovery patterns Zafar Mahmood, Dr. Nadeem Lehrasab, Muhammad Iqbal, Dr. Nazir Shah Khattak, Dr. S. Fararooy Fararooy - Eigen Analysis of Model Based Residual Spectra for Fault Diagnostics Techniques Russ Abbott - Emergence explained INNOVATION - Helen Harte - Session Chair Adam Groothuis, Sharon Mertz, Philip Vos Fellman - Multi-agent based simulation of technology succession dynamics Svetlana Ikonnikova - Games the parties of Eurasian gas supply network play: analysis of strategic investment, hold-up and multinational bargaining Koen Hindriks, Catholijn Jonker, Dmytro Tykhonov - Reducing complexity of an agent's utility space for negotiating interdependent issues Md. Mahbubush Salam Khan, Kazuyuki Ikko Takahashi - Mathematical model of conflict and cooperation with non-annihilating multi-opponent Philip Fellman, Matthew Dadmun, Neil Lanteigne - Corporate strategy: from core competence to complexity-an evolutionary review Adam Groothuis, Philip Vos Fellman - Agent based simulation of punctuated tech nology succession: a real-world example using medical devices in the field of interventional cardiology Sharon Mertz, Adam Groothuis, Philip Fellman - Dynamic Modeling of New Technology Succession : Projecting the Impact of Macro Events and Micro Behaviors on Software Market Cycles Manuel Dias, Tanya Araujo - Hypercompetitive environments: an agent based model approach Diane McDonald, George Weir - A methodology for exploring emergence in learning communities Diane McDonald, Nigel Kay - Towards an evaluation framework for complex social systems Debra Hevenstone - Employer- Employee Matching with Intermediary Parties: a trade off Between Match Quality and Costs? BIOLOGY - Jeff Schank - Session Chair Margaret J. Eppstein, Joshua L. Payne, Bill C. White, Jason H. Moore - A 'random chemistry' algorithm for detecting epistatic genetic interactions Jason Bates - Nonlinear network theory of complex diseases C. Anthony Hunt - Understanding Emergent Biological Behaviors: Agent Based Simulations of In vitro Epithelial Morphogenesis in Multiple Environments
xv Franziska Matthaeus, Oliver Ebenhoeh - Large-scale ana lysis of metabolic networks: cluste ring metabolites by t heir synt hesizing capac it ies Mar tin Greiner - Impact of observational incomplete ness on t he st ructura l properti es of prot ein int eraction networks Jorge M azze o, Melina Rapacio li, Vl adimir Flores - Cha racterizing cell proliferat ion process in t he developing central nervo us system Garrett Dancik, Karin Dorma n , Doug Jones - An agent- based model for Leishman ia infect ion Markus Schwehm, M anuel P oppe - Mode lling cytoskeleton morphogenesis wit h SBToo is Ray Greek, Niall Shanks - Implicat ions of complex systems in biomed ical research using a nima ls as mod els of hum an s Robert Melamede - E ndocannab inoids : Mult i-sca led, Globa l Homeostatic Regulators of Cells an d Society Holger Sltmann - Reverse Ph ase Protein Arrays for prot ein qu antificat ion in biological sam ples James Glazier - Cell Or ient ed Mod eling Biological Development using t he Cellular Potts Model POSTERS Pooya Kabiri - Ra nkin cycle Power P lant Exe rgy a nd Energy Optimization With Gra ph Mode l Mit r a Shojania Feizabadi - Tr acing t he behavior of t umor cells du ring a course of chemot hera py Tibor Bosse , Martijn Schut , J an Treur, David Wendt - Emergence of alt ruism as a result of cognit ive capa bilit ies involving trust an d int ertemp ora l decision-m aki ng Dritan Osmani, Richard Tol - T he case of two self-enforcing internationa l agreements for environmental prot ecti on R ene Doursat, Elie Bienenstock - How Act ivity Regulates Connectivity: A SelfOrganizi ng Complex Neural Networ k M a ciej Swat , James Glazie r - Cell Level Modeling Using CompuCe1l3D Jur e Dobnikar, Marko Jagodic, M ilan Brumen - Computer simulation of bact erial chemotaxis
Tuesday, June 27, 2006 SOCIAL SY S T E M S - John Sterman - Session Chair John Morgan - Global Security Steven Hassan - Destruct ive mind control issues: cults , deprogramming and SIA post 9-11 Jay Forrester - Syst em Dynami cs Michael Hammer - Reengineerin g th e Corporat ion EDUCATION AND HEALTHCARE - Irene Conrad and H el en Harte - Session Cha ir Nastaran Keshavarz , Don Nutbeam, Louise Rowling, Fereidoon Khavarpour Ca n complexity t heory shed light on our understandin g about school hea lth promot ion? A shok K ay (K a n a gara jah ), Pet e r L indsay, Anne Miller, David Parker - An exp lora t ion into t he uses of age nt-based modeling to improve quality of health care Alice David son, Marilyn Ray - Com plexity for hu man-environment well-b eing Thea Luba - Creating, Connecting an d Collaborating in SonicMetro: Our on-line complex systems model for Arts Ed ucation Hiroki Sayama - Teaching emergence and evolution simultaneously t hro ugh simulated breed ing of art ificial swarm behaviors P aulo Blikstein, U r i W ilensky - Learni ng Abo ut Learn ing: Using Multi-Agent Compu ter Simulation to Invest igate Huma n Cogn ition
xvi COMPLEX SYSTEMS EDUCATION - Thea Luba - Session Cha ir Charles H adlock - Guns, ger ms, and st eel on t he sugar scap e: int rod ucing und erg raduates to agent bas ed simulation Nicholas Gessler - Through the looking-glass with ALiCE: arti ficia l life, cult ure and evolut ion: PHYSIOLOGY - Plamen Ivanov - Session Cha ir Bela Suki - Fluct uat ions a nd noise in respiratory and cell physiology Gottfried Mayer-Kress, Yeou-Teh Liu, Karl Newell - Complexity of Hum an Movement Learning Attila Priplata - Noise-E nha nced Human Balance Control Cecilia Diniz Behn, Emery Brown, Thomas Scammell, Nancy Kopell - Dynam ics of behavioral st at e control in t he mouse sleep-wake network Indranill Basu Ray - Interpreting complex body signals to predict sudden card iac deatht he largest killer in the West ern Hemisphere. Bruce West - Frac tiona l Ca lculus In P hysiologic Networks MEDICAL SIMULATION - Dwight Meglan - Session Cha ir Dwight Meglan - Simul at ion-based Surgery Tr ai ning Bryan Bergeron - Interdepe nde nt Man-Mach ine Problem Solving in Serious Gam es James Rabinov - New Approac hes to Comp uter-based Int ervent ional Neur oradiology Tr aining Joseph Teran - Scientifi c Comput ing Appli cations in Biomedical Simulat ion of Soft Ti ssues NETWORKS - Dan Braha - Session Chair V Anne Smith - Bayesian network inference a lgorit hms for recovering networks on mult iple biological scales: molecular, neur al , and ecological Peter Dodds - Social cont agion on networks: groups a nd cha os. Jun Ohkubo, Kazuyuki Tanaka - Fat- tailed degree d istribution s generated by qu enched disorder Joseph E . Johnson - Generalized ent ropy as network met rics Hugues Berry, Benoit Siri, Bruno Cessac, Bruno Delord, Mathias Quoy - Topological and dyn amical structures indu ced by Hebb ian learn ing in random neur al networks Gabor Csardi, Katherine Strandburg, Laszlo Zalanyi, Jan Tobochnik, Peter Erdi - Estimat ing the dyn am ics of kern el-b ased evolving networks Michael Gastner - Traffic flow in a spatial network mod el Pedram Hovareshti, John Baras - Consensus problems on sma ll world graphs: a st ruct ural study Dalia Terhesiu, Luis da Costa - On t he relationsh ip between comp lex dyn amics a nd complex geometrical structure HOMELAND SECURITY - Jeff Cares - Session Chai r Giuseppe Narzisi, Venkatesh Mysore, Lewis Nelson, Dianne Rekow, Marc Triola, Liza Halcomb, Ian Portelli, Bud Mishra - Complexit ies, Catastrophes and Cit ies: Unraveling Emergency Dyna mics Nancy Hayden, Richard Colbaugh - The complexity of te rrorism: considering surprise an d deceit Markus Schwehm, Chris Leary, Hans-Peter Duerr, Martin Eichner - Int erSim :A network-b ased out break investigation and inte rvent ion planning too l. Gary Nelson - St ructure and Dyn ami cs of Mult iple Agent s in Homeland Secur ity Risk Management Czeslaw Mesjasz - Complexity St udies and Security in th e Comp lex World : An Epist emologica l Framework of Analysis Corey Lofdahl, Darrall Henderson - Coord inating National Power using Complex Systems Simul at ion Kevin Brandt - Operational Synchronizat ion Alex Ryan - About the Bears and t he Bees: Adapt ive Respons es to Asymmetric Warfare Adrian Gheorghe - Vulnerab ility Assessme nt of Comp lex Cr itical Infrastructures
xvii EVOLUTION AND ECOLOGY - Guy Hoelzer - Session Chair Elise Filotas, Lael Parrott , Martin Grant - Effect of space on a mult i-sp ecies commun ity model with ind ivid ual-based dyn amics Pascal Cote, Lael Parrott - Application of a genetic algorit hm to generate non-ra ndom assembly sequences of a community assembly model Javier Burgos, Julia Andrea Perez - A Met hodological guide to environmental pr iorit izing using hyperbolic laws and strategical ecosystems Holger Lange, Bjorn 0kland, Paal Krokene - Thresholds in t he life cycle of t he spr uce bark beetle und er climate change Lora Harris - Ra met rules: merging mechan istic growt h mod els with an indiv idual-based approach t o simulate clonal plants Tibor Bosse, Alexei Sharpanskykh, Jan Treur - On t he complexity monotonicity thesis for environme nt , behaviour and cognition Mauricio Rinc?n-Romero, Mark Mulligan - Hydrol ogical sens it ivity ana lysis to LUCC in Tropical Mountainous Environment ALIFE AND EVOLUTION - H iroki Sayama - Session Cha ir William Sulis - Eme rgence in th e Game of Life Predrag Tosic - Computat iona l Complexity of Counti ng in Sparsely Networked Discrete Dynamical Syst ems Rene Doursat - T he growing canvas of biological development : multiscale pat t ern generation on an expanding lat tice of gene regulato ry networks Kovas Boguta - Inform at ional fractu re po ints in cellular automata Christof Teuscher - Live and Let Die: Will there be Life after Biologically Inspi red Computation? Hideaki Suzuki - A molecular network rewiring rule t hat represents spatial constraint PHYSICAL SYSTEMS - Jonathon Post - Session Chair Karoline Wiesner, James Crutchfield - Com putation in Finit ar y Qu antum P rocesses Wm. C. McHarris - Complexity via Correlated Statistics in Qua nt um Mechan ics Xiangdong Li , Andis ChiTung Kwan, Michael Anshel, Christina Zamfirescu, Lin Wang Leung - To Quantum Walk or Not Philip Fellman, Post Jonathon - Nash equil ibrium and qu antum com putational complexity Ravi Venkatesan - Informat ion Encry ption using a Fisher-Schroedinger Model INNOVATION - Iqbal Adjali - Session Chair Jeroen Struben - Ide ntifyi ng cha llenges for sustained adoption of alternative fuel vehicles a nd infrast ru cture Jiang He, M. Hosein Fallah - Mobility of innovators a nd prosperity of geographical t echnol ogy clusters Ian Wilkinson, Robert Marks, Louise Young - Toward Agent Based Models of t he Developme nt And Evolutio n of Business Relations a nd Networks Nunzia Carbonara, Haria Giannoccaro, Vito Albino - T he competitive advantage of geographical clust ers as comp lex adaptive systems: an exploratory st udy based on case st udies and net work analysis Kazuyuki Takahashi, Md. M ahbubush Salam Khan - Com plexity on Politics -How we construct perpetua l peacePhilip Fellman, Jonathon Post - Comp lexity, competitive inte lligence an d t he 'first mover ' advantage Thomas Brantle, M . Hosein Fallah - Com plex knowledge networks and invent ion collaboratio ns
xviii ENGINEERING - Fred Discenzo - Session Chair Sarjoun Doumit , Ali Minai, - Dist ribut ed resource exploitation for a utonomous mobile sensor agents in dynamic environments Abhinay Venuturumilli, Ali Minai - Obt aining Robust W ireless Sensor Networks t hroug h Self-Organizatio n of Heterogenous Connect ivity Justin Werfel, Yaneer Bar-Yam, Radhika Nagpal - Automating construction wit h dist ribu ted robo tic systems Fred Discenzo, Francisco Maturana, Raymond Staron - Dist ributed diagnostics an d dynamic reconfiguration using auto nomous agents Adel Sadek, Nagi Basha - Self-Learn ing Int elligent Agents for Dynamic Traffic Rout ing on Transportation Networks Fred M . Discenzo, Dukki Chung - Power scavenging ena bles maintenance-free wireless sensor nodes Jacob Beal - Wh at t he assass in's guild taught me abo ut distribut ed comp ut ing Predrag Tosic - Dist ribu t ed Coa lition Formation for Spa rsely Networked Large-Scale Mult i-Agent Syste ms SYSTEMS ENGINEERING - Doug Norman - Session Chair Sarah Sheard - Bridging Systems Enginee ring and Com plex Systems Sciences M ichael Kuras, Joseph DeRosa - W hat is a system? Anne-Marie Grisogono - Success an d failure in adaptation Jonathan R. A. Maier, Matt Motyka, Georges M . Fadel - Represent ing t he Complexity of Engineeri ng Systems: A Multidisciplinary Perce pt ual Approach Leonard Wojcik, Sam Chow, Olivier de Week, Christian LaFon, Spyridon Lekkakos, James Lyneis, Matthew R inaldi, Zhiyong Wang, Paul Wheeler, Marat Zborovskiy - Ca n Models Capt ure t he Comp lexity of t he Systems Engineer ing Process? Richard Schmidt - Synthesis of Syste ms of Systems [SoS] is in fact the Management of Syst em Design Complexity LEADERSHIP N anette Blandin - Re-conceptualizing leadership for comp lex social systems Russ Marion - Mult i-Agent Based Simulat ion of a Model of Complexity Lead ership Chi-Kuo M ao - P rinciples of Orga nizat ion Cha nge - A Complex System Pers pect ive Earl Valencia - Architecting t he Next Generation of Technical Leaders Cory Costanzo, Ian Littlejohn - Ea rly Detection Capabilities: Apply ing Complex Adap t ive Systems Principles to Business Environments
Wednesday, June 28 , 2006 EVOLUTION - Jerry Sussman - Session Chair Edward O. Wilson - New Underst and ing in Socia l Evolution David Sloan Wilson - Rethinking the T heoretical Foundations of Sociobiology Andreas Wagner - Robust ness and Evolution Charles Goodnight - Complexity an d Evolution in Struct ured Pop ulat ions PHYSIOLOGY - Aaron Littman - Session Chair M adalena Costa, Ary Goldberger, C .-K. Peng - Broken asym metry of t he hum an heartbeat: loss of time irreversibility in aging and disease
David Garelick - Body sway tec hnology Helene Langevin - Connect ive tissue: a bod y-wide complex system network? Edward Marcus - Manifestations of Cellular Cont raction Patterns on t he Cardiac Flow Out put Andrei Irimia, Michael Gallucci, John Wikswo - Com pariso n of chaotic biomagnet ic field patterns record ed from t he arr hythmic cardiac and Gl syste ms Muneichi Shibata - A whole-body met ab olism simulation model
xix LANGUAGE - Adrian Gheorghe - Session Chair F. Canan Pembe, Haluk Bingol - Complex Networks in Different Lang uages : A St udy of an Emergent Multilingual Encyclopedia Gokhan Sahin, Murat Erenturk, Avadis Hacinliyan - Search for chaotic structures in Turkish an d Eng lish texts by detrended fluct uation and ti me series analyses PSYCHOLOGY Jeff Schank, Chris M a y, Sanj ay J oshi - Can robots help us underst and the developme nt of behavior? Irina Trofimova - Ensemb les wit h Variable Struct ure (EVS) in t he modeling of psychological phenomena Michael Roberts, Robert Goldstone - Human -environme nt inte ractions in gro up foraging behavior MATHEMATICAL METHODS - Joel MacAuslan - Session Chai r Yuriy Gulak, H aym B enaroya - Nonassociative algebraic structures and comp lex dynamical systems Martin Zwick, Alan Mishchenko - Binary Decision Diagrams an d Crisp Possibilistic Reconstructability Ana lysis Thomas Ray - Self Organization in Rea l and Complex Analysis ALIFE AND EVOLUTION - William Sulis - Session Chai r Hiroki Sayama - On self-rep licat ion and the halt ing problem Gerald H Thomas, Hector Sabelli, Lazar Kovacevic, Louis Kauffman - Biotic patterns in t he Schroedinger's equation and t he ear ly universe Zann Gill - Designing Cha llenges to Harness C-IQ [collaborat ive intelligence] DYNAMICS O F MATERIALS Sergio Andres Galindo Torres - Computational simulat ion of the hydraulic fracturing process using a discrete element met hod Jun Yu, Laura Gross, Christopher D anforth - Complex dyn amic behavior on transition in a solid combustion mode l GAME THEORY - John Nash a n d Adrian Gheorghe - Session Chairs Paul Scerri, K atia Sycara - Evolutionary Ga mes and Social Networks in Adversary Reasoning Mike Mesterton-Gibbons, Tom Sherratt - Anima l netwo rk phenomena: insights from triadic games S imon Angus - Cooperation networks: endogeneity a nd complexity Gerald H Thomas, Keelan Kane - Prisonner's Dilemma in a Dynamic Game Theory Andis ChiTung Kwan - On Nash -connectivity and choice set prob lem ACADEMIA - Ronald Degray - Session Chai r Hank Allen - Complexity, social physics, and emergent dynamics of t he U.S. academic system Sean Park, D el H arnish - Rethinking Accountability an d Quality in Higher Education ENGINEERING Neena George, Ali Minai, Simona Doboli - Self-organized inference of spatial struct ure by randomly deployed sensor networks M a y Lim, Dan Braha, Sanith Wijesinghe, Stephenson Tucker, Yaneer Bar-Yam Preferential Detachment: Improving Connectivity an d Cost Trade-offs in Signaling Networks Martin Greiner - P roactive robustness control of heterogeneously loaded networks DISEASE John Holmes - Communica ble disease outbreak detection and emergence of etiologic phenomena in an evolut ionary computation system Clement McGowan - Biological Event Modeli ng for Respo nse P lan ning
xx DYNAMICAL METHODS - David Miguez - Session Chair Christopher Danforth, James Yorke - Making forecast s for chao tic physica l processes Michael Hauhs, Holger Lange - Organisms, rivers, an d coalgebras Burton Voorhees - Emergence of Met astabl e Mixed Choice in Probabilist ic Ind uction C rist ian Suteanu - An isotropy and spatial scali ng aspects in th e dyn am ics of evolving d issipat ive syst ems with discret e appearance Hector Sabelli, Lazar Kovacevic - Biot ic Populat ion Dynamics an d t he T heory of Evolut ion HERBERT SIMON AWARDS - Yaneer B ar-Yam - Session Chai r Kenneth Wilson - Herb ert Simon Award presented to Ken neth W ilson for his deve lopment of t he Renorm alizat ion Gro up John F . Nash Jr. - Herbert Simo n Award presented to Joh n F . Nash Jr. for his ana lysis of Game T heory John F . Nash Jr. - Multi player Ga me Theory Marvin Minsky - Rem iniscence of John F . Nash J r .
Thursday, June 29, 2006 SYSTEMS BIOLOGY - Hava Siegelmann - Session Cha ir Naama Barkai - From Dyna mic Mecha nisms t o Networks Jose Venegas - Anatomy of an Ast hma Attack Luis Amaral - Biological and Social Networks SYSTEMS ENGINEERING - Doug Norman, Joe DeRosa - Session Chairs Lou Metzger - Systems Eng ineeri ng Tony De Simone - The Globa l Inform ati on Grid K enneth Hoffman, Lindsley B o iney, Renee Stevens, Leonard Wojcik - Complex systems land scapes - t he enterprise in a soc io-economic contex t Brian White - On t he Pursuit of Enterprise Syst ems Engineering Ideas Dean Bonney - Inqui ry and Enterprise Transformation Keith McCaughin, Joseph DeRosa - Stakeholder Ana lysis To Shape t he Enterp rise Carlos Troche - Doc umenting Complex Systems in t he Ente rprise Michael Webb - Capability-Based Engineering Analysis (CBEA) EVOLUTION AND ECOLOGY - Guy Hoelzer - Session Chair Claudio Tebaldi, Deborah Lacitignola - Complex Feat ures in Lot ka-Volterra Systems wit h Behavioral Adaptation Jeffrey Fletcher, Martin Zwick - Unifying t he t heories of inclusive fitness an d reciprocal altruism Georgy Karev - On mathematical th eory of select ion: d iscrete-ti me models Suzanne Sadedin - Adaptation and self-organization in spatial mode ls of speciation Chris Wright - Lot ka-Volterr a community organization increases wit h added trophic complexity SCIENCE FICTION - Jonathon Vos Post - Session Chair Geoffrey Landis - Science, Science Fiction, & Life in t he Universe Stanley Schmidt - The Symbiosis of Science and Science Fiction CONCEPTS - Helen Harte - Session Chair Susan and Bruce Sgorbati and Weber - How Deep a nd Broad are t he Laws of Emergence? P ierpaolo A ndriani , Jack Cohen - Innovation in biology and tec hno logy: exaptation precedes adaptation. Burton Voorhees - Paradigms of Order Rush D . Robinett, III, David G . Wilson, Alfred W . Reed - Exergy Sustainability for Complex Systems
xxi Diane McDonald, George Weir - Developing a conceptual model for exploring emergence Victor Korotkikh, Galina Korotkikh - On an irreducible t heory of complex systems Michael Hlsmann, Bernd Scholz-Reiter, Michael Freitag, Christine Wycisk, Christoph de Beer - Autonomo us Cooperation as a Met hod to cope with Complexity and Dynamics: A Simulation based Analysis and Measurement Concept Approach Jonathon Post, Philip Fellman - Complexity in t he Paradox of Simplicity Daniel Polani - Emergence, Int rinsic Struct ure of Informat ion, and Agent hood Irina Ezhkova - Self-Organizing Architecture of Complex Systems BIOLOGICAL NETWORKS - Ali Minai - Session Chair Guy Haskin Fernald, Jorge Oksenberg, Sergio Baranzini - Mutual informat ion networks unve il global properties of IFN ? immediate transcriptio na l effects in humans Holger Sueltmann - Delineating breast canc er gene exp ression networks by RNA interference and globa l microarray analysis in human tumor cells Samantha Kleinberg, M arco Antoniotti, Satish Tadepalli, N aren Ramakrishnan, B ud Mishra - Remembrance of experiment s past : a redescription approach for knowledge discovery in comp lex systems Shai Shen-Orr, Y itzhak Pilpel , Craig Hunter - Embryonic and Maternal Genes have Different 5' and 3' Regu lat ion Com plexity Tinri Aegerter-Wilmsen, Christof Aegerter, Konrad B asler, Ernst Hafen - Coupling biological pattern formation and size control via mechanical forces Blake Stacey - On Mot if Statistics in Symmetric Networks Benjamin d e Bivort, Sui Huang, Yaneer Bar-Yam - Dynamics of cellular level function and regulation derived from murine expression array dat a Thierry Emonet, Philippe Cluzel - Fro m molecul es to behavior in bacterial chemotaxis NEURAL AND PHYSIOLOGICAL DYNAMICS - Gottfreid M ayer-Kress - Session Chair Tetsuji Emura - A spatiotempo ra l coup led Lorenz model drives emergent cogn itive process Steve Massaquoi - Hierar chical and parallel organization, scheduled scaling of erro r-type signa ls, and synergistic actuation appear to greatly simp lify and rob ustify human motor control Michae l Holroyd - Synchronizability and connectivity of disc ret e comp lex systems Walter Riofrio, Luis Angel Aguilar - Different Neurons Pop ulation Distribution correlates with Topo logic-Temporal Dynamic Acoustic Information Flow D r. Joydeep Bhattacharya - An index of signal mode complexity based on orthogona l transformat ion Konstantin L Kouptsov, Irina Topchiy, David Rector - Brain synchronization d uri ng sleep David G. Meguez - Exp erimental steady pat tern formatio n in react ion-diffusion-advect ion systems Harikrishnan Parameswaran, Arnab Majumdar, Bela Suki - Relating Microscopic and Macroscopic ind ices of alveolar dest ruction in emp hysema A rnab Majumdar, Adriano M . Alencar, Sergey V . Buldyrev, Zoltn Hantos, H . Eugene Stanley, B ela Suki - Branching asymmetry in t he lung airway tree GLOBAL SYSTEMS - Sanith Wij es inghe - Session Cha ir Sanith W ijesinghe - Thursday Afternoon Breakout Session on Global Systems Hans-Peter Brunner - App lication of comp lex systems research to efforts of intern at ional developme nt Iqbal Adjali - An agent-based spatial mode l of consumer behavior Kumar Venkat, Wayne Wakeland - Eme rgence of Networks in Distance-Constrained Trade Roxana Wright, Philip Fellman, Jonathon Post - Path Dependence, Transformation and Convergence- A Mathematica l Mode l of Transition to Mar ket Craig Williams - Transaction Costs, Agency T heory and t he Complexity of Electric Power Distribution Governance
xxii Mauricio Rinc?n-Romero Environmental P lanning
- Hyd rological catchment as a Complex System for th eir
PHYSICAL SYSTEMS - May Lim - Session Chair David G arelick - Part icles traveling fast er t han the speed of light Cintia Lap illi , Peter Pfeifer, Carlos Wexler - Universa lity away from critical poi nts in a t hermostatistical mode l Sean Shaheen - Molecular Self-Assembly P rocesses in Organi c Photovoltaic Devices Leila Shokri, Boriana Marintcheva, Charles C. Richardson , Mark C . Williams - Salt Dependent Bind ing of T7 Gene 2.5 Protein to DNA from Single Molecule Force Spectroscopy Vivian Halpern - Precursors of a phase transition in a simple mode l syste m J onathon Post, Christine Carmichael, Philip Fellman - Emergent ph enomena in higher-ord er electrody namics Cynthia Whitney - On Seeing th e Superlu minals Aziz Raouak - Diffusion And Topological Properties Of P hase Space In The Standard Map SYSTEMS ENGINEERING II - Doug Norman and Joe DeRosa - Session Chairs Joyce Williams - Systems Thinking: The 'Softer Side' of Comp lex Systems Eng ineering Thomas Speller, Daniel Whitney, Edward Crawley - Syst em Archit ect ure Gene ration based on Le Pont du Gard Michael McFarren, Fatma Dandashi, Huel- Wan Ang - Service Or iente d Arch itect ures Using DoDAF Jeff Sutherland, Anton Victorov, Jack Blount - Adaptive Enginee ring of Large Software Projects wit h Dist ribu ted jOut sourced Tea ms Robert Wiebe, Dan Compton, D ave Garvey - A System Dynamics Treatment of t he Essent ial Tension Bet ween C2 an d Self-Sync hronization George Rebovich - Enterp rise Systems Engineer ing: New and Emerging Perspec t ives John J . Roberts - Enterprise Ana lysis and Assessment of Complex Military Com man d and Control Envi ronments EVOLUTION AND ECOLOGY II - Holger Lange - Session Cha ir Justin Scace, Adam Dobberfuhl, Elizabeth Higgins, Caroly Shumway - Complexity a nd t he evolut ion of the socia l brain Yosef Maruvka, Nadav Shnerb - T he Surviving Creatures : T he stable state on th e species Network Lauren O'Malley - Fisher Waves and Front Roug heni ng in a Two-Species Invasion Model wit h Preemptive Compet it ion Marcelo Ferreira da Costa Gomes, Sebastin Gon?alves - The SIR model wit h de lay Guy Hoelzer, Rich Drewes, Rene Doursat - Temporal waves of genetic d iversity in a spatially explicit model of evolution: heavi ng toward speciation GLOBAL SYSTEMS II - H ans-Peter Brunner - Session Chair Dmitry Chistilin - Global security Mehmet Tezcan - T he EU Foreig n Policy Governance As A Complex Adaptive Syste m Doug Smith - Bifurcation in Socia l Movements J . (Janet) Terry Rolfe - A New Era of Causality and Responsibility: Assessing t he Evolving Supe rpower Ro le of the United States Walid Nasrallah - Evolutionary paths to a corrupt society of artificial agents C arlos Puente - From Complexity to Peace Claudio Tebaldi, Giorgio Colacchio - Chaotic Behavior in a Modified Goo dwin' s Growth Cycle Model B ennett Stark - The Globa l Political System: A Dynamical System wit hin the Chaotic P hase, A Case Study of Stuart Kau ffman 's Complex Adaptive System Theory
xxiii RECONSTRUCTABILITY ANALYSIS - Martin Zwick - Session Cha ir Martin Zwick - A Short Tu torial on Reconstructabilit y Analysis Roger Cavallo - Whi ther Reconst ru ct abili ty Analysis Gary Shaffer - T he K-systems Niche Michael S. Johnson, M artin Zwick - State-based Reconst ruct ability Analysis Mark Wierman, Mary Dobransky - An Empirical St udy of Search Algorit hms Applied to Reconstruct ab ility Ana lysis Berkan Eskikaya - An extension of reconst ruct ability ana lysis wit h implicat ions to syst ems science a nd evolutio nary comp utation
Friday, June 30, 2006 GLOBAL SYSTEMS Hans-Pet er Brunner - App licat ion of comp lex systems resear ch to efforts of int ern ational development Patrick M . Hughes - Complexity, Convergence, and Confluence Peter Brecke - Modeling Globa l Syst ems Ricardo H ausmann - International Development
XXiV
Publications: Proceedings: Conference proceedings (t his volume) On t he web at ht tp :/ /necsLedu/ event s/iccs6/proceedings.html Video proceedings are available to be ordered t hrough t he New England Complex Systems Inst itu te.
Journal articles: Individual conference articles were published online at http://interjournal.org
Web pages: The New England Complex Systems Institute http://necsLorg The First International Conference on (ICCS1997) http://www.necsLorg/ ht ml/ ICCS_Program.ht ml
Complex
Systems
The Second International Conference on (ICCS1998) http :/ / www.necsi.org/ events / iccs/iccs 2program.html
Complex
Systems
The Third International Conference on (ICCS2000) ht tp ://www.necsLorg/ events/iccs/ iccs3program.html
Complex
Systems
The Fourth International Conference on (ICCS2002) htt p://www.necsLorg/events/iccs/iccs4program.html
Complex
Systems
The Fifth International Conference on Complex Systems (ICCS2004) http: / /www .necsi.org/eve nts/lccs / openconf/ aut hor/ iccsprogra m.php The Sixth International Conference (ICCS2006) http ://www .necsLorg/ event s/ iccs6/index.php
on
The Seventh International Conference on (ICCS2007) http ://www.necsi.org/events/iccs7/index.php
Complex
Systems
Complex
Systems
NECSIWiki http :/ /necsLorg/commun ity/ wiki/index.php/Main-Page InterJournal - Th e journ al of the New England Complex Systems Institu te ht tp :/ /int erjourn al.org
Part I: Methods
Chapter 1
Emergence, Intrinsic Structure of Information, and Agenthood Daniel Polani Adaptive Systems Research Group School of Computer Science, University of Hertfordshire, UK [email protected]
Em ergence is a central organizing concept for th e und erstanding of complex syst ems . Under the manifold mathematical notions that have been introduced to chara ct erize emergence, the inform ation- theoretic are of particular int erest since they provide a qu antitative and tr an spar ent approach and generalize beyond t he imm ediat e scope at hand. We discuss approaches to characterize emergence usin g information t heory via t he intrinsic temporal or composit iona l stru ct ure of the informa tion dynamics of a system. This approach is devoid of any extern al const ra ints and purely a proper ty of t he information dynami cs itself. We th en briefly discuss how emergence naturally connects to t he concept of age nt hood which has been recently defined using information flows.
1
Introduction
The concept of emergence is of cent ral importance to understand complex systems. Although there seems to be quite an intuitive agreement in the community which phenomena in complex systems are to be viewed as "emergent" , similarly to the concept of complexity, it seems difficult to const ruct a universally accepted precise mathemati cal notion of emergence. Unsurprisingly one is t hus faced with a broad spectrum of different approaches to define emergence. The present paper will briefly discuss a number of notio ns of emergence and t hen focus on the information-theoretic variants. Due to its universality, infor-
4 mation theory spawned a rich body of concepts based on its language. It provides power of quantification, characterization and of prediction. The paper will discuss how existing information-theoretic notions of emergence can be connected to issues of intrinsic structure of information and the concept of "agenthood" and thus provide new deep insights into the ramifications, and perhaps the reason why emergence plays such an important role.
2
Some Notions of Emergence
Of the broad spectrum of notions for emergence we will give an overview over a small selection representing a few particularly representative approaches, before concentrating on the information-theoretic notions which form the backbone of the philosophy of the present paper. A category-theoretic notion of emergence has been brought forward in [11]. While having the advantage of mathematical purity, category theory does not lend itself easily for the practical use in concrete systems. One of the difficulties is that the issue of identifying the emergent levels of description is exogenous to the formalism. These have to be formulated externally to be verified by the formalism. As is, the approach provides no (even implicitly) constructive way of finding the emergent levels of description. The difficulty of identifying the right levels of description for emergence in a system has brought up the suspicion that emergence would have to be considered only "in the eye of the beholder" [6]. In view of the fact that human observers typically agree on the presence of emergence in a system, it is often felt that it would rather be desirable to have a notion of emergence that does not depend on the observer, but is a property that would arise naturally from the dynamics of the system . In an attempt to derive emergent properties of a system, a pioneering effort to describe organizing principles in complex systems is the approach taken by synergetics [4]. The model attempts to decompose nonlinear dynamic systems in a natural fashion. In the vicinity of fixed points, dynamical systems decompose naturally into stable, central and unstable manifolds. Basically, this decomposes a system into fast and slow moving degrees of freedom (fast foliations and slow manifolds) . Since the lifetime of the slow degrees of freedom exceeds that of the fast ones, Haken termed the former master modes as compared to the latter which he termed slave modes. The main tenet of synergetics is that these master modes dominate the dynamics of the system. In the language of synergetics, the master modes correspond to emergent degrees of freedom. An informationtheoretic approach towards the concept of synergetics is presented in [5] . The synergetics view necessarily couples the concept of emergence to the existence of significantly different timescales . In addition, the applicability of above decomposition is limited to the neighbourhood of a fixed point. Under certain conditions, however, it is possible to achieve a canonical decomposition of chaotic dynamical systems even without separate time scales into weakly coupled
5
or decoupled subsystems [13] . In addit ion to above, a significant number of oth er approaches exist, of which we will briefly mention a few in §3.4 in relation to t he inform ation-theoretic approaches to be discussed in §3.2 and §3.3.
3
Concepts of Emergence
Among the possible formalizations of emergence, the information-th eoretic ones are particularly at tr active due to the universality of information theory and the power of expression, description and prediction it provides, as well as the pot ential to provide path s for explicitly const ructing the st ructures of relevance (see e.g. §4).
3.1
Notation
We introduce some notation. For random variables use capital letters such as X , Y, Z , for th e values th ey assume use letters such as x, Y, z , and for the sets th ey take values in use letters such as X , y , Z. For simplicity of notation, we will assume that such a set X is finite. A random variable X is determined by the probabilities Pr(X = x) assumed for all x E X. Similarly, joint variabl es (X , Y) are determined via Pr (X = x, Y = y), and condit ional variables via Pr(Y = ylX = x) . If there is no danger of confusion, we will prefer writing the probabiliti es in the shorthand form of p(x) , p(x, y) and p(y lx) instead of the more cumbersome explicit forms above. .Define the entropy of a random variable X by H (X ) - L XEX p(x) logp(x ) and the conditional entropy of Y given X as H (Y IX ) := L XEX P(x )H (Y IX = x) where H (Y IX = x) := - L YEy p(ylx )logp(ylx ) for x E X . Th e joint entropy H (X , Y) is the entropy of the random vari able (X, Y ) with jointly distributed X and Y . T he mutual information of random variables X and Y is defined as I (X ;Y ) := H (Y ) - H( Y /X) = H (X )+H(Y ) - H( X , Y ) . In analogy to (regular) maps between set s, we define a probabilistic map X --t Y via a condit ional p(ylx). If the prob abilistic map is deterministic, we call it a projection. A given prob ability distribution p(x ) on X induces a prob ability distribution p(y) on Y via the probabilistic map X --t Y in the natural way.
3.2
Emergence as Improved Predictive Efficiency
Based on th e epsilon-machine concept, a notion of emergence in tim e series has been developed in [1, 12]: a process emerges from another one if it has a greater predictive efficiency than the second. This means that , the ratio between prediction information (excess entro py) and the complexity of t he predicting epsilon-machine is better in the emerging process than the original process. This gives a natural and precise meaning to t he perspective that emergence should represent a simpler coarse-grained view of a more intr icate fine-grained system dynamics. In the following, we will review the technical aspects of this idea in more detail.
6
For this, we generally follow the line [12]. Consider a random variable X together with some projection X ---+ X. Then define the statistical complexity of the induced variable X as CJL(X) := H(X). Let random variables X, Y be given, where we wish to predict Y from X. Define an equivalence relation " ' f on X of equivalent predictiveness with respect to Y via Vx,x' EX: x " ' f x' iff Vy E Y: p(ylx) = p(ylx') . (1.1) The equivalence relation r-« induces a partition X of X into equivalence classes" . Each x E X is naturally member into one of the classes 55 EX, and thus there is a natural projection of X onto X. This induces a probability distribution on X and makes X a random variable which is is called a causal state. Consider now an infinite sequence ... S-I' So, SI,. .. of random variables. Furthermore, introduce the notation Slt,t'] for the subsequence St,St+l, . . . ,St'-I,St', where for t = -00 one has a left-infinite subsequence and for t' = 00 a right-infinite subsequence. We consider only stationary processes, i.e. processes where p(slt,oo)) = p(s[t',ooJ)' for any t, t'. Then, without f-
loss of generality, one can write S := S[oo ,t) for the past of the process and ---+ ........ S := Slt ,oo] for the future of the process, as well as S for the whole sequence. Following (1.1), introduce an equivalence between different pasts 7,7' in ---+ predictiveness with respect to the future S (for detailed technical treatment of the semi-infinite sequences, see [12]) . This induces a causal state S. Then, in [12] it is shown that, if a realization s of the causal state induced by the past S[-oo,t) is followed by a realization S of StH, the subsequent causal state induced by S[-oo,t+l) is uniquely determined. This induces an automaton on the set of causal states, called the e-machine, Then the statistical complexity CJL(8) (the entropy of the e-rnachine) measures how much memory the process stores about its past. As opposed to that, one can consider the excess entropy of the process, f - ---+ defined by E = I ( S; S). The excess entropy effectively measures how much information the past of a process contains about the future . It can be easily shown that E :::; CJL (8). In other words, the amount the past of the process "knows" at a point about its future cannot exceed the size CJL(S) of the internal memory of the process. Armed with these notions, in [12] the following definition of emergence is suggested : define EjCJL(S) E [0,1] as a measure of predictive efficiency, that is, how much of the internal process memory is used to actually predict what is going to happen. If we consider a derived process induced by a projection ........ applied to each member of the sequence S, this derived process is then called emergent if it has a higher predictive efficiency than the process it derives from. In particular, emergence is an intrinsic property of the process and does not depend on a subjective observer. 1 Note that the partition consists of subsets of X . However, we will use the partition itself later as a new state space and therefore the individual equivalence classes x are both subsets of X and states of the new state space X.
7
3.3
Emergent Descriptions
T he emergent description mod el developed by t he author in [10] takes an ap pro ach t hat , while related to predict ive efficiency, differs from it in some impor t ant as pects. In the emergent descr iptions model, consider agai n a sequen ce .-. of random state variables S . Assum e t hat t here exists a collect ion of k pro babilistic ma ppings each indu cing a sequence of random var iab les S (i), wit h .-. i = 1 .. . k, formin g a decom positio n of t he origina l system S . Then [10] defines S (k) to be an emergent description for S if t he decomposit ion S;i), fulfils t hree prop er ties Vi = 1 ... k: 1. the decom pos ition repres ents t he system fully: I( St;S;l), . .. S;k)) = H(St)); 2. t he individual substates are independent from each ot her: I(S;i);S? )) = 0 for i conserving through t ime
i- j;
3. and they are individually information
I(S;i);S;21) = H(S;21) ' Similarly t o the pr edictive efficiency from §3.2, the emergent description formalism considers the predi ctivity of a t ime series which is measured by mutual information. However , t he emergent description model only deals with a syste m wit hout a past , unlike t he predictive efficiency model which uses e-machines and t hus includes full ca usa l hist ories. However , a much more imp ort ant difference is t hat t he emergent description mod el explicitly considers a decomp osition of t he tot al dy na mical system into individual independent inform ation al com ponents. Rather t ha n considering t he system as a un st ru ct ured "bulk", t his view perceives it as having an inner informational dyn amics and a natural informationa l subst ructure . Similar to the emergence notion from §3.2 t his substruct ure is not exte rnally imposed , but rather an int rinsic pro perty of t he syste m. It is, however, not necessaril y un ique. F ig. l( a) shows schematically t he deco mpos it ion into emergent descript ions.
(a)
(b)
Figure 1: (a) Schematic structure of emergent description decomposition into independent modes. (b) Automaton found in multiobjective GA search. The two groups of states belong to the two components of the emergent description, and the arrows indicate the stochastic tra nsitions, where darker arrows indicate higher transition probabilities. The eme rgent descri ption mod el has t he advantage t hat it can be explicitly const ruc ted due to its quantitativ e characterisat ion (§4). This is a considera ble advantage to more conceptual a bst ract models (such e.g. the category-theoretic approach mentioned in §2).
8
3.4
Other Related Approaches
Two further related approaches should be mentioned. The authors in [9] suggest emergence as higher-level prediction models for partial aspects of a systems, based on entropy measures. This model can be viewed as a simplified version both of the predictive efficiency and of the emergent description model. Compared with CrutchfieldjShalizi's predictive efficiency, it does not consider causal states, and compared with our emergent description model, it does not consider a full decomposition into independent information modes. As opposed to that, [7] construct a decomposition into dynamical hierarchies based on smooth dynamical systems. This model is close in philosophy to the emergent descriptions approach, except for the fact that it is not based on information theory.
4
Constructing Emergent Descriptions
The quantitative character of the emergent description model provides an approach to construct (at least in principle) an emergent description (or at least an approximation) for a given system. Consider a system with 16 states, starting with a equally distributed random state and with a deterministic evolution rule St+l := s, + 1 mod 16, i.e, it acts as a counter modulo 16. We attempt to find an emergent description of the system into 2 subsystems of size 4 (these values have been chosen manually), applying a multiobjective Genetic Algorithm (GA) [2] to find projections that maximize system representation (criterion 1) and individual system prediction (criterion 3). With the given parameters, the optimization implicitly also optimizes criterion 2. The multiobjective optimization fully achieves criterion 1 and comes close in maximizing criterion 32 • The search is far from fully optimal and can easily be improved upon. However, it provides a proof-of-principle and it demonstrates several issues of relevance that are discussed below. The dynamics of one of the emergent descriptions found is shown in Fig. 1(b). The left automaton, if the GA had been fully successful, would have shown a perfect 4-cycle, i.e. a counter modulo 4, with only deterministic transitions; the failure of the GA to find this solution is due to the deceptiveness of the problem . However, the right counter, the lower-level counter, can never be fully deterministic according to the model from §3.3. Like a decadic counter, perfectly predicting the transition to the next state in right counter would ideally depend on the carry from the left counter; but the indepence criterion does not allow the right counter to "peek" into the left, thus always forcing a residual stochasticity. It turns out that this observation has a highly relevant relation to the algebraic Krohn-Rhodes semigroup decomposition [3] . Here, it turns out that the most general decomposition of a semigroup has a particular hierarchical structure that comprises a high-level substructure (group or flip-flop) which does not depend on anything else) and then a successive hierarchy of substructures each 2In fact, the GA fails to find the best solution since the problem is GA-dec eptive.
9 of which may depend on all the structures above them, the simplest example illustrated by a counter such as above". To incorporate this insight into the emergent description model, one could modify the conditions from §3.3 to respect the possible Krohn-Rhodes structure of the system. Schematically, this would correspond to a decomposition of the kind shown in Fig. 2(a)4.
D~ITIIJ o ITIIJ o
ITIIJ
(a)
D -ITITI OTI-ITITI ITIIIJ (b)- ITITI
Figure 2: (a) Emergent description with hierarchical dependence of states similar to Krohn-Rhodes decomposition . (b) Emergent description with state histories . In the present model system there is, however, a way to recover the independence of modes and maintain an optimal predictiveness. Note that in the emergent description model we completely banished state history. If we readopt it similar in spirit to component-wise e-rnachines, then the components can count individually whether a carry is required or not. The idea is schematically represented in Fig. 2(b) .
4.1
Discussion
We have contrasted the e-machine approach to characterize emergence with that of the emergent descriptions. The approaches are in many respects orthogonal, as the e-rnachine creates a relation of the two full half-axes of the temporal coordinate without any decomposition of the states itself, while the emergent description approach limits itself to a single time slice, however suitably decomposing the state into independent modes. This approach has however been shown to lose some predictivity even in the very simple counter scenario. As a remedy one can introduce either a hierarchical form of emergent descriptions, inspired by the Krohn-Rhodes decomposition, or else aim for an e-rnachine like history memory for the individual modes which is a kind of marriage of the emergent prediction and the e-machine models. In particular, this observation suggests the hypothesis that it might be possible to formulate a trade-off: one the one hand the memory requirements that a "serial" computation model such as the e-machine needs to compute the future from the past; on the other hand the information processing resources required by the "parallel" computation model such as the hierarchical emergent descriptions which involves combining information from different components of the decomposition to compute a component's future . It is quite possible that universal trade-offs may exist here, offering the possibility for resource optimization 3It also bears some algebraic relation to the Jacobian decomposition discussed in [7]. 4It is evident how to formalize this di agram in the spirit of §3.3.
10 and also for studying generalized forms of to-machines where computational resources can be shifted more-or-less freely between temporal and compositional degrees of freedom.
5
Agenthood
As a final comment, it should be mentioned that in [8] it has been shown that the perception-action loop of an agent acting in an environment can be modeled in the language of information. This is particularly int eresting for above considerations, as the agent/environment system is a generalization of a time series (a time series can be considered an agent without the ability to select an action, i.e, without the capacity for "free will"). Using infomax principles, above agent/environment system can be shown to structure the information flows into partly decomposable information flows, a process that can be int erpreted as a form of concept formation . This gives a new interpretation for the importance of emergence as the archetypical mechanism that allows the formation of concept in intelligent agents and may thus provide a key driving the creation of complexity in living systems.
Bibliography [L] J . P . C r utc h fie ld . The ca lc u li o f emerg en ce : Co m p u tat io n , dy nam ics , a n d indu c ti on . Physics D, p e g e s 11 -54 , 1994 . [2]
K . D eb , A . Prata p , S. A g a r w al , a nd T. M e y arivan . A f a st a nd e li tist mu lt.I o b .ject l ve gen e tic a l g o r i t h m : N ag.a-H . 0 /1 E volutionary Comp uta tion, 6 : 182-1 97 , 2 0 02 .
JEE P. 'Iran se cucns
[3 J A . E g r i- Nag y a nd C . L . N e han tv . M ak i ng s e nse o f t h e se nso r y d ata d e c o m pos it io n . In P rcc . K BS 2006, 2006.
coo r d in a te sys t e ms by h ie rar c h ical
[4 J H . H akon . Ad vancedsynerget ics. S pr in ger- V e r lag, B erlin , 198 3 , [5]
H . H ak on . Inform ation and Self· Organ izatio n. S pri n ger Ser ie s in Sy n e r get tcs . Spr inger , 2000 .
[6 ]
I. H a rvey. T' he 3 es o f a r t ificia l life : Em erg en c e , e m bo d i me n t a n d ev o tu t .ion . I nv it ed t alk a t Arti fi ci al L i fe VII , 1. - 0 . A u g u s t , Port la nd , Au gu st 200 0 .
[7]
M . N . Jac o bi. Hi e rarch ic al o rgan laat .ton in s moot h d yn am ic al s y s t e ms . A rt ificial
[8]
A . S . Klyu bin , D . P ol ani , a n d C . L . N ehaniv . O rga n izat io n o f the info rm ati on fl ow 1n th e p erc epti on- a cti on lo op o f evo lv e d age nts . I n Proceedings of 2004 NASA/ DoD Conference on Evolvable Hardwa re, p a g es 177-180. IE E E C o m p ute r So ci e ty, 2004 .
{9]
S. M c Gr-eg o r a n d C . F e rn an d o . L evel s o f d e s crip tio n : A n ovel a p p r oach t o dyn ami c al hi erarchie s . A rt ificial Life, 11 (4 ) ,4 59 -472 , 2 0 05.
[10J
D . Po lan i. D efinin g e merge nt d e s cripti on s by inform ati on pre s erv a tion . In Pr oc. of tlle International Con ference on Comph!x Systems. N ECS I, 2004 . L ong a b s tract, full pa pe r under r e v ie w in Inter J o urn al.
Lue,
11 (4) :4 9 3-512 , 2005 .
[1 11 S . R a smus s en , N . B aas , B . M ayer , M . N il s so n , a nd M . W . O lesen . A n sate fo r dyn ami c al hi e rarchies . A rtificial Life, 7 :329 - 3 5 3 , 2001.
[1 2 )
C . R . S ha lizi. Cause! Archn ecture, Complexity e nd Self· Organiza tion in T ime Series and Cellular A uto mata. PhD th esi s , Uni vers ity o f W is c on sin - M ad is on , 2 0 01.
[13 ] S . W inter. Z erl e gung vo n g ekoppe lt en D yn aml s ch en Sye t .em en (D e co mp o s it ion o f Co u p le d Dy namica l Sys te m s ). Di pl om a th esi s , J ohan n es G u t.e n b erg -U n lv cr-stt .at M a i n z , 1 9 96 . ( I n G e r man) .
Chapter 2
How Deep and Broad are the Laws of Emergence? Susan Sgorbati and Bruce Weber, Bennington College
Abstract Bruce Weber, evolutionary biologi st and Susan Sgorbati , choreographer have been in a dialogue for the last several years asking the question of whether there are deep structuring principles that cross disciplines. While both professors at Bennington Colleg e, they developed a series of courses that explored these structuring principles in complex systems. Ideas such as self-organization, emergence, improvisation, and complexity were investigated through the lens of different disciplines and modes of perception. The inquiry was both intellectually driven and experientially driven. Students were asked to research and write papers, as well as move in the dance studio. Experiments in the studio led Susan Sgorbati to develop research that subsequently resulted in a national tour with professional dancers and musicians who are participating in a performance as part of this conference.
In this paper we will define concepts we have been using in our work and teaching , focusing on resonances between the different modalities. How to discern when organizing principles have relationships in common , and when they are specific to their systems seems an important distinction and line of inquiry that could have important implications for analyzing complex systems in a wide range of different environments from science to art to public policy.
Introduction: Providing a Historical Context Starting in 1999 Bruce Weber, a biochemist intere sted in how emergent and selforgani zing phenomena in complex chemical and biological systems affect our ideas of the origin and evolution of life, entered into a collaboration with Susan Sgorbati, a dancer interested in emergent improvisation, a form she developed as a set of structuring principles for dance and music . They were able to collaborate in both
12 teaching and research/creative work over a period of years at Bennington College, in an environment that fostered such interaction. We began with teaching a course in the emergence of embodied mind that was based upon reading the scientific writings on consciousness and embodiment of Gerald Edelman, Nobel Laureate and Director of The Neurosciences Institute in La Jolla, and who has visited the Bennington campus . Exploring the biological basis of consciousness brought us not only to utilize the conceptual resources of complex systems dynamics (theories of selforganization, emergence and the application of various computational models) but also to devise experiential work for the students involving perception, movement, improvisation, and the contrast of objective and subjective awareness. In addition to continuing this class over several years, we also taught classes in more general aspects of emergent complexity, where we drew heavily on the work of Stuart Kauffman of the Santa Fe Institute and the University of Calgary, and who also spent time at the campus. We looked for similar patterns that arose in different types of systems across a wide range of phenomena - physical, chemical, biological, cultural, and aesthetic. We ranged widely over such different subjects in order to ascertain if there might be a more general paradigm of emergent complexity beginning to affect our culture as suggested by Mark Taylor in his recent The Moment ofComplexity (Taylor 2001) . In the experientials that Susan developed for students in our classes, we studied the role of limited, close-range interactions vs. longer-range, global interactions, and also the correlation of constraints and selective factors and the likelihood of observable, global , aesthetic structures emerging. It was interesting to have the students report their subjective experiences during the process of emergence, something about which molecules are mute . For Susan, the language and concepts of complex systems dynamics in general and the specific ideas of Edelman and Kauffman in particular, provided a context for discussing emergent improvisational movement form s. Her creative exploration of this science/dance interface has intrigued colleagues at The Neurosciences Institute, where she has been in residence for several weeks in the last four years . The Jerome Robbins Foundation, The Bumper Foundation, The Flynn Center for the Performing Arts and The National Performance Network Creation Fund (The Creation Fund is sponsored by the Doris Duke Charitable Foundation, Ford Foundation, AUria, and the National Endowment for the Arts , a federal agency) have all supported her research
Defining Key Concepts We are interested in higher-order structures in complex systems that reveal themselves in scientific and aesthetic observations. The scheme that we explored was based upon the following type of pattern unfolding over time: individuals> self-organizatiore-ensemblec- emergence> complex system We explored such a sequence in particular systems, such as the BZ reaction, Bernard Cells, self-organization in slime molds , and the various Kauffman's NK models, where N represents the number of constituents in a system and K the number of ways such constituents are related to each other. In physical, chemical, and biological systems studied we saw that self-organization (SO or perhaps more perspicuously system-organization) and self-structuring can occur spontaneously when a system is held far from equilibrium by flows of matter/energy gradients and the system has
13 mechanisms for tapping such gradients (Peacocke 1983; Wicken 1987; Casti 1994; Schneider and Sagan 2005). The resulting structures from such SO processes involve an interplay of selective and self-organizing principles from which higherorder structures can emerge that can constrain activities at the lower levels and allow the completion of at least one thermodynamic work cycle (Kauffman 1993, 1995, 2000; Depew and Weber 1995; Weber and Depew 1996; Weber and Deacon 2000; Deacon 2003; Clayton 2004). Such emergent systems can, under special circumstances, display agency in that they select activities and/or behaviors that allow them to find gradients and extract work from them (Kauffman 2000). Sufficiently complex chemical systems with agency, boundaries and some form of "molecular memory" are showing many of the traits of living systems and give clues to the possible emergence of life (Kauffman 2000; Weber 1998,2000,2007). Further, Edelman's theory of neuronal group selection similarly invokes an interplay of selective and self-organizational principles giving rise to emergence of consciousness (Edelman 1987; Edelman and Tononi 2000; Weber 2003). In Edelman's model of how consciousness emerges there is a central role for both complexity and a process of reentry that can give rise to coherent neuronal activity in a "dynamic core" (Tononi and Edelman 1998). While exploring these concepts Susan developed experientials to help students understand the issues through an alternative modality to the experimental and mathematical. This alternative modality is based on the aesthetic idea that important concepts such as agency, movement, embeddedness, memory, topology, and complexity arise in dancers and musicians in an improvisational system. Trying out a series of experiments with students and then with professional dancers and musicians based on simple rules and constraints, certain key concepts were formulated as a result of observations. They are: 1) agency: Individual dancers and musicians exhibit agency, or in this context the choice to move or to create sound. An essential aspect of this agency is the sensation of being "embodied 2) movement: in this context, movement is the energy force driving the selforganizing system, creating the individual actions, the local interactions, and the global ensemble patterns. Movement is key as the system would be static without it. Movement is an essential component in any kind of structuring process . 3) embeddedness: the elements of this particular system contain constraints and boundaries in a particular environment. The global behavior is integral to the environment and will alter with any changes in the constraints. Time and space are essential components and will dictate the nature of structuring. 4) memory: structuring is an act of learning by the elements that are building the shape and patterns. Learning involves memory, reconstructing past experience into present thinking and action. This learning is essentially selectional, choosing certain patterns over others. Edelman speaks of "degeneracy" or many different ways, not necessarily structurally identical, by which a particular output occurs. (Edelman and Tononi 2000,86) The ability to recreate patterns to refine structuring processes increasingly depends on degenerate pathways to find more adaptable solutions to build onto forms. 5) topology: In this way of structuring, a •metatopology 'occurs where the system has the ability to operate on all levels at once (Sgorbati 2006,209). Scale and amplification are important. According to Terence Deacon, a topology is "a
14 constitutive fact about the spatial -temporal relationships among component elements and interactions with intrinsic causal consequences" (Deacon 2003,282) . Three levels of interaction exist at once: the local neighbor interaction, the small group ensemble locally, and the global collective behavior. 6) complexity: dynamic compositional structures among dancers and musicians arise when simple rules are followed through improvisation based on certain constraints in the environment. This leads us to speculate that there are three interactive levels of analysis to these complex structures: systems approach (evolutionary biology), developmental approach (morphology), and psychological approach (meaning) as a way of observing structuring principles (Susan Borden personal communication with Sgorbati, 2006). Complex systems dynamics gives a language with which to consider and discuss our experiences and the emergence of new aesthetic forms.
Research in Emergent Improvisation - An Aesthetic Idea The Emergent Improvisation Project is a research project into the nature of improvisation in dance and music . In this context improvisation is understood to mean the spontaneous creation of integrated sound and movement by performers who are adapting to internal and external stimuli, impulses and interactions. Ordinarily, we think of order and form as externally imposed, composed or directed. In this case , however, new kinds of order emerge, not because they are preconceived or designed, but because they are the products of dynamic, self-organizing systems operating in open-ended environments. This phenomenon - the creation of order from a rich array of self-organizing interactions - is found not only in dance and music, but also, as it turns out, in a wide variety of natural settings when a range of initial conditions gives rise to collective behavior that is both different from and more than the sum of its parts. Like certain art forms, evolution, for example, is decidedly improvisational and emergent, as is the brain function that lies at the heart of what it is to be human. Emergent forms appear in complex , interconnected systems, where there is enough order and interaction to create recognizable pattern but where the form is open-ended enough to continuously bring in new differentiations and integrations that influence and modify the form. It is by way of these interactions that particular pathways for the development of new material are selected.
In linking the creative work of art-making to the emergent processes evident in nature, there is basis for a rich and textured inquiry into how systems come together, transform and reassemble to create powerful instruments of communication, meaning and exchange. This project explores the ways in which natural processes underlie artistic expression along with the possibility that art can help illuminate natural processes. Conversations with scientists, particularly Bruce Weber at Bennington College, Gerald Edelman, Anil Seth, and John Iverson of The Neurosciences Institute, and Stuart Kauffman of The University of Calgary, have introduced Susan to the idea that, in living systems, self-organization produces complex structures that emerge dynamically. This idea resonated with her own work in improvisation and led us to
15 speculate that there are deep, structuring principles that underlie a vast range of phenomena, producing similar evolving patterns in different environments: dancers collecting, birds flocking , visual representations of neuronal networks
New Forms in Emergent Improvisation Movement appears to be a fundamental component of all living processes and we , as dancers, are moving and experiencing our own emergent sense of organization in this process (Sheets-Johnstone 1999). Working in this way with our students led Susan to observe and develop structuring principles for two emergent forms : comple x unison and memory. The Complex Unison Form is based on the observation of natural systems, which exhibit self-organizing structuring principles. In this form, open-ended processes are constantly adapting to new information, integrating new structures that emerge and dissolve over time. Complex Unison reveals the progression of closely following groups of individuals in space, to the unified sharing of similar material , and finally to the interplay of that material, which has both a degree of integration and variation, often displaying endlessly adaptive and complex behavior. In the Memor y Form, the dancers and musicians create an event that is remembered by the ensemble, and then reconstructed over time, revealing memory as a complex structuring process. This process by the dancers and musicians investigates multiple interpretations that draw on signals that organize and carry meaning . In this way, memory of the initial event is a fluid, open-ended process in which the performers are continuously relating past information to present thinking and action. This reintegration of past into present draws on repetition , nonlinear sequencing, and emergence to construct new adaptations. The Memory Form was inspired by the concept, "the remembered present" of Gerald Edelman.
Notes Toward The Definition of a General Theory of Emergence Entering into this discussion of a general theory of emergence feels like walking through a minefield . The dangers of generalities, of vague assumptions , of philosophizing about abstractions are everywhere. Artists and scientists have their own languages that describe the concept of emergence. Do the movement patterns of flocks of birds, schools of fish, neuronal networks, and ensembles of dancers and musicians have anything in common? Does our dialogue have something to contribute to our own communities as well as the culture at large? Yaneer Bar-Yam , in his book Dynamics a/Complex Systems states, "Thus, all scientific endeavor is based, to a greater or lesser degree , on the existence of universality , which manifests itself in diverse ways" (Bar-Yam 1997, 1). This suggests that there might be universal principles contained in the concept of emergence that might shed light on structuring principles for many disciplines . Let us make perfectly clear that we are not interested in comparing apples to oranges. Dancers are not molecules. However, unlike molecules, dancers and musicians can relate their subjective experience during the process of emergent complexity. They are aware of what signals are effective in self-organizing structuring processes, and can reflect on multi-level attention spans that participate in these topological structuring processes. From our dialogues in the last several years as well as our
16 work with students , we believe conversations between artists and scientists about emergence are important, and that a general theory may be possible . It is not simple to define emergence from a scientific or an aesthetic point of view , and clearly harder to encompass both perspectives. One definition is from Terrence Deacon, who in his essay, "The Hierarchic Logic of Emergence" states that , "Complex dynamical ensembles can spontaneously assume ordered patterns of behavior that are not prefigured in the properties of their component elements or in their interaction patterns" (Deacon 2003, 274) . Artists experience their own sense of emergence. Gerald Edelman describes some of the basis for this experience. In his essay "The Wordless Metaphor: Visual Art and The Brain" he states, based upon current theoretical models and experiments, "Because it has no instructional program, but works by selection upon variation, the brain of a conscious animal must relate perception to feeling and value , whether inherited or acquired . These are the constraints -feeling and value- that give direction to selection within the body and brain" (Edelman 1995,40). Edelman then describes how this complex process of continual recategorization of experience and movement of the body has links to motor features of artistic expression which we relate to as ' memory'. "The notion of bodily-based metaphor as a source of symbolic expression fits selectionist notions of brain function to aT . As Gombrich has put it, the artist must make in order to match" (Edelman 1995,41). He concludes the essay by stating, "I hope that artists will be pleased to hear that the process of selection from vast and diverse neural repertoires, giving each of their brains a unique shape, may be a key to what they have already discovered and expressed in their creative work . The promise of this idea is its ability to account for the individuality of our responses, for the coexistence of logic and ambiguity as expressed in metaphor, and for the actual origins of the silent bodily-based metaphors that underlie artistic expression . When scientific verifications and extensions of these notions occur, we will have a deeper understanding of how artistic expression, in an enduring silence of wordless metaphors, often historically precedes explicit linguistically expressed ideas and propositions. Art will then have a sounder and more expansive link to scientific ideas of our place in nature" (Edelman 1995,43-47). Whether one is looking at flocks of birds, ensembles of dancers or neuronal networks, certain questions, appropriately framed for the particular instance, seem pertinent. Questions of structure are of extreme importance across disciplines. While humans will always interact from a psychological framework unlike other living systems, all systems appear to need structuring in order to survive. Complex structuring is particularly challenging because of new ways of looking at nonlinear sequencing, communication across distances with spatio-temporal and kinesthetic signaling , analysis of particular constraints within a context, and new investigations into morphological concepts. In this general theory of emergence, movement and structuring principles are key elements. Robert Laughlin (1998 Nobel Prize in Physics) has written, "Nature is regulated not only by a microscopic rule base but by powerful and general principles of organization. Some of these principles are known, but the vast majority are not" (Laughlin 2005, xiv) . If the vast majority of principles of organization are not known, it is possible that they are there for us to be discovered on all levels , scientific as well as artistic. These structuring principles might be organized in levels of interactive analysis, analyzing, as in complex systems, such that we need to see the whole picture at once as well as individual levels. These levels include first
17 the systems approach where much research is occurring. Second is the developmental or morphological approach, where much research has occurred in relation to the development of organisms, but not much related to structuring principles, and the psychological approach, where the structuring of meaning and metaphor is integral to emergence and complexity, and can be directly related to social systems and artistic expression (Borden , personal communication with Sgorbati 2006). Thus, in conclusion, we observe some common themes across scientific and artistic disciplines on emergence: It is a property that arises out of self-organizing ensembles. Movement is an essential component of the self-organization . Constraints are necessary as are boundaries of time and space. Structuring principles dictate the type and nature of the emergence. They are found in a unique ordering that is a relationship between integration and differentiation. In our case, scientists and artists have begun a real conversation about a particular resonance to emergent structures across these disciplines. This theory suggests that living complex dynamical systems may share some unified experiences while making rigorous distinctions critical. (For example, molecular interactions are not sentient the way interactions among dancers are) . As Edelman suggests in connecting pattern recognition, selection and creativity, it may be that all living systems move toward creative ways to structure themselves in their environment based on a higher degree of adaptability. What may seem destructive to one group may seem perfectly ordered and coherent to another. For the sake of this discussion, rather than put a judgment on order or disorder, it might behoove us to observe and describe the structuring principles we see around us in order to best understand them, to recognize them, and then to determine their efficacy or destructive power. We might then be able to determine which structures work best within certain constraints, the length of their life spans, how much learned information is necessary for agents to participate in building them and gain a deeper appreciation for the beauty in patterns around us. We conclude with a quote from Stuart Kauffman from At Home in the Universe : The emerging sciences of complexity begin to suggest that the order is not all accidental, that vast veins of spontaneous order lie at hand. Laws of complexity spontaneously generate much of the order of the natural world. It is only then that selection comes into play, further molding and refining ... How does selection work on systems that already generate spontaneous order? .... Life and its evolution have always depended on the mutual embrace of spontaneous order and selection's crafting of that order. We need to paint a new picture. (Kauffman 1995,8-9). We look forward to continuing our exploration into these matters and to encourage artists and scientists to engage in this fruitful dialogue.
Bibliography Bar-Yam, Y. (1997), Dynamics of Complex Systems, Reading MA: Addison-Wesley. Casti, J.L. (1994), Complexification: Explaining a Paradoxical World Through the Science ofSurprise, New York: HarperCollins. Clayton, P. (2004), Mind & Emergence: From Quantum to Consciousness, Oxford:Oxford University Press. Deacon, T.W. (2003), The Hierarchic logic of emergence: Untangling the
18 interdependence of evolution and self-organization , in Evolution and Learning: The Baldwin Effect Reconsidered, Cambridge MA: MIT Press, pp 273-308. Depew, DJ . and B.H. Weber (1995), Darwini sm Evolving: Systems Dynamics and the Genealogy ofNatural Selection, Cambridge , MA: MIT Press . Edelman, G.M. (1987), Neural Darwinism: The Theory ofNeuronal Group Selection ,New York: Basic Books . Edelman, G.M. (1995), The wordless metaphor: Visual art and the brain , in i995 Biennial Exhibition Catalogue of the Whitney Museum ofAmerican Art, New York: Abrams. Edelman, G.M., and G. Tononi (2000), A Universe ofConsciou sness: How Matter Becomes imagination , New York: Basic Books. Kauffman , SA . (1993), The Origins of Order: Self-Organization and Selection in Evolution , New York : Oxford University Press. Kauffman, SA . (1995) , At Home in the Universe: The Search for the Laws of SelfOrganization and Complexity, New York: Oxford University Press. Kauffman, S.A. (2000), Investigations, New York: Oxford University Press. Laughlin, R.B. (2005), A Different Universe: Reinventing Physicsfrom the Bottom Down, New York: Basic Books . Peacocke, A.R. (1983), An Introduction to the Physical Chemistry ofBiological Organization, Oxford: Oxford University Press. Schneider, E.D. and D. Sagan (2005), Into the Cool: Energy Flow Thermodynamics and Life, Chicago: University of Chicago Press. Sgorbati, S. (2006), Scientifiquement Danse : Quand La Danse Puise aux Sciences et Reciproquement, Bruxelles: Contredanse. Sheets-Johnstone, M. (1999) , The Primacy ofMovement, Amsterdam: Benjamin. Taylor, M.C. (2001), The Moment ofComplexity: Emerging Network Culture, Chicago: University of Chicago Press . Tononi, G. and G.M. Edelman (1998), Consciousness and complexity, Science 282: 1846-1851. Weber, B.H. (1998), Emergence of life and biological selection from the perspective of complex systems dynamics, in Evolutionary Systems, G. van de Vijver, S. Salthe , and M. Delpos (eds), Dordrecht: Kluwcr. Weber, B.H. (2000), Closure in the emergence of life , in Closure : Emergent Organizations and Their Dynamics J.L.R . Chandler and G. van de Vijver (eds), Annals ofthe New York Academy ofSciences, 501: 132-138. Weber, B.H. (2003), Emergence of mind and the Baldwin effect, in Evolution and Learning: The Baldwin Effect Reconsidered, Cambridge MA: MIT Press, pp. 309326. Weber, B.H. (2007), Emergence of life, Zygon 42:837-856 . Weber, B.H. and T.W . Deacon (2000), Thermodynamic cycles, developmental systems, and emergence, Cybernetics and Human Knowing 7:21-43. Weber, B.H. and DJ. Depew (1996), Natural selection and self-organization: Dynamical models as clues to a new evolutionary synthesis, Biology and Philosophy 11:33-65. Wicken , J .S. (1987), Evolution , Information and Thermodynamics : Extending the Darwinian Program, New York: Oxford University Press .
Chapter 3
On an irreducible theory of complex systems Victor Korotkikh and Galina Korotkikh Faculty of Business and Informatics Central Queensland University Mackay, Queensland, 4740 Australia [email protected], [email protected] u.a u
1
Introduction
Complex systems profound ly change human activities of the day and may be of strategic interest. As a result , it becomes increasingly important to have confidence in the t heory of complex systems. Ultimately, this calls for clear explanations why the foundations of the theory are valid in t he first place. The ideal sit uation would be to have an irreducible theory of complex systems not requiring a deeper explanatory base in princip le. But the quest ion arises: where could such a theory come from, when even t he concept of spacetime is questioned as a fundamental entity. As a possible answer it is suggested t hat the concept of integers may take responsibility in the search for an irreducible theory of complex systems [1] . It is shown that self-organization processes of prime integer relat ions can describe complex systems through the unity of two equivalent forms, i.e., arithmetical and geometrical [1], [2]. Significant ly, based on the integers and controlled by arithmetic only such processes can describe complex systems by information not requiring further simp lification. T his raises the possibility to develop an irreducible theory of complex systems. In t his pa per we present results to progress in this direction.
20
2
N onloeal Correlations and Statistical Information about Parts of a Complex System
As we consider the correlations between the parts preserving certain quantities of the system self-organizat ion processes of prime integer relations can be revealed
[1], [2] .
Let I be an integer alphabet and IN = {x = Xl ...XN, Xi E I, i = 1, ...,N} be the set of sequences of length N 2 2. We consider N elementary parts {Pi , i = 1, ..., N} with the st ate of Pi in its local reference frame given by a space coordinate Xi E I , i = 1, ..., N and th e state of the elementary parts by a sequence x = Xl ...XN E IN . A local reference frame of an element ary part Pi is specified by two parameters ci > 0 and Oi > 0, i = 1, ..., N . The parameters are required to be th e same C = e. ,0 = Oi, i = 1, ..., N , i.e., const ants , and can not be changed unless simult aneously. It is proved [1] that C( x , x') 2 1 of the quantities of a complex syst em remain invariant , if and only if the correlat ions between t he parts can be defined by a system of C(x,x' ) Diophantine equat ions (m+N) C(X ,X/)-l~Xl
+ (m+N _1)C(X,X/)-1~X2 + ... +(m+lf(x ,x/)-l~XN =
+ N)l~Xl + (m + N (m + N)O~Xl + (m + N (m
+ + (m + l)l~XN = 0 -1)o~x2 + + (m + I)O~xN = 0
°
-1)1~X2
(1)
and an inequality (m + N( x,x') ~Xl
+ (m + N
- I)C(x,x') ~X2
+ ...+ (m + I)C(x,x') ~XN i= 0,
where {~Xi = x~ - Xi, X~, Xi E I , i = 1, ..., N} are the changes of the element ary parts {Pi , i = 1, ..., N} between the stat es x' = x~ ...x':v, x = Xl ...XN and m is an integer. Th e coefficients of the system become th e ent ries of the Vandermonde matrix , when th e number of th e equations is N . This fact is important in order to prove th at C(x,x') < N [1]. The equations (1) present a special typ e of correlations th at have no reference to the dist ances between the parts, local times and physical signals. Thus , according to the description parts of a complex systems may be far apart in space and time and yet remain interconnected with inst antaneous effect on each other, but no signaling. The space and non-signaling aspects of th e correlations are familiar properties of quantum correlat ions [3]. The time aspect of th e nonlocal correlations suggests an interesting persp ective. For th e observable ~ Xi of an elementary part Pi, i = 1, ..., N the solutions to t he equat ions (1) may define a set of different possible values. Since th ere is no no mechanism specifying which of t hem is going to take place, an intrinsi c uncertainty about t he element ary part exists. At the same time, th e solutions can be used to evaluate the probability of the observable ~Xi , i = 1, ..., N to tak e each of the measurement out comes. Thus , th e description provides the st atist ical information about a complex system.
21
3
Self-Organization Processes of Prime Integer Relations and their Geometrization
Through the Diophantine equations (1) integer relati ons can be found . Th eir analysis reveals hierar chical st ruct ures, which can be interpreted as a result of self-organization processes of prime integer relations [1], [2].
3
2
o Figure 1: The left side shows one of th e hierarch ical structures of prime integer relations, when a complex system has N = 8 elementary par ts {Pi , i = 1, ... , 8}, x = 00000000, x' = +1 - 1 - 1 + 1 - 1 + 1 + 1-1, m = 0 and C(x ,x') = 3. Th e hierarchical st ruct ure is built by a self-organization process and determin es a correlat ion structure of th e complex syst em. Th e process is fully controlled by arit hmetic. It can not progress to level 4, because arithmet ic determines t hat +8 3 - 73 - 63 + 5 3 - 43 + 33 + 23 - 13 t= O. Th e right side present s an isomorphic hierar chical st ruct ure of geometr ical patterns determining t he dynamics of th e syst em. On scale level 0 eight rectangles specify th e dynami cs of the elementary parts {Pi , i = 1, ..., 8}. Under th e integration of the function j lk) th e geometrical patterns of t he parts at th e level k form th e geometrical patterns of th e parts at the higher level k + 1, where j[O) = j and k = 0,1,2. T hrough t he integrations arithmetic defines how t he geomet rical patterns must be curved to dete rmine the spacetime dynamics of the parts. All geometrical patterns are symmetrical and th eir symmetries are all interconnected. The symmetry of a geomet rica l pattern is global and belongs to a corresponding part as a whole.
Starting wit h integers as t he elementary building blocks and following a single principle, such a self-organization process ma kes up from the prime integer relations of a level of a hierarchical st ruct ure the prime integer relations of the higher level (Figure 1). Notab ly, a prime int eger relation is made as an inseparable obj ect : it ceases to exist upon removal of any of its formation compon ents. By using the integer code series [4] th e prime integer relations can be equiv-
22 alently geometrized as two-dimensional patterns and the self-organization processes can be isomorphica lly expressed by certain transformations of the patterns [1], [2]. As it becomes possible to measure a prime integer relation by a corresponding geometrica l pattern, quantities of a system the prime integer relation describes can be defined by quantities of the geometrical pattern such as the area and length of its boundary curve (Figure 1). In general, the quantitative description of a complex system can be reduced to equations characterizing properties and quantities of corresponding two-dimensional patterns. Due to the isomorphism in our description the structure and the dynamics of a complex system are united [1], [2]. The dynamics of t he parts are determined to produce precisely the geometrical patterns of the system so that the corresponding prime integer relations can be in place to provide the correlation structure of the whole system . If the dynamics of the parts are not fine-tuned, then some of the relationships are not in place and the system falls apart .
4
Optimality Condition of Complex Systems and Optimal Quantum Algorithms
Despite different origin complex systems have much in common and are investigated to satisfy universal laws. Our description points out that the universal laws may originate not from forces in spacetime, but through arit hmetic. There are many notions of complexity introduced in the search to communicate the universal laws into theory and practice. The concept of structural complexity is defined to measure the complexity of a system in terms of self-organization processes of prime integer relations [1]. In particular, as selforganization processes of prime integer relations progress from a level to the higher level, the system becomes more complex, because its parts at the level are combined to make up more complex parts at the higher level. Therefore, the higher the level self-organization processes progress to , t he greater is the structural complexity of a corresponding complex system . Existing concepts of complexity do not in general explain how the performance of a complex system may depend on its complexity. To address the situation we conducted computat ional experiments to investigate whether the concept of structural complexity could make a difference [5]. A special optimization algorithm, as a complex system , was developed to minimize the average distance in the trave lling salesman problem. Remarkably, for each problem the performance of t he algorithm was concave. As a result, the algorithm and a problem were characterized by a single performance optimum. The ana lysis of t he performance optimums for all problems tested revealed a relat ionship between the structural complexity of the algorithm and the structural complexity of the prob lem approximat ing it well enough by a linear function [5]. Th e results of the computational experiments raise the possibility of an optimality condition of complex systems : A complex system demonstrates the optimal performance for a problem, when
23 the structural complexity of the system is in a certain relationship with the structural complexity of the problem .
Remarkably, the optimality condition presents the structural complexity of a system as a key to its optimization. According to the optimality condition the optimal result can be obtained as long as the structural complexity of the system is properly related with the structural complexity of the problem. From th is perspective the optimization of a system should be primarily concerned with the contro l of the structural complexity of the system to match the structural complexity of the problem . The computational results also indicate that the performance of a complex system may behave as a concave function of the structural complexity. Once the structural complexity could be controlled as a single entity, the optimization of a complex system would be potentially reduced to a one-dimensional concave optimization irrespect ive of the number of variab les involved in its description. In the search to identify a mat hematical structure under lying optimal quan tum algorithms the majorization princip le emerges as a necessary condition for efficiency in quantum computational processes [6]. We find a connection between the optimality condition and the majorization principle in quantum algorithms . The majorization princip le provides a local directio n for an optimal quantum algorithm : the probability distribution associated to the quantum state has to be step-by -step majorized unt il it is maximally ordered . This means that an optimal quantum algorithm has to work in such a way t hat the probability distribution Pk+1 at step k + 1 majorizes Pk -< Pk+1 the probability distribution Pk at step k [6] . Our algorit hm also has a direction. It is given by the order of the selforganization processes of prime integer relations and expressed through the structural complexity. Importantly, the performance of the algorithm becomes concave, as it tries to work in such a way that the structural complexity C k+ 1 of the algorithm at step k + 1 majorizes C k -< C k+ 1 its structural complexity C k at step k. The concavity of the algorithm's performance suggests efficient means to find optimal solutions [5].
5
Global Symmetry of Complex Systems and Gauge Forces
Our descriptio n reveals a global symmetry of complex systems as the geometrical patterns of the prime integer relations appear symmetrical and the symmetries are all interconnected through their transformations. The global symmetry belongs to the complex system as a whole, but does not necessarily apply to its parts. Usually, when a global symmetry is transformed into a local one, a gauge force is required to be added . Because in the description arithmetic fully determines the breaking of the global symmetry, it is clear why the resulting gauge forces exist the way they do and not even slight ly different. Let us illustrate the results by a special self-organization process of prime
24
integer relations [1], [2]. The left side of Figure 1 shows a hierarchical structure of prime integer relations built by the process. It determines a correlation structure of a complex system with states of N = 8 elementary parts {Pi, i = 1, ..., 8} given by the sequences x = 00000000, x' = +1 - 1 - 1 + 1 - 1 + 1 + 1 - 1 and m = o. The sequence x' is the initial segment of length 8 of the Prouhet-Thue-Morse (PTM) sequence starting with +1. The self-organizat ion process we consider is only one from an ensemble of self-organization processes forming the correlation structure of the whole system . The right side of Figure 1 presents an isomorphic hierarchical structure of geometrical patterns. The geometrical pattern of a prime integer relation determines the dynamics of a corresponding part of th e complex system . Quantities of a geometrical pattern, such as the area and length of the bound ary curve, define quantities of a corresponding part. Quantities of t he parts are interconn ected through the transformat ions of the geometrical patterns.
1
Fig ure 2: The geometrical pattern of the part (PI +--> P2) +--> (P3 +--> P4 ) . From above the geometrical pattern is limited by th e boundary curve, i.e., the graph of the second integral j (2)(t), to ~ t ~ t4 of the function j defined on scale level 0 (Figur e 1), where t i = ie , i = 1, ..., 4, e = 1, and it is restri cted by the t axis from below. The geometrical pattern is equivalent to th e prim e integer relation + 8 1 _ 7 1 _ 6 1 +5 1 = 0 and determines the dynamics. If the part deviates from this dynamics even slightly, then some of the correlation links provided by the prime integer relation disappear and the part decays. The boundary curve has a special prop erty ensuring that the area of th e geometrical pattern is given as the area of a triangle: S = H2D , where Hand D are the height and the width of th e geometrical pattern. In the figure H = 1 and D = 4, thus S = 2. The property is illustrated in yin-yang motifs.
Starting with the elementary parts at scale level 0, the parts of the correlation structure are built level by level and thus a part of the complex system becomes a complex system itself (Figure 1). All geometrical patterns characterizing the parts are symmetrical and t heir symmetries are interconnected through the integrations of the function f . Specifically, we consider whether the description of the elementary parts of a scale level is invariant . At scale level 2 the second integral f [2l(t) , to ~ t ~ t4 , ti = ie, i = 1, ..., 4, e = 1 characterizes t he dynamics of the part (PI +-t P2 ) +-t (P3 +--> P4 ) . This composite part is made up of the elementary parts
25 PI, P2 , P3 , P4 and parts H ...... P2 , P3 ...... P4 changed under the transformations to be at scale level 2 (Figures 1 and 2). The description of the dynamics of the elementary parts PI , P2 , P3 , P4 and parts PI ...... P2 , P3 ...... P4 within the part (PI ...... P2 ) ...... (P3 ...... P4 ) is invariant relative to their reference frames. In particular, the dynamics of the elementary parts PI and P2 in a reference frame of the elementary PI is specified by
The transition tP2 = -tp, - 2, j~J = - j~} -1 from the coordinate system of the elementary part PI to a coordinate system of the elementary P2 shows that the characterization
t2
[2J(t ) = ...!2. j P2 P2 2 '
(3)
of the dynamics of the elementary part P2 is the same, if we compare (2) and (3). Similarly, the description is invariant , when we consider the dynamics of the element ary parts P3 and P4 . Furthermore, it can be shown that the description of the dynamics of the parts H ...... P2 and P3 ...... P4 relative to their coordinate systems is invariant . However, at scale level 3 the description of the dynamics is not invariant . In particular, we consider the dynamics of the elementary parts PI and P2 changed under the transformations to be at scale level 3 within the part ((PI ...... P2 ) ...... (P3 ...... P4 ) ) ...... ((P5 ...... P6 ) ...... (P7 ...... Ps)). Relative to a coordinate system of the elementary part PI the dynamics can be specified by (Figure 1) [3J (
jp! tp! [3J (
) _
j P, tp, -
)
t:;', ' =31
t:;', + t 2p, -31
(4)
to,P,~t~tl,P! ,
1
tp! + 3' ii ,r,
~
tp!
~
t2,P,.
The transitions from t he coordinate systems of the elementary part PI to the coordinate systems of the elementary part P2 do not preserve the form (4). For example, if by tP2 = ir. + 2, j~J = - j~} + 1 the perspective is changed from the coordinate system of the elementary part PI to a coordinate system of the elementary part P2 , then it turns out that the description of the dynamics (4) is not invariant j [2J(t)
3
t = j[3P2J(t P2 ) = ...!2. 3! -
due to the additional term -tP2 '
tp2'
26 Therefore, at scale level 3 arithmetic determines different dynamics of the elementary parts PI and Pz. Information about the difference could be obtained from observers positioned at the coordinate system of the elementary part PI and the coordinate system of the elementary part Pz respectively. As one observer would report about the dynamics of PI and the other about the dynamics of Pz , the difference could be interpreted through the existence of a gauge force F acting on the elementary part Pz in the coordinate system to the effect of the term X(F) = -tP2
In summary, the results can be schematically expressed as follows: Arithmetic -+ Prime integer relations in control of correlation structures of complex systems t-t Global symmetry: geometrical patterns in control of dynamics of complex systems Not invariant descriptions of parts of complex systems t-t Gauge forces to restore local symmetries
-+
Bibliography [I] Victor KOROTKIKH, A Mathematical Structure for Emergent Computation, Kluwer Academic Publishers (1999). [2] Victor KOROTKIKH and Galina KOROTKIKH, "Description of Complex Systems in terms of Self-Organization Processes of Prime Integer Relations", Complexus Mundi: Emergent Patterns in Nature, (Miroslav NOVAK ed.), World Scientific (2006), 63-72 , arXiv:nlin.AOj0509008. [3J Nicolas GISIN, Can Relativity be Considered Complete? From Newtonian Nonlocality to Quantum Nonlocality and Beyond, arXiv :quant-phj0512168. [4J Victor KOROTKIKH, Integer Code Series with Some Applications in Dynamical Systems and Complexity, Computing Centre of the Russian Academy of Sciences, Moscow (1993). [5] Victor KOROTKIKH, Galina KOROTKIKH and Darryl BOND, On Optimality Condition of Complex Systems: Computational Evidence, arXiv :cs.CCj0504092. [6J Roman ORUS, Jose LATORRg and Miguel MARTIN-DELGADO, Systematic Analysis of Majorization in Quantum Algorithms, arXiv :quant-phj0212094.
Chapter 4
Measuring and Tracking Complexity in Science Jacek Marczyk Ph.D., Balachandra Deshpande Ph.D . Ontonix Srl, Ontonix LLC [email protected]
1. Introduction Recent years have seen the development of a new approach to the study of diverse problems in natural, social and technological fields: the science of complexity [GellMan 1994]. The objective of complex systems science is to comprehend how groups of agents, e.g. people, cells, animals, organizations, the economy, function collectively . The underlying concept of complexity science is that any system is an ensemble of agents that interact. As a result, the system exhibits characteristics different from that of each agent, leading to collective behavior [Gell-Man 1994]. This property is known as emergence [Morowitz 2002]. Moreover, complex systems can adapt to changing environments, and are able to spontaneously self-organize [Sornette 2000]. The dynamics of complex system tends to converge to time patterns, that are known as attractors [Sornette 2000] and is strongly influenced by the agent inter-relationships, which can be represented as networks [Barabasi 2002]. The topological properties of such networks are crucial for determining the collective behavior of the systems, with particular reference to their robustness to external perturbations or to agent failure [Barabasi, Albert 2000], [Dorogovtsev 2003]. Although the theoretical exploration of highly complex systems is usually very difficult, the creation of plausible computer models has been made possible in the past 10-15 years. These models yield new insights into how these systems function. Traditionally, such models were studied within the areas of cellular automata [Chopard 1998], neural networks [Haykin 1999] chaos theory [Sornette 2000], control theory [Aguirre 2000], non-linear dynamics [Sornette 2000] and evolutionary programming [Zhou 2003]. The practical applications these studies cover a wide spectrum, ranging from studies of DNA and proteins [Jeong 2001] to computational biology [Dezso 2002], from economics and finance [Mantegna
28 2000] to ecology [Lynam 1999] and many others . When addressing complexity and complex systems, many researchers illustrate the ways in which complexity manifests itself and suggest mathematical methods for the classification of complex behavior. Subjects such as cellular automata , stochastic processes, statistical mechanics and thermodynamics, dynamical systems, ergodic and probability theory , chaos, fractals, information theory and algorithmic complexity and theoretical biology, etc. are consistently covered but with very few concrete attempts to practically quantify complexity and to track its evolution over time. However, even though complexity is becoming an increasingly important issue in modern science and technology, there are no established and pra ctical means of measuring it. Clearly, measurement constitutes the basis of any rigorous scientific activity. The ability to quantify something is a sine-qua-non condition towards being able to manage it. There also does not exist a widely accepted definition of complexity. Many of the popular definitions refer to complexity as a "twilight zone between chaos and order". It is often maintained that in this twilight zone Nature is most prolific and that only this region can produce and sustain life. Clearly, for a definition to lend itself to a practical use, it needs to provide a means for measurement. In order to increase our understanding of complexity and of the behaviour of complex systems, it is paramount to establish rigorous definitions and metrics of complexity. Complexity is frequently confused with emergence. Emergence of new structures and forms is the result of re-combination and spontaneous self-organization of simpler systems to form higher order hierarchies, i.e. a result of complexity. Amino acids combine to form proteins , companie s join to develop market s, people form societies , etc. One can therefore define complexity as amount offunctionality , capacity , potential or fitn ess. The evolution of living organisms, societies or economie s constantly tends to states of higher complexity precisel y because an increase in functionality (fitness) allows these systems to "achieve more" , to face better the uncertainties of the respective environment s, to be more robust and fit, in other words, to survive better. To track or measure complexity, it is necessary to view it not as a phenomenon (such as emergence), but as a physical quantity such as mass , energy or frequency. There do exist numerous complexity measures, such as the (deterministic) Kolmogorov-Chaitin complexity, which is the smallest length in bits of a computer program that runs on a Universal Turing Machine and produces a certain object x. There are also other measures such as Computational Complexity, Stochastic Complexity, Entropy Rate, Mutual Information , Cyclomatic Complexity, Logical Depth, Thermodynamic Depth, etc. Some of the above definitions are not easily computable. Some are specific to either computer programs , strings of bits, or mechanical or thermodynamic systems. In general , the above definition s cannot be used to treat generic multi-dimensional systems from the standpoint of structure, entropy and coarse-graining. We propose a comprehensive complexity metric and establish a conceptual platform for practical and effective complexity management. The metrics established take into account all the ingredients necessary for a sound and comprehensive complexity measure, namely structure , entropy and data granularity, or coarse-graining. The metric
29 allows one to relate complexity to fragility and to show how critical threshold complexity levels may be established for a given system. The methodology is incorporated into Ontospace'", a first of its kind complexity management software developed by Ontonix.
2. Fitness Landscapes and Attractors The concept of Fitness Landscape is central towards the determation of complexity of a given system. We define a fitness landscape as a multi-dimensional data set, in which the number of dimensions is determined by the number of systems variables or agents (these may be divided into inputs and outputs, but this is not necessary). ....; ..... .... ....":'. .... :
3 ······:·· 25 .'
:
. .'; .....
.
o
X}~·;·
__
.
.., .
25 . . . . ,. . ..
~
'
~
.~
..:
-":
i:
;
:' ;
:
0
_,
.
"
3·:.. .~.... ... +...... i.;
'.'0::' .i
.
;
, ,~ .
:
,!
;
J, :: r',",t~ J : .'
':j :.
,
"
ro"
o
x 1 0~
0
6
1
4
Figure 1. Example of Fitness Landscape. The four views refer to the same data-set and represent different combinations of axes. As one moves within the landscape, different local properties, such as density for example, will be observed in the vicinity of each point. The fitness in in every point of the landscape is equated to complexity.
The number of data points in the landscape is equal to the number of measurements or samples that make up the data set. Once the fitness landscape is established (either via measurement or a Monte Carlo Simulation for example), we proceed to identify regions in which the system locally posseses certain properties which may be represented via maps or graph, which we call modes. It is not uncommon to find tens or even hundered of modes in landscapes of low dimension (say a few tens). Once all the modes have been obtained we proceed to compute the complexity of each mode as function of the topology of the corresponding graph, the entropy of each link in the graph and the data granularity . We define data granularity in fuzzy terms and, evidently, this impacts the computation of entropy for each mode. We define fitness at a given point of the landscape to be equal to complexity of the mode in that point. Since the same modal topology may be found in many points of the landscape, there clearly can exist regions
of equal fitness. We may also define the total fitness landscape complexity as the sum of all the modal complexities.
30
Figure 2. Examples of modes. The variables (agents) are arranged along the diagonal and significant relationships are determined based on exchanged information and entropy. Red nodes represent hubs. The mode on the left has complexity of 91.4, while the one on the right a value of 32.1. Both modes originate from the same landscape.
Examples of modes are indicated in Figure 2, where one may also identify hubs indicated in a darker shade of red - the number of which may be related to numerous properties such as robustness , fragility, redundancy, etc. As one moves across the landscape, the modal topology will change , and so will the hubs.
I
)0 ,
•- -
-
I
i
~
.....-r
I
I
I
~
nl . __ . _ _; _ _•. __ .~ _ . __ . _
, , ,
102
. _ ....
... :
."
"
_~ -.: _ .
.
..
-..~.:-~:: ~.
.
__ . _ _.
, _ _ . .. - . __
. - ~.
__ . __ .
....
''''··'-~''~2-~..,....-,....,..---'.s 235 3"
Figure 3. Example of modal complexity spectrum.
The complexities of all the extracted modes in a given landscape may be plotted in ascending order, forming a complexity spectrum . Flat spectra point to homogenoeus landscape while in the opposite case they clearly point to cluster-dominated situations. There is a sufficient body of knowledge to sustain the belief that whenever dynamical systems , such as those listed above, undergo a crisis or a spasm, the event is accompanied by a sudden jump in complexity. This is also intuitive. A spasm or collapse implies loss of functionality , or organization. The big question then is: to what maximum levels of complexity can the above systems evolve in a sustainable fashion? In order to answer this question, it is necessary to observe the evolution of complexity in the vicinity of points of crisis or collapse . We have studied empirically the evolution of complexity of numerous systems and have observed that:
31 D D D D
High-dimension systems can reach higher levels of complexity (fitness). The higher the graph density the higher the complexity that can be reached . The higher the graph density the less traumatic is breakdown of structure. For dense systems, the life-death curve looks like y(t)= t*A*exp(-k*t"4).
The plots in Figure 4 illustrate examples of closed systems (i.e. systems in which the Second Law of Thermod ynamics holds) in which we measured how complexity change s versus time. We can initially observe how the increase of entropy actuall y increases complexity - entropy is not necessarily adverse as it can help to increase fitness - but at a certain point, complexity reache s a peak beyond which even small increase of entropy inexorably cause the breakdown of structure. The fact that initially entropy actually helps increase complexity (fitness) confirms that uncertaint y is necessary to create novelty. Without uncertainty there is no evolution.
\
Figure 4. Examples of evolution of complexity versus time of two closed systems . The plot on the left represents a 20-dimensional system, while the one on the right a 50 dimensional one. In the case of the smaller system, the maximum value of complexity that may be reached is approximately 3, while in the case of the larger system the threshold is approximately 12. The corresponding graphs have density of 0.1 and 0.2, respectively.
In our metric , before the critical complexity threshold is reached, an increase in entropy does generally lead to an increase in complexity, although minor local fluctuations of complexity have been observed in numerous experiments. After structure breakdown commences, an increase in entropy nearly always leads to loss of complexity (fitness) but at times, the system may recover structure locally. However, beyond the critical point, death is inevitable, regardless of the dimensionality or density of the system.
3. A practical application of Complexity measurement: the James Webb Space Telescope In the past decade numerou s Monte Carlo-ba sed software tools for performing stochastic simulation have emerged. These tools were conceived of as uncertainty management tools and their goal was to evaluate, for example, the effects of tolerances on scatter and quality of performance, most likely behavior, dominant design variables, etc. An important focus of the users of such tools has been on robust design. However,
32 simply attempting to counter the effects of tolerance s or environmental scatter is not the best way to achieve robust designs. A more efficient way to robustness is via managing the complexity of a given design rather than battling with the uncertainty of the environment in which it operates. After all, the environment (sea, atmosphere, earthquakes , etc.) is not controllable. At the same time, it is risky to try to sustain a very complex situation or scenario or design in an uncert ain environment. Robustnes s requires a balance between the uncertainty of a given environment and the complexity of the actions we intend to take in that environment. Ontonix has collaborated with EADS CASA Espacio on the design and analysis of the James Webb Space Telescope adapter. The component in question is an adapter between a launcher and it' s payload (satellite) and the objective was to achieve a robust design using complex ity principle s. Given the criticality of the component and the restricti ve and stringent requirement s in terms
•
" , ''... -
. . """:""
0'
" ~~
.......
.,' - ~ '
.. '- '
.0 1
,
, 'I"'
"
Figure 5. Two candidate designs are evaluated. The one with a lower complexity metric is chosen because lesser complexity in an uncertain enviro nment is more robust.
of mass, stiffness, interface fluxes and strength, a stochastic study was performed. Furthermore, the problem has been rendered more complex due to certain assemblyspecific consideration s. Given the unique nature of the component in question - no commercially available adaptors could have been used - it was necessary to evaluate a broad spectrum of candidate design topologie s, Two different design options with the corresponding maps which relate input (design) variables to outputs (performance) are shown in figure 5. While both solutions offer the same performance, the design on the left has a comple xity of 20.1, while the one on the right has 16.8. The design on the right will therefore be less fragile and less vulnerable to performance degradation in an uncertain environment.
33
4. Conclusions We propose a comprehensive complexity metric which incorporates structure, entropy and data coarse-graining . Structure, represented by graphs, is determined locally in a given fitness landscape via a perturbation-based technique. Entropy of each mode (graph) is computed based on the data granularity . Finally, fitness in each point of the landscape is defined as complexity . The metric has been applied to a wide variety of problems, ranging from accident analysis of nuclear power plants, to gene expression data, from financial problems to analysis of socio-economical sytems. The metric shows how a closed system will reach a certain maximum complexity threshold, after which even a small increase in entropy will commence to destroy structure.
References GelI-Mann. The quark and the jaguar : adventures in the simple and the complex . New York: W.H. Freeman and Co.; 1994. H.J. Morowitz. The emergence ofeverything: how the world became complex . New York: Oxford University Press; 2002. D. Sornette. Critical phenomena in natural sciences : chaos,fractals, selforganization, and disorder : concepts and tools. Berlin ; New York: Springer; 2000. A.-L. Barabasi. Linked: the new science ofnetworks . Cambridge, Mass.: Perseus Pub.; 2002. A.-L. Barabasi, R. Albert. Statistical mechanics ofcomplex networks. Reviewsof Modem Physics. 2002 2002;74(1 ):47-97. S.N. Dorogovtsev, J.F.F. Mendes. Evolution ofnetworks:from biological nets to the Internet and WWW. Oxford ; New York: Oxford University Press; 2003. B. Chopard, M. Droz. Cellular automata modeling ofphysical systems . Cambridge, U.K. ; New York: CambridgeUniversity Press; 1998. CHA. Aguirre, L. A. B. Torres. Control ofnonlinear dynamics: where do the models fit in? International Journal of Bifurcationand Chaos. 2000;10(3):667-681. C. Zhou, W. Xiao, T.M. Tirpak, P.C. Nelson. Evolving accurate and compact classification rules with gene expression programming. IEEE Transactionson Evolutionary Computation. 2003;7(6):519-531 . S.S. Haykin. Neural networks : a comprehensivefoundation . 2nd ed. Upper Saddle River, N.J.: Prentice HalI ; 1999. H. Jeong, S.P. Mason, A.L. Barabasi,Z.N. Oltvai. Lethality and centrality in protein networks. Nature. May 3 2001;411(6833):41-42. Z. Dezso, A.L. Barabasi. Halting viruses in scale-free networks . Phys Rev E Stat Nonlin Soft Matter Phys. May 2002;65(5 Pt 2):055103. R.N. Mantegna, H.E. Stanley. An introduction to econophysics: correlations and complexity in finance. Cambridge, UK ; New York: Cambridge University Press; 2000. T. Lynam. Adaptive analysis oflocally complex systems in a globally complex world. Conservation Ecology. 1999;3(2):13.
Chapter 5
Data-Driven Modeling of Complex Systems Val K. Bykovsky Utah State University [email protected] The observation of the dependencies between the data and the conditions of the observation always was and is a primary source of knowledge about complex dynamics. We discuss direct program-dri ven analysis of these data dependenc ies with the goal to build a model directly in computer and thus to predict the dynamics of the object based on measured data. The direct generalization of data dependencies is a critical step in building data-driven models.
"Theory, described in its most homely terms, is the cataloging of correlations .. ." Nick Metropolis [H83]
1 Introduction There are two main sources of data dependencies : (1) direct physical experiments that immediately generate data dependencies of interest and (2) indirect, computer or in-silico experiments, a way of using a computer as an experimental setup to get the "measurement data". Such a setup mimics the real experiments when they are impossible or very costly. That is an alternative to using a computer as a number cruncher. "Good computing requires a firm foundation in the principles of natural behavior" , wrote Metropolis [H83]. The experimentation approach was proposed by S. Ulam and N. Metropolis [M87, BOO] at Los Alamos when designing weapon systems with direct experiments basically impossible.
35
With "experimental approach", computer experiments is an integral part of understanding complex phenomena. Same idea was a focus of proposed then Monte Carlo (MC) method [M87], a generation of random experimental configurations and their evolution in time. However, it was not just events generation and data collection. N. Metropolis wrote in his historical account of development of the MC method [M87]: "It is interesting to look back over two-score years and note the emergence, rather early on, of experimental mathematics, a natural consequence of the electronic computer. The role of the Monte Carlo method in reinforcing such mathematics seems self-evident." Stressing the hands-on aspect of experimental mathematics, he wrote then: "When shared-time operations became realistic, experimental mathematics came to age. At long last, mathematics achieved a certain parity - the twofold aspect of experiment and theory - that all other sciences enjoy." The idea of bridging a gap between a data source and the model based on the data was also introduced by Metropolis [A86, BOO], and he demonstrated it by two-way coupling a computer to the Navy cyclotron. The importance of cataloging correlations, or dependencies, as a basis for any theory was also stressed by Metropolis [H83]. The same idea of dynamic integration of data and modeling to steer the new measurements was recently (50 years after pioneering work by Metropolis) resurrected in the NSF-sponsored DDDAS program [NSF], "a paradigm whereby application (or simulations) and related measurements become an integrated feedback-driven control system". We are making a step in the same direction with the focus on dynamic data-driven model building, testing and update with an option of new data measured on-demand. The controlled search in the physical configuration space is another focal point. Yet another focal point is the persistent framework to be structured by data obtained in the experimentation process, an infrastructure that bridges the gap between the physical object to be explored and the observation system ("explorer") which handles experiments and generates data. The framework also generalizes data dependencies representing the properties of the physical object. Accordingly, it may have a built-in logic to support an in-situ data analysis. This way, data/properties tum out to be integrated with the processing logic. Recently, the same problem has been analyzed in the general context
36
of making databases more dynamic and integrated with the logic of programming languages[G04]. Conceptually, the proposed approach mimics the traditional "fromdata-to-model" process that includes human-driven generalization of the dependencies. The proposed method handles data dependencies programmatically and builds an online model (mapping) by their programmatic generalization. Thus, the human generalization of dependencies gets replaced by high-performance programmatic analysis. The data source becomes an integral part of the model building process and is available for online model testing and update. When a traditional model is built, its connection with the data source gets lost, and the model (symbolic equation) lives its own life occasionally interacting with the real world through parameters and the initial/boundary conditions. As the proposed approach relies on computer power and flexibility, it is complexity-neutral, a distinctive difference with the traditional approach sensitive to complexity.
2 Measurements and Data Dependencies Generally, data is a result of measurement of a property of a physical object, and used to be a combination (a pair) of the data and its measurement context that makes the data unique . With the context attached, a field position in a measurement record may not be its identification any more; and data can be located, accessed and interpreted based on its physical attributes, rather than on its position in a record. In particular, the access can be associative, so that all the data with the same tag (property) can be easily accessed, collected, moved, processed, and placed in a database based on its properties. In its tum, processing the data may lead to updating its attributes. Data Dependencies and State Vector Dynamics. A data record taken as a measurement of an object state at a specific time is actually a state vector, and its change in time describes the object dynamics which used to be described by the motion equations. So, analysis of record dynamics, a revealing hidden patterns and regularities, can be likened to the analysis of the object dynamics by using the motion equations.
The proposed programmatic approach enhances the concept of data bridging a gap between an experiment and its data. A regular data record gets elevated to the level of a data object with the built-in logic designed to validate and test the data so that data becomes the physical
37
properties (vs. just numbers). That makes a record a live or active record, a new dimension in data-driven model-building that makes it easy the building, testing and updating a model. A simple example of linking two data sets into a data pair is given in Fig. 1. The context is a set t of N numbers in the range [0, pi] ; the data is the function x=sin (t) in the same range. The linking is done using a simple "linker", the outer product out ( t, x ( t)} of two vectors. The reconstruction of x is done using the dot function x=dot ( t, ou t) . A few data dependencies p= (t,x) can be used to build a mapping between the contexts and the data. This can be done by computing the outer products for each pair p and their generalization by spatial superimposition. The 4 data sets (trajectories) case is shown at Fig. 2, where the bottom graph is the reconstructed trajectory x.
1 ~
JOO
1 )0
')0
100
I"I<J
200
' o(}
1)t1
:.JOO
Fig. 1. Linking and trajectory reconstruction model. (a) Dependence p= (t, x) between t and the function x ( t ) was represented as the 2D pattern (b) , an outer product ~aut (t , x (c ) ) , a mapping x=M (t) which allows for reconstruction (c) of trajectory x lfor a new slope t .~Iol;-9
• •,
-;;;-
-
" '-
--;:;---
......-
I -d l GO
1
•J 1 I
o
Fig. 2. Building a trajectory reconstruction model. (a) Dependence p= (t, x) has been represented as the 2D pattern , an outer product au t (t, x (t.) ) . (b) The dependencies p [i J, i = 1, 2 , 3 , 4 for 4 trajectories were computed as 2D patterns and generalized by spatial overlapping; this created a mapping x=M (t.) , (c) Reconstruction of new trajectory x for a new slope t .
38
A Dynamic Data Source: A Gas-of-Particles Method. Data can also be generated by a dynamic data source, for example, by gas of atoms. To monitor data dependencies, the proper patterns need to be chosen, and then generalized to see if there are any common Jeatures .
Fig . 3. (a) Building a data-driven model for a dynamic data source, gas of hard-ball atoms at certain temperature. (b) An accumulative histogram of velocity distribution ; the bins show the number of atoms with specified range of velocities at a time t. A continuous curve is the theoretical Maxwell distribution
With the model built this way, the gas dynamics can be estimated for new experimental conditions - box size, temperature, inter-atom potential, etc. An interesting aspect of the data-driven analysis is a possible interpretation of a record (a measurement) as a particle which has properties, and the inter-record dependencies as the interactions between the particles. The observation dataset then can be thought of as a system of interacting particles, and various characteristics such as energy, entropy, one-, two- and many-particle correlation functions can be computed with the records-particles. The particle-based statistical paradigm can be useful in revealing programmatically the hidden patterns and regularities in the massive datasets such as hyperspectral imagery and micro-array data. Indeed, an interaction between particles pl and p2 can be approximated as , where < •.• > is a proper average; if =*, the particles pl and p2 do not interact. Then, the combinations of variables (fields) can be sought to minimize the interactions such as d «pl*p2>,
39 derived as an independent particle approximation to the quantum description [N27].
3 Programmatic Building of Data-Driven Models Data Exploration Tools . Inter-field and inter-record correlations can be indicative when looking for long-term and multi-scale correlations . A key point is the difference metrics for various fields and records. With a data-driven approach, it can be flexible and configurable to match the application requirements . Programmatic approach introduces a dynamic vision of data. Data may "live" within a smart container, a wrapper designed to support properties of the physical object including the logic to correct or signal the discrepancies thus performing monitoring and quality control functions . The logic may also check if the fields and records satisfy certain "conservation", "continuity", or other rules. Say, the total energy should be constant in mechanical systems without friction, or the sum of spectral line intensities should be equal to total radiance/energy at-a-sensor, or (electronic) density should be equal to the total number of particles (electrons) . In tum, the programmatic tools such as object-relational mappers (ORM) allow for structured access to data vs. their standard handling as fields in a table. Examples of field, record and even the segment of records containers with the built-in logic are: date-time, random data, periodic data, gaussian data, data field with a built-in check for conservation F(f[l], .. ,f[N])=const,etc. The Framework. The proposed framework is designed to support the incremental data-driven model building process. This is an infrastructure that makes model building a consistent, controlled and automated activity. The framework holds input data, data dependencies, partial models, prediction results, and other components related to the model-building activities . In the process of generalization, the data pairs structure the framework and incrementally refine the model. The framework can be thought of as a new kind of database - application-structured and designed to generalize the data pairs. Thus, it is able to estimate the output for the new input, a bridge between classical database and classical predictive model. For example, the hyper-spectral imagery data would structure
40
the framework so that all the spectrally-similar pixels (road, grass, etc) get into the proper nodes each collecting pixels with the same spectral signatures. For a new pixel, the framework triggers a search to find the proper application context (say, a roof) for that pixel.
4 Examples of Data Containers With live data - data integrated with the relevant logic - checking and monitoring data properties in-situ, at the location of data, becomes possible, bridging a gap between data (usually in a database) and the logic to handle the data (usually in a program). A generic container can be built, and the necessary properties can be added at run time depending on current testing results, an option hardly possible for analytical models. A few examples of containers for live data follow. A container for hyperspectral data. Hyperspectral imagery is a set of pixels or records rec= (t, xy, R (f) ), where R (f) is a reflectance spectrum for the x,y-point at time t, and f is the frequency. The logic built-in may monitor a specific spectral line, or a set of spectral lines, a spectral signature . It also may test the presence of the specific line in a pixel and in its xy-neighborhood, identifying the pixel as a roof, road, etc pixel. Such context-specific pixels then can be automatically placed into the proper node of the structured framework to mimic the spatial structure of the real object. A container for tabular data. A monolithic table with a rid, row(fields)] format is used often for measurement data. The live container would allow for a flexible access to any component of the data and for dynamic reorganization of data based on any property or any mix of those; e.g., the data structures can be introduced to describe the structure of the application. It also allows for automatic recalculation ofthe content triggered by a change. A pipeline container is designed for explorative experiments. This is a pipeline with conjigurable stages such as validation, processing, visualization, etc. The stages can be added, removed, modified, reassembled programmatically. The pipeline has a control unit which allows for turning various stages on and off conditionally. The database may be integrated as a stage to keep the data and built datadriven models for future use in other experiments. The "action" stages can be added to implement the whole sensors-to-actions (s2a) pipeline.
Conclusions
41
The proposed approach aims to build programmatically an online model of a phenomenon using dependencies in the observation data ; an analytical model is not necessary. The two key stages are: (1) obtaining the data pairs or dependencies and (2) their generalization into an input-output mapping, a data-based model . A structured framework is proposed to build the model. The estimate for the new input gets done by search (traversing) of the framework.
Acknowledgements. Author thanks Prof. Simon Berkovich of George Washington University, Dr. Aleks Jakulin of Columbia University, and Prof. Yaneer Bar-Yam, of New England Complex Systems Institute, for inspiring discussions on the data-driven model building and the role of data dependencies.
References [A86] H. Anderson, "Metropolis, Monte Carlo and the MANIAC" , Los
htt~f?mg!J~~1~~r.go~1r£p~6sAJb~~6886.pdf
[BOO] N. L. Balazs, J. C. Browne, J. D. Louck, D. S. Strottman, Nicholas Constantine Metropolis, Physics Today, v.53, n.10, p.100, (2000) http://www.aip.org/pt/vol-53/iss-10/p100.html [G04] J. Gray, "The Revolution in Database Architecture", Tech. Rep., Microsoft Research, March 2004, http t//research.microsoft.comz-Gray [H83] F. Harlow, N. Metropolis, "Computing and Computers-Weapons Simulation Leads to the Computer Era," Los Alamos Science, n.7, p. 132, 1983 http://lib-www.lanl.gov/lapubs/00285876.pdf [N27] J. von Neumann and E. Wigner , Phys . Z. 30,467 (1927). [M87] N. Metropolis, "The Beginning of the Monte Carlo Method", Los Alamos Science, Special Issue 1987, p.125-131 http://1ibwww.lanl .gov/la-pubs/00326866.pdf [NSF] www.nsf.gov/cise/cns/dddas and www.dddas.org
Chapter 6 Modelling Complex Systems by Integration of Agent-Based and Dynamical Systems Models Tibor Bosse,Alexei Sharpanskykh, and Jan Treur Vrije Universiteit Amsterdam Department of Artificial Intelligence {tbosse, sharp,treur}@cs.vu.nl
1 Introduction Existing models for complex systems are often based on quantitative , numerical methods such as Dynamical Systems Theor y (DST) [Port and Gelder 1995]. Such approaches often use numerical variables to describe global aspects and specify how they affect each other over time. An advantage of such approaches is that numerical approximation methods and software are available for simulation. Agent-based modelling approaches take into account the local perspective of a possibly large number of agents and their behaviours . They are usually based on qualitative , logical languages. An advantage of such approaches is that they allow logical analysis of relationships between different parts of a model, for example relationships between global and local properties of a (multi-agent) system. Moreover, declarative models can be specified using logic-based languages close to natural language. Such models can be analyzed at a high level of abstraction . Furthermore , automated support is available for manipulation and design of models . Complex systems often involve both qualitative aspects and quantitative aspects that can be modelled by agent-based (logical) and DST-based approaches respectively . It is not easy to integrate both types of approaches in one modelling method . On the one hand, it is difficult to incorporate logical aspects in differential equations. On the other hand, logical, agent-based modelling languages, often are not able to handle real numbers and calculations. This paper shows an integrative approach to simulate and analyse complex systems, integrating quantitative, numerical and qualitative, logical aspects within one temporal specification language . In Section 2, this language (called LEADSTO) is described in detail, and is illustrated for a system of differential equations (a Predator-Prey model) applying methods from numerical analysis . Section 3 shows how quantitative and qualitative aspects can be combined within the same model. Section 4 demonstrates how relationships can be established between dynamics of basic mechanisms (described in LEADSTO) and global dynamics of a process (described in a super-language of LEADSTO) . Finally, Section 5 is a discussion. An extended version of this paper with more details appeared as [Bosse, Sharpanskykh and Treur, 2008J.
43
2 Modelling dynamics in LEADSTO Dynamics can be modelled in different forms. Based on the area within Mathematics called calculus, the Dynamical Systems Theory [Port and Gelder 1995J advocates to model dynamics by continuous state variables and changes of their values over time, which is also assumed continuous . In particular, systems of differential or difference equations are used. However, not for all applications dynamics can be modelled in a quantitative manner as required for DST. Sometimes qualitative changes form an essential aspect of the dynamics of a process. For example, to model the dynamics of reasoning processes usually a quantitative approach will not work. In such processes states are characterised by qualitative state properties , and changes by transitions between such states. For such applications often qualitative, discrete modelling approaches are advocated, such as variants of modal temporal logic, e.g. [Meyer and Treur 2002]. However, using such methods , the more precise timing relations are lost. For the LEADSTO language described in this paper, the choice has been made to consider the timeline as continuous , described by real values, but for state properties both quantitative and qualitative variants can be used. The approach subsumes approaches based on simulation of differential or difference equations , and discrete qualitative modelling approaches. In addition, the approach makes it possible to combines both types of modelling within one model. Moreover, the relationships between states over time are described by either logical or mathematical means, or a combination thereof. This will be explained in more detail in Section 2.1. As an illustration, in Section 2.2 it will be shown how a system of ordinary differential equations representing the classical Predator-Prey model can be modelled and simulated in LEADSTO .
2.1 The LEADSTO language Dynamics is considered as evolution of states over time. The notion of state as used here is characterised on the basis of an ontology defining a set of properties that do or do not hold at a certain point in time. For a given (order-sorted predicate logic) ontology Ont, the propositional language signature consisting of all state ground atoms (or atomic state propert ies) based on Ont is denoted by APROP(Ont). The state properties based on a certain ontology Ontare formalised by the propositions that can be made (using conjunction, negation, disjunction, implication) from the ground atoms. A state S is an indication of which atomic state properties are true and which are false, i.e., a mapping S: APROP(Ont) ..... {true, false}. To specify simulation models a temporal language has been developed . This language (the LEADSTO language [Bosse et al. 2007]) enables to model direct temporal dependencies between two state properties in successive states, also called dynamic properties . A specification of dynamic properties in LEADSTO format has as advantages that it is executable and that it can often easily be depicted graphically. The format is defined as follows . Let a and ~ be state properties of the form 'conj unction of atoms or negations of atoms' , and e, f, 9, h non-negative real numbers . In the LEADSTO language the notation a --e, f, 9, h ~ (also see Figure I) , means:
44
If state property a holds for
a certain time interval with duration g, then after some delay (between e andf) state property ~ will hold for a certain time interval oflength h .
example expresses the fact that, if agent A observes that food is present during I time unit, then after a delay between 2 and 3 time units, agent A will r~ ,l irnc~=====----E~ E 3-_ belief that food is present during ~ -t :::~::~' l:::·;···h······t 1.5 time units . In addition , the ' g f : LEADSTO language allows using 10 II 12 sorts, sorted variables, real Figure. 1. Timing relationships in LEADSTO numbers , and mathematical operations, such as in "has_value(x, v) ..... e, I, g, h has_value(x, v*0.25)". A trace or trajectory y over a state ontology Ont is a time-indexed sequence of states over Ont (with the real numbers as time frame). To specify that a certain event (i .e., a state property) holds at every state within a certain time interval, the predicate holds_duringjnterval(event, n, t2) is used . Here event is some state property, t1 is the start of the interval and t2 is the end of the interval.
u- -E3. E:1--- ---=F=•• = = =----
2.2 Differential Equations in LEADSTO Often behavioural models in the Dynamical Systems Theory are specified by systems of differential equations with given initial conditions for continuous variables and functions . One of the approaches to find solutions for such a system with given initial values is based on discretization, i.e., replacing a continuous model by a discrete one, whose solution is known to approximate that of the continuous one . For this methods of numerical analysis are usually used [Pearson 19861 . The simplest approach to approximate of solutions for ordinary differential equations is provided by Euler 's method . To solve a differential equation of the form dy/dt = fry) with the initial condition y(lo)=yo this method comprises a difference equation derived from the Taylor power series, until power I: Yi+l=Yi+h* f(y,) , where i..O is the step number and h>O is the integration step size . This equation can be modelled in the LEADSTO language in the follow ing way: • Statesspecify the respective valuesofy at different time points. • The difference equationis modelled by a transition rule from currentto successive state. • The durationof an interval between state changesis defined by a step sizeh. For the considered general case the LEADSTO model comprises the following rule: has_value(y, v1)..... o.o.n.n has_value(y, v1+h* f(v1)) The initial value for the function y is specified by the following LEADSTO rule: holds_duringjnterval(has_value(y, Yo), 0, h) By performing a simulation of the obtained model in the LEADS TO environment an approximate solution to the differential equation is found . Although the first-order Euler method offers a stable solution, it is still rather rough and imprecise since the accumulated error within this method grows exponentially as the integration step size increases, therefore small step sizes are
45
needed . To obtain more precis e solutions for a given step size , higher order numerical methods are used . To illustrate higher-order numerical approaches, the fourth-order Runge-Kutta method is considered. This method is derived from a Taylor series up to the fourth order. It is known to be very accurate (the accumulated error is O(h4 ) ) and stable for a wide range of problems. The Runge -Kutta method for solving a differential equation of the form dx/dt = 1(1, x) is described by the following formulae: Xi+1 = Xi + h/6 '(k , + 2'k2 + 2'k 3 + k.), where i~O is the step number, h>O is the integration step size , and k, = I(~, Xi) k2 = I(~ + hl2, Xi + h/2 'k,) k3 = I(~ + h/2, Xi + h/2 'k 2) k4 = I(~ + h, Xi + h' k3) . To illustrate the proposed approach for simulations based on numerical methods, the system of ordinary differential equations representing the classical Lotka -Volterra model (a Predator-Prey model) [Morin 1999] is used . This model describes interactions between two species in an ecosy stem, a predator and a prey . If x(l) and y(l) represent the number of preys and predators respectively , that are alive in the system at time I, then the Lotka-Volterra model is defined by: dx/dl = a'x - b'x'y dy/dl = c'b'x'y - e'y where a is the per capita birth rate of the prey ; b is a per capita attack rate; c is the conversion efficiency of consumed prey into new predators; e is the rate at which predators die in the absence of prey . Now, using the Runge-Kutta method, the classical Lotka-Volterra model is described in the LEADSTO format as follow s: has_value(x, v1) 1\ has_value(y, v2)-- O, O, h, h has_value(x, v1 + hl6' (kl1 + 2'k , 2 + 2'k , 3 + k14)) has_value(x, v1) 1\ has_value(y, v2) - - O,O,h ,h has_value(y, v2 + h/6 '(k21 + 2'k 22 + 2'k 23 + k24 )) , where kl1 = avt-trvrvz. k21 = c'b'v1 'v2 - e'v2, k' 2 = a'(vt + h/2 'k,,) - tr(vt + h/2 'k,,)*(v2 + hl2 'k 21) , k22 = c'b'(vi + hl2 'k l1)*(v2 + hl2 'k 21) - e'(v2 + h/2 'k 21) , k' 3 = a'(vt + hl2 'k , 2) - b'(v1 + h/2 'k , 2)*(v2 + hl2'k 22 ) , k23 = c'b'(vt + h/2 'k , 2)*(v2 + hl2 'k 22 ) - e'(v2 + h/2 'k 22) , k' 4 = a'( v1 + h 'k13l - b'(v1 + h 'k , 3)*(v2 + h 'k 23) , k24 = c'b'(v1 + h 'k , 3)*(v2 + h 'kd e'(v2 + h ' kd . The result of simulation of this model with the initial values Xo=25 and yo=8 and the step size h=O.1 is given in [Bosse, Sharpanskykh and Treur, 2008] . It is identical to the result produced by Euler's method with a much smaller step size (h=O.01) for the same example . Although for most cases the Runge-Kutta method with a small step size provides accurate approximations, this method can still be computationally expensive and, in some cases , inaccurate. To achieve a higher accuracy together with minimum computational efforts, methods that allow the dynamic (adaptive) regulation of an integration step size are used . Generally, these approaches are based on the fact that the algorithm signals information about its own truncation error. The most commonly used technique for this is step doubling and step halving , see, e.g. [Gear 19711. Since its format allows the modeller to include qualitative aspects, it is not difficult to incorporate step doubling and step halving into LEADSTO . See [Bosse , Sharpanskykh and Treur, 20081 for an illustration of how this can be done.
46
3 The Predator-Prey Model with Qualitative Aspects In this section, an extension of the standard predator-prey model is considered, by some qualitative aspects of behaviour. Assume that the population size of both predators and preys within a certain eco-system is externally monitored and controlled by humans. Furthermore, both prey and predator species in this eco-system are also consumed by humans. A control policy comprises a number of intervention rules that ensure the viability of both species. Among such rules could be following: in order to keep a prey species from extinction, a number of predators should be controlled to stay within a certain range (defined by pred_min and pred_max); if a number of a prey species falls below a fixed minimum (prey_min), a number of predators should be also enforced to the prescribed minimum (pred_min); if the size of the prey population is greater than a certain prescribed bound (prey_max), then the size of the prey species can be reduced by a certain number prey_quola (cf. a quota for a fish catch) . These qualitative rules can be encoded into the LEADSTO simulation model for the standard predator-prey case by adding new dynamic properties and changing the existing ones in the following way: has_vaJue(x, V1) A has_value(y, V2) A vt-epreyrnax --+O,O,h ,h has_value(x, v1+h'(a'v1-b'v1'v2» has_value(x, V1) A has_value(y, v2) A v1 '" prey_max --+ 0, 0, h, h has_value(x, v1+h'(a'v1-b'v1'v2) - prey_quola) has_value(x, V1) A has_value(y, v2) A vt '" prey_min A v2 < pred_max --+ 0, 0, h, h has_value(y, v2+h' (c'b'v1 'v2-e'v2» has_value(x, V1) A has_value(y, v2) A v2 '" pred_max --+O,O,h ,h has_value(y, pred_min) has_vaJue(x, V1) A has_value(y, v2) A v1 < prey_min --+ 0, 0, h, h has_value(y, pred_min)
The result of simulation of this model using Euler's method with the parameter settings: a=4; b=O.2, c=O.1 , e=8, pred_min=10, pred_max=30, prey_min=40, prey_max=100, prey_quola=20, Xo=90, yo=10 is given in Figure 2.
WJ
.= C
A
"
i i ll fl f\ {
I\( V V V V
Flgure.z, Simulation results for the Lotka-Volterra model combined with qualitative aspects .
4 Analysis In Terms of Local-Global Relations Within the area of agent-based modelling, one of the means to address complexity is by modelling processes at different levels , from the global level of the process as a whole, to the local level of basic elements and their mechanisms . At each of these
47
levels dynamic properties can be specifi ed, and by interlevel relations they can be logic ally related to each other; e .g., [Sharpanskykh and Treur 2006] . These relationships can provide an explanation of properties of a process as a whole in terms of properties of its local elements and mechanisms. Such analyses can be done by hand or automatically. To specify the dynamic properties at different levels and their relations, a more expressive language is needed than simulation languages based on causal relationships, such as LEADSTO. To this end, the formal language TIL has been introduced as a super-language of LEADSTO; cf. [Bosse et al. 2006) . It is based on order-sorted predicate logic, and allows including numbers and arithmetical functions . Therefore most methods used in Calculus are expressible in this language, includ ing methods based on derivatives and differential equations. In this section it is shown how to incorporate differential equations in the predicate-logical language TTL that is used for analysis . Further, in this section a number of global and local dynamic properties are identified, and it is shown how they can be expressed in TIL and logically related to each other.
Differential Equations in TTL A differential equation of the form dy/dt= I(y) with the initial condition y(Io)=Yocan be expressed in TIL on the basis of a discrete time frame (e.g ., the natural numbers) in a straightforward manner: ~ state(y, t+1) i= has_value(y, v + h • I(v)) Vt Vv state(y, t) i= has_value(y, v) The traces y satisfying the above dynamic property are the solutions of the difference equation . However, it is also possible to use the dense time frame of the real numbers, and to express the differential equation directly. Thus, x = dy/dtcan be expressed as: Vt,w 'iE>O 30>0Vt',v,v' 0 < dist(t',t) < b & state(y, t) i= has_value(x, w) & state(y, t) i= has_value(y, v) & state(y, t') i= has_value(y, v') ~ dist«v'-v)/(t'-t),w) < E where dist(u,v) is defined as the absolute value of the difference . The traces y for which this statement is true are (or include) solutions for the differential equation . Models consisting of combinations of difference or differential equations can be expressed in a similar manner. Thi s shows how modelling constructs often used in DST can be expressed in TrL.
Global and LocalDynamic Properties Within Dynamical Systems Theory, for global properties of a process more specific analysis methods are known . Examples of such analysis methods include mathematical methods to determine equilibrium points , the behaviour around equilibrium points, and the existence of limit cycles. Suppose a set of differential equations is given, for example a predator prey model : dx/dt = I(x, y) and dy/dt = g(x, y). Here, f'(x, y) and g(x, y) are arithmetical expressions in x and y. Within TIL the following abbreviation is introduced as a definable predicate: point(y, t, x, v, y, w) state(y, t) 1= has_value(x, v) A has_value(y, w)
=
Equilibrium points These are points in the (x , y) plane for which, when they are reached by a solution, the state stays at this point in the plane for all future time points . This can be expressed as a global dynamic property in TTL as follows: has_equilibrium(y, x, v, y, w) Vt1 [ point(y, t1 , x, v, y, w) ~ V12..t1 point(y, t2, x, v, y, w) 1 occurring_equilibrium(y, x, v, y, w) 3t point(y, t, x, v, y, w) & has_equilibrium(y, x, v, y, w)
Here, dist(v I, wl , v2, w2) denotes the distanc e between the points (vi , wi) and (v2, w2) in the (x, y) plane . The global dynamic properties described above can also be addressed from a local perspective.
Local equilibrium property From the local perspective of the underlying mechanism, equilibrium points are those points for which dxldt =dy/dt =0, i.e., in terms of f and g for this case fix, y) =g(x, y) =O. equilibrium_state(v, w) f(v, w) = 0 & g(v, w) = 0 ¢>
In terms of f and g, this can be expressed by relationships for the eigen values of the matrix of derivatives of f and g. The properties of local and global level can be logically related by interlevel relations, for example, the following ones: Vt [state(y, t) 1= equilibrium_state(v, w) •• has_equilibrium(y, x, v, y, w) 3d>O, b>O attracting(y, x, v, y, w, b, EO, d) •• attracting(y, x, v, y, W, EO)
5 Discussion The LEADSTO approach proposed in this paper provides means to simulate models of dynamic systems that combine both quantitative and qualitative aspects. Sometimes such systems are called hybrid systems [Davoren and Nerode 20001. In the control engineering area, hybrid systems are often considered as switching systems that represent continuous-time systems with isolated and often simplified discrete switching events [Liberzon and Morse 1999]. Yet in computer science the main interest in hybrid systems lies in investigating aspects of the discrete behaviour, while the continuous dynamics is often kept simple [Manna and Pnueli 1993 j. Our LEADSTO appro ach provides as much place for modelling the continuous constituent of a system , as for modelling the discrete one. In contrast to many studies on hybrid systems in computer science (e.g., [Rajeev et al. 1997]), in which a state of a system is described by assignment of values to variables, in the proposed approach a state of a system is defined by (composite) objects using a rich ontological basis (i.e., typed constants, variables, functions and predic ates). Furthermore, using TTL, a super-language of LEADSTO, analysis of dynamical systems by formalizing and applying standard techniques from the mathematical calculus can be performed . Accuracy and efficiency of simulation results for hybrid systems provided by the proposed approach to a great extend depend on the choice of a numerical approxim ation method . Although the proposed approach does not prescribe usage of
49
any specific approximation method (even the most powerful of them can be modelled in LEADSTO), for most of the cases the fourth-order Runge-Kutta method can be recommended, especially when the highest level of precision is not required . For simulating system models, for which high precision is demanded, higher-order numerical methods with an adaptive step size can be applied .
Bibliography [I] Bosse, T ., Jonker, e.M., Meij, L. van der, Sharpanskykh, A . & Treur, J ., 2006 . Specification and Verification of Dynamics in Cognitive Agent Models. In: Proceedings of the Sixth Int. Conf. on Intelligent Agent Technology, IAT'06. IEEE Computer Society Press, 247-255 . [2] Bosse, T., Jonker , C.M ., Meij, L. van der , and Treur, J ., 2007 . A Language and Environment for Analysis of Dynamics by Simulation . International Journal of Artificial Intelligence Tools, 16,435-464. [3] Bosse, T ., Sharpanskykh, A. & Treur, J., 2008. Integrating Agent Models and Dynamical Systems . In: Baldoni, M., Son, T .e., Riemsdijk, M.B. van, and Winikoff, M. (eds.) . In Proc. of the 5th Int. Workshop on Declarative Agent Languages and Technologies, DALT'07, LNAI 4897, Springer Verlag, 50-68 . [4] Davoren, J.M. & Nerode, A ., 2000. Logics for Hybrid Systems . In Proc. of the IEEE,88,7 , 985-1010 . [5] Gear, e.W ., 1971. Numerical Initial Value Problems in Ordinary Differential Equations. Englewood Cliffs, NJ: Prentice-Hall. [6] Liberzon, D . & Morse, A. S., 1999. Basic problems in stability and design of switched systems, IEEE Control Systems Magazine, 19,5,59-70 . [7) Manna , Z . & Pnueli, A ., 1993. Verifying Hybrid Systems . In Hybrid Systems, Lecture Notes in Computer Science 736, Springer-Verlag, 4-35 . [8) Meyer, J.J .Ch. & Treur, J. (eds .), 2002 . Agent-based Defeasible Control in Dynamic Environments . Series in Defeasible Reasoning and Uncertainty Management Systems, 7 . Kluwer Academic Publishers . [9] Morin PJ, 1999. Community Ecology . Blackwell Publishing, USA . [10) Pearson,C.E.,1986. Numerical Methods in Engineering and Science .CRC Press . [II) Port, R.f., and Gelder, T . van (eds .), 1995. Mind as Motion: Explorations in the Dynamics of Cognition . MIT Press, Cambridge, Mass. [12]Rajeev, A ., Henzinger, T .A., & Wong-Toi, H., 1997. Symbolic analysis of hybrid systems . Proc. of 36th Annual Conf. on Dec. and Control, IEEE Press. [13]Sharpanskykh, A. & Treur, J., 2006 . Verifying Interlevel Relations within Multi-Agent Systems . Proc . of 17th European Conf. on AI, lOS Press, 290-294.
Chapter 7
On Elementary and Algebraic Cellular Automata Yuriy Gulak Center for Structures in Extreme Environments, Mechanical and Aerospace Engineering, Rutgers University, New Jersey [email protected]
In this paper we study elementary cellular automata from an algebraic viewpoint. The goal is to relate the emergent complex behavior observed in such systems with the properties of corresponding algebraic structures. We introduce algebraic cellular automata as a natural generalization of elementary ones and discuss their applications as generic models of complex systems.
1 Introduction T here is a great diversity of comple x systems in physics, biology, engineering and other fields that share the ability to exhibit complicated, difficult to predict spatio-temporal behavior. Models of such systems, which reflect their most crucial properties, are based on traditional mathematical approaches, as well as other techniques such as networks and automata. In physics and engineering, models can be deri ved from con servation laws and formulated in terms of differential or difference equati ons. Due to nonlinearities , the solutions of these equations may demonstrate irregular, "chaotic" behavior, which is often attri buted to the syste m evolutio n se nsi tivity to initial data , al so known a s the " butte rfly
effect". For a biologist, however , models based on classical mathematical techniques such as differential equations, might look too restrictive. Indeed, a typical system consi sts of a large number of elements (cells) , where each cell experiences local nonlinear interactions with its neighbors. It is often argued that networks and automata can provide better alternative to the traditional approaches being discreet by nature , and defined by
51 specifying the interaction rules between cells . Interestingly, the first automata studies were initiated by mathematicians J. von Neumann and S. Ulam in the earl y 1950's, inspired by the analogies between the operations of computers and the human brain . At those times the chaotic behavior of differential and difference equations was awaiting its disco very and known mathem atical tools were not expressive enough to simulate systems of high level of complexity. I Meanwhile , Von Neumann and Ulam also expected that their automata project would help them to come up with new mathematics that would be adequate to describe the pattern s, processes, and self-o rganization of living matter. Von Neumann 's first automation was quite complicated, but 30 years later S. Wolfram discovered remarkabl y simple rules that may lead to arbitraril y complicated global behavior and visual patterns , as demonstrated in his computer simulations of elementarycellular automata (ECA) models [10, III. Nowadays, cellular automata is a practical tool to deal with complex systems in many disciplines, competitive even in the fields where traditional techniques are well established (for example, Lattice-G as method to solve fluid dynamics equations). In some mathematical circles however , these approaches are often con sidered as an add hoc toolkit rather than a step toward s a satisfactory mathematical theor y, thus avoiding the complexity as such . In this paper we would like to emphasize the complementary role of the automata models and abstract mathematical ideas in the studie s of complex systems, as originally viewed by Ulam and Von Neumann. We discuss the interrelati on between the behavior of patterns and algebrai c propert ies of ECA and their generalizations based on groupoids.
2 ECA as groupoids By definition [II], ECA are discrete dynamical systems that describe the evolution of black and white cells , denoted as 1 and a correspondingly, arranged in horizontal lines. Starting from some initial line , where all cells are white and a single cell in the center is black (...0001000 ...), for example, the color of the cell on the next line is determined by the color of its three neighbors immediately above it. In order to define a particular ECA rule, it is sufficient to assign "0" or " I" values to 8 triplets, for instance, 111
a
110 1
101 1
all
100
a
1
010
1
001
1
000
a
(1.1)
It is not difficult to see that there are 256 such rules , which are numbered from a to 255 by binary numbers form ed from the assigned digit s and then converted to decimals. Thu s, definition ( 1.1) correspond s to the rule ItO , since 110111 02 = 110. It appears , the conventional ECA rules can be easily replaced by the equivalent algebrai c objects. Specifically, instead of using 3-cell rules, for exampl e , 11 0 1
10 0
a
-+
1 10 0 10 '
I Although Ulam's computer-assisted studies of polynomial iterated maps in the late 1950's demon strated fascinating limit sets that today would be classified as strange attractors .
52 one can define "products" of 2-cell blocks , 11 0 00 = 10, etc., consistent with the corresponding rule. Denoting e1 = 11, ez = 10, e3 = 01, e4 = 00, and evaluating similarly all products ei 0 ej, i, j = 1..4, a particular case of ECA can be given by a corresponding multiplication table. Rule 110 (1.1), for instance, has the following table: 0
e1
ez
e3 e4
e1 e4 e1
ez e1
ez e3 e1 e1 e1
e3 e1 e3 e1 e3
e4
ez e4 ez e4
This multiplication table defines a four-element groupoid, a set G of 4 elements {e.}, i = 1..4 together with a closed binary operation 0 : G x G ....... G, which in the general case is not necessarily associative and/or commutative. Clearly, each of the ECA rules generates a unique multiplication table, and hence a groupoid. Thus , the ECA evolution can be computed by groupoid multiplications of neighboring elements. Starting from the initial line of 2-block cells X1XZ •••X n , where X k E G, k = Ln, and evaluating the next n - 1 lines results in a single block, denoted by B1z...n- In general, n(n - 1)/2 products are required to calculate B 1Z...nA key question is the following: can we "shortcut" such computations, using less number of operations? It turns out that if groupoid has special properties, the prediction of B 1Z... n may be much more efficient [6,7J . It can be shown , for example , that the groupoid of rule 90 is a commutative group which requires only O(n) products to calculate the block B 1Z...n . Other ECA based on known and well studied algebraic structures such as semigroups , quasigroups, loops , and groups can also be efficiently predicted. However, the majority of groupoids related to ECA do not belong to any known algebraic structures . For instance, there are only 128 non-equivalent semigroups (associative groupoids) among 4 16 ~ 4.2 . 1010 possible 4-element groupoids, first constructed by G. E. Forsythe in 1955 [1]. But what can be said about such general groupoids, which algebraic structure is not known or not studied before? Do they satisfy any identities that would lead to the efficient prediction of the corresponding ECA evolution? A combinatorial computer-assisted search of identities in these groupoids was previously described in [2], where a list of low order identities for some ECA that show interesting behavior is presented. It appears that groupoids of ECA that demonstrate simple behavior and belong to the class I or 2 in Wolfram's classification, exhibit many trivial identities with B-blocks, which provide computational "shortcuts" for efficient prediction. However, much fewer such identities were found in ECA groupoids of class 3 that demonstrate randomness as rule 30. Remarkably, identities with B-blocks were found in groupoids of class 4 ECA for some particular initial conditions. For instance the groupoid of rule 110 satisfies the following identities:
Such identities, however, seem to appear quite randomly, so one may inquire whether there exists an effective description, or a basis of the groupoid identities, from which all other identities can be derived . It is well known, however, that groupoids in general are not necessarily finitely based (i.e., possess a finite basis of identities) . Only 2-element groupoids are finitely based [41, the first example of a three-element nonfinitely based groupoid o
el el el
ei el e3
el e2 es
(1.2)
was constructed by V. L. Murskii [8) . In particular, he proved that for any n 2 3 the following formula
is satisfied identically, but cannot be derived from any set of lower degree groupoid (1.2) identities. Concerning ECA , their equivalence to 4-element groupoids implies that there might exist some set of "shortcut" identities that are impossible to derive from other lower degree identities. We can prove such identities by brute-force exhaustive substitutions, since groupo ids are finite structures , but never using the traditional axiomatic approach. The situation resembles Godel's incompleteness result , but in the "light form": brute force is useless in the case of infinite structures . Clearly , the lack of a finite basis of groupoid identities imposes serious complications on their algorithmic study and explains why one should not be surprised that such simple systems as ECA might be irreducible and show complicated behavior. In fact, there is even no general algorithm to determine whether an arbitrary finite groupoid is finitely based or not, as recently proved by R. McKen zie [5]. Thus , each groupoid/ECA must be studied on the individual basis. We would like to emphasi ze especiall y an important role of nonassociativity as a necessary condition for a nonfinitenes s of the groupoid basis and generation of complex patterns in the corresponding automaton. Indeed, one needs at least 6-element semigroup that might be nonfinitely based [91 . Remarkably, Wolfram's experimental
54 studies of cellular automata based on semigroups suggested that a semigroup of at least 6 elements is required to obtain patterns more complicated than nested, regardless of the initial condition [11, p. 887] .
3 Algebraic cellular automata So far we were focused on studying ECA as groupoids , but they represent only a subset of 256 out of 4 16 ~ 4.2 . 1010 possible 4-element groupoids. It may be surprising that members of such a small subset demonstrate quite a rich behavioral spectrum. Examining the rest of the four billion cases would be time consuming but not imposs ible for a modern computer. In light of the above discussion, however, we can introduce cellular automata based on 3- and 2-element groupoids, in analogy with the ECA, and call them further algebraic cellular automata (ACA). For a convenient enumeration we will use digits 0, 1, ... instead of e i notation, because groupoids allow scalar representation. The following 3-element groupoid, for example, o
o 1 2
012 012 120 102
(1.3)
will be numbered as a decimal 4061, since 121201023 = 406110, and the number in base 3 is formed by rows of the multiplication table, starting from the top one . First, as a simplest case, we examine 2-element groupoids. There are only 16 of such groupoids, and we know that they all are finitely based. We would not expect to observe any complicated behavior of the corresponding ACA, but it is interesting to see generic cases of "simple" patterns . To make pictures we use two colors , the element o corresponds to the block of 2 black square cells, and 1 to the white one . The most "interesting" patterns obtained from the initial row containing single 0 element and the rest are 1's are shown in Fig. 1. Other trivial observed patters are all white, all black, and alternating black and white lines. We have also done some preliminary studies of 3-element ACA using elements of three colors: 0 is red, 1 is blue, and 2 is white. There are 39 = 19683 different 3element ACA, and some of them are not finitely based . As an initial condition we used a single red element, the rest were white . Our experiments showed a wide behavioral spectrum of patterns - from simple nested to random with localized, turbulent structures. One of such patterns is shown in Fig . 2. Clearly, more experiments, especially with different initial conditions, are required to filter the most interesting ACA. This study, however should be accompanied by the investigation of the algebraic structure and identities of corresponding groupoids, which might result in the classification of isomorphic families of groupoids. It would also be interesting to compare such families for 3- and 4-element groupoids.
55
Figure 1: Patterns produced by 2-element ACA of number 3 and 9 in the firs column, and 12 and 14 in the second one . First 100 steps of evolution is shown, starting from the single 0 element.
Figure 2: First 100 steps evolution of the 3-element ACA number 4061 starting from single red 2-cell block, the rest are white .
4 Discussion In this paper, we tried to indicate that traditional mathematical methodsand structures can be usefulin the studiesof complexsystems,but requiresometechnicaland methodologicaladjustments in order to describe the real world phenomena. Nonassociative structures,namelygroupoids,that allowedequivalent representation of the ECA, are fascinating but unfortunately not very interesting for mathematicians. In particular,Jacobson's classical algebra textbookstates l3]:
56 One way of trying to create new mathematics form an existing mathematical theory , especially one presented in an axiomatic form, is to generalize the theory by dropping or weakening some of its hypotheses. If we play this axiomatic game with the concept of an associative algebra, we are likely to be led to the concept of a non-associative algebra, which is obtained simply by dropping the associative law of multiplication. If this stage is reached in isolation from other mathematical realities , it is quite certain that one would soon abandon the project, since there is very little of interest that can be said about non-associative algebras in general. We have seen, however, that axiomatic games are very restrictive, especially when dealing with nonassociative structures. The lifting of the nonassociativity breaks many symmetries, so general groupoids lack some nice structural properties. But such lack of symmetries appears to be an essential property of ECA and many other complex systems . We have not touched in this paper applications of other nonassociative structures such as nonassociative algebras, which are briefly discussed in [2]. Finally, we would like to stress the increasing importance of modern computers and emerging experimental mathematical techniques, on which we heavily relied in the course of the present study.
Acknowledgments I wish to thank Haym Benaroya for valuable discussions and encouragements. Use of the facilities and support of the Rutgers Center for Structures in Extreme Environments are gratefully acknowledged.
57
Bibliography [I) FORSYTHE, George E., "SWAC computes 126 distinct semigroups of order 4", Proc. Amer. Math . Soc. 6 (1955), 443-447 . [2] GULAK, Y., "Algebraic properties of some quadratic dynamical systems", Advances in Applied Mathematics 35 (2005), 407-432. [3] JACOBSON, Nathan, Basic algebra. I, W. H . Freeman and Co. San Francisco, Calif. (1974). [4] LYNDON, R., "Identities in the two-valued calculi", Transactions of theAmerican Mathematical Society 71 (1951) ,457-465 .
[51 McKENZIE, Ralph, "Tarski 's finite basis problem is undecidable", International Journal of Algebra andComputation 6, I (1996), 49-104. [6] MOORE, C., "Quasi-linearcellular automata", Physica D 103 (1997), 100-132.
[71 MOORE, C., "Predicting non-linear cellular automata quickly by decomposing them into linearones", Physica D 111 (1998), 27-41 . [8J MURSKII, V.L., "The existance in three-valued logic of a closed class with fi-
nite basis, not having a finite complete system of identities", Soviet Mathetatics Doklady 4 (1965), 1020-1024. [9] TRAHTMAN, A. N., "The finite basis question for semigroups of order less than six", Semigroup Forum 27, 1-4(1983), 387-389. [10] WOLFRAM, S., Cellular Automata and Complexity: Collected Papers, AddisonWesley (1994).
[Ill WOLFRAM , S., A New Kind of Science, Wolfram MediaInc. Champaign (2002).
Chapter 8
Dual phase evolution - a mechanism for self-organization in complex systems David G. Green, Tania G. Leishman l and Suzanne Sadedin Clayton School ofInfonnation Technology Monash University, Clayton 3800, Australia [email protected]
1.1. Introduction A key challenge in complexity theory is to understand self-organization: how order emerges out of the interactions between elements within a system. Prigogine [1980 I pointed out that in dissipative systems (open systems that exchange energy with their environment) , order can increase . Rather then being suppressed, positive feedback allows local irregularities to grow into global features . Haken [1978] introduced the idea of an order parameter and pointed out that critical behaviour (e.g. the firing of a laser) always occurs at some predictable value of the parameter. Nevertheless, many questions remain, especially about the ways in which different processes act in concert with one another. In particular, the relationships between self-organization, natural selection and the evolution of complexity remain unclear . The results from several of our recent studies [e.g. Green et a1. 2000, Green & Sadedin 2005] imply that processes governing evolution in landscapes are similar to a wide range of phenomena that occur in many different contexts . Here we distil these observations into a single theory, which we term dual phase evolution (DPE), and suggest that DPE may underlie self-organization in a many different systems.
I
Formerly Tania Bransden
59
1.2. Landscape phase changes and evolutionary learning 1.2.1. The role of catastrophes on different time scales Our first indication that evolution in a landscape may represent a larger class of processes came when we detected similarities between the patterns and processes of biological change in landscapes on two completely different scales . On geological time scales, evolution occurs in fits and starts . There are long periods (tens or hundreds of millions of years) during which the flora and fauna of a region remain largely constant, forming the main geological periods . The transition from one period to the next is usually very abrupt. In 1980 Alvarez et aI. provided evidence that the Cretaceous-Tertiary boundary was associated with impact of a large comet [Alvarez et aI. 1980]. Subsequent research found evidence for asteroid impacts , volcanic activity and climate change associated with other geological boundaries. Examination of patterns of species turnover showed that these boundaries were also associated with mass extinction events . This led Eldredge and Gould [1972] to propose their punctuated equilibrium hypothesis . They argued that instead of proceeding at a steady pace, evolution occurred mainly during brief bursts of diversification. These bursts were preceded by mass extinctions, and followed by long periods of stasis . Eldredge and Gould suggested that mass extinctions release evolutionary constraints by providing empty ecological niches and decreased competition. As new species evolve and populations grow and spread, competition increases. Consequently, the opportunity for evolutionary novelty declines until a steady state is reached which is ultimately broken by a large external perturbation such as an asteroid strike . The causes of punctuated equilibrium in the fossil record are still disputed. However, a key observation has been largely neglected in this debate. It is that similar patterns of change also occur on much shorter time scales . An excellent analogue for punctuated equilibrium is found in vegetation history during the last 10,000 years.
1.2.2. Holocene forest history An early triumph of Quaternary palynology (the study of preserved pollen) was to show that vegetation changes during postglacial times followed consistent patterns over vast regions of Europe and North America . Pollen histories also show that forest changes occur abruptly, in fits and starts, just like evolution in the geological record . Palynologists divide vegetation histories into "pollen zones", which are periods of relatively constant pollen composition. Transitions between pollen zones mark rapid, major shifts in species composition. Just like geological periods, zones are punctuated by sudden phase shifts that are triggered by environmental disturbances, usually forest fires [Green 1990]. The parallels between vegetation change and evolution are striking: pollen zones correspond to geological eras, sudden changes in community composition correspond to mass extinctions, and major fires correspond to cometary impacts. This correspondence suggests that a common process underlies both evolution and forest change [Green & Kirley 2000]. Simulation studies (see below) suggest that biotic interactions within landscapes are responsible. In the case of forest change seed dispersal acts as a conservative process [Green 1989]. Because they possess an
60 overwhelming majonty of seed sources, established species are able to exclude invaders. By clearing large regions, major fires enable invaders to compete with established species on equal terms . Conversely seed dispersal also enables rare species to form clumped distributions that allow them to survive in the face of superior competitors. This mechanism appears to be important for the maintenance of diversity in tropical rainforests [Green 1989].
--Connected - -Disconnected
2
o ~=:;:~~~:::::::::::::;~::::l 200
400
600
000
1000
Generation
Figure 1. Simulated genetic drift in a landscape . Connected areas at left (examples shown in black) are small when the density of patches is sub-critical (top), but occupy the entire region when the density is super-critical. The graph at right shows the course of random genetic drift under these two cases. In a connected landscape, breeding suppresses genetic divergence, but in a fragmented landscape, random drift quickly leads to evolutionary divergence .
1.2.3. Landscapes, cataclysms and connectivity Phase changes in the connectivity of a landscape can potentially explain both punctuated equilibria and pollen zones. Sites in a landscape are connected by processes, such as dispersal, that involve movement from place to place. In evolution and ecology, this means that individuals can migrate between sites, and consequently processes that are occurring in one site can spread to others. This property gives rise to the potential for a phase change in the connectivity of the landscape as a whole when the density of connected sites crosses a critical threshold . If density exceeds the threshold , the landscape is overwhelmingly connected and feedback processes that occur in one region can rapidly percolate throughout the environment. If density is subcritical, the landscape fragments into isolated, independently evolving patches (Fig. I).
61 Cataclysms such as fires, cometary impacts or volcanic activity can drastically alter the connectivity of the landscape, flipping it from a connected to a disconnected state or vice versa. After a cataclysm , the landscape is largely empty. Surviving populations occupy isolated refugia, and are consequently fragmented . At geological timescales , ecological depletion and spatial fragmentation create ideal conditions for adaptive radiation . At finer timescales , they allow for the explosive spread of previously suppressed populations. As the landscape fills, it passes the critical threshold once again and spatial suppression inhibits innovation. In this way, phase changes in landscapes may explain both punctuated equilibrium and pollen zones [Green et al. 2000] . The evolutionary impact of landscape phase changes can be seen in cellular automata models (Fig. 1). When the density of habitat patches exceeds the critical level, they merge into a single connected region. This means that there is a single breeding population, and random genetic drift is suppressed (Fig. 1). However , when the density of patches is sub-critical, the landscape becomes fragmented into separate patches and a population breaks up into isolated sub-populations. Under these conditions, genetic drift is unconstrained and speciation becomes likely (Fig. I).
1.2.4. Natural processes We can also see the above process , or elements of it, at work in a variety of different natural phenomena. For instance, off the coast of California the kelp beds have a stable mix of plant species . However every few years a major storm rips through the ecosystem, which sometimes reform with a completely different mixture of species [Dayton et al. 1984]. There are also many ecosystems where external forces impose landscape phase changes. In central Australia, for instance, rainfall mediates phase changes in the distribution of water birds [Roshier et al. 200 I] . During wet years, the landscape is essentially connected. Birds can fly virtually anywhere by moving from one water body to another. However, in drought years, most water bodies dry up, and the landscape becomes fragmented, confining the birds to small isolated areas. In the above examples, the biological implications of the phase changes are not clear, however, dual environmental phase changes of the kind we discussed above have been implicated in the evolution of at least one group of species : the cichlid fishes found in the lakes of Africa [Sturmbauer & Meyer 1992]. Here the phases are mediated by lake levels. Adaptive radiation and speciation predominate when water levels are high, whereas the environment is fragmented during phases of low water levels, so competition, selection and extinction predominate [Kornfield & Smith 2000]. Implicit in the mechanism proposed above are two processe s - variation and refinement - that occur in many contexts. In learning and development, for example, Jean Piaget proposed two processes - accommodation and assimilation - that correspond to the above phases [Block 1982]. Accommodation occurs when a child encounters a novel situation and needs to find a new pattern of behavior, a new "schema", to deal with it. Assimilation occurs when a child encounters a variation on a known situation and assimilates the experience by adapting an existing schema to deal with it. The issue of learning leads naturally to questions of brain function. Several studies raise the possibility of phase changes being involved in brain function. Most notably , Walter Freeman [1975, 1992] living neural systems are prone to respond chaotically.
62 He found that, even slight differences in stimuli could evoke widely different patterns of neural response.
1.2.5. Optimization algorithms Many adaptive algorithms used in optimization apply phase changes implicitly to mediate between global search (exploration) and local search (exploitation). Fitness landscapes provide a convenient basis for understanding why this is so. In a fitness landscape, we imagine all the potential solutions to a problem laid out on a pseudo-landscape, with values of key parameters fixing location and the object function ("fitness") defining the elevation. In the Great Deluge Algorithm, for instance, a random walker can initially wander anywhere within the fitness landscape, even areas of low elevation (i.e. poor solutions). In other words , global search operates . However, rising "flood" waters make the areas of low elevation inaccessible. At first, this is not a problem for the walker, who can skirt around the pools of water, and all elevated areas remain connected. However, when the water level reaches a critical point, connectivity in the landscape breaks down and the walker becomes trapped on a single hill. From that point on, the walker is confined to local search (i.e. hill-climbing). Other optimization algorithms incorporate these phase changes in different ways . In simulated annealing, for instance, the cooling schedule plays the role of rising flood water by imposing increasing restrictions on variations to parameter values .
1.3. Dual phase evolution 1.3.1. The theory Based on the above observations, we argue that evolution within landscapes exemplifies a family of mechanisms that differ from other widely known phenomena, such as self-organized criticality. In essence, our research suggests that underlying selforganization and emergence in many complex systems is a mechanism (Dual Phase Evolution) that incorporates the following (Fig. 2): 1. State spaces possess dual phases, with variation (exploration) dominant in one phase and selection (exploitation) dominant in the other. 2. Complexity accumulates as a result of repeated phase changes. 3. Phase changes are mediated by perturbations. 4. After perturbation, low connectivity decouples the dynamics of many local patches, allowing chaos to act as a source of novelty (exploration phase) 5. The system becomes increasingly connected over time . 6. When connectivity rises above the threshold, unstable interactions and poorlyadapted designs are selected out, allowing increasingly complex, stable and orderly structures to crystallize (exploitation phase) .
1.3.2. Relationship to other forms of critical behavior To begin , we argue that OPE explains how punctuated equilibrium arises. Eldredge & Gould [1972J effectively argue for two phases, the link that we make here to connectivity and fragmentation provides a mechanism. Moreover the mechanisms
63 involved in DPE indicate that the proliferation of new species noted by Eldredge and Gould can arise in several ways, especially the expansion of existing but previously rare species, and unleashing genetic variation by removing selective pressure. Finally, by linking the mechanism to a wide variety other processes, such as forest ecology, we have shown that punctuated equilibrium is a special case of DPE. Dual phase evolution (DPE) differs in important ways from other processes that involve critical behavior. In particular, the theory of self-organized criticality [Bak et al. 1987] deals with processes that drive a system to approach a critical state and remain there. In contrast, DPE deals with systems that normally lie well away from a critical state, but external stimuli occasionally drive them across a critical threshold. The two theories describe different aspects of critical behavior and are complementary.
Exploitation Phase Perturbation
Stasis
t
Selection reduces diversity
t
Connected network
..
Exploration Phase
...
Fragmented network
• •
Variation increases diversity
Links restored
Emergence of new forms
Figure 2. Generic representation of the processes involved in dual phase evolution .
Several authors have suggested that self-organized criticality may explain punctuated equilibrium if evolution drives ecosystems towards a critical state where a tiny perturbation can initiate an avalanche of extinctions . However, most mass extinctions seem to be associated with a large external perturbation rather than being self-organized, and therefore seem closer to the DPE model than to self-organized criticality. In addition, models of self-organized criticality suggest that it requires rather finely-tuned parameter values that are unlikely to arise by chance [but see Halley et al. 20041 . DPE is not incompatible with self-organized criticality, but it offers a potentially more robust mechanism for punctuated equilibrium. The present theory also differs from the "edge of chaos" model, which arose from studies into the behaviour of automata, with the relevant critical region lying within an
64 automaton's state space. There is a phase change, from simple to chaotic , which occurs in automata with increasing richness of behavior. Automata that lie close to this "edge of chaos" often display the most interesting behavior, including universal computation [Langton 1990]. Rather than settling in the critical region, the phenomena we describe here exhibit jumps through the critical region. They do not settle in a critical state and remain there. Numerous authors [e.g. Freeman 1975; Langton 19901 have suggested that chaos provides a source of novelty in nature. In the cases we describe, one phase is essentially chaotic. So such systems acquire novelty while in the "exploration" phase.
1.4. An application We have successfully exploited DPE to improve the performance of genetic algorithms. In the cellular genetic algorithm (eGA), for instance, we mapped the agent population onto a pseudo landscape (not a fitness landscape) , allowing breeding only with neighbors [Kirley & Green 2000]. We introduced intermittent phase changes in landscape connectivity by including cataclysms that cleared patches . These steps made it possible for the algorithm to maintain a diversity of solutions and avoid premature convergence, a common problem with genetic algorithms. The intermittent "disasters" introduced phase changes between connected and fragmented landscapes. The effect of disturbance was to allow "fitter" solutions to expand . In other words, the disturbances mediated regular swaps between local and global search.
1.5. Conclusions The theory of dual phase evolution that we have outlined here proposes that many systems develop and change by a mechanism involving phase changes. Left to themselves such systems will rapidly evolve to a stable state where selection refines existing adaptations , but creativity is limited. However external events may disturb the system , flipping it into a different phase in which variation, rather than selection , dominates . The phase transition is an essentially chaotic phenomenon that perturbs the systems in unpredictable ways, and thereby acts as a source of novelty. Following the phase change the system gradually drifts back into its original phase, but settles into a completely new, and often more complex, steady state. The theory of dual phase evolution raises many new questions to be answered. What role does it play in structural change within networks? What kinds of emergent features arise via DPE, especially encapsulation and module formation [e.g. Holland 1995]. We have also raised the possibility that DPE plays a role in many different contexts, such as brain function. These ideas have yet to be explored more carefully. Finally, there still remains a need to identify clear criteria for identifying conditions where DPE is active.
References Alvarez, L.W., Alvarez , W., Asaro, F. and Michel, H.Y., 1980, Extraterrestrial Cause for the Cretaceous-Tertiary extinction, Science 208, 1095-1108. Bak, P., Tang, C. & Wiesenfeld, K., 1987, Self-organized criticality, An explanation of lifnoise, Phys. Rev . Lett., 59 , 381-384.
65 Block , J. 1982, Assimilation , accommodation and the dynam ics of personality development, Child Development 53, 281-295. Dayton , P.K. , Currie , V. , Gerrodette, T ., Keller , B.D ., Rosenthal , R, & Ven Tresca , D., 1984, Patch dynam ics and stabilit y of some California kelp communities , Ecological Monograph s 54(3),253-289. Eldredge, N. and Gould, S.1., 1972, Punctuated equilibria: An alternative to phyletic gradualism, in Models in Paleobiology , edited by T. Schopf, Freeman , Cooper, San Franci sco, pp. 82-115 . Freeman , W.1., 1975, Mass action in the nervous system , Acad emic Press (New York). Freeman , W.1., 1992, Tutori al on neurobiology: from single neurons to brain chaos , International Journal of Bifurcation and Chaos 2(3) ,451 -482 . Green , D.G., 2004, The Serendipity Machine , Allen and Unwin (Sydney). Green , D.G ., 2003, Self-organization in networks, in Proceedings of 2003 Asia Pacific Symposium on Intelligent and Evolutionary Systems: Technology and Appli cations , edited by M. Gen , A. Namatame , O. Katai , R. McKay , H.S. Hwang , and B. Liu , Waseda University (Tok yo) (ISBN 0731705033) pp 10-14. Green, D.G., 2001, Hierarch y, complexity and agent based models , In Our Fragile World: Challenges and Opportun ities for Sustainable Developm ent , UNESCO (Paris), 1273-1292 . Green, D.G., and Kirley , M.G. , 2000 , Adaptation , diversity and spatial pattern s, International Journal of Knowledge-Based Intelligent Engineering Systems 4(3) ,184-190. Green , D.G ., 2000 , Self-Organization in comple x systems, in Complex Systems. edited by T J . Bossomaier and D.G. Green, Cambridge University Press, pp. 7-41. Green, D.G ., Newth . D and Kirley, M., 2000, Connectivity and catastrophe - towards a gene ral theory of evolution , in Artificial Life VII: Proceedings of the Seventh International Conference , edited by M.A. Bedau et a!. pp 153-161 , MIT Press (Boston). Green , D.G . and Sadedin , S., 2005 , Interactions matter - complexity in landscapes and ecosystems, Ecological Complexity 2, 117-130 . Haken, H., 1978, Synergeti cs , Springer-Verlag , Berlin. Halley, J.D. , Warden, A.e., Sadedin , S. and Li , W., 2004 , Rapid self-organized criticality: Fractal evolution in extreme environments, Physical Review E , 036118. Heng , T.N . and Green , D.G., 2006, The ALife Virtual Laboratory, www.comple xity.org.au/vlab/. Holland , J., 1995, Hidden Order: How Adaptation Builds Complexity, Addi son-Wesley, New York . Kirley, M., and Green, D.G ., 2000, An Empirical Investigation of Optimisation in Dynamic Environments Using the Cellular Genetic Algorithm , in The Proceedings of Genetic and Evolutionary Computation Conference (GECCO-2000) , edited by D. Whitley et a!., pp. 11-18 , Morgan Kauffman. Kornfield, I. & Smith , P.F ., 2000 , African cichlid fishes: Model Systems for Evolutionary Biology, Annual Review of Ecology and Systematics 31, 163-196 . Langton , e.G ., 1990, Computation at the edge of chaos: phase transition s and emergent comput ation , Physica D 42(1·3) , 12-37. Prigogine, 1.,1980 , From Being to Becoming , W.H . Freeman and Co , San Franci sco. Roshier, D.A. , Robertson , A.I., Kingsford , R.T. and Green , D.G. , 2001 , Continental-scale interactions with temporary resource s may explain the paradox of large populations of desert waterbird s in Australia, Landscape Ecology 16,547-556. Sturmbau er , e. & Meyer, A., 1992, Genetic divergence , speciation and morphological stasis in a lineage of African cichlid fishes, Natur e 358, 578 - 581 .
Chapter 9
A new measure of heterogeneity of complex networks based on degree sequence Jun Wu , Yue-Jin Tan, Ho ng-Zhong Deng and Da-Zhi Zhu School of Inform ation System and Management , National University of Defense Technology, Changsha , Chin a [email protected]
Many unique properties of complex networks are due to t he hete rogeneity. The measure and analysis of heterogeneity is imp ortan t and desira ble to research the behaviours and functio ns of complex networks. In this paper,entropy of degree sequence (EDS) as a new measure of the heterogeneity of complex networks is proposed and norm alized ent ropy of degree sequence (NEDS ) is defined . EDS is agree ment with t he normal meanin g of het erogeneity within t he context of comp lex networks compare d with convent ional measures. The het erogeneity of scale-free networks is studied using EDS . T he analyt ical expression of EDS of scale-free networks is present ed by introducing degree-rank functi on. It is demonst rated th at scale-free networks become more het erogeneous as scaling exponent decreases. It is also demonstrated th at NEDS of scale-free networks is independ ent of the size of networks which indicat es th at NE DS is a suitable and effective measure of het erogeneity.
1
Introduction
We are surrounded by networks. Networks wit h complex topology describe a wide range of systems in nature and society [?, ?]. In late 1950s, Erd os and Renyi made a breakthrough in the classical mathematical graph t heory. T hey described a network with complex topology by a
67
random graph (ER model)[?]. In the past few years, the discovery of smallworld[?] and scale-free properties[?] has stimulated a great deal of interest in studying the underlying organizing principles of various complex networks. There are major topological differences between random graphs and scalefree networks. For random networks, each vertex has approximately the same degree k ~< k >. In contrast, the scale-free network with power-law degree distribution implies that vertices with only a few edges are numerous, but a few nodes have a very large number of edges. The presence of few highly connected nodes (i.e. 'hubs') is the most prominent feature of scale-free networks and indicates that scale-free networks are heterogeneous. The heterogeneity leads to many unique properties of scale-free networks. For example, Albert et al. [?] demonstrated that scale-free networks possess the robust-yet-fragile property, in the sense that they are robust against random failures of nodes but fragile to intentional attacks. Moreover, it is found that homogeneous networks are more synchronizable than heterogeneous ones, even though the average network distance is larger[?]. Consequently, the measure and analysis of heterogeneity is important and desirable to research the behaviours and functions of complex networks. Several measures of heterogeneity have been proposed . Nishikawa et al. [?] quantified the heterogeneity of complex networks using the standard deviation of degree 17. Sole et a1.[?] proposed entropy of the remaining degree distribution q(k) to measure the heterogeneity. Wang et al.[?] measured the heterogeneity of complex networks using entropy of the degree distribution P(k). With these measures above, the most heterogeneous network is the network obtained for P(l) = P(2) = ... = P(N - 1), and the most homogeneous network is the network obtained for P(k o) = 1 and P(k) = l(k =1= ko), i.e, a regular networks. However, these conventional measures are not in agreement with the normal meaning of heterogeneity within the context of complex networks. For example, we are generally inclined to believe that a random network is quite homogeneous, but it is not the truth with the measure above. In addition, a star network is generally considered to be very heterogeneous because of the presence of the only hub-node, but the star network is quite homogeneous with the conventional measures. In this paper, we first present a new measure of heterogeneity called entropy of degree sequence (EDS) and compare it with conventional measures . Then we investigate the heterogeneity of scale-free networks using EDS.
2
Entropy of degree sequence
A complex network can be represented a graph G with N vertices and M edges. Assume that G is an undirected and simple connected graph. Let d i be the degree of a vertex Vi' As shown in Figure 1, we sort all vertices in decreasing order of degree and get a degree sequence D(G) = (D 1 , D 2 , ... D N ) , where D 1 2: D 2 2: ... 2: D N • Note that different vertices may have the same degree, we group all vertices into N - 1 groups according to their degree. That is, the
68 degree of the vertices in the 8 t h group is N - 8. Let is be the number of vertices in the 8 t h group, namely, the frequences of vertices with degree N - 8 . Let 8i be the index of the group in which vertex Vi is located, and ri be the global rank of Vi among all of the vertices in decreasing order of degree. Let P(k) be the degree distribution, i.e. the probability that a randomly chosen vertex has degree k . Let d = f(r) be the degree-rank function, which gives the relationship function between the degree and the rank of the degree sequence D( G) and is non-stochastic, in the sense that there need be no assumption of an underlying probability distribution for the sequence.
degree
group
2
3
N-2
N-I
Figure 1: Sorting all verticesin decreasing order of degree and group all vertices into groups according to degree. To measure the heterogeneity of complex networks, we define entropy of degree sequence (EDS) as N
EDS=- LIilnIi
(1.1)
i=l
where
t, = Dd
N
2: o,
(1.2)
i=l
Substituting eqution (2) into eqution (1), we have
(1.3)
Obviously, the maximum value of EDS is EDSmax = 10g(N) obtained for Ii = liN, i.e. D 1 = D 2 = ... = D N . Note that D, > 0 and D, is integer, so the minimum value of EDSmin = (ln4(N - 1))/2 occurs when D(G) = (N - 1,1, ..., 1). The maximum value of EDS corresponds to the most homogeneous network, i.e, a regular network, and the minimum value of EDS corresponds to the most heterogeneous network, i.e, a star network. The normalized entropy of degree sequence (NEDS) can be defined as NEDS =
ERD max - ERD ERD max - ERDmin
(1.4)
69 For comparison, we present the definition of entropy of remain degree distribution (ERD) in [?] N
ERD
= -
L q(k) lnq(k)
(1.5)
k =l
where q(k) = (k + l)P(k) / in [?]
L j jP(j) , and ent ropy of degree distribution (EDD ) N
EDD = -
L P(k) lnP(k)
(1.6 )
k= l
For ERD or EDD , th e maximum value is log(N - 1) obt ained for P(k) = l /(N - 1) ( Vk = 1,2 , ..., N - 1 ) which corresponds to th e most heterogeneous network and th e maximum value is 0 obt ained for P(k o) = 1 and P(k) = l(k ::f k o) which corresponds to the most homogeneous network. To be consistent with NEDS, we define the normalized entropy of remain degree distribution (NERD) and th e normalized ent ropy of degree distribution (NEDD) as
N E RD =
ERD-ERD min E RD max - E RDmin
(1.7)
NEDD
EDD - EDD min EDD max - EDDmin
(1.8)
and =
Then a network becomes more heterogeneous as NEDS/NERD/NEDD increases. Using NEDS, NERD , NEDD and standard deviation of degree (J to measure: (a) a regular network with N = 1000; (b) a random network (ER model) with N = 1000 and connection probability p = 0.3; (c) a st ar network with N = 1000; (d) a scale-free networks (Barabsi-Albert model) with N = 1000 and mo = m = 3. The result is shown in Table. 1. With NERD or NEDD, the order of heterogeneity is th at random network >scale-free network >- st ar network >- regular network. With standard deviation of degree (J , the order of heterogeneity is that st ar network >- random network >scale-free network >- regular network. With NEDS, the order of heterogeneity is th at st ar network >- scale-free network >- random network >- regular network. We are generally inclined to believe th at a scale-free network is more heterogeneous th an a random network and a star network is very heterogeneous because of the presence of the only hub-node. So our measure is agreement with the normal meaning of heterogeneity within the context of complex networks compared with convent ional measures.
3
Heterogeneity of scale-free networks
To obtain EDS of scale-free networks, we first present a th eorem on the relationship between the degree distribution P (k) and the degree-rank function f( r) .
70 Table 1: Result of compa risons between t he convent iona l measures and ours.
NEDS
o
Regular network Random network Scale-free network Star network
0.0004 0.1178 1
NERD 0 0.6451 0.4304 0.0502
NEDD 0 0.6480 0.2960 0.0011
a 0 26.2462 6.8947 31.5595
Theorem 1 If t he degree dist ribution of a network is P (k ), the degreerank function of the network is d = f (r ) = N - T * , where T * satisfies T+
II
P (k = N- s ) ·ds = r / N.
PROOF
Since
Vi
is locat ed in th e s~h group, then we obtain s,
s;-1
Lis 2: r. and L is ~
s= 1
(1.9)
ri
s= 1
Namely, s; is t he minimum T th at satisfies
T
E is 2: ri, i.e,
s= 1
s,
T
= T min ( E is 2: ri)'
Note t hat is = N· P (k = N - s), we obtain
s=1
T
s, = Tmin(L P (k = N - s) 2: rdN)
(1.10)
s= 1
Th en
T
d;
=N -
Si
= N - Tmin(L P (k = N - s) 2: rd N )
(1.11 )
s= 1
Assuming t hat P (k ) is integrabel, with continuous approximat ion for the degree distribution, equt ion. (11) can be written as di
=N
- Tmin(j T P (k
Note that P(k = N - s) 2: 0 , hence with respect to T , leading to j Tm;n P (k
=N
- s)ds 2: rdN)
(1.12)
It P(k = N - s)ds is an increasing function = N - s) ds = rdN
(1.13)
Using eqution (13), eqution (12) can be expressed as (1.14 )
where T * satisfies
1
T+
P(k =N - s)ds=rd N
(1.15)
71 The theorem is proofed. • For scale-free networks with power law degree distributions P(k) = ck:" , where>' is the scaling exponent, substituting P(k) = Ck- A into eqution (15), we have
1
T'
C(N - s)-Ads
= rdN
(1.16)
Solving eqution (16) for T* , we have (1.17) Substituting eqution (17) into eqution (14), we obtain the degree-rank functions of scale-free networks as follows: >. -1 d = f(r) = [ NC ' r
>.+1] + (N -1)-
1
- A+ l
(1.18)
Note that the scaling exponent of most scale-free networks in real world ranges between 2 and 3. We have (N - 1)->'+1 --+ 0 as N --+ 00 when>' > 2. Then, eqution (18) simplifies to (1.19) where C 1 = (~-:-6ra and 0: = 1/(>' -1) . We call 0: the degree-rank exponent of scale-free networks . Substituting eqution (19) into eqution (3), we obtain EDS of scale-free networks as follows:
(1.20)
With continuous approximation for the degree distribution, we obtain of scale-free networks as a function of degree-rank exponent 0:
EDS=
0: . N 1- a - In N 1- a
N 1- a - 1 +In-----(l-o:)(Nl-a-l) 1-0: 1 -0:
(1.21)
Substituting 0: = 1/(>' - 1) into eqution (21), we obtain EDS of scale-free networks as a function of scaling exponent >. A- 2
EDS
=
A- 2
Nx=T In Nx=T (>'-2)(N~ -i-l)
A-2
+ In
(>. - I)(Nx=T - 1) >'-2
1 - ->'-2
(1.22)
Substituting Eq. (22) into Eq. (4), we can obtain NEDS of scale-free networks. In Figure 2, we show th e plots of NEDS versus for different>. E (2,3).
72 We can find th at a scale-free network becomes more heterogeneous as ,\ decreases. Moreover, we can find that the plots of NEDS for different N overlap wit h each ot her which indicates t hat NEDS is independent of N and then NEDS is a suitable measur e of heterogeneity for different size of scale-free networks.
--N=1oJ --N=lo' ->,,> N=10'
en
o w
z
21
22
23
24
..
25
26
27
28
29
Figure 2: Plots of NEDS versus A E (2,3) for different N . It is shown that NEDS decreases as Aincreases and NEDS is independent of N.
4
Conclusion
Many unique properties of complex networks are due to the heterogeneity. Th e measur e and analysis of heterogeneity is important and desirable to research the behaviours and functions of complex networks. In this paper , we have proposed entropy of degree sequence(EDS) and normalized entropy of degree sequence (NEDS) to measure the heterogeneity of complex networks. The maximum value of EDS corresponds to th e most heterogeneous network , i.e. star network , and the minimum value of EDS corresponds to th e most homogeneous network, i.e. regular network. We measure different networks using convent ional measur es and ours . Th e results of comparison have shown t hat EDS is agreement with the normal meaning of heterogeneity within t he context of complex networks. We have studied th e heterogeneity of scale-free networks using EDS and derived the analyt ical expression of EDS of scale-free networks. We have demonst rated that scale-free networks become more heterogeneous as scaling exponent decreases. We also demonstr ated that NEDS of scale-free networks is independent of t he size of networks which indicates tha t NEDS is a suitable and effective measur e of heterogeneity.
73
Bibliography [1] ALBERT, R. , BARABASI, A.-L. , "St at ist ical mechanics of complex networks" , Rev. Mod. Phys 74 (2002) ,47-51. [2] STROGATZ, S.H., "Exploring complex networks" , Nature 4 10 (2001) , 268276. [3] ERDOS, P. and RENYI , A., "On random graphs" , Publ. Math . 6 290-297.
(1959) ,
[4] WATTS, D.J and STROGATZ, S.H, "Collect ive dynamics of 'small-world' networks" , Nature 393 (1998) ,440-442. [5] BARABASI, A.-L. and ALBERT, R., "Emergence of scaling in rando m networks", Science 286 (1999), 509-512. [6] ALBERT, R., J EONG, H., and BARABASI, A.-L., "Error and attack to lera nce of complex networks" , Nature 406 (2000) , 378-382. [7] NISHIKAWA, T ., MOTTER, A.E ., LAI, Y.C., and HOPPENSTEADT, F.C., "Heterogeneity in Oscillator Networks: Are Smaller Worlds Easier to Synchronize?" , Phys. Rev. Lett. 91 (2003) , 014101. [8] SOLE, R.V., VALVERDE, S.V., "Informat ion Theory of Complex Networks" , Lect. Not es. Phys. 650 (2004) , 189. [9] WANG , B., TANG, H.W ., Guo , C.H. , and XIU, Z.L., "E nt ropy Optimization of Scale-Free Networks Robustness to Random Failures" , Pbysice A 363 (2005), 591.
Chapter 10
Are technological and social networks really different? Daniel E. Whitney Engineering Systems Division Massachusetts Institute of Technology Cambridge, MA 02139 [email protected] David Alderson Operations Research Department Naval Postgraduate School Monterey, CA 93943 [email protected]
The use of the Pearson coefficient (denoted r) to characterize graph assortativity has been applied to networks from a variety of domains. Often, the graphs being compared are vastly different, as measured by their size (i.e., number of nodes and arcs) as well as their aggregate connectivity (l.e., degree sequence D). Although the hypothetical range for the Pearson coefficient is [-1, +1]' we show by systematically rewiring 38 example networks while preserving simplicity and connectedness that the actual lower limit may be far from -1 and also that when restricting attention to graphs that are connected and simple, the upper limit is often very far from +1. As a result, when interpreting the r -values of two different graphs it is important to consider not just their direct comparison but their values relative to the possible ranges for each respectively. Furthermore, network domain ("social" or "technological") is not a reliable predictor of the sign of r . Finally, we show that networks with observed r < 0 are constrained by their D to have a range of possible r which is mostly < 0, whereas networks with observed r > 0 suffer no such constraint. Combined, these findings say that the most minimal network structural constraint, D, can explain observed r < 0 but that network circumstances and context are necessary to explain observed r > O.
75
1
Introduction
Newman [1] observed that the Pearson degree correlation coefficient r for some kinds of networks is consistently positive while for other kinds it is negative. Several explanations have been offered [2, 3]. In this paper we offer a different explanation based on embedding each subject network in the set of all networks sharing the subject network's degree sequence (denoted here as D) . Our primary contribution is to show with 38 example networks from many domains that the degree sequence for simple and connected graphs dictates in large part the values of r that are possible. More precisely, we show that, although D does not necessarily determine the observed value of r , it conclusively determines the maximum and minimum values of r that each subject network could possibly have, found by rewiring it while preserving its D, its connectedness, and its simpleness. Approaching the problem this way reveals interesting properties of D that affect the range of possible values of r, In particular, networks with observed r < 0 have a smaller range that is all or mostly < O. But for networks with observed r > 0 the range covers most of [-1, +1]. After studying these properties and their underlying mathematics, we ask if the alternate wirings are semantically feasible, in an effort to see how the domain of each network might additionally constrain r .1
2
Observed data and mathematical analysis
Table 1 lists the networks studied and their properties of interest. The values of r m a x and rmin were obtained by systematically rewiring each subject network while preserving connectivity and degree sequence. This type of rewiring procedure was used previously by Maslov et al. [4], who argued that graph properties such as assortativity only make sense when the graph of interest is compared to its "randomized" counterpart. The message of this paper is similar in spirit, but focuses on empirical evidence across a variety of domains . The networks in Table 1 are listed in ascending order of r . It should be clear from this table that one find networks of various types , such as "social," "biological," or "technological," having positive or negative values of r . This indicates that networks do not "naturally" have negative r or that any special explanation is needed to explain why social networks have positive r. All empirical conclusions drawn from observations are subject to change as more observations are obtained, but this is the conclusion we draw based on our data. In Table 1, the kinds of networks, briefly, are as follows: social networks are coauthor affiliations or clubs; mechanical assemblies comprise parts as nodes and joints between parts as edges; rail lines comprise terminals, transfer stations or rail junctions as nodes and tracks as edges; food webs comprise species as nodes and predator-prey relationships as arcs; software call graphs comprise subroutines as nodes and call-dependence relationships as arcs; Design Structure 1 No causality is implied. The domain may well provide the constraints that shape D . The present paper does not attempt to assign a causal hierarchy to the constraints.
76
Matrices (DSMs) [9] comprise engineering tasks or decisions as nodes and dependence relationships as arcs; voice/data-com systems comprise switches, routers and central offices as nodes and physical connections (e.g., wire or fiber) as arcs; electric circuits comprise circuit elements as nodes and wires as arcs; and air routes comprise airports or navigational aids as nodes and flight routes as arcs. In Table 1, we introduce the notion of elasticity, defined here as e = Irmax rminl/2, which reflects the possible range of r relative to the maximum range [-1 ,1] obtained for all networks having the same degree sequence. We call a degree sequence with large e elastic, while a degree sequence with small e is called rigid. The vastly different observed ranges for possible values of r can be explained by a closer look at the respective degree sequences for each network and the way in which they constrain graph features as a whole. In the remainder of this paper, when refering to the degree sequence D for a graph, we mean a sequence {d 1 , d2 , •.. , dn }, always assumed to be ordered d 1 2: d2 2: .. . 2: dn without loss of generality. The average degree of the network is simply (k) = n -1 L.J i=1 di· For the purposes of this paper, we define the Pearson coefficient (known more generally as the correlation coefficient [5]) as
""n
(1.1) where m is the total number of links in the network, and we define di m- 1 E(i,j) di = m- 1 Ek(d k ) 2 . Here, di is the average degree of a node seen at the end of a randomly selected edge. It is easy to see that di = dj when averaging over links (i,j), so in what follows we will simply refer to d. Observe that d:f (d) . In fact, d = (n- 1 Ei dT)/(n- 1 E j dj ) = (d2 ) / (d), so d is a measure of the amount of variation in D . Figure 1 shows that the most rigid D are characterized by a few dominant nodes of relatively high degree, with the remaining vast majority of nodes having relatively low degree, equivalent to a small supply of available edges and implying a small value of (d). This gives D a rather "peaked" appearance. By comparison, the more elastic D have a more gradually declining degree profile. The importance of d in determining r can be easily seen from Equation 1.1. Positive r is driven by having many nodes with d; > d that can connect to one another. However, for networks with large d, there are typically fewer such nodes, and thus many more connections in which di > d but dj < d. The implication is that for highly variable D in which there are only a few dominant high degree nodes larger than d, most connections in the network will be of this latter type, and r will likely be negative . This line of reasoning is suggestive but not conclusive, yet as a heuristic, it succeeds in distinguishing the rigid D from the elastic D studied here. The observed values of r, r max, and rmin from Table 1 are plotted in Figure2. The range [rmin, r max] provides the background against which the observed r should be compared, not [-1, +1]. When the observed r < 0, the whole range is
77 Network Karate Club "Erdos Network" (Tirole)
"Erdos Network" (St iglitz) Scheduled Air Routes, US Littlerock Lake" food web Gran d Pia no Act ion 1 key Santa Fe coauthors V8 engi ne Grand Piano Action 3 keys Abilene-inspired toynet (l uternet) Bike Six speed transmission "HOT" -inspired toynet (Int ernet) Ca r Door* DSM Jet Engine' DSM TV Circuit" Tokyo Regional Rail FAA Nav Aids , Unscheduled Mozilla, 19980331* softwar e Canton food web" Mozilla, all components" Munich Schnellbahn Rail FAA Nav Aids, Scheduled St . MarkS' food web \ Vestern Power Grid Unschedu led Air Routes , US Apache software call list * Physics coauthors
Tokyo Regional Rail + Subways Traffic Light controlle r* (circuit ) Berlin U- & S-Bahn Rail London Undergro und Regional Powe r Grid Moscow Subways
Ta bl e 1: N etworks Studied a nd So m e of Their Properties, Or dered by Increasin g Pear-
so n D egre e C o rr e lation r , Each network is simple, connect ed , and u nd ir ec t ed unless marked *. In the case of t he physics coauthors, company directors, and soft ware, on ly the largest con nected component is analyzed. Table omissions corr esp ond to cases wh ere only summary statistics (and not the entire network) wer e available or where t he network was d irect ed (complicating th e calculation and interpretation of il). Soc ial networks were ob tained from published articles and d ata ava ilable directly from res earchers . Their definition s of node and edge were used. T he Santa Fe researche rs data were t a ken from Figure 6 of [7J. Air route and navigational aid data were taken from FAA data bases. Mechanical assemblies were a nalyzed using dr awings or exploded views of products. DSM data were ob tained by interviewing participants in design of t he respective products. Rail and subway lines were an alyzed based on published network maps ava ilable in travel guides and web sites. Food web data represent condensation to t ro ph ic spe cies. Software call list data were analyzed usi ng standard software analysis took The traffic light control circ uit is a standard b enchma rk ISCAS89 circuit. "Nano be ll" is a mo dern competitive loca l exchange carrier operating in one state wit h a fiber optic loop network a rchitecture. Its pos itive value for r reflects this architect ure. The regional bell operating company (REOC) that operates in the same state has a legacy copper wire network that rellects the tree- like architecture of the original AT&T monopoly , a nd in thi s st at e it s network's r is -0.6458. This st a t ist ic is based on ignoring all links be tween central offices . Adding 10% more links at random between known central offices brings r up to zero. The RBOC wou ld not divu lge infor m a t ion on t hese lin ks for competitive reasons.
78
wholly or mostly < O. Wh en the observed r > 0, the whole range approximat es [-1 , +1]. Networks of all typ es may be seen across th e whole range of r in this figur e. 60;
Figure 1 : Degree Profiles of Two Networks in Table 1. A greater fraction of nodes have di > d in the physics coauthors (right) than in the V8 engine (left) , consistent with increasing elasticity.
~
0.' 0.'
j
0.'
z-
-e.z
13 ~ Q
~
:
0 .2
-0 .4 · 0. 6
-0 .8
Figure 2: Relationship Between r and its Range.
3
Domain analysis
The preceeding data and indicators lead us to a striking conclusion: in some cases whether a network has r < 0 or r > 0 may be simply a function of network 's degree sequence D it self. For exa mple, if th e entire range of allowable r is negative, then no domain-sp ecific "explanation" is required to justify why the network has r < O. Networks with rigid D are obviously more const rained t han thos e with elast ic D , and why a particular network having an elastic D gives rise to a particular r-value when a lar ge ran ge is mathemat ically possible remains an important question. It t hen makes sense to ask if th e mathemat ical range of possible values of r is in fact plausible for a funct ioning syste m. Stat ed an ot her way: do the domain-specific features of t he system necessarily constrain the network to the observed (or nearby) r-values?
79 For t he mechanic al assemblies, t he answer is t hat not all values of r within the possible ra nge correspond to functioning systems. The rewired bikes are not different bikes, but meani ngless snarls of spokes, pedals, wheels, bra ke cables, and so on. These networks are not only constrained by rigid D , they are functio na lly intolerant of t he slightest rewiring. But t he rewired coaut hor networks, even at t heir ext remes of positi ve and negative r, represent plau sible coaut horship scena rios. A negative r scenario could arise in classic German university resear ch inst itutes, where each insti t ute is headed by a professor whose name is on every pap er t hat t he institu te publi shes. Some of t he coa ut hors go on to head t heir own institutes and ultimately have many coaut hors t hemselves, while t he majority of t he ot hers go into industry and pu blish few pap ers after gra dua t ing. The result is a network with relatively few high-degree nodes connected to many low-degree nodes and only one, if any, connections to other high-d egree nod es, leading to negative r , The fact t hat coaut horship and other social networks have been found with negative r shows t ha t such plausible scenarios might act ually exist . The opposite scena rio could be observed at a lar ge resear ch institute devot ed to biomedical resear ch , where huge efforts by many investi gators are needed to crea te each publication , and t here are often 25 or 30 coa ut hors on each pap er .r If such groups produce a series of pap ers, the result will be a coaut hor network with positive r . T he same may be said of t he Western Power Grid , where t he observed connections are no more necessary t ha n many other similar hookup s. In [8] it was shown t hat a communication network wit h a power law degree sequence could be rewired to have very different r-values and st ructure , and t hat t he different st ruct ures could display very different total ba ndwidt h capacity. While all t hese networks are feasible (i.e., t hey could be built from existing technology), engineering and economic crite ria preclud e some as "unrealistic" (i.e., pro hibitively high cost and/or poor perform ance). Interestingly, t he observed st ructure strongly resembles the planned form of th e AT&T long dist ance network as of 1930 [6] . Large mechani cal assemblies like t he v8 and t he walker have a few high degree nodes th at support the large forces and to rques t hat occur in t hese devices. The six speed t ransmission and t he bike similarly suppo rt lar ge forces and torques but have a larger number of load-bearing parts and consequent ly fewer edges impinging on those par ts. Assemblies with rigid parts are severely restricted in allowed magnitude of (d) by their need to avoid over-const raint in the kinem atic sense [10] . Elastic parts do not impose the severe mechanic al const ra int on their neighb ors that rigid ones do, so t he limit on (d) is not as severe. The ent ries in Table 1 bear this out. In the bike, t he parts that create elasticity are the spokes while in t he t ransmission t hey are t hin clutch plates. Both kind s of parts appea r in lar ge numbers and connect to parts like wheel rims, hubs, and main foun dation castings without imposing undue mechanical constra int . For t hese reasons t he bike and t he t ra nsmission have less peaked D , larger d, and offer more opt ions for rewiring. Nonetheless, all of t hese rewirings are impl ausible 2For all 55 reports published in Science in t he summer and fall of 2005, t he average nu mber of authors is 6.9 with a standard dev iation of 6.
80 and are not observed in practice. Tr ansport ation networks may be tree-like or mesh-like, depending on th e constraints and obj ecti ves und er which t hey were designed or evolved, as the case may be. It is easy to show t hat regular trees have negative r while meshes have positive r . Pl anned urb an rail and subway systems increasingly include circle lines superposed on a closely knit mesh , tending to push r toward positive values. If a simple grid is rewired to have respect ively minimum and maximum r, we can easily imagine geogra phic const raints t hat make t he rewired versions plausible, as shown in Figur e 3.
: §il: 4
10
9
3
Rural Roads r 0.2941
=
Region Divided by a River r=-0.7647
Old City on Hill New Suburbs Outside r 0.7351
=
Figure 3: Three Road Systems. Left: a simple grid, typical of roads in Iowa or Nebraska. Center: the grid rewired to have minimum degree correlation, reflecting roads in a region divided by a large river or mountain range. Right: the grid rewired to have maximum degree correlation, reflecting an old European city as a citadel on high ground surrounded by suburbs with a geographically constrained road system.
4
Conclusions
This pap er st udied simple connected network s of various ty pes and investi gated the extent to which t heir degree sequences determined t he observed value of r or t he range of mathematically feasible values of r that t hey could exhibit . We found t ha t certain cha racteristics of D , mainl y a few domin ant high degree nodes, small (d), and large d relative to (d) give rise t o observed r < 0 and const ra ined r to a narrow range comprising mostly negative values. It is then of dom ain interest to und erstand why a particular network has a degree sequenc e D with these characteristics. For the rigid assembly networks, this can be traced to the fact th at they must support large forces and torques and t hat they have a few high degree parts t hat perform t his function while supporting t he rest of t he parts. For rigid social networks like th e Kar at e Club and the Tirole and St iglitz coaut horship networks, it can be t raced to t he presence of one or a few dominant individuals who cont rol t he relationship s repr esented in t he network. For networks whose D does not have t hese restri ctive cha racteristics, the observed value of r, while usually > 0, may not mean anyt hing from eit her a mathematical point of view (beca use a wide ran ge of r of both signs is mathematically feasible) or from a domain point of view (beca use ot her rewirings wit h very different r exist or are plausible). Thus, our findings contradict t he
81 claim made in [3], namely that" Left to t heir own devices, we conjecture, networks normall y have negative values of r. In ord er to show a positive value of r, a network must have some specific additional st ruct ure th at favors assortative mixing. " Th e exa mples in t his pap er disaffirm such such genera lizat ions and suggest instead that th e observed r for any network should not be compared to [-1 , +1] but rather to t he allowed range of r for t hat network, based on its D. Similar arguments based on graph theoreti c properties are made in [11].
5
Acknowledgments
The au th ors t ha nk J. P ark , J. Davis, and J. Dunne for sharin g and explaining physics coaut hor, company director , and food web data, respectively, and S. Maslov, G. Bounova and M.-H. Hsieh for sharing and explaining valuable Matlab routines. Th e authors also thank J . Noor, K. Steel, and K. Tapi a-Ahumada for data on th e regional power grid , P. Bonnefoy and R. Weibel for dat a on the US air tr affic syste m, C. Vaishn av, J. Lin and D. Livengood for several telephon e networks, N. Sudarsan am for the t raffic light cont roller, C. Baldwin and J. Rusnak for Apache and Mozilla call list dat a, and C. Rowles and Eri c McGill for DSM data. The autho rs t ha nk C. Magee and L. Li for valuable discussions.
Bibliography [1] Newman, M., 2003, The struct ure and function of complex networks, SIA M Review 45 , 167. [2] Maslov, S. and Snepp en, K. , 2002, Specificity and Stability in Topology of Protein Networks, Science 296 , 910-913. [3] Newman, M. E. J ., and Park , J ., 2003, Wh y Social Networks are Different from Other Ty pes of Networks, Physical Review E 68 , 036122. [4] Maslov, S., Sneppen, K., and Zalianyzk, A., 2004, Detection of topological patte rns in complex networks: corre lation profile of the internet , Physica A 333 , 529-540. [5] Eric W. Weisstein . Correlat ion Coefficient . From MathWorld- A Wolfram Web Resource. http:/ /mathworld .wolfram .com/ CorrelationCoefficient.html [6] Fagen , Ed ., 1975, A History of Engineering and Science, in Th e Bell System, T he Early Years (1875-1925), Bell Teleph one Labor atories. [7] Girvan , M., and Newma n, M. E. J., 2002, Community Structure in Social and Biological Networks, PNA S 99 , 12, 7821-7826. [8] Li, L., Alderson , D., Doyle, J. C., and Willinger , W. , 2006, Towards a T heory of Scale-Free Gra phs: Definitio n, P roperties, and Implicat ions, Internet Met liemetics 2, 4, 431-523. [9] Steward , D. V., 1981, Systems Analysis and Management : Structu re, Strategy, and Design, PBI (New York ). [10] Whi tn ey, D. E., 2004, Mechanical Assemblies and their Role in Product Development , Oxford University Pr ess (New York). [11] Alderson , D., and Li, L., 2007, Diversity of gra phs with highly var iable connectivity, Phys. Rev. E 75 , 046102.
Chapter 11 Evolutional family networks generated by group-entry growth mechanism with preferential attachment and their features Takeshi Ozeki Department of Electrical and Electronics Engineering, Sophia University, 7-1 Kioicho, Chiyodaku, Tokyo, 102-8554 , Japan [email protected] The group-entry growth mechanism with preferential attachment generates a new class of scale free networks, of which exact asymptotic connectivity distribution and generation function are derived. They evolve from aristocratic networksto egalitarian networks with asymptotic power law exponent of = 2 + M depending on the size M and topology of the constituent groups. The asymptotic connectivity distribution fits very well with numerical simulationeven in the region of smaller degrees. Then it is demonstrated small size networks can be analysed to find their growth mechanism parameters using asymptotic connectivity distribution templates in region of smaller degrees, where it is easy to satisfy a statistical level of significance. This approachis believedto develop new search of scale free networksin real worlds. As an exampleof evolutional familynetwork in the real world,Tokyo Metropolitan RailwayNetworkis analysed.
r
1 Introduction The scale free network science [Barabasi 2002, Buchanan 2002,Newman 2006] is expected to provide potential methods to analyse various network characteristics of complex organizations in real world systems, such as the epidemics in Internet [Newman 4]] and dependability of social infrastructure networks [Mitchell 2003] . The evolutional network modelling is desirable to analyse such characteristics depending on network topologies. The WattsStrogatz's small world [Watts 1998] evolves from regular lattice networks to the Erdos-Renyi's random networks by random rewiring links as adjusting its probability [ErdosI960] . The Watts-Strogatz's small world having fixed number of nodes is discussed as a static network. On the other hand, the scale-free network of Barabasi-Albert (BA model) introduces the concept of growing networks with preferential attachment [Barabasi [9]]. One of characterizations of networks is given by the connectivity distribution of P (k), which is the probability that a node has k degrees (or, number of links) . In the scale free networks based on BA model, the connectivity distribution follows the power law, in which P (k) is approximated to k-r, having the exponent F3 . The real world complex networks are analysed to find various scale free networks with various exponents, which are covered in references [Barabasi 2002 , Buchanan 2002 ,Newman 2006] . For an example, it is well known that social infrastructure networks, such as power grids, as egalitarian networks, follow the power law with exponent 4 [Barabasi 1999]. There were many trials reported to generate models with larger exponents for fitting these real-world networks in reference [10-17] : Dorogovtsev et af [Dorogovtsev2000] modified the preferential attachment probability as n(k) oc am + k and derived the exact asymptotic solution of the connectivity distribution showing the wide range of exponents
83 y = a + 2 , where a is the attractiveness and m is a degree of a new node. One of similar models modifying the probability of preferential attachment is successfully applied to analyse the power-law relation of the betweenness as a measure of a network load distribution, for an instance [Goh 2001]. The BA model modified in the preferential attachment probability is practical to fit the exponents of the real world networks, but there are a lot of works necessary to identify the physical causes of such preferential attachment probability. Especially the model is difficult to find the parametric relations with network topologies. In this report, "the evolutional family networks" generated by "a group entry growth mechanism" with the preferential attachment is proposed . This is a modification of the BA model in "the growth mechanism", that is, the basic BA model assumed "one node" joining to the network at each time step. This is the first model to generate a new kind of scale free networks with exponents depending on the topological parameters of the constituent groups. One can imagine this new growth mechanism by considering the famous Granovetter's social world [Granovetterl973], i.e., the social world grows with entry of a family or a group. The family members are initially connected strongly with Evoh niou nl r.mily "
84 The evolutional family networks with M= 1 is coincident with the aristocratic network of BA model. On the contrary, the evolutional family network with larger M is the egalitarian network with larger y, and is a new class of scale free networks which has higher regularity and clustering coefficient, but a smaller diameter as increasing M: These features are quite different from the WattsStrogatz's small world. The evolutional family network can evolve with changing M through a new class of scale free networks of various topologies from the BA model network.
2 Asymptotic connectivity distributions of full-mesh family networks The asymptotic connectivity distributions of the full-mesh family networks evolving horizontally in Fig. 1 are derived by the method reported by [Dorogovtsev2000]. At initial time t= 1, the number of nodes in the network is M, and number of links or edges is M (M-l)l2. At time t=t, the number of nodes in the network is
M . t and the total number of links is {M 2 t - M}/ 2 .
p( k, s, t) denotes the connectivity distribution probability of node shaving the degree k at time t, which is given by the following master equations: O ~s~M·t-1
p(k,s,t+l)= p(k, s,t) ( 1-
k)M ( k - I )M-t M(k-I) 2 + p(k-I,s,t) 1- 2 2 M ·t - M M ·t-M M ·t - M
(1)
M·t~s~M(t+I)-I; p(M,s,t+I)=1 The average connectivity probability of the network is defined as the connectivity distribution by
p(k,t) The
1 =-M>! L p(k,s,t) s=O Ml - I
(2)
p(k,t + 1) is calculated as follows :
(t + l)(p(k,t+ 1)- p(k,/)) = -(I +
~]P(k,/)
+ (k~I) p(k -1,/)+ 1· 0 k,M +
oo: It' ) (3)
Assuming the existence of the asymptotic connectivity distribution lim p(k ,t) = p(k) and the convergence of the left hand side of Eq. (3) to zero :
,.....,
lim(I + l)(p(k,1+ 1)- p(k,t)) = 0 ,
x....co
the asymptotic connectivity distributions p(k) is obtained as follows :
which is coincident with the expression derived in the reference. From Eq.4, the evolutional family network with M approximately follows the power law with the exponent r = M + 2 for sufficiently large k. Here one has the exponent directly related to the network topology parameters M.
85 The asymptotic connectivity distributions ~ of the evolutional family c OJ , •I network with M=8, and M= I corresponding to the ~ DOl BA-model network are shown in Fig. 2. The 110'" exponents, measured at the degree of 8, are 7.5 and 3, I 10" '-I- - ' - -.......-' 10 100 10 160 respectively . One point of ... . tI.r ., II.t" It "-oIl1~k interest in evolutional Fig.2 Connectivity distribution of full-mesh family networks is that, family networks with M=8 and M=I(BA). Crosses even in relatively small denote asymptotic distribution and white circles network, the numerically simulated connectivity denotes numerical simulations . distribution is fitted very well with the asymptotic connectivity distribution, as indicated in Fig. 2(a). The crosses and circles plot the asymptotic connectivity distribution and the numerical simulation, respectively . The network size is No= 1000 for all cases. Fig.2 (b) illustrates the fitting of the evolutional family network with M= I, which also shows excellent fitting in small degrees less than 10. However, it clearly shows larger divergence of the connectivity distribution that was numerically simulated from the asymptotic solution in the distribution region below 10-2• This fact may suggest a possibility that the asymptotic exact connectivity distribution can be applied to fit real world networks with relatively small size and within relatively smaller degrees, where it is easy to satisfy a statistical level of significance. (a)
(b)
rV'i I, M:4
SA •• ,
.. ~
; .1.
The generation functions Go(x, M) are given as follows [Newman 200 I]: Go(x,M) =
k~'" (2M)!M(M + 1) k =M
M!
k
x k(k +l)(k +2) · .. ·(k + M + I)
(6)
For M=I ,
Go(x, 1)= 2(l-2X ) 2 log-I1_+ 3x -2 =4· fz(x) ; [x ]c l (7), x -x x where fm (x) is a function found in the mathematical formula [MoriguchiI956], and the forms of higher M in
Go (x, M) can be calculated by the successive
formula in the following,
r ()
Jm+1 X
I (l-x)fm(x) (m+l)!(m+l) (m+l)x
(8)
3 Asymptotic connectivity distributions with noninteger exponents The asymptotic connectivity distribution of family network combining the lines and loops evolving vertically in Fig.l is derived by the same method used above, when the probabilities of line and loop constituent network appearance at each entry time step are 1- & and s , respectively, as follows;
The asymptotic exponent is r=I+2(N+&) /{1+&) , which is the non-integer exponent including the statistical weight e. The statistically combined growth mechanism between full-mesh family networks M and M+ 1 with statistical probability of appearance in each time step, ~ and 1-&, respectively, generates also an evolutional family network with noninteger exponent having the asymptotic connectivity distribution as follows; p(M) and p(M + 1) are derived by the master equation method in the reference[6] . k-l p(k) = p(k-l) ; k c. M +2 (11 ) k+(M +1)(M +2-2&) /(M +1-&) The asymptotic exponent is r
=1+ (M + 1)(M + 2- 2&)/(M + 1-&) .
For an example, Fig.3 shows connectivity distribution of the statistically combined family networks for the case of M=4 and &=0.5, with r = 6.56 in Eq.3, which is denoted by solid line in the middle of integer asymptotic exponents of r
=6
for M=4 and 7 for M=5 . The circles denote the
numerical simulation adopting this statistically combined growth mechanism, which show good coincidence with the asymptotic connectivity distribution of Eq.3. For another example, in case of changing "m" the number of links 1::; m ::; M connecting to the old nodes from the new constituent family, at each time step, the asymptotic exact connectivity distribution is obtained as 2-M)
P(k)=P(M/(M+3+(M r(M)
/m) .
r(k) f(k+3+(M 2 -M) /m)
; kc.M+l
(12) where P(M -1) andP(M) are obtained by similar procedure ofEqA. The y . 6,S56
a
0.1
to iii! u.s
Go
0.01
exponent is r=3+M(M-l)/m. Thus these evolutional family networks can evolve from r = 2 + M
to r = 3 + M(M -1) by changing m. The evolutional family networks can ~ 1'10- '& well evolve through the scale free .-, ~ 1·10,.., networks with wide range of exponents and topologies. It should 11 -10-" be noted that the asymptotic ,M- S . ~ ,.. 1·10'" connectivity distribution fit well with , numerical simulation even in 10 relatively small network, which . . . .rofl i..... ~ suggests that we can expect such Fig.3. Statistical combination in growth small networks show the scale free mechanism for Non-integer exponent for characteristics in broaden fields by evolutional family network.
.
.
\
87 the evolutional family networks.
4 Template for analysing evolutional family networks So far it seems difficult to apply scale free network science to relatively smaller networks observed around us, because the network size should be larger than several of nodes, at least, thousands D.' to satisfy the statistical level of significance in analysing power-law relations. It is true because the BA model 0.0' !;, network with power-law exponent of 3 shows the Line larger divergence of the connectivity distribution that is numerically simulated from the asymptotic solution in the distribution region HO'" .... ,------.1.---'-----"00 to below 10-2 as shown in Fig.2. ..-, of link. Fig.4. Growth mechanism templates for However, it is much evolutional family networks. easier to satisfy the statistical criteria of significance when the connectivity distribution in smaller degrees in real world network is measurable with higher accuracy. To demonstrate usefulness of the growth mechanism templates, a numerically simulated connectivity distribution with a network size of 1000 is plotted on the template of FigA. : White circles denote that of the full-mesh family network, which shows clearly that the circles are fitted well to the growth mechanism template of the full-mesh family network with M=4. The coincidence of growth mechanism is the origin of better fitting . There seem to be various modifications of the templates corresponding generalization of both growth mechanism and the probability of the preferential attachment [Watts I998,Barabasi200 I,Arenas2001].
t
5 Tokyo metropolitan Railway Network (L
.l5"
..... ...'" ...o
..c
0.1
<:>
>
j
1·\0"
10 . r of l inh k
Fig.5 Tokyo Metropolitan Networks
As a real world network analysis by the template, the Tokyo metropolitan railway system described in reference [21] is analysed by a statistically combined growth mechanism with the line and loop family networks: Fig.5 depicts the connectivity distribution of a central part of Tokyo Metropolitan Railway System of which number of total stations and links are 736 and 100 1762, respectively. The number of links is counted topologically: that is, Railway we count the number of links between Tokyo to Kanda as 1, even though we
88 have three double railways between them, for an example in reference [22]. Fitting with N=3 and 6=0.75. is excellent, which suggests that the growth mechanism is coincident with the growth mechanism of the evolutional family networks: In the construction of railway system, number of stations as a group with a line or a loop topology are installed simultaneously. The exponent measured at the degree of 8 is 4, which is coincident with that of the power grids of the northern America [Barabasi 1999]. The number of nodes in constituen line and loop networks is N=3, which is reasonable in the central part of Tokyo from its complexity. It is found a real world network suitable for fitting by modified evolutional family network model with loop topology, which is Tokyo metropolitan railway network.
6 Discussions The asymptotic connectivity distribution function of the evolutional family network satisfies the conditions of the statistical distribution function. The network parameters such average number of degree < k > , the standard deviation o, the clustering coefficient [Satorras 2004] and the network diameter of the evolutional family networks are listed in Table 1, which are calculated by the numerical simulation with No=1000. The standard deviation o for M=1 is infinity, corresponding to BA model. The regularity of the network increases as M increase. The clustering coefficients increase as M increases, which is approximated by (I-21M) for larger M The diameter, in the Table 1, is the number of hops to reach all of the nodes from the largest hub node, which are compared with generation function method of approximately counting the number of neighbours to estimate average path length in reference [Newman 2001]. Table I .Full-mesh family network features These characteristic parameters suggest that the evolutional 3 5 family networks evolve from Barabasi-Albert scale free 0.82 0.37 0.18 etex» CO network to a new class of scale . clustering 0.78 0 0.38 free networks, which is diameter 7 6 9 5 characterized by the larger clustering coefficient, small diameter and high regularity with decreasing cr/ as increasing M. It is said that the evolutional family network is a new class of scale free networks different from the networks of the Watts-Strogatz small worlds even with higher regularity in larger constituent family size M. M
I I
2
4
8 9
7 Conclusion The group-entry growth mechanism with preferential attachment generates a new class of scale free networks, of which exact asymptotic connectivity distribution and generation function are derived. They evolve from aristocratic networks to egalitarian networks with asymptotic power law exponent of r = 2 + M depending on the size M and topology of the constituent groups. The asymptotic connectivity distribution fits very well with numerical simulation even in the region of smaller degrees. Then it is demonstrated small
89 size networks can be analysed to find their growth mechanism parameters using asymptotic connectivity distribution templates in region of smaller degree s, where it is easy to satisfy a statistical level of significance. This approach is believed to develop new search of scale free networks in real worlds. As an example of evolutional family network in the real world, Tokyo Metropolitan Railway Network is analysed.
Bibliography [I] Barabasi , A L. "Linked ", A Plume Book, (2002) [2] Mark Buchanan , "Nexus ", Norton & Company Ltd., New York (2002) [3] Newman, M..Baraba si, A.L and. Watts, J., "The structure and Dynamics of Networks" Princeton Univ .Press (2006) [4] Chapter4, pp.180-181 , ibid, [5] Mitchell, William J. "Me+ + ",MIT Press (2003) [6] Watts, DJ. and Strogat z, S.H. Collective dynamics of "small-world" Networks, Nature 393, 440-442 (1998) [7] Erdos, P. and Renyi, A Pub\. Math. Inst. Acad. Sci., 5,17(1960) [8] . Barabasi . A. L and Albert , R. Emergence of scaling in random networks, Science 286, 509 (1999) [9] Barabasi , A. L., Albert , R. and Jeong, H., Mean fi eld theory of scale free random networks , Physica A272, 173 (1999 ) [10] Barabasi, AL., Ravasz, E. and Vicsek, T., Determini stic sca le-free networks, Physica A299 , 559-5 64 (200 1) [II] Arenas, A Diaz-Guilera, A and Guimera, R. , Communication in networks with hierarchical branching, Phys.Rev.Lett s. 86, 3196 (200 I) [12] Albert R., Baraba si, A. L. , Topology ofevolving networks Phys. Rev. Lett., 85, 5234-5237 (2000) [13] Mathias N. and Gopal, V., Small Worlds:How and Why, Phys. Rev. E-63, 021117 (200 1) [14] Dorogo vtsev, S.N., J. F. F. Mende s, & A. N. Samukhin,", Stru cture of growi ng networks with pref erential linking, Phys.Rev.Lett. 85, 4633-4636 (2000) [15] Krapivsky, P. L., Redner , S., Lcyvraz, F. Conn ectivity of growing random networks, Phys. Rev. Lett., 85, 4629(2000) [16] Goh, K.I. Kahng B.,and.Kim, D, Unive rsal Behavior ofLoad Distribution in Scale free networks, Phys. Rev. Lett. 87,278701 (2001) [17] Alava, M. J. and Dorogovtsev, S. N. , Comp lex networks created by aggregation, Phys.Rev. E71, 036107 (2005) [18] Grano vetter, M., The Strength ofweak ties, American Journal of Sciology 78, 1360-1380(1973) [19] Newman , M. E. J. Strogatz, S H. and Watts, D. J. Scientifi c collaboration networks Phys.Rcv. E64,026 I 18(200 I) [20] Morigu chi, S. , Udagawa, K. ,and Ichimatsu, S. ,The Mathematical Formula 11, Iwanami Pub. Co. Tokyo (1956) [21] Rail map ofTokyo (Shoubunsha Publications, 2004), ISBN4-398-72008-1 . [22] The number of stations with k=! includes some of stations counted as the end station s at the zone boundary for the central part of Tokyo. [23] Satorras, R.P. and Vespignani, A., "Evolution and Structure ofthe Internet ", Cambridge Univ. Press.(2004)
Chapter 12
Estimating the dynamics of kernel-based evolving networks Gabor Csardi Center for Complex Systems Studies, Kalamazoo, MI, USA and Department of Biophysics, KFKI Research Institute for Particle and Nuclear Physics of t he Hungarian Academy of Sciences, Budapest , Hungary [email protected] Katherine Strandburg DePaul University - College of Law, Chicago, IL, USA Laszlo Zalanyi Department of Biophysics, KFKI Research Institute for Particle and Nuclear Physics of t he Hungarian Academy of Sciences, Bud apest , Hungary Jan Tobochnik Department of Physics and Center for Complex Systems Studies, Kalamazoo College, Kalamazoo, MI, USA Peter Erdi Center for Complex Systems Studies, Kalamazoo College, Kalamazoo, MI, USA
In this pap er we present the application of a novel meth odology to scient ific citation an d collab orat ion networks. T his method ology is design ed for underst anding th e govern ing dyn am ics of evolving networks and relies on an attachment kernel, a scalar funct ion of node pr op erties, that stochas t ically drives the addition and deleti on of vertices and edges. We illustrate how th e kernel function of a given networ k ca n be extracted from the hist ory of t he network and discuss ot her possible applications.
91
1
Introduction
The network represent ati on of complex systems has been very successful. Th e key to t his success is universality in at least two senses. First , t he simplicity of representin g complex syste ms as networks makes it possible to app ly network theory to very different syste ms, ranging from the social st ructure of a group to t he interactions of proteins in a cell. Second, these very different networks show universal st ruct ural traits such as t he small-world property and the scale-free degree-distribution [16, 3]. See [1 , 13] for reviews of complex network research. Usually it is assumed that the life of most complex systems is defined by someoften hidden and unknown - underlying governing dynamics. Th ese dynamics are the answers to the question 'How does it work?' and a fair sha re of scient ific effort is t aken to uncover thi s dynamics. In th e network represent ation th e life of a (complex or not) system is modeled as an evolving graph: sometimes new vertices are introduced to th e system while oth ers are removed, new edges are formed, others break and all these events are governed by the underlying dynamic s. See [5, 12,2] for data-driven network evolution studies. Thi s paper is organized as follows. In Section §2 we define a framework for studying the dynamics of two types of evolving networks and show how this dynamics can be measur ed from the data. In Section §3 we present two applications and finally in Section §4 we discuss our results and ot her possible applicat ions.
2
Modeling evolving networks by attachment kernels
In this section we intr oduce a framework in which the underlying dynamics of evolving networks can be est imated from knowledge of the time dependence of the evolving network. This framework is a discrete time model, where time is measur ed by the different events happ ening in th e network. An event is a structural change: vertex and/or edge additions and/or deletions. The interpr etation of an event depends on th e system we're studying; see Section §3 of this paper for two examples. Th e basic assumption of the model is that edge addit ions depend on some propert ies of th e vertices of th e network. This property can be a st ructural one such as the degree of a vert ex or its clustering coefficient but also an intrinsic one such as the age of a person in a social network or her yearly net income. The model is independent of th e m eaning of these properties. The vertex prop erties drive t he evolution of th e network stoc hast ically through an at tachment kernel, a function giving the probabilities for any new edges which might be added to the network. See [9] for anot her possible application of attachment kernels. In this pape r we specify the model framework for two special kinds of networks: cit ation and non-decaying networks, more genera l results will be published in forthcoming publications.
92
2.1
Citation networks
Citation networks are special evolving networks. In a citation network in each t ime step (event) a single new node is added to the network together with its edges (citations). Edges between "old" nodes are never introd uced and there are no edge or vertex deletions eithe r. For simplicity let us assume t hat the A(·) attachment kernel depends on a only one property of the potentially cited vert ices, t heir degree. (T he formalism can be genera lized easily to include ot her prop erties as well.) We assume that the probability that at time ste p t an edge e of a new node will attach to an old node i wit h degree di is given by
Pre cites i] =
t
A(di(t ))
(1)
~k=1 A(dk(t) )
Th e denominator is simply the sum of th e at tachment kernel functions evaluated for every node of the network in th e curre nt tim e step. With t his simple equation th e model framework for cita tion networks is defined: we assume that in each t ime step a single new node is at tached to the network and t hat it cites ot her, older nodes with the probability given by (1). For a given citation network we can use t his model to estimate t he form of the kernel function based on data about the history of a network. In this pap er we only give an overview of this estimation process, please see [7] for the details. Based on (1) t he probability that an edge e of a new node at time t cites an old node with degree d is given by
P re cites a d-degree node] = Pe(d) =
A(d)Nd(t) S(t) ,
t
S(t) = L A(dk(t ))
(2)
k=1
Nd(t) is the number of d-degree nodes in t he network in time step t. From here we can extract the A(d) kernel function: A(d) = Pe(d)S(t ) Nd(t)
(3)
If we know S(t ) and Nd(t), th en by est imating Pe(d) based on the network dat a we have an est imate for A(d) via (3), and by doing this for each edge and d degree, in practice we can have a reasonable approximat ion of the A(d) function for most d values. (Of course we cannot est imate A(d) for those degrees which were never present in t he network.) It is easy to calculate Nd(t) so th e only piece missing for the estimation is th at we need S(t) as well; however this is defined in terms of the measured A(d) function. We can use an ite rative approach to make bet ter and better approximat ions for A(d) and S(t). First we assume that So(t) = 1 for each t and measure Ao(d) which can be used to calculate the next approximation of S(t), S1 (t) yielding Al (t) via the measurement , etc . It can be shown that t his procedure is convergent an d in practice it converges quickly - after five iterations the difference between successive An(d) and A n+ 1 (d) estimations is very small.
93
Non-decaying networks
202
Non-decaying networks are more general then citation networks because connections can be formed between older nodes as well. It is still true, however, that neither edges nor nodes are ever removed from the network . Similarly to the previous section , we assume that the attachment kernel depends on the degree of the vertices, but this time on the degree of both vertices involved in the potential connection. The probability of forming an edge between nodes i and j in time step t is given by
The denominator is the sum of the attachment kernel function applied to all possible (not yet realized) edges in the network. akl(t) is 1 if there is an edge between nodes k and l in time step t and 0 otherwise. Using an argument aimilar to that of the previous section we can estimate A(d*, d**) via *
**
Pre connects d and d** degree nodes] = Pe(d*,d ) =
A(d*,d**)Ndodoo(t) S(t) ' , (5)
Ndo .d: (t) is the number of not yet realized edges between d* and d** degree nodes in time step t, and 0
N(t) N(t)
S(t) =
L L(1- akl(t))A(dk(t),dl(t))
(6)
k 1#
A(d* d**) = Pe(d*,d**)S(t) Ndo ,doo(t) ,
(7)
S(t) can be approximated using an iterative approach similar to that introduced in the previous section.
3
Applications
In this section we briefly present results for two applications for the model framework and measurement method. For other applications and details see [6, 7] .
301
Preferential attachment in citation networks
The preferential attachment model [3,8] gives a mechanism to generate the scale-free degree-distribution often found in various networks. In our framework for citation networks it simply means that the kernel function linearly depends on the degree:
A(d) = d + a,
(8)
94 where a is a constant. By using our measurement method, it is possible to measure the kernel function based on node degree for various citation networks and check whether they evolve based on this simple principle. Let us first consider the network of high-energy physics papers from the arXiv eprint archive. We used data for papers submitted between January, 1992 and July, 2003, which included 28632 papers and 367790 citations among them. The data is available online at http ://www . cs. cornell. edu/proj ects/kddcup/datasets. html. This dataset and other scientific citation networks are well studied, see [15, 10] for examples. First we've applied the measurement method based on the node degree to this network and found that indeed , the attachment kernel of the network is close to the one predicted by the preferential attachment model, that is
(9) gives a reasonably good fit to the data. See the measured form of the kernel in Fig . 1. The small exponent for d is in good agreement with the fact that the degree distribution of this network decays faster than a power-law. Next, we've applied the measurement method by using two properties of the potentially cited nodes: their degree and age, the latter is simply defined as the difference of the current time step and the time step when the node was added. We found that the two variable A(d, a) attachment kernel has the following form: (10) This two-variable attachment kernel gives a better understanding of the dynamics of this network: the citation probability increases about linearly with the degree of the nodes and decreases as a power-law with their age. Note that these two effects were both present in the degree-only dependent A * attachment kernel , this is why the preferential attachment exponent was smaller there (0.85 < 1.14). Similar results were obtained for the citation network of US patents granted between 1975 and 1999 containing 2,151,314 vertices and 10,565,431 edges: 2 A** patent (d, a) = (d1.
+ 1) a-1.6 .
(11)
These two studies show that the preferential attachment phenomenon can be present in a network even if it does not have power-law degree-distribution because there is another process - aging in our case - which prevents nodes from gaining very many edges.
3.2
The dynamics of scientific collaboration networks
In this section we briefly present the results of applying our methods to a nondecaying network: the cond-mat collaboration network. In this network a node is a researcher who published at least one paper in the arXiv cond-rnat archive between
95
·· ·
........ ""=l
o
o 0
""=l
:\
'-'
~ * .,.. *-
"-
~
o
50
100 degree
150
o 10
1
200
- 20 5
2
30
x
10
40
I 20
degree
F ig ure 1: The two measured kernel functions for the REP citation network. The left plot shows the degree dependent kernel, the right the degree and age dependent kernel. On the right plot four sections along the degree axis are shown for different vertex ages.
1970 and 1997 (this is the date when the paper was submitted to cond-mat, not the actual publication date, but most of the time these two are almost the same) . There is an edge between two researchers/nodes if they've published at least one paper together. The data set contains 23708 papers, 17636 authors and 59894 edges. We measured the attachment kernel for this network based on the degrees of the two potential neighbors. See Fig. 2 for the Acond-mat (d*, d**) function .
r- 40 o
t
;j
:30
.
~')O <::--
.
-<::- 10
•
--'
5 . 10
Q:' 15
~. 20'\"2~00:3>
Cb 25
30
1
2
5
10 20 degree
50
100
F igure 2: The attachment kernel for the cond-mat collaboration network, the surface plot was smoothed by applying a double exponential smoothing kernel to it . The right plot has logarithmic axes. The right plot shows that the kernel function has high values for zero-degree nodes, this might be because a new researcher will usually write a paper with collaborators and thus will have a high probability of adding links to the network.
We've tried to fit various functional forms to the two-dimensional attachment kernel funct ion to check which is a better description of the dynamics. See Table 1 for the functional forms and the results.
T able 1: Four optimization methods were run for each functio nal form to minimize t he least squar e difference: BFGS , Neider-Mead , CG and SANN, the results of th e best fits are included in th e table. See [14, 4] for th e detai ls of these methods.
T he best fit was obt ained by A~ond-mat (d*, d**) =
Cl .
(d*d** )C2 + c3
(12)
where the Ci are constants. See [2, 11] for other studies on collaboration networks.
4
Discussion
We've briefly presented a methodology for understanding the evolution of networks through kernel functions and showed how the kernel functions can be extracted from network data. We've discussed two applications for this methodology: first the "fitting" of the preferential attachment model to a network of scientific citations and th en determinin g how the evolution of a scientific collaboration network depends on the degree of th e vertices. T he methodology outlined here is general and can be successfully app lied to any kind of evolving network where time dependent data is available. By defining the kernel function in terms of the potentially important vert ex properties one can check whet her th ese properties really significantly influence network evolution: if a kernel function is not sensitive to one of its arguments that suggests th at this argum ent does not have an impor tant contribution. Another possible application would be to identify changes in the dynamics of a system by doing th e measur ements in sliding time windows, see [6] for an example.
5
Acknowledgement
This work was funded in part by the ED FP6 P rogramme under grant numbers IST-4-027173-ST P and IST-4-027819-IP and by the Henry R. Luce Foundation. Th e authors also thank Mark Newman for providing the cond-mat data set .
97
Bibliography [1] ALBERT, Reka , and Alber t-L aszlo BARABASI, "Stat ist ica l mechanics of complex networks" , Reviews of Modern Physics 74 (2002) , 47.
[2] BARABASI, AL , H J EONG, Z NEDA , E RAVASZ, A SCH UBERT, and T VICSEK , "Evolution of t he social network of scient ific collaborat ions", Pbysice A 311 (2002), 590- 614.
[3] BARABASI, Albert-Laszlo , and Reka ALBERT, "E mergence of scaling in random networks" , Science 286, 5439 (1999), 509-512.
[4] B ELISLE, C. J. P., "Convergence t heorems for a class of simulate d annealing algorit hms on rd " , Journal of Applied Probability 29 (1992), 885- 895.
[5] BERG, Johannes , Michael LSSIG , and Andreas WAGNER, "St ruct ure and evolution of protein interacti on networks: a statistical model for link dynamics and gene duplications" , BMC Evolutionary Biology 4 (2004) , 51.
[6] CSAROI, Gabor , "Dyna mics of citation networks" , Proceedings of the Conference on Artificial Neural Net works , (2006). [7] CSAROI , Gabor , Katherine J STRANDBURG , Laszlo ZALANYI , Jan T oBOCHNIK , and P et er BROI , "Modeling innovation by a kinetic description of the patent syste m", Physica A 374 (2007), 783-793. [8] J EONG, Hawoon g, Zolt an NEDA, and Alb er t-L aszlo BARABASI, "Measur ing preferential attachment for evolving networks" , Europhys. Lett. 61 (2003), 567-572. [9] KRAPIVSKY, P.L ., S. R EDNER, and F. LEYVRAZ, "Connect ivity of growing random networks" , Physical Review Letters 85 (2000), 4629- 4632. [10] LEHMANN, S., B. LAUTRUP, and A. D. JACKSON, "Cit at ion networks in high energy physics" , Physical Review E 68 (2003), 026113. [11] N EWMAN, MEJ , "Scient ific collabora tion networks.I. Network const ruction and fund am ental results" , Physical Review E 64 (2001), 016131. [12] N EWMAN, M. E. J., "Cluste ring and preferential at t achment in growing networks" , Physical Review E 64 (2001) , 025102 . [13] NEWMAN, M. E. J. , "T he st r uct ure and fun ction of compl ex networks" , SIAM Review 45 (2003), 167-256. [14] NOCEDAL, J ., and S. J. WRI GHT, Numerical Optimi zation, Springer (1999). [15] R EDNER, Sidney, "Citat ion statist ics from 110 yea rs of physical review" , Physics Today 58 (2005) , 49. [16] WATTS, Duncan J. , and Steven H. STROGATZ, "Collect ive dy na mics of small world networks" , Nat ure 393 (1998) , 440- 442.
Chapter 13
Consensus Problems on Small World Graphs: A Structural Study Pedram Hovareshti and John S. Baras 1 Department of Electr ical and Computer Engineering and the Institute for Systems Research, University of Maryland College Park
Consensus problems ar ise in many instances of collaborative control of multi-agent comp lex systems; where it is impo rtant for the agents to act in coordi nation wit h the other agents. To reach coord ination, agents need to sha re information. In large groups of agents t he informat ion sharing should be local in some sense, du e to energy limit ations, reliability, and ot her const raints . A consensus protocol is an iterative meth od that provides t he group with a common coordination varia ble. However, local informat ion excha nge limits the speed of convergence of such protocols. T herefore, in orde r to achieve high convergence speed, we should be able to design appropriate network to pologies. A reasonable conject ure is th at t he small world gra phs should result in good convergence speed for consensus problems because t heir low average pairwise path length should speed t he diffusion of informa t ion in t he syste m. In t his pap er we address this conject ure by simulat ions and also by st udying th e spectra l prop erties of a class of matrices corresponding to consensus problems on small world gra phs.
1
Introduction
Consensus problems arise in many instan ces of collaborative cont rol of multiagent complex syste ms; where it is imp ort ant for t he agents to act in coordination IThe material is based upon work supported by National Aeronautics and Space Administratio n under award No NCC8235.
99 with the other agents. [11, 1, 4, 8, 5, 14, 9]. In this paper we consider Vicsek's model for leaderless coordination and reaching consensus [4, 11], in which at each time instant each agent's state variable is updated using a local rule based on the average of its own state variable plus the state variables of its neighbors at that time. The local neighborhoods are time dependent in general. Each agent's dynamics can be represented as:
(1) Here N, (t) denotes the set of neighbors of agent i at time t and ti; (t) denotes the cardinality of this set. The dynamics of the system can be written in matrix form. Let G u be the set of possible graphs on n vertices. Let P be a suitably defined set that indexes the set Gu and pEP. For each Gp E Gu define a corresponding F-matrix as:
(2) where A p is the adjacency matrix of the graph G p and D p is the diagonal matrix whose ith diagonal element is the degree of vertex i. This way the simplified Vicsek's model is represented as a switched linear system whose switching signal takes values in a set of indices that parameterize the set of underlying graphs.
B(t + 1) = F,,(t)B(t)
(3)
The F matrices are a class of stochastic matrices and convergence of consensus protocols depends on properties of their infinite products. In this way linear consensus schemes are closely related to Markov chains and random walks on graphs with self loops. Different connectivity assumptions (symmetric vs. asymmetric neighborhoods) as well as different topology assumptions (fixed vs. changing) result in different sufficient conditions for convergence of consensus problems which can be found in [4, 3, 1] and references therein. In this paper we limit our scope to symmetric topologies, for which being connected is a sufficient condition for convergence. In fact there exist even less restrictive assumptions for convergence. This paper addresses the convergence speed and effects of structural properties of graphs on the performance of consensus protocols . After discussing measures of convergence speed, we study the convergence of consensus protocols for a class of complex networks, known as small world graphs [13], leading us to propose design guidelines for reaching consensus fast. We examine the conjecture that dynamical systems coupled in this way would display enhanced signal propagation and global coordination, compared to regular lattices of the same size. The intuition is that the short paths between distant parts of the network cause high speed spreading of information which may result in fast global coordination. To the best of our knowledge, the only existing result in the literature
100 abo ut the convergence of consensus problems on small world gra phs, is a very recent paper of Olfati-Saber [7], which belongs to continuous time consensus protocols and contains some conjectures on the second largest eigenvalue of the Laplacian of the small world gra phs. Th e organizat ion of the paper is as follows. First , we use Perron-Frob enius t heory of nonnegative mat rices to show t hat the Second Largest Eigenvalue Modulus (SLEM) of the corresponding F matrix is a good measure of t he convergence speed of the consensus protocol. Th en we st udy the convergence speed for small world graphs and try to find design guidelines. We use simulat ions to show a drastic improvement of convergence speed by considering small values of ¢ and use graph spectra l meth ods to reason about th is behavior.
2
Speed of convergence in fixed and changing topologies
A very important issue in consensus problems is th e speed of convergence. Th e fast er the consensus is reached, the better the performance of the protocol. Since t he applications that use consensus protocols involve many agents, it is necessary for all of th em to converge quickly. Th e convergence rate is a function of the topology of the underlying graphs. This problem is act ually in close connect ion with the asymptotic behavior of Markov chains. In fact if we consider a fixed topology, the convergence rate of the consensus protocol is nothing but the convergence rate to the stationary distribution of the Markov chain corresponding to the stochastic matrix F . Consider the syste m:
B(t + 1) = F17(t)B(t )
(4)
as before where Fp = (1+ D p ) - 1 A p are stochastic matri ces with nonzero diagonal elements . In the case of a fixed gra ph topology, the second largest eigenvalue modulu s (SLEM) of t he corresponding F matrix determines the convergence speed. This is because,
B(oo) - B(t) = (F OO
-
Ft)B(O)
(5)
Since F is a primitive stochast ic matrix, according to th e Perron-Frobenius th eorem [10]' >'1 = 1 is a simple eigenvalue with a right eigenvecto r 1 and a left eigenvector n such th at I T n = 1, Foo = In T and if >'2,>'3,...,>'r are the oth er eigenvalues of F ordered in a way such th at >'1 = 1 > 1>'21;::: 1>'31;::: ... ;::: I>'rl, and m2 is t he algebraic multiplicity of ). 2 , t hen
(6) where O(f (t )) represents a function of t such t hat there exists 0:,13 E R , with o < 0: ::; 13 < 00, such that o:f(t) ::; O(f (t )) ::; f3 f(t) for all t sufficient ly large. T his shows t hat the convergence of t he consensus protocol is geomet ric, with
101 relative speed equal to SLEM. We denote by It = 1 - SLEM(G) the spectral gap of a graph, so graphs with larger spectral gaps converge more quickly. For the general case where topology changes are also included , Blondel et al [1] showed that the joint spectral radius of a set of reduced matrices derived from the corresponding F matrices, determines the convergence speed. For ~ a set of finite n x n matrices, their joint spectral radius is defined as:
(7) Calculation of the joint spectral radius of a set of matrices is a mathematically hard problem and is not tractable for large sets of matrices. Our goal is to find network topologies which result in good convergence rates. Switching over such topologies will also result in good convergence speed. We limit our scope to the case of fixed topology here and examine the conjecture that small world graphs have high convergence speed .
3
Convergence in small world graphs
Watts and Strogatz [12] introduced and studied a simple tunable model that can explain behavior of many real world complex networks. Their small world model takes a regular lattice and replaces the original edges by random ones with some probability 0 ::; ¢ ::; 1. It is conjectured that dynamical systems coupled in this way would display enhanced signal propagation and global coordination, compared to regular lattices of the same size. The intuition is that the short paths between distant parts of the network cause high speed spreading of information which may result in fast global coordination. We examine this conjecture. In this study, we use a variant of the Newman-Moore-Watts [6] improved form of the ¢ -model originally proposed by Watts and Strogatz. The model starts with a ring of n nodes, each connected by undirected nodes to its nearest neighbors to a range k . Shortcut links are added -rather than rewiredbetween randomly selected pairs of nodes, with probability ¢ per link on the underlying lattice; thus there are typically nk¢ shortcuts. Here we actually force the number of shortcuts to be equal to nk¢ (comparable to the Watts ¢-model.) In our study, we have considered different initial rings (n , k) = (100,2), (200,3), (500,3) ,(1000,5), generated 20 samples of small world graphs G(¢) for 50 different ¢ values chosen in a logarithmic scale between 0.01 and 1. Selecting these choices of (n, k) is done for comparison purposes with the results of [7] . In the figures 1 and 2, we have depicted the gain in spectral gap of the resulting small world graphs with respect to the spectral gap of the base lattice. We will just discuss below the results for cases (500,3) and (1000,3). The others follow a similar pattern. Some important observations and comments follow:
< 4> < 0.01) there is no spectral gap gain observed and the SLEM is almost constant and a drastic increase in the spectral gap is observed around ¢ = 0.1.
1. In the low range of 4> (0
102 ,."
ten
s»
.
."
en
100
,
e»
' lOO
,."
500
""
ee
..
en
"" lOO
lOO '00 0
0
10'
10"'
IO~
,if
Figure 1: Spectral gap gain for (n,k) = (500,3)
..
10. '
,if
Figure 2: Spectral gap gain for (n, k) = (1000, 5)
2. Simulations show that sma ll world gra phs possess good convergence propert ies as far as consensus protocols are concern ed. Some analyt ical results are includ ed in t he next sect ion but the complete analysis is t he subject of future work. The results show t hat adding nk¢ shortc uts to a 1 - D lat t ice dramatically improves t he convergence properties of consensus schemes for ¢ ~ 0.1. For example in a (500, 3) lat t ice, by adding ra ndomly 150 edges, we can on average increase t he spectral gap approximately by a factor of 100. However, our aim is to find a more clever way of adding edges so that after adding 150 edges to a (500,3) lat t ice we get a much lar ger increase in t he spectral gap. To formul ate thi s problem, we consider a dynamic gra ph which evolves in time start ing from a 1 - D lattice Go = C(n, k) . Let 's denote t he complete gra ph on n vertices by K n . Also, denote t he complement of a graph G = (V, E) (which is the graph with t he same vertex set but whose edge set consists of the edges not present in G) by G. SO, E (G) = E (K n ) \ E (G). If we denot e the operation of adding an edge to a gra ph by A, the dynamic gra ph evolut ion can be written as:
So, now t he probl em to solve is: maxe(l).....e(n)EE(G(t» max [A2(F (nk¢) ), -AN(F(nk¢))]
subject to:
(8)
(9)
where F (nk¢ ) = D(G(nk ¢))- lA(G(nk¢)). We will now mention some observat ions which are useful to build a framework for studying the above problem.
103
3.1
Spectral analysis
T he choice of Go = C(n , k) to be a regul ar 1 - D lat tice wit h self loops, mean s that (possibly afte r re-lab elling vertices) t he adjacency matrix of t he gra ph ca n be writt en as a circ ulant mat rix: al an an -l
A=
a2
in which:
a2 al an
a3 a2 al
an an- l an-2
a3
= circ[al, a2, ...an ]
(10)
al
a ~ [al , a2, ...a n] = ~
k+l
0, ..., 0,
'-v-"
n - 2k - l
8k
(11)
Circulant matrices have a special st ruct ure which provid es t hem with special pr op erties. All entries in a given diagonal are the same . Each row is det ermined by it s previous row by a shift to t he right (modul o n) . Consider the n x n permutation mat rix , II = circ[O 1 0]. Then for any circ ulant matrix we ca n write: A = circ[al,a2, ..., a n ] = all + a2II + ... + anIIn- l . For a vect or a = [al, a2, ...,an], the polyn omial Pa(z) = a l + a2Z + a3z2 + ...anz n is called the representer of the circulant . The following theorem based on [2] states how to calculate t he eigenvalues of circulants .
° ..
2",;=I
Theorem 3.1 [2} Let w = e be th e n th root of unity. The eigenvalues of A = circ[al , a2, ..., an ] are given by Ai = Pa(w i- l ), where i = 1,2 , ...n . n
The main result considering the spectral properties of Go follows.
Proposition 3.1 The corresponding F m atrix of Go = C(n , k) is circ ulant . Furth ermore, its SLEM has multiplicity at least 2. Sketch of Proof: Since Go = C(n, k ) is 2k+ l-regular (including the self loop) , F = tr:' A = 2k~1 A . So F is circ ulant, F = circ(2k~ 1 a ), where a is as in (11). The representer of t his circ ulant is
So, the eigenvalues of this matrix are Ai = Pa(wi- l) . It is easy to show that Al = 1 and moreover it is a simple eigenvalue becau se the underlying graph is connecte d. Since for integers A and B , w An+ B = w B , it follows that A2 = An , A3 = An- l and so on . In t he case that n is odd, apart from Al = 1, all eigenvalues come in pairs. In t he case that n is even, it can be shown t hat A ~ + l is the on ly
104 ....r - -- - - - -- -----,
.-
Dl _
t;IC "" IllO la) . tD'""dl.1_-' llQC1O "t
IJO
eoo
1(1(1)
Figure 3: Adding a shortcut to (1000,5). Line ta ngent to curve shows th e SLEM before new edge
Figure 4 : T he opt imal topol ogy; adding 2 shortcuts to C(16, 2)
eigenvalue apart from -1 which can be single, however direct calculation shows t hat it is equal to ~k~~ which is clearly less than A2 = An. A simple geometric argument shows th at SLEM = A2 = An = 2k~ 1 [1 + 2Re(w) + 2Re(2w) + ... + 2Re(kw)] < 1 and Ai :::; A2 for i E 2, ..., n - 1. This shows t hat for the case where k « n , which are t he cases we are more interested in, as n ---. 00 two of t he non-unity eigenvalues approach 1. This describes the slow convergence of consensus protocols when the diameter is large.
4
Simulation results: The effect of adding one and two shortcuts
We ran a set of simulat ions with different purposes based on (8). A counte r intui tive result is t hat the SLEM does not monotonically change wit h addition of edges. Specifically, in cases when n is even, adding an edge will increase th e SLEM except in th e case where a vertex is connected to the farthest vertex from it , th at is i is connected to i + n/2 (modulo 2). In this case one of the multiplicities of the SLEM is decreased but the oth er multiplicity is not changed. Figures 3 and 4 illustrat e this effect . The dotted line t angent to the curves show th e SLEM of the original curves. The more distant the two joined vertices, th e smaller the increase in the SLEM. Addin g two shortcuts can however decrease t he SLEM. It is worthwhile to ment ion t hat in all of our simulat ions, for a given n, shortc uts th at reduced t he diameter of th e gra ph more, result ed in larger spectral gap. For example, for the case of adding 2 shortc uts to Go = C(16,2), Figure 4 shows t he opt imal topology. The analysis of this conjecture is th e sub ject of future work. In this paper, we presented a structural st udy on t he convergence of consensus problems on small world gra phs. Simulations and some preliminary analytical
105
results were presented. Our future work focuses on the analytical study of small world phenomena in consensus problems .
Bibliography [1] V. Blondel, J. Hendrickx, A.Olshevsky, and J . Tsitsiklis. Convergence in multiagent coordination, consensus and flocking. Proceedings 44th IEEE Conference on Decision and Control, pp. 2996-3000, 2005. [2] P. J. Davis. Circulant Matrices. Wiley, 1979.
[3] L. Fang and P. Antsaklis. On communication requirements for multi-agents consensus seeking. Proceedings of Workshop NESC05, Lecture Notes in Control and Information Sciences (LNCIS) , Springer, 331:53-68, 2006.
[4] A. Jadbabaie, J . Lin, and A. S. Morse. Coordination of groups of mobile autonomous agents using nearest neighbor rules. IEEE Transactions on Automatic Control, 48(6):988-1001 , 2003 .
[5] T . Jiang and J . S. Baras Autonomous trust establishment. Proceedings 2nd International Network Optimization Conference, Lisbon, Portugal, 2005.
[6] M. E. J . Newman, C. Moore, and D. J . Watts, Mean-field solution of the small-world network model. Physical Review Letters 84: 3201-3204, 2000.
[7] R Olfati Saber . Ultrafast consensus in small-world networks. Proceedings American Control Conference, 4:2371-2378, 2005.
[8] R Olfati Saber and RM Murray. Consensus problems in networks of agents with switcing topology and time-delays . IEEE Transactions on Automatic Control, 49(9):1520-1533, 2004.
[9] W. Ren, R W. Beard , and T. W. McLain. Coordination variables and consensus building in multiple vehicle systems. Lecture Notes in Control and Information Sciences series, 309:171-188, 2004, Springer Verlag. [10] E. Seneta. Nonnegative Matrices and Markov Chains. Springer, 1981.
[11] T. Vicsek, A. Czirok, E. Ben Jakob, 1. Cohen, and O.Schochet. Novel type of phase transitions in a system of self-driven particles. Physical Review Letters, 75:1226-1299, 1995. [12] D.J . Watts and S.H. Strogatz. Collective dynamics of small-world networks. Nature, 393:440-442, 1998. [13] D.J. Watts. Small Worlds: The Dynamics of Networks Between Order and Randomness. Princeton University Press, 1999. [14] L. Xiao, S. Boyd, and S. Lall. A scheme for robust distributed sensor
fusion based on average consensus. Proceedings International Conference on Information Processing in Sensor Networks, pp. 63-70, 2005.
Chapter 14
Complex Knowledge Networks and Invention Collaboration Thomas F. Brantle t Wesley 1. Howe School of Technology Management Stevens Institute of Technology Castle Point on Hudson Hoboken, NJ 07030 USA [email protected] M. Hosein Fallah, Ph.D. Wesley J. Howe School of Technology Management Stevens Institute of Technology Castle Point on Hudson Hoboken, NJ 07030 USA [email protected] Knowledge and innovation flows as characterized by the network of invention collaboration is studied, its scale free power law properties are examined and its importance to understanding technological advancement. This research while traditionally investigated via statistical analysis may be further examined via complex networks. It is demonstrated that the invention collaboration network's degree distribution may be characterized by a power law, where the probability that an inventor (collaborator) is highly connected is statistically more likely than would be expected via random connections and associations, with the network's properties determined by a relatively small number of highly connected inventors (collaborators) known as hubs. Potential areas of application are suggested.
1.
Introduction
Presently we are constructing ever increasingly integrated and interconnected networks for business, technology , communications, information, and the economy . The vital nature of these networks raises issues regarding not only their significance and consequence but also the influence and risk they represent. As a result it is vital t
Corresponding Author
107 to understand the fundamental nature of these complex networks. During the past several years advances in complex networks have uncovered amazing similarities among such diverse networks as the World Wide Web [Albert et al. (1999)], the Internet [Faloutsos et al.(1999)], movie actors [Amaral et al. (2000)], social [Ebel et al. (2002)], phone call [Aielo et al. (2002)], and neural networks [Watts and Strogatz (1998)] . Additionally, over the last few decades we have experienced what has come to be known as the information age and the knowledge economy. At the center of this phenomenon lies a complex and multifaceted process of continuous and farreaching innovation advancement and technological change [Amidon (2002)], Cross et al. (2003) and Jaffe and Trajtenberg (2002)]. Understanding this process and what drives technological evolution has been of considerable interest to managers, researches, planners and policy makers worldwide. Complex networks offer a new approach to analyze the information flows and networks underlying this process.
1.1
Knowledge and Innovation Networks
Today, nations and organizations must look for ways of generating increased value from their assets. Human capital and information are the two critical resources . Knowledge networking is an effective way of combining individuals' knowledge and skills in the pursuit of personal and organizational objectives. Knowledge networking is a rich and dynamic phenomenon in which existing knowledge is shared, evolved and new knowledge is created. In addition, in today's complex and constantly changing business climate successful innovation is much more iterative, interactive and collaborative, involving many people and processes. In brief, success depends on effective knowledge and innovation networks. Knowledge collaboration and shared innovation , where ideas are developed collectively, result in a dynamic network of knowledge and innovation flows, where several entities and individuals work together and interconnect. These networks ebb and flow with knowledge and innovation the source and basis of technological advantage. Successful knowledge and innovation networks carry forth the faster development of new products and services, better optimization of research and development investments , closer alignment with market needs, and improved anticipation of customer needs resulting in more successful product introductions, along with superior competitor differentiation. [Skyrme (1999), Amidon (2002), and Cross et al. (2003)] This paper discusses knowledge and innovation flows as represented by the network of patents and invention collaboration (inventors and collaborators) and attempts to bridge recent developments in complex networks to the investigation of technological and innovation evolution. The recent discovery of small-world [Watts and Strogatz (1998)] and scale-free [Barabasi and Albert (1999)] network properties of many natural and artificial real world networks has stimulated a great deal of interest in studying the underlying organizing principles of various complex networks , which has led in tum to dramatic advances in this field of research. Knowledge and innovation flows as represented by the historical records of patents and inventors, with future application to technology and innovation management is addressed.
108 1.2
Gaussian Statistics to Complex Networks
Patents have long been recognized as a very useful and productive source of data for the assessment of technological and innovation development. A number of pioneering efforts and recent empirical studies have attempted to conceptualize and measure the process of knowledge and innovation advancement as well as the impact of the patenting process on patent quality, litigation and new technologies on innovation advancement [(Griliches (1990), Jaffe and Trajtenberg (2002), Cohen and Merrill (2003)]. However, these studies have primarily relied upon traditional (Gaussian) statistical data analysis. Complex networks should reveal new associations and relationships, thus leading to an improved understanding of these processes. Recent studies in complex networks have shown that the network's structure may be characterized by three attributes, the average path length, the clustering coefficient, and the node degree distribution. Watts and Strogatz (1998) proposed that many real world networks have large clustering coefficients with short average path lengths, and networks with these two properties are called "small world." Subsequently it was proposed by Albert et al. (1999) and Barabasi and Albert (1999) that many real world networks have power law degree distributions, with such networks denoted as "scale free." Specifically scale free networks are characterized by a power law degree distribution with the probability that a node has k links is proportional to k-Y (i.e., P(k) - k"), where y is the degree exponent. Thus, the probability that a node is highly connected is statistically more significant than in a random network, with the network's properties often being determined by a relatively small number of highly connected nodes known as hubs. Because the power law is free of any characteristic scale, networks with a power law node degree distribution are called scale free. [Albert and Barabasi (2002), Newman (2003), and Dorogovtsev and Mendes (2003)] In contrast, a random network [Erdos and Renyi (1959)] is one where the probability that two nodes are linked is no greater than the probability that two nodes are associated by chance, with connectivity following a Poisson (or Normal) distribution . The Barabasi and Albert (BA) (1999) model suggests two main ingredients of selforganization within a scale-free network structure, i.e., growth and preferential attachment. They highlight the fact that most real world networks continuously grow by the addition of new nodes, are then preferentially attached to existing nodes with large numbers of connections, a.k.a., the rich get richer phenomenon. Barabasi et al. (2002) and Newman (2004) have also previously studied the evolution of the social networks of scientific collaboration with their results indicating that they may generally be characterized as having small world and scale free network properties.
1.3
Invention, Knowledge and Technology
Patents provide a wealth of information and a long time-series of data about inventions, inventors, collaborators, prior knowledge, and assigned owners. Patents and the inventions they represent have several advantages as a technology indicator. In particular, patents and patent citations have long been recognized as a very rich and fertile source of data for studying the progress of knowledge and innovation. Hence, providing a valuable tool for public and corporate technology analysis, as well as planning and policy decisions [Griliches (1990), Jaffe and Trajtenberg (2002),
109 Cohen and Merrill (2003)]. Nevertheless , patents and invention collaboration have undergone limited investigation, thus offering a very rich information resource for knowledge and innovation research that is even less well studied and is yet to be fully exploited [Jaffe and Trajtenberg (2002)]. A companion paper analyzes patents and patent citations from a complex networks perspective [Brande and Fallah (2007)].
2
Invention Collaboration
Patents and invention collaboration data contains relevant information allowing the possibility of tracing multiple associations among patents, inventors and collaborators . Specifically, invention collaboration linkages allows one to study the respective knowledge and innovation flows, and thus construct indicators of the technological importance and significance of individual patents, inventors and collaborators . An item of particular interest is the connections between patents and invention collaborators . Thus, if inventor A collaborates with inventor B, it implies that inventor A shares or transfers a piece of previously existing knowledge with inventor B, and vice versa, along with the creation of new knowledge as represented by the newly patented invention. As a result, not only is a flow of knowledge shared between the respective invention collaborators, but an invention link or relationship between the individual collaborators is established per the patented invention. The supposition is that invention collaboration is and will be informative of the relationships between inventors and collaborator s as well as to knowledge and innovation. The construction of the invention collaboration network is discussed and it' s bearing to knowledge and information. Next, summary statistics, probability distributions and finally the power law degree distribution is analyzed.
2.1
Bipartite Graphs and Affiliation Networks
An invention collaboration network similar to that produced by the movie actor network [Watts and Strogatz (1998)] may be constructed for invention collaboration where the nodes are the collaborators , and two nodes are connected if two collaborators have coauthored a patent and therefore co-invented the invention. This invention affiliation or collaboration relationship can be easily extended to three or more collaborators. The relationship can be completely described by a bipartite graph or affiliation network where there are two types of nodes, with the edges connecting only the nodes of different types. A simple undirected graph is called bipartite if there is a partition of the set of nodes so that both subsets are independent sets. Collaboration necessarily implies the presence of two constituents, the actors or collaborators and the acts of collaboration denoted as the events. So the set of collaborators can be represented by a bipartite graph, where collaborators are connected through the acts of collaboration . In bipartite graphs, direct connections between nodes of the same type are impossible, and the edges or links are undirected. Figure 1 provides a bipartite graph or affiliation network representation with two sets of nodes, the first set labeled "patents" which connect or relate the second set labeled "in vention collaborators" who are linked by the shared patent or invention. The two mode network with three patents, labeled PA, PB and Pc, and seven patent or invention collaborators, C 1 to C7, with the edges joining each patent to the respective
110 collaborators is on the left. On the right we show the one mode network or projection of the graph for the seven collaborators . It is noted that singularly authored patents would not be included in the bipartite graph and resulting invention collaboration network. Invention Collaborators
Patents and invention collaboration constitute a documented record of knowledge transfer and innovation flow, signifying the fact that two collaborators who coauthor a given patent, or equivalently co-invent said invention, may well indicate knowledge and innovation flowing between the respective collaborators along with the creation of new knowledge and innovation as represented by the new invention. The patent invention link and collaboration knowledge and innovation flow is illustrated in Figure 2 and can be easily extended to three or more collaborators. Thus, knowledge and innovation information made publicly available by the patent has not only flowed to the invention, but has significantly influenced the invention 's collaborators. Several network measures may be applied to the collaboration network in order to both describe the network plus examine the relationship between and the importance and significance of individual inventors and COllaborators [Newman (2004)].
The invention collaboration network is constructed using the inventor data provided by the NBER (National Bureau of Economic Research) patent inventor file [Jaffe and Trajtenberg (2002)]. This file contains the full names and addresses of the inventors for patents issued from the beginning of 1975 through the end of 1999, comprising a twenty-five year period of patent production and invention collaboration . This includes approximately 4.3M patent- inventor pairs, 2.1M patents and 104M inventors.
111 2.4 2.4.1
Invention Collaboration Distribution Power Law Degree Distribution
Figure 3 provides the probability degree distribution for the invention collaboration network on logarithmic scales. It may be seen that the best fit line for this distribution follows a power law distribution with an exponent of 2.8. Hence it is concluded that a power law provides a reasonable fit to the data. It is noted that a truncated power law distribution with an exponential cutoff may provide a suitable representation, with an associated improvement in the explanation of total variance (R2 ;::; 1.0). This systematic deviation from a power law distribution is that the highest collaborating inventors are collaborating less often than predicted and correspondingly the lowest collaborating inventors are collaborating more often than predicted. A reasonable rationale for this deviation is that in many networks where aging occurs, show a connectivity distribution that possess a power law organization followed by an exponential or Gaussian decay distribution [Amaral et al. (2000)].
Figure 3 - Invention Collaboration: Collaborators Per Inventor The improved fit of the truncated power law with exponential cutoff model, may be attributed to a distinction in the objectives of invention patenting versus scientific publishing. As a result, patent invention collaboration with the sharing of patent rights, further dividing any potential economic rewards and financial gains might have a minimizing or at least optimizing effect on any incentive to increase the number of collaborators. It would be expected that inventors would evaluate and weigh the potential technical contribution against the economic and financial impact of the prospective collaboration on the invention and its shared ownership. Again, with respect to scientific publication this objective is much less of a consideration. For the patent invention collaboration network the degree exponent of the number of patent invention collaborators is approximately 2.8. Thus, it is demonstrated that the number of invention collaborators roughly follows a power law distribution. That is, the numbers of collaborators per inventor falls off as k-Y for some constant y ;::; 2.8, implying that some inventors account for a very large number of collaborations,
112 while most inventors collaborate with just a few and smaller number of additional collaborators. These results are consistent with the theoretical and empirical work concerning scale free networks where a degree exponent of 2 < 'Y < 3 is predicted for very large networks, under the assumptions of growth and preferential attachment.
3
Summary, Discussion and Conclusions
Knowledge and innovation as typified by the network of patents and invention collaboration and the significance of this network to the advancement of technology is discussed. This area of research while traditionally investigated via statistical analysis may be further advanced via complex network analysis. The scale free power law property for the invention collaboration network is presented, where the probability that an inventor or collaborator being highly connected is statistically more significant than would be expected via random connections or associations. Thus the network's properties now being determined by a relatively small number of highly connected inventors and collaborators known as hubs . Immediate areas of potential application and continued investigation include: technology clusters and knowledge spillover [Saxenian (1994) , Porter (1998), Jaffe et al. (2000) , Jaffe and Trajtenberg (2002)] and patent quality , litigation and new technology patenting [Cohen and Merrill (2003) , Lanjouw and Schankerman (2004)]. Analyses of invention collaboration and application to these areas from a complex network analysis perspective should provide a deeper understanding as to their underlying structure and evolution which may influence both private and public policy decision making and planning initiatives. Significant effort and research has been placed into investigating the organization, development and progression of knowledge and innovation, and its impact on technology advancement. Complex network analysis offers tremendous potential for providing a theoretical framework and practical application to the role of knowledge and innovation in today's technological and information driven global economy.
R
References
[1] Aielo, W., Chung, F., and Lu, L. (2002). Random Evolution of Massive Graphs. In Abello , J. Pardalos, P.M., and Resende, M.G.C. eds., Handbook of Massive Data Sets. (pp. 97-122) Dordrecht, The Netherlands: Kluwer Academic Publishers. [2] Albert, R. and Barabasi A.L. (2002) . Statistical Mechanics of Complex Networks. Reviews ofModern Physics, 74,47-97. [3] Albert , R., Jeong, H. and Barabasi A.L. (1999). Diameter of the World -Wide Web. Nature, 401,130-131. [4] Amaral, L.A.N ., Scala, A., Barthelemy M., and Stanley , H.E. (2000). Classes of Small World Networks. Proceedings National Academy Sciences, USA, 9 7, 21, 11149-11152. [5] Amidon, D.M. (2002). The Innovation SuperHigh way: Harnessing Intellectual Capital for Collaborative Advantage. Oxford, UK: Butterworth-Heinemann. [6] Barabasi, A.L. and Albert, R. (1999). Emergence of Scaling in Random Networks. Scienc e, 286, 509-512.
113 [7] Barabasi, A.L., Jeong , H., Neda, Z., Ravasz, E., Schubert, A., and Vicsek, T. (2002). Evolution of the Social Network of Scientific Collaborations. Physica A, 311, 590-6 14. [8] Brande, T.F. and Fallah M.H. (2007). Complex Innovation Networks, Patent Citations and Power Laws. Proceedings ofPICMET '07 Portland International Conference on Management of Engineering & Technology, August 5-9, 2007 Portland, OR [9] Cohen, W.M. and Merrill , S.A., eds. (2003) . Patents in the Kno wledge-Based Economy. Washington, DC: The National Academic Press. [10] Cross , R., Parker, A. and Sasson, L., eds. (2003). Networks in the Knowledge Economy. New York, NY: Oxford University Press. [11] Dorogovtsev, S.N. and Mendes, J.F.F .F. (2003). Evolution ofNetworks: From Biological Nets to the Internet and www. Oxford, Great Britain: Oxford University Press. [12] Ebel, H., Mielsch, L.I. and Bornholdt, S. (2002) . Scale-Free Topology of E-mail Networks. Physical Review E, 66, 035103. [13] Erdos, P. and Renyi, P. (1959) . On Random Graphs . Publi cationes Mathematicae, 6,290-297. [14] Faloutsos, M., Faloutsos, P. and Faloutsos, C. (1999) . On Power Law Relationships of the Internet Topology. Computer Commun ication Review, 29(4), 25 1-262. [15] Griliches, Z. (1990). Patent Statistics as Economic Indicators: A Survey. Journal ofEconomic Literature, 28(4), 1661-1707. [16] Hall, B., Jaffe, A. and Trajtenberg, M. (2005). Market Value and Paten t Citations. Rand Journal ofEconomics 36, 16-38 [17] Jaffe, A. and Trajtenberg, M., eds. (2002). Patents, Citations, and Innovations: A Window on the Knowledge Economy. Cambridge, MA: MIT Press. [18]Jaffe, A. , Trajtenberg, M. and Fogarty, M. (2000). Knowledge Spillo vers and Patent Citations: Evidence from a Survey of Inventors. Am erican Economic Review, Papers and Proceedings, 90, 215-218 . [19] Lanjouw, J and Schankerman, M. (2004) . Patent Quality and Research Productivity: Measuring Innovation with Multiple Indicators. Economic Journal, 114,441-465 . [20] Newman, M.EJ. (2003) . The Structure and Function of Complex Networks. SIAM Review, 45, 167-256. [2 I] Newman, M.EJ. (2004). Who is the Best Connected Scientist? A Study of Scientific Co-authorship Networks. In Complex Networks, E. Ben-Nairn, H. Frauenfelder, and Z. Toroczkai (eds.), pp. 337-370, Springer, Berlin. [22] Porter, M.E. (Nov.-Dec. 1998). Clusters and the New Economics of Competition. Harvard Busin ess Review, 78, 77-90 . [23] Saxenian, A. (1994) . Regional Advantage: Culture and Competition in Silicon Valley and Route 128. Cambridge, MA: Harvard University Press. [24] Skyrme , D.M. (1999). Knowledge Networking: Creating the Collaborative Enterpris e. Oxford, UK: Butterworth-Heinemann. [25] Watts , DJ. and Strogatz, S.H. (1998). Collective Dynamics of Small-world Networks. Nature, 393, 440-442.
Chapter 15
Complexity, Competitive Intelligence and the "First Mover" Advantage Philip Vos Fellman Southern New Hampshire University Jonathan Vos Post Computer Futures, Inc. In the following paper we explore some of the ways in which competitive intelligence and game theory can be employed to assist firms in deciding whether or not to undertake international market diversification and whether or not there is an advantage to being a market leader or a market follower overseas. In attempting to answer these questions, we take a somewhat unconventional approach. We first examine how some of the most recent advances in the physical and biological sciences can contribute to the ways in which we understand how firms behave. Subsequently , we propose a formal methodology for competitive intelligence. While space consideration s here do not allow for a complete game-theoretic treatment of competit ive intelligence and its use with respect to understanding first and second mover advantage in firm internationalization, that treatment can be found in its entirety in the on-line proceedings
1 Agent-Based Modeling: Mapping the Decision-maker Agent-based modeling , particularly in the context of dynamic fitness landscape models, offers an alternative framework for analyzing corporate strategy decisions, particularly in the context of firm internationalization and new market entry. While we agree with Caldart and Ricart (2004) that corporate strategy is not yet a mature discipline, it is not our intention to claim that agent-based modeling is either a complete discipline or that it can solve all of the problems of corporate strategy . However , agent-based modeling can help solve a number of problems which pure empirical research , or research which depends upon homogeneity assumpt ions cannot readily solve. Working with "local rules of behavior" (Berry, Kiel and Elliot 2002; Bonabeau 2002) agent based modeling adopts a "bottom up" approach , treating each individual decision maker as an autonomous unit. It is a methodology sharing features with Nelson and Winter's evolutionary economics insofar as it depends on computational power to model multiple iteration interactions. As Edward Lorenz demonstrated as far back as 1963, even simple "couplings" between agents can give rise to very complex phenomena. Agent-based modeling also gets around excessive assumptions of homogeneity because it allows each agent to behave according to its own rules . Additionally, agent preferences can evolve and agent-based systems are capable of learning. Agent based modeling can capture emergent behaviors as well as recognize phenomena which appear random at the local level , but which produce recognizable patterns at a more global level (Bonabeau, 2002).
2 Modeling the Firm and Its Environment Industry and firm specific factors have recently come to be seen as playing increasingly important roles in the performance of international market entrants. Comparing earlier regression studies which found country specific factors to be a greater determinant of risk than industry specific factors , Cavaglia, Brightman and Aked have developed a factor model to compare country vs. industry effects on securities returns in a more contemporary setting and obtained a conclusion quite different from the conventional wisdom . In general , they found that industry specific factors were a more important determinant of extraordinary returns and investor exposure than country factors (Cavaglia et al, 2000). In this context , traditional tools such as linear regression analysi s may not be very useful for dealing with heterogeneous actors, or in settings in where industry or firm specific factors playa dominant role . Caldart and Ricart (2004) , for example, cite the overall ambiguousness of studies on not just international diversification, but, following Barney (1991) , on diversification strategies in general , they conclude is that "as a result of the persistence of mixed results on such important lines of research, there is increasing agreement around the idea that new approaches to the study of the field would be welcomed".
116
2.1 Kauffman's Dynamic Fitness Landscape Approach The choice of Kauffman's approach is not random to this study. Earlier work by by McKelvey (1999) demonstrates both strong practical management applications stemming from Kauffman's work as well as strong theoretical connections to the strategy literature , established through McKelvey's application of NK Boolean Dynamic Fitness Landscapes to Michael Porter 's value chain . Fundamental properties of Kauffinan 's model include, first, a type of complexity with which we are already familiar - one which is easily described by agent based modeling: A complex system is a system (whole) comprising of numerous interacting entities (parts) each of which is behaving in its local context according to some rule(s) or force(s). In responding to their own particular local contexts, these individual parts can, despite acting in parallel without explicit interpart coordination or communication, cause the system as a whole to display emergent patterns , orderly phenomena and properties , at the global or collective level.
2.2 Firm Rivalry and Competitive Intelligence One might legitimately ask why the authors have taken such a long way around in getting to the role of competitive intelligence in finn competition. The answers are essentially that : (a) Real business is tricky and messy ; (b) So real businesses are led by people who made satisficing and ad hoc arguments; (c) By dint of repetition over generations, these ad hoc arguments became the received wisdom; (d) Formal methods, be they statistical or Game Theoretic or whatever, began to be applied to Business; (e) However , these were mostly dressing up the ad hoc received wisdom in pretty new clothes; (f) Since both Mathematicians and Business theorists complained about this, a new generation of more sophisticated models sprang up, which we survey; (g) However, a closer look shows that they suffer from the sins of the original ad hocracy, plus flaws in the methodology as such; (h) Hence, in conclusion, most everything in the literature that we have surveyed is a dead end, except for a few gems which we think deserve attention ; (i) This is even more germane when we admit to Competitive Intelligence as a meta-industry, so that players of business games know things about each other which classically they cannot know; and further that the businesses have employees or consultant who know formal methods such as Game Theory, and so model each other in ways deeper than classical assumptions admit; and these problems will grow more acute as computation-intense methods such as agent-based models are used by all players in a game, so this is a bad time to sweep problems and meta-problems under the rug; hence (j) we attempt a synthesis in Complex Systems .
3.0 Competitive Intelligence: An introductory Definition McGonagle and Vella (1990), among the earliest authors seeking to define competitive intelligence (CI) in a formal fashion define CI programs as being :
117 "A formalized, yet continuously evolving process by which the management team assesses the evolution of its industry and the capabilities and behavior of its current and potential competitors to assist in maintaining or developing a competitive advantage". A Competitive Intelligence Program (CIP) tries to ensure that the organization has accurate, current information about its competitors and a plan for using that information to its advantage (Prescott and Gibbons 1993). While this is a good "boilerplate" operational definition, it does not, in fact, address the deeper question of how we determine if performing the activities of competitive intelligence does, in fact, convey a competitive advantage to the firms so engaged. Even at the simplest level of the process, such general definitions provide little guidance for the evaluation of either the type or magnitude of the value creation involved. The evaluative challenge becomes even more complex when it becomes necessary to assess the value of accurate competitive intelligence for firms whose advantage in international markets is subject to network externalities . Under these circumstances , measuring the role and performance of competitive intelligence becomes one of the central problems of Competitive Intelligence. Can competitive intelligence raise a finn's overall fitness? Can competitive intelligence be used to drive a competitive advantage in product and market diversification? In the following section, we suggest a meta-framework (Competitive Intelligence MetaAnalysis, or CIMA) which lays out the criteria which a formal system of competitive intelligence must meet if it is to accomplish these goals.
4.0 Competitive Intelligence Meta-Analysis Where we diverge from virtually all previous authors, it is critical to ensure that the process by which CI is defined is formal rather than Ad Hoc. This is an inherent relational property shared between CI and CIMA. An underlying goal of the formalization of this process is to ensure that CI evolves adaptively and is a robust element of value creation within the finn . As we have already argued, without such rigorously grounded first principles, the present, largely ad hoc, definition of CI may lead those attempting to pursue competitive advantage through CI in an unrewarding, self-limiting or even value-destructive direction. Another question whose answer will frame much of the subsequent discourse is whether CI can actually be developed into an academic discipline, which can prosper in business schools or universities academic departments . For some areas of CI, especially those related to traditional security concerns, this may not be so great a problem. However, in high technology fields, particularly where either intellectual property rights are at issue or the success or failure of a move towards internationalization is dependent upon obtaining accurate information about existing network externalities and their likely course of development, the approach to the CI discipline is absolutely critical.
4.1 Are there "Standard" Tools and Techniques for all Competitive Intelligence? In a fashion similar to that faced by traditional intelligence analysts, each business organization's CI unit must also face the complexities of choice in sources of
118 raw data. In its simplest form, the question may revolve around whether the organization should use government sources, online databases, interviews, surveys, drive-bys, or on-site observations. On a more sophisticated level, the organization must determine, to draw from the analysis of Harvard Professor and former Chairman of the National Intelligence Council, Joseph Nye, whether the issues it addresses are "secrets" to be uncovered or "mysteries" which require extended study of a more academic nature. In addition. Although government sources have the advantage of low cost, online databases are preferable for faster turnaround time. Whereas surveys may provide enormous data about products and competitors, interviews would be preferred for getting a more in-depth perspective from a limited sample. These breakdowns indicate essential strategies for CI countermeasures, and countercountermeasures. One methodology for assessing the effects of CI countermeasures, and counter-countermeasures is Game Theory. For example, we can look, in simplified terms at a matrix of offensive and defensive strategies, such as: Game Theory Matrix for CI Unit and Target Target Uses Counter Unit Uses Tactic
Target Doesn't Use Counter
Mixed Results
Unit Doesn't Use Tactic Bad Data
Mixed Results
The matrices grow larger when one considers various CI tactics, various countermeasures, and various counter-counter measures. Once numerical weights are assigned to outcomes, then there are situations in which the matrix leads formally to defining an optimum statistical mixture of approaches. Imagine the benefits to a CI unit in knowing precisely how much disinformation they should optimally disseminate in a given channel? This is a powerful tool in the arsenal of the sophisticated CI practitioner.
5.0 The Formal CI Process-Sources of Error Beyond this, the firm's CI group, whether internal or external must address the problems not only of raw data collection, but far ore importantly, must deal with the transformation of that data into a finished intelligence product. This requires stringent internal controls not only with respect to assessing the accuracy of the raw data, but also a variety of tests to assess the accuracy and reliability of the analysis in the finished produc. Throughout the processes of collection, transformation, production and dissemination, rigorous efforts are needed to eliminate false confirmations and disinformation. Equally strong efforts are required to check for omissions and anomalies. Critical to this process is a clear understanding of first principles and an evaluative process which is not merely consistent but which embodies a degree of appropriateness which serves to maximize the firm's other value creating activities.
119
5.1 Omissions Omission, the apparent lack of cause for a business decision, makes it hard to execute a plausible response . While most omissions are accidental, there is a growing body of evidence that in a rich organizational context "knowledge disavowal"--the intentional omission offacts which would lead to a decision less than optimally favorable to the person or group possessing those facts-plays a significant role in organizational decision making. Following the framework of rational choice theory and neo-institutionalism, it is clear that every business decision has both proximate and underlying causes. Because historical events can often be traced more directly to individual decision makers than conventional wisdom would have us believe, avoiding omission must then take two forms. The first approach is basically a scientific problem, typically expressed in applied mathematics as the twin requirements of consistency and completeness . The second approach to omissions is more complex and requires an organizational and evaluative structure that will, to the greatest extent practicable, eliminate omissions which arise out of either a psychological or an organizational bias.
5.2 Institutional Arrangements Between the profound and persistent influences of institutional arrangements and the often idiosyncratic nature of corporate decision making, even when decisions are perceived as merely the decisionmaker 's "seat-of-the-pants" ad hoc choice, these decisions generally have far broader causes and consequences. The organizational literature is replete with examples of "satisficing", unmotivated (cognitive) bias and motivated biases (procrastination, bolstering, hypervigilance, etc.) which play heavily on decision makers facing difficult choices and complex tradeoffs. In particular bolstering is especially pernicious because complex psychological pressures often lead decision makers towards excessive simplification of decisions and towards a focus which relies almost exclusively on the desired outcome rather than on the more scientific joint evaluation of probability and outcome.
5.3 Additional Problems Other problems associated with the decision process are include framing and inappropriate representativeness. All of these tendencies must be rigorously guarded against during both during the collection and the production of competitive intelligence. Again, to the extent which it is practicable, the finished product of CI should be presented in a way which minimizes the possibilities of its misuse in the fashions indicated above. To some extent these problems are inescapable, but there are a host of historical precedents in traditional intelligence which should offer some guidelines for avoiding some of the more common organizational pitfalls.
120
5.4 Anomalies Following the above arguments, anomalies -- those data that do not fit -- must not be ignored any more than disturbing data should be omitted. Rather, anomalies often require a reassessment of the working assumptions of the CI system or process (McGonagle & Vella, 1990). In traditional intelligence, anomalies are often regarded as important clues about otherwise unsuspected behavior or events. In the same fashion, anomalies may represent import indicators of change in the business environment. There is also a self-conscious element to the methodology of dealing with anomalies. In the same way that anomalous data can lead researchers to new conclusions about the subject matter, the discovery of anomalies can also lead the researcher to new conclusions about the research process itself. It is precisely the anomalies that lead to a change of paradigm in the scientific world. Since conclusions drawn from data must be based on that data, one must never be reluctant to test, modify, and even reject one's basic working hypotheses. Any failure to test and reject what others regard as an established truth can be a major source of error. In this regard, it is essential to CI practice (collection, analysis and production) that omissions and anomalies in the mix of information (and disinformation) available in the marketplace be dealt with in an appropriate fashion.
6.0 Countermeasures Similarly, it is essential that CI Counter-countermeasures incorporate an appropriate methodology for dealing with anomalies and omissions (intentional or accidental). An important element of this process is developing the institutional capability to detect the "signatures" of competitors' attempts to alter the available data. CI counter-measures should be designed with the primary goal of eliminating the omissions and anomalies which occur in the mix of information and disinformation available in the marketplace. Included in this category are: False Confirmation: One finds false confirmation in which one source of data appears to confirm the data obtained from another source. In the real world, there is NO final confirmation, since one source may have obtained its data from the second source, or both sources may have received their data from a third common source. Disinformation: The data generated may be flawed because of disinformation, which is defined as incomplete or inaccurate information designed to mislead the organization's CI efforts. A proper scientific definition of disinformation requires the mathematics of information theory, which the authors detail elsewhere. Blowback: Blowback happens when the company's disinformation or misinformation that is directed at the target competitor contaminates its own intelligence channels or CI information. Inevitably, in this scenario, ALL information gathered may be inaccurate, incomplete or misleading.
121
Bibliography [I] Arthur, W. Brian, "Inductive Reasoning and Bounded Rationality", American Economic Review, May 1994, Vol. 84, No.2, pp. 406-411 [2] Barney, 1., "Finn Resources and Sustained Competitive Advantage", Journal of Management 1991,Vol. 17, No. 1,99-120 [31 Berry, L., Kiel, D. and Elliott, E. "Adaptive agents, intelligence, and emergent human organization: Capturing complexity through agent-based modeling" , PNAS,99 Supplement 3:7187 -8.
[4] Bonabeau, Eric "Agent Based Modeling: Methods and Techniques for Simulating Human Systems", PNAS, Vol. 99, Supplement 3,7280-7287. [51 Cavaglia, J., Brightman , C., Aked, M. "On the Increasing Importance of Industry Factors: Implications for Global Portfolio Management", Financial Analyst Journal, Vol. 56. No.5 (September/October 2000) : 41-54. [61 Caldart, A ., and Ricart, J. "Corporate strategy revisited: a view from complexity theory", European Management Review, 2004, Vol. 1,96-104 [7] Gal , Y. and Pfeffer, A., "A language for modeling agents' decision making processes in games", Proceedings of the second international joint conference on Autonomous agents and multiagent systems, ACM Press , New York, 2003 . [81 Kauffman, Stuart, "The Origins of Order: Self-Organization and Selection in Evolution, Oxford University Press, 1993 [9] McGonagle, JJ. & Vella, C.M., Outsmarting the Competition: Practical Approaches to Finding and Using Competitive Information, Sourcebooks, 1990 [10] McKel vey , Bill "Avoiding Complexity Catastrophe in Co evolutionary Pockets: Strategies for Rugged Landscapes", Organization Science, Vol. 10, No.3, May-June 1999 pp. 294-321 [III Malhotra, J . "An Analogy to a Competitive Intelligence Program: Role of Measurement in Organizational Research", in Click & Dagger: Competitive Intelligence, Society of Competitive Intelligence Professionals, 1996
[12] Prescott, J .E. & Gibbons, P.T. (1993) . "Global Competitive Intelligence: An Overview", in Prescott, J .E., and Gibbons, P.T. Eds., Global Perspectives on Competitive Intelligence, Alexandria, VA: Society of Competitive Intelligence Professionals. [13] Ruigrok, W., and Wagner, H.,(2004) "Internationalization and firm performance: Meta-analytic review and future research directions" , Academy of International Business, Stockholm, Sweden , July 11 ,2004. [14) Sato, S., E. Akiyama, and J.D. Farmer. "Chaos in Learning a Simple Two Person Game." Proceedings of the National Academy of Science. 99 (7) pp. 4748-4751 (2002) .
Chapter 16
Mobility of Innovators and Prosperity of Geographical Technology Clusters: A longitudinal examination of innovator networks in telecommunications industry Jiang He 1 Wesley J. Howe School of Technology Management Stevens Institute of Technology Castle Point on Hudson Hoboken, NJ 07030 USA [email protected] M. Hosein Fallah, Ph.D. Wesley 1. Howe School of Technology Management Stevens Institute of Technology Castle Point on Hudson Hoboken, NJ 07030 USA [email protected] Abstract Knowledge spillovers have long been considered a critical element for development of technology clusters by facilitating innovations. Based on patent co-authorship data, we construct inventor networks for two geographical telecom clusters - New
I
Corresponding Author
123 Jersey and Texas - and investigate how the networks evolved longitudinally as the technology clusters were undergoing different stages of their lifecycles. The telecom industry in the former state had encountered a significant unfavorable environmental change, which was largely due to the breakup of the Bell System and evolution of the telecom industry. Meanwhile, the telecom cluster of Texas has been demonstrating a growing trend in terms of innovation output and is gradually replacing New Jersey's leadership in telecom innovation as measured by number of patents per year. We examine differences and similarities in dynamics of the innovator networks for the two geographical clusters over different time periods . The results show that TX's innovator networks became significantly better connected and less centralized than the ones ofNJ in the later years of the time series while the two clusters were experiencing different stages of lifecycle. By using network visualization tools, we find the overwhelming power of Bell System's entities in maintaining the NJ innovator network to be lasting a very long time after the breakup of the company. In contrast the central hubs of TX's networks are much less important in maintaining the networks.
Key words: Social network, Technology clusters, Innovation, Telecommunications R&D
1. Introduction Clustering has become one of the key drivers of regional economic growth by promoting local competition and cooperation. The impact of clustering on business competitiveness and regional prosperity has been well documented (Porter, 1998). The paper is to identify the extent to which technology spillovers are associated with regional economic growth. This is an area of active research. In this study, the authors provide a new approach of monitoring cluster evolution by conducting a longitudinal analysis of the dynamics of inventor networks . The paper focuses on the telecom sectors of New Jersey and Texas. For almost a century, New Jersey has been the leader in telecommunications innovation, due to the presence of Bell Laboratories. With the break-up of AT&T and passage of 1996 Telecommunications Act that drove the de-regulation of US telecommunications market, New Jersey's telecom sector went through a period of rapid change. However, since the industry downturn of 2000, the NJ's telecommunications sector has been experiencing a hard time. While NJ is struggling to recover from the downturn, we've observed that some other states, such as Texas, have been able to pull ahead and show greater growth (He and Fallah, 2005) as measured by the number oftelecom patents. It seems that New Jersey 's telecommunications cluster is currently stuck in a stagnant state. The analysis of inventor networks within the telecom industry can provide further insight into the evolution of NJ's telecom cluster and the influence of such networks on performance of the cluster.
2. Inventors Networks
124 Complex networks are often quantified by three attributes : clustering coefficient, average path length, and degree distribution. The clustering coefficient measures the cliquishness of a network, which is conceptualized by the likelihood that any two nodes that are connected to the same node are connected with each other. The average path length measures the typical separation between any two nodes. Degree distribution maps the probability of finding a node with a given number of edges. Following the discovery of "small world" network phenomenon , which is characterized by short average path length and high degree of clustering properties (Watts and Strogatz, 1998), many empirical studies proved that small-world properties are prevalent in many actual networks such as airline transportation networks and patent citation networks. The dense and clustered relationships encourage trust and close collaboration, whereas distant ties act as bridge for fresh and non-redundant information to flow (Fleming et al., 2004). There is evidence that the rate of knowledge diffusion is highest in small-world networks (Bala and Goyal, 1998; Cowan and Jonard, 2003; Morone and Taylor, 2004). In order to test the relationship between knowledge transfer network and regional innovation output, Fleming et. al (2004) analyzed co-authorship network data from US patents of the period between 1975 and 2002. Their results are inconsistent with the generally believed proposition that "small world" networks are associated with high level of innovation. It appeared that decreased path length and component agglomeration are positively related to future innovation output; however clustering, in that study, has a negative impact on subsequent patenting. In fact, the existing empirical literature is not rich enough to illustrate the role of knowledge spillovers , created by inventors' mobility or/and collaboration, in promoting the development of clusters. In this study, we will investigate the evolution of telecom inventor networks of New Jersey versus Texas, and examine their significant differences which may explain the differences in innovation growth and cluster development of the two states.
3. Data and analysis approach In this study we map the network of telecom innovators using patent co-authorship data. We believe patent co-authorship data is a good quantitative indicator for knowledge exchange. As majority of patents are delivered by teams instead of independent individuals, it is reasonable to assume that co-authors know each other and technical information exchange occurs in the process of innovation. Secondly, patent data can reflect the mobility of the inventors as long as they create patents in different firms or organizations . Cooper's study (2001) suggests that a higher rate of job mobility corresponds to greater innovation progress because parts of the knowledge generated by a mobile worker can be utilized by both firms involved. The data for this study was originally collected from the United States Patent and Trademark Office (USPTO) and organized by Jaffe, et. al. (2002). This dataset provided us with categorized patent information covering the period between 1975 and 1999. The objective of our study is to analyze the dynamics of inventor networks for different geographical clusters over time. For this study, we selected the telecom
125 patents granted to inventors in New Jersey and Texas between 1986 and 1999 for analysis . We consider a patent belong s to either New Jersey or Texas , as long as one or more inventors of the patent were located within that state. Those patents belonging to both states were deleted for this initial study (accounts for 0.9% of the total number of patents). For each state, we investigated how the inventor network evolved over time by moving a 3-year window . The patent dataset enables us to develop a bipartite network (upper portion of Fig. I) which consists of two sets of vertices---patent assignees and patent inventors . This type of affiliation network connects inventors to assignee s, not assignee s to assignees or inventors to inventor s, at least not directly . The bipartite networks are difficult to interpret as network parameters such as degree distribution have different meanings for different sets of vertices. In order to make the bipartite network more meaningful, we transform the bipartite network to two one-mode networks . Figure I illustrates an example of this transformation from bipartite to one-mode networks. The network analysis tool Pajek was used to explore the patent network and visualize the analysis results. For the one-mode network of assignees , a link between two nodes means that the two organizations share at least one common inventor. In this network , a link indicates that one or more inventors have created patents for both of the organizations during that time frame. In practice this happens when an inventor who create s a patent for company A joins a team in company B or moves to company B and creates a new patent that is assigned to company B. In either of the scenarios, one can assume there would be a knowledge spillover due to R&D collaboration or job movement.
4. Findings and interpretation As described , using Pajek software package , we constructed two sets of one-mode networks for each geographical cluster with a longitudinal approach. This paper focuses on the one-mode network s which consist of organizations, in which a tie between any two nodes indicate s at least one patent inventor is shared by both assignees during the window period . Before proceeding to examine the network structures and level of inventors' mobility, we had noticed from the original patent dataset that some assignees indeed represent entities which are associated with a large organization. The Bell System's multiple entities form linkages between their innovators via patent co-authorship and those linkages could account for a considerable portion of linkages over the whole network (He and Fallah, 2006) . Since this kind of linkage is not directly associated with the spontaneous job mobility of innovators , which is the focus of our study, we regarded them as noise for our interpretation and therefore treated multiple entities of an organization as one unit by reassigning a new unique name to them.
126
Two- mcx:le network
c O ne- mcx:le network
IS
One- mode network
Figure 1: Transformation ofa two-mode network to one-mode networks Figure 2 shows the average node degree of all vertices for each state over different three year window periods. A higher degree of vertices implies a denser network, which in this case indicates inventors are more likely to move from one organization to multiple other ones. Based on this figure, it appears the NJ network was better connected in the earlier years, but the advantage was lessened gradually as the time window moves ahead and NJ finally fell behind of TX in the last period of observation. The situation of NJ in patent network connectedness during the early years may correspond to the regulatory adjustment of the telecom industry. The 1984 Bell System Divestiture broke up the monopoly of AT&T which led to some redistribution of employees including some of the R&D people. The Telecom deregulation created a substantial undeveloped new market which could be exploited with new technologies . As new starts-ups emerge in a cluster, the job mobility among organizations can be expected to grow also. As can be seen in Figure 2, the connectedness of NJ's patent network had experienced a dynamic change during the period between 1986 and 1993. The network of TX maintained a low level of connectedness in that period because there was very little telecom R&D work in that state. Starting from 1993, the networks in both NJ and TX demonstrated a significant growing trend in connectedness. Indeed, largely due to the further openness of telecom market, that was also a period in which the total patent output for both states was growing rapidly (Figure 3). In terms of network structure, the major difference between the NJ network and the TX one is the level of degree centralization 2• We observe that the NJ network is more centralized than that of TX, especially for the later period of observation (Figure 4). Based on our analysis, we noticed that, compared with the counterpart of Degree centralization is defined as the variation in the degree of vertices divided by the maximum degree variat ion which is possible in a network of the same size (De Nooy, 2005). Put it differently, a network is more central ized when the vertices vary more with respect to their centrality; a star topology network is an extreme with degree central ization equals one .
2
127 TX, the main component of the NJ network always accounts for a larger portion of the total connectivity, and the difference becomes more significant in the later periods. This may correspond to the disappearance of many of the start-ups that emerged in mid to late 1990s. Based on the network measurement in overall connectedness, though the NJ network also shows a growing trend after 1993, we conclude that the growth was largely corresponding to the size growth of the main component rather than a balanced growing network . Figure 5 and 7 visualize the one-mode network of assignees for NJ and TX, respectively (window period of 1997-1999). Figure 6 and 8 correspondingly demonstrate the main components extracted from the parent networks . Interestingly , we notice that the "Bell Replace" which represents the entities ofthe old Bell System is the key hub maintaining the main component of the NJ network (Figure 6).
(]) (])
.....
Ol (])
"0
'0 c (IJ
(])
2:
0.6 0. 5 0.4 0.3
0. 2 0.1 0
---....
.,/'
CD CD
lJ)
~
cb
r-, CD
CD
alJ) cD
CD
~
lJ)
9
lJ)
en CD
0
Ol
1-- Mean of node degree-NJ
. ....
m
N
/'
m
lJ)
Ol
Ol
N
Ol
lfl
lJ)
..:.---i
10
9 .... Ol
~
/
r-,
lJ)
CD Ol
Ol
Ol
lJ)
9 r-,
cb
of,
lJ)
Year . - Mean of node degree-TXI
Figure 2: Mean ofDegree - NJ vs. TX 700
'" C OJ
ro0-
'0
'lot:
.
GOO 500 400 300 2 00 10 0 0
... <0
CD
~
..... CD
~
CD CD
~
m
-
CD
en
C>
a;
~
~
m
T o t al # of oat e n t s- Nr
N
m
-
.
~
~
m "" en ~
..,. m
en ~
lJ")
O'l
~
<0
m
~
..... m 01 ~
/-
CD
en ~
m
en en ~
Year Total # of o at errt s -T x
Figure 3: Total number oftelecom patents - NJ vs. TX
C Degree centralizatIon o t notwork - NJ • Degree centralization
~~
[]j
II IL.
IL. I
1.r11 I . I . I . I .
I
Year
Figure 4:Network Centralization - NJ vs. TX
Consistent with above-mentioned findings, the visuali zation of TX network demonstrates a decentralized pattern, in which most of the network connections would still exist even if the most connected hub is removed (Figure 7, 8).
i I
.......
.... ,.-:
\,
.
..
.... \
.
"." .
.
i
...,.J -,
.
.
'E>U'" '. ..-.-.......t .. ~~
'"
Figur. 5: NJ's on..mod. network of asslgn••s - 1997.1999
ra
· JQN"r' ~T1O"o
FiQure 6: Main comoon.nt . xr,.ctfKI from Fio.5 ......~ ru "'U ~n::Jl>,; • .... u
\
! .
\
\
N~caClO !,.TO
- J
-.',.
.......---...
.J
I
. //
. _- .-- ....
Figure 7: TX's one-mode network of assignees - 1997·1999
Figur e 8: Main componenlexlracled from Fig. 7
We interpret the highly centralized network structure as a weakness ofNJ's telecom industry which may explain the cluster 's performance in innovation output. As a majority ofjob movements in the cluster originate from a common source, the diversity of the knowledge transferred in such a network is limited if compared to a network in which innovators move around more frequently with a variety of random routes. Also, considering the history and background of AT&T and the change of regulatory environment, there is a good possibility that the AT&T -hubbed patent
129 network may, to a large extent, correspond to a firm- level adjustment resulting from the corporate fragmentation , rather than the macro-dynamics of labor market; while the latter scenario is a more desirable attribute for encouraging cluster development.
5. Conclusions and future work Our results illustrates that patterns of job mobility may be predicti ve of the trend in cluster development. The study suggests, compared with New Jersey, Texas telecom inventors were more frequently changing their employers , starting their own business orland joining others teams from different organizations. The latter scenario may often result from formal collaborations between organizations, such as contracted R&D projects . Either way these types of ties increase the possibility of technical information flowing within the industry cluster, though the two classifications of ties may vary in their capability for knowledge transfer. One limitation of the network analysis is that, based on the patent dataset itself, the proportion of connections corresponding to each type of ties cannot be explicitly measured , so future researches may benefit from interviewing those inventors to further investigate their motivations or duties involved with the connected patents .
Reference Bala, V. and S. Goyal, "Learning from neighbors ," Review of Economic Studies , 65, 224:595-621 ,1998. Cooper D.P., "Innovation and reciprocal externalities: information transmission via job mobility," Journal of Economic Behavior and Organization , 45,2001. Cowan R. and N. Jonard, "The dynamics of collective invention," Journal of Economic Behavior and Organization , 52, 4, 513-532 , 2003. Cowan R. and N. Jonard, "Invention on a network," Structural Change and Economic Dynamics, In Press. De Nooy, W., A. Mrvar and V. Batagelj, Exploratory social network analysis with Pajek. Cambridge University Press, 2005. Jaffe , A.S. and M. Trajtenberg , "Patents. Citations, and Innovations: A window on the knowledge economy. " MIT Press, 2002 . He, 1. and M.H. Fallah, " Reviving telecommunications R&D in New Jersey: can a technology cluster strategy work," PICMET 2005. He, J. and M.H. Fallah, "Dynamic s of inventors' network and growth of geographic clusters", PICMET 2006. Morone P. and R. Taylor, "Knowledge diffusion dynamics and network properties of face-to-face interactions ," Journal ofEvolutionary Economics, 14, 3, 327-351, 2004. Porter, M.E.; "Clusters and the new economics of competition ," Harvard Business Review. Nov./Dec. 1998. Watts, DJ. and S.H. Strogatz , "Collective dynamics of 'small-world networks' ," Nature, 393,440-442, 1998.
Chapter 17
Adaptive capacity of geographical clusters: Complexity science and network theory approach Vito Albino , Nunzia Carbonara, Haria Giannoccaro DIMEG, Politecnico di Bari Viale Japigia 182,70126 Bari, Italy
This paper deals with the adaptive capacity of geographical clusters (GCs), that is a relevant topic in the literature. To address this topic, GC is considered as a complex adaptive system (CAS). Three theoretical propositions concerning the GC adaptive capacity are formulated by using complexity theory. First, we identify three main properties of CASs that affect the adaptive capacity, namely the interconnectivity, the heterogeneity, and the level of control, and define how the value of these properties influence the adaptive capacity. Then, we associate these properties with specific GC characteristics so obtaining the key conditions of GCs that give them the adaptive capacity so assuring their competitive advantage. To test these theoretical propositions, a case study on two real GCs is carried out. The considered GCs are modeled as networks where firms are nodes and inter-firms relationships are links. Heterogeneity, interconnectivity, and level of control are considered as network properties and thus measured by using the methods of the network theory.
1 Introduction Geographical clusters (GCs) are geographically defined production systems, characterized by a large number of small and medium sized firms that are involved at various phases in the production of a homogeneous product family. These firms are highly specialized in a few phases of the production process, and
131 integrated through a complex network of inter-organizational relationships [Porter 1998]. The literature on GCs is quite rich and involves different streams of research, such as social sciences, regional economics, economic geography, political economy, and industrial organization. Referring to this literature, studies have mainly provided key notions and models to explain the reasons of GC competitive success [Krugman 1991; Marshall 1920; Sabel 1984]. However, in the recent competitive context the foregoing studies do not explain why some GCs fail and some other not and why some GCs evolve by assuming different structures to remain competitive and other not. They in fact adopt a static perspective to analyze GCs restricting the analysis to the definition of a set of conditions explaining GC competitive advantage in a particular context. In addition they focus on the system as a whole and not on the single components (firms), observe the phenomena when they are already happened at the system level, and describe them in terms of cause-effect relations by adopting a topdown approach. Our intention is to overcome these limitations by adopting a different approach. We look at the GC competitive advantage as not the result of a set of pre-defined features characterizing GCs but as the result of two different capabilities, namely adaptability and co-evolution of GCs with the external environment. The high the GC adaptive and co-evolution capabilities, the high the GC competitive success. In fact, if GCs possess the conditions that allow them to adapt and co-evolve with the environment, they will modify themselves so as to be more successful in that environment. In this way, GCs have competitive advantage not because they are characterized by a set of features but because they are able to evolve exhibiting features that are the spontaneous result of the adaptation to the environment. This result is not known a priori, but emerges from the interactions among the system components and between them with the environment. This approach is consistent with the perspective adopted in the paper to study GCs by using complexity science [Cowan et al. 1994], which studies the complex adaptive systems (CASs) and explains causes and processes underlying emergence in CASso Once GCs have been recognized as CASs, CAS theory on co-evolution is used to look for GC features that allow the adaptability of GCs in "high velocity" environments [Eisenhardt 1989]. In particular, three theoretical propositions regarding GC structural conditions supporting GC adaptability are formulated. To test these theoretical propositions, a case study on two real GCs is carried out. The considered GCs are modeled as networks where firms are nodes and inter-firms relationships are links. Heterogeneity, interconnectivity, and level of control are considered as network attributes and thus measured by using the methods of the network theory. In the next we give a brief review of CASs, we show that GCs possess relevant properties of CASs, and based on CAS theory we derive the propositions on the GC co-evolution in "high velocity" environments. Then, we present the methodology applied to this research and the empirical evidence.
2 The complexity science approach to GC competitive advantage Complexity science studies CAS and their dynamics. CASs consist in evolving network of heterogeneous, localized and functionally-integrated interacting agents. They interact in a non-linear fashion, can adapt and learn, thereby evolving and developing a form of self-organization that enables them to acquire
132 collective properties that each of them does not have individually. CASs have adaptive capability and co-evolve with the external environment, modifying it and being modified [Axelrod and Cohen 1999; Gell-Mann 1994]. During the 1990s, there was an explosion of interest in complexity science as it relates to organizations and strategy. The complexity science offers a number of new insights that can be used to seek new dynamic sources of competitive advantage. In fact, application of complexity science to organization and strategy identifies key conditions that determine the success of firms in changing environments associated with their capacity to self-organize and create a new order, learn and adapt [Levy 2000]. Complexity science is used in this study to identify what conditions of GCs enabling them to adapt to external environment. Therefore, the basic assumption of this study is that GCs are CASs given that they exhibit different properties of CAS, such as the existence of different agents (e.g. firms and institutions), the non-linearity, different types of interactions among agents and between agents and the environment, distributed decision making, decentralized information flows, and adaptive capacity [Albino et al. 2005]. In the follow three theoretical propositions concerning the GC adaptive capacity are formulated by using CAS theory. Interconnectivity. CAS theory identifies the number of interconnections within the system as a critical condition for self-organization and emergence. Kauffman [1995] points out that the number of interconnections among agents of an ecosystem influences the adaptive capacities of the ecosystem. He uses the NK model to investigate the rate of adaptation and level of success of a system in a particular scenario. The adaptation of the system is modeled as a walk on a landscape. During the walk, agents move by looking for positions that improve their fitness represented by the height of that position. A successful adaptation is achieved when the highest peak of the landscape is reached. The ruggedness of the landscape influences the rate of adaptation of the system. When the landscape has a very wide global optimum, the adaptive walk will lead toward the global optimum. In a rugged landscape, given that there are many peaks less differentiated, the adaptive walk will be trapped on the many suboptimal local peaks. By using the concept of tunable landscape and the NK model, Kauffman [1995] demonstrates that the number of interconnections among agents (K) influences the ruggedness of the landscape. As K increases, the ruggedness raises and the rate of adaptation decreases. Therefore, in order to assure the adaptation of the system to the landscape, the value of K should not be high. This result has been largely applied in organization studies to modeling organizational change and technological innovation [Rivkin and Siggelkow 2002]. In organization studies the K parameter has an appealing interpretation, namely, the extent to which components of the organization affect each other. Similarly, it can be used to study the adaptation of GCs, by considering that the level of interconnectivity of CGs is determined by the social and economic links among the CG firms. When the number of links among firms is high, the behavior of a particular firm is strongly affected by the behavior of the other firms. On the basis of the discussion above, we formulate the following proposition: Proposition 1. A medium number oflinks among GC firms assures the highest GC adapti ve capacity. Heterogeneity. Different studies on complexity highlight that variety destroys variety. As an example, Ashby [1956] suggests that successful adaptation requires a system to have an internal variety that at least matches
133 environmental variety. Systems having agents with appropriate requisite variety will evolve faster than those without. The same topic is studied by Allen [200I] and LeBaron [2001]. Their agent-based models show that novelty, innovation, and learning all collapse as the nature of agents collapses from heterogeneity to homogeneity. Dooley [2002] states that one of the main properties of a complex system that supports the evolution is diversity. Such a property is related to the fact that each agent is potentially unique not only in the resources that they hold, but also in terms of the behavioral rules that define how they see the world and how they react. In a complex system diversity is the key towards survival. Without diversity, a complex system converges to a single mode of behavior. Referring to firms, the concept of agent heterogeneity can be associated to competitive strategy of firms. This in fact results from the resources that firm possesses and defines the behavior rules and the actions of firms in the competitive environment [Grant 1998]. Therefore, we assume that: Proposition 2. The greater the differentiation of the competitive strategies adopted by GCsfirms, the higher the GC adaptive capacity. Level of control. The governance of a system is a further important characteristic influencing CAS self-organization and adaptive behaviors. Le Moigne [1990] observes that CASs are not controlled by a hierarchical command-and-control center and manifest a certain form of autonomy. The latter is necessary to allow evolution and adaptation of the system. A strong control orientation tends to produce tall hierarchies that are slow to respond [Carzo and Yanousas 1969] and invariably reduce heterogeneity [Jones 2000]. The presence of "nearly" autonomous subunits characterized by weak but not negligible interactions is essential for the long-term adaptation and survival of organizations [Simon 1996]. The level of control in GCs is determined by the governance of the GC organizational structure. Higher the degree of governance, higher the level of control exerted by one or more firms on the other GC firms. Therefore, we assume that: Proposition 3. A medium degree of governance of the GC organizational structure improves the GC adaptive capacity.
3 Research methodology The proposed theoretical propositions have been supported by the results of an empirical investigation. The empirical investigation, adopting a multiple-case study approach [Yin 1989], involves three in-depth case studies on real GCs. To address the purpose of our research we selected the GCs on the basis of their competitive advantage and we modeled each GCs as a network where firms are nodes and inter-firms relationships are links. In particular, we chosen three cases characterized by a different degree of success in the competitive scenario and by using the methods of the network theory we measured the GC structural characteristics identified in the three theoretical propositions as network attributes. We considered two Italian industrial districts': the leather sofa district of Bari-Matera and the agro-industrial district of Foggia, both localized in Southern Italy. The leather sofa district of Bari-Matera has been analyzed in two I Industrial districts are a specific production model characterized by the agglomeration of small- and medium-sized firms integrated through a complex network of buyer-supplier relationships and managed by both cooperative and competitive policies .
134 different years, 1990 and 2000, which correspond to different stages of its lifecycle, Development and Maturity respectively [Carbonara et a1. 2002]. The case studies have been preceded from an explorative phase addressed to delineate the characteristics of the considered GCs. Such phase involved two stages: 1) the collection of data and qualitative information from the firm annual reports and the relations of Trade Associations and Chambers of Commerce; 2) the survey of the firms operating in the two GCs, based on the Cerved database (the electronic database of all the Italian Chambers of Commerce). The analysis of data collected during stage one has been addressed to evaluate the competitive performance of the three cases. In particular, for each case we measured the percentage ratio between export and turnover (Table 1). Through the survey it has been possible to define the dimension of the considered GCs in terms of number of firms. Successively, a sample of firms has been selected within each GCs. A reputational sampling technique [Scott 1991] has been used rather than a random one. To do this, we asked a key GC informant to select a sample of firms based on their reputations as both main players of the GC and active players in the network. This sampling technique ensures to identify a sample of firms that better represents the population. The three networks that model each considered GCs are: 1) the network of the agro-industrial district of Foggia; 2) the network of the leather sofa district of Bari-Matera in the Development stage; 3) the network of the leather sofa district of Bari-Matera in the Maturity stage. We labeled these three networks "alfa-net", "beta-net" , and "gamma-net", respectively. Table. t: Geographical clusters' dimension andcompetitive perfonnance. Agro-industrial district of Foggia Numb er of firms Export/turnover (%)
140
Leather sofa district of Bari-Matera Development Stage ( 1990) 101
33%
60%
Leather sofa district of Bari-Matera Maturity Stage (2000) 293 77%
In particular, we selected 66 firms active in the alfa-net, 43 in the beta-net, and 58 in the gamma-net. These samples represent the 47 percent, the 43 percent, and the 20 percent of the GC's total firm population, respectively. The data on each firm of the three networks have been collected through interviews with the managers of the firms and questionnaires. In particular, we collected network structure data by asking respondents to indicate with which other sample firms they have business exchanges. We then used these data to build the network of business inter-firm relationships characterizing each considered GCs.
4 The network analysis To test the three theoretical propositions by the empirical study the measurement of the three features of the GC organizational structure, namely heterogeneity, interconnectivity, and level of control, is required. To this aim we have first operationalized the three GC structural features in terms of network attributes and then we have measured the identified network attributes by using the methods of the network theory. In particular, we have used the following set of measures: network density, network heterogeneity, and network centrality. The test of Proposition 1 has been based on a simple measure of the network structure, network density (ND), defined as the proportion of possible linkages that are
135 actually present in a graph. The network density is calculated as the ratio of the number of linkages present, L, to its theoretical maximum in the network, n(n-1), with n being the number of nodes in the network [Borgatti and Everett 1997]: L
NDo - - noOnolO To test Proposition 2 we performed an analysis of the heterogeneity of the coreness of each actor in the network. By coreness we refer to the degree of closeness of each node to a core of densely connected nodes observable in the network [Borgatti and Everett 1999]. Using actor-level coreness data, we calculated the Gini coefficient, that is an index of network heterogeneity. Finally, to test Proposition 3 we used an index of network centrality: the average normalized degree centrality (Average NDC). The degree centrality of a node is defined as the number of edges incident upon that node. Thus, degree centrality refers to the extent to which an actor is central in a network on the basis of the ties that it has directly established with other actors of the network. This is measured as the sum of linkages of node i with other j nodes of the network. DCOnjOo 0 xij Due to the different sizes of the three networks, a normalized degree centrality NDC has been used [Giuliani 2005]. This is measured as the sum of linkages of node i with other j nodes of the network DC(nJ and standardized by (n -1) .
"DCrn-ro-DOnJ
JV.
,
tn 0 Ir
Once we have operationalized the GC structural features in terms of network properties, we applied network analysis techniques using UCINET (Borgatti et al., 2002) software to represent the three networks and to compute the identified network attributes (Table 2). Table. 2: Measure s of the three network attributes . Alfa-net Beta-net Gamma-net
Network density 0,0138 0,0321 0,0236
0,041 0.22 0.24
2.751 5.648 4.295
Each network attribute represents a structural GC characteristics. In particular, we used: 1) the network density to measure the GC interconnectivity, 2) the Gini coefficient to measure the GC heterogeneity, and 3) the average normalized degree centrality to measure the level of control inside the Gc. As regards the competitive performance of each GC, we used the percentage ratio between export and turnover. Export measures are usually adopted to evaluate the competitiveness at different Icvels, namely country, industry, firms and product [Buckley ct al. 1998]. Then, the percentage ratio between export and turnover can be considered as a good proxi to compare the competitive advantage of different firms and/or system of firms. Results are summarized in Table 2 and confinn the three propositions.
5 Conclusion
136 This paper has used complexity science concepts to give new contributions to the theoretical understanding on Geographical Clusters (GCs) competitive advantage. In fact, the complexity science has been used as a conceptual framework to investigate the reasons for the success of GCs. This approach is particularly valuable given that allows the limits of traditional studies on GCs to be overcome. In particular, the GC competitive advantage is not the result of a set of pre-defined features characterizing GCs, but it is the result of dynamic processes of adaptability and evolution of GCs with the external environment. Therefore, the GC success is linked to the system adaptive capacity that is a key property of complex adaptive system (CAS). Using the theory on CAS, the key conditions of GCs that give them the adaptive capacity have been identified, namely the number of links among GC firms, the level of differentiation of competitive strategies adopted by GCs firms, and the degree of governance of the GC organizational structure . The theory on CAS has been then used to identified the value that these variables should have to increase the system adaptive capacity . In this way, three theoretical propositions concerning GC adaptive capacitive have been formulated . The proposed theoretical propositions have been supported by the results of an empirical investigation. In particular, the empirical investigation, adopting a multiple-case study approach , involves three in-depth case studies on real GCs. The three cases were analyzed by using the methods of the network theory. We measured the GC structural characteristics identified in the three theoretical propositions as network attributes . Simulation results have confirmed the theoretical propositions showing that: (i) a medium network density assures the highest performance, measured in terms of percentage ratio between export and turnover, (ii) the higher is the network heterogeneity, measured by the Gini coefficient, the higher the GC performance, and (iii) a medium value of the network centrality, measured by the average normalized degree centrality, determines the highest GC performance.
Bibliography III Albino , V., Carbonara, N., & Giannoccaro, I., 2005, Industrial districts as complex adaptive systems: Agent-based models of emergent phenomena, in Industrial Clusters and Inter-Firm Networks, edit by Karlsson, C; Johansson, B., & Stough, R., Edward Elgar Publishing, 73-82. [2] Allen , P. M., 2001 , A complex systems approach to learn ing, adaptive networks. International Journal of Innovation Management, 5, 149-180. [3J Ashby, W.R ., 1956, An Introduction to Cybernetics, Chapman & Hall (London). [4] Axelrod, R., & Cohen, M.D ., 1999, Harnessing Complexity: Organizational Implications of a Scientific Frontier, The Free Press (New York) . [5] Borgatti , S.P., & Everett, M.G, 1999, Models of core/periphery structures, Social Network s, 21: 375-395. [6] Borgatti, S.P., Everett, M.G., & Freeman, L.c., 2002, UCINET 6 for Windows , Harvard: Analytic Technologies. [7] Buckley, Pzl., Christopher, L.P., & Prescott, K., 1988, Measures of International Competitiveness: A critical survey. Journal of Marketing Management,4 (2), 175-200.
137 18] Carbonara, N., Giannoccaro 1., & Pontrandolfo P., 2002, Supply chains within industrial districts: a theoretical framework, International Journal of Production Economics, 76 (2),159-176 . [9] Carzo , R., & Yanousas, J. N., 1969, Effects of flat and tall structure, Administrative Science Quarterly , 14, 178 191. [10] Cowan , G. A., Pines, D., & Meltzer, D. (eds.) , 1994, Complexity: Metaphors, Models, and Reality, Proceedings of the Santa Fe Institute , Vol. XIX. Addison-Wesley (Reading, MA). [II] Dooley , KJ., 2002, Organizational Complexity, in International Encyclopedia of Business and Management, edit by M. Warner, Thompson Learning (London) . [12] Gell-Mann, M. , 1994, The quark and the jaguar, Freeman & Co (New York) . [13\ Giuliani, E., 2005, The structure of cluster knowledge networks: uneven and selective, not pervasive and collective, Working Papers, Copenhagen Business School. [15] Grant, R.M., 1998, Contemporary Strategy Analysis . Concepts, Techniques, Applications, Blackwell (Oxford). [16] Jones , G.R., 2000 , Organ izational Theory, Addison-Wesley (Reading, MA). [17] Kauffman, S.A., 1993, The Origins of Orders: Self-Organization and Selection in Evolution, Oxford University Press (New York/Oxford) . [18] Kauffman , S.A., 1995, At home in the universe: the search for laws of selforganization and complexity, Oxford University Press (New York). [191 Krugman, P.R., 1991, Geography and Trade, MIT Press (Cambridge, MA) . [21] Le Moigne , J. L., 1990, La Modelisation des Systernes Complexes, Dunod (Paris). [221 Lebaron, B., 2001 , Financial market efficiency in a coevolutionary environment. Proceedings of the Workshop on Simulation of Social Agents : Architectures and Institutions, Argonne National Laboratory and The University of Chicago, 33-51. [251 Levy, D.L. , 2000, Applications and Limitations of Complexity Theory in Organization Theory and Strategy, in Handbook of Strategic Management, edit by Rabin, J ., Miller, G. J. , & Hildreth, W.B., Marcel Dekker (New York). [27] Marshall , A. , 1920, Principles of Economics, Macmillan (London) . [28) McKelvey, B., 2004, Toward a Oth Law of Thermodynamics: Order Creation Complexity Dynamics from Physics and Biology to Bioeconomics, Journal of Bioeconomics, 6, 65-96. [301 Piore, M., & Sabel , C.F., 1984, The Second Industrial Divide , Basic Books (New York). [31) Porter, M., 1998, Clusters and the new economics of competition, Harvard Business Review, 76, 77-90. 1321 Rivkin, J.W., & Siggelkow, NJ ., 2002, Organizational Sticking Points on NK Landscape, Complexity, 7 (5), 31-43. [33] Scott, J., 1991, Social network analysis, Sage (London) . 134] Simon , H.A., 1996, The sciences of the artificial, MIT Press (Cambridge, MA). [35) Yin, R., 1989, Case Study research: Design and methods, Sage (Newbury Park,CA) .
Chapter 18
Corporate Strategy an Evolutionary Review Philip V. Fellman Southern New Hampshire University As Richard Rumelt indicates in his book, "Fundamental Issues in Strategy: A Research Agenda", corporate strategy is a relatively recent discipline. While pioneers in the field like B.H. Liddell Hart and Bruce Henderson (later to found the Boston Consulting Group and creator of the famous BCG Growth-Share Matrix) began their research during the Second World War, the modem field of business strategy as an academic discipline, taught in schools and colleges of business emerged rather later. Rumelt provides an interesting chronicle in the introduction to his volume by noting that historically corporate strategy, even when taught as a capstone course, was not really an organized discipline . Typically, depending on the school's location and resources , the course would either be taught by the senior most professor in the department or by an outside lecturer from industry. The agenda tended to be very much instructor specific and idiosyncratic rather than drawing in any systematized fashion upon the subject matter of an organized discipline .
139
1.0 Corporate Strategy as the Conceptual Unifier of the Business One of the most important early thinkers on corporate strategy was Harvard professor Kenneth T. Andrews. Perhaps his most famous work is his book "The Concept of Corporate Strategy",' which has also appeared in other places as both a brief book chapter' and a rather longer audiotape lecture . Andrews primarily conceptualized corporate strategy as a unifying concept, centered around pattern .3 While many of his notions regarding the corporation may seem a bit dated, as they did after all, reach their final form before the majority of landmark decisions in corporation law and corporate governance which emerged out of the merger-mania of the 1980's and far before the electronic commerce and internet revolution of the next decades, as with the work of Hart and Henderson, their foundations are nonetheless essentially sound. Andrews' most important contribution to the literature is more likely his emphasis on "pattern" as the relational foundation of corporate activities which we call strategy. However, strategy is often emergent in practice. Henry Mintzberg, in a now famous series of articles on "Emergent Strategies", made some very convincing arguments that for most corporations, strategy is more likely to be a pastiche of structures which emerge (like undocumented, patched operating system code) as the corporation grows, rather than a monolithic set of policy statements formulated at the birth of the corporation."
2.0 Corporate Pattern as the Core of Corporate Strategy Nonetheless, Andrews' recognition of pattern as a defining variable in corporate strategy is a major contribution to the literature. Among other things, in defining strategic tradeoffs and in evaluating the strategic behavior of a business, it allows the analyst to determine how coherent that overall pattern is. Questions like, "does this corporate activity make sense within the broader context of our organizational mission?" fir very neatly into the "coherence test" and in a very real way anticipate the concepts of cross-subsidization and strategic fit among a the different activities in a corporation's value chain which are so well expressed by Michael Porter at the end of the 1990'S.5,
Kenneth T. Andrews, "The Concept of Corporate Strategy", Richard Irwin, Inc, (New York: 1980) See "Readings in the Strategy Process", Ed. Henry Mintzberg, Prentice Hall, (New Jersey: 1998) and "The Concept of Corporate Strategy (I Don't Know What This Means)", Careertapes Enterprises, 1991. 3 Ibid. No.6, "Corporate strategy is the pattern of decisions in a company that determines and reveals its objectives, purposes or goals, produces the principal policies and plans for achieving those goals, and defines the business the company is to pursue, the kind of economic and human organization it intends to be and the nature of the economic and noneconomic contribution it intends to make to its shareholders , employees, customers and communities ... " 4 See, Henry Mintzberg, 'The Design School: Reconsidering the Basic Premises of Strategic Management.' (Strategic Management Journal, vol. 11/3, 1990, 171-195). For a grand summation of Mintzberg's views, see "The Academy of Management Executive" , August 2000 v14 i3 p31, View from the top: Henry Mintzberg on strategy and management. (Interview) Daniel J. McCarthy. 5 See, Michael Porter, "What is Strategy", Harvard Business Review, November-December, 1996. I
2
140 While Andrews draws a strong distinction between strategic planning and strategy implementation, the work of later authors like Mintzberg and Drucker 6 demonstrates rather clearly that this is probably not the best way of viewing the strategy process . As the next section of this brief paper discusses, this was the earliest conceptual break which Richard Rumelt , Andrews' "star student" and a leader of the first generation of professional strategists made with Andrews. While Andrews explicitly argued that breaking up a company' s activities into a sequence of activities would undermine the compan y's ability to formulate a coherent strategy, Rumelt argues that in order to assess the appropriateness of corporate strategy, a reduction to basic elements is a necessary prerequisite.
3.0 Appropriateness and Strategy Evaluation: The Next Generation Richard Rumelt, by comparison, is an analyst of an entirely different sort. As mentioned above, Rumelt was a member of that first group of strategy analysts to actually train and specialize in the field of corporate strategy as the discipline to which their careers would be devoted. In his introduction to "Fundamental Issues in Strategy", Rumelt emphasizes that corporate strategy as a relatively new discipline has roots in many other fields, particularly within the social sciences. Among the fields which Rumelt cites are history, economics, sociology, political science, and applied mathematics. With the more recent work of Theodore Modis, Michael Lissack, J. Doyne Farmer and Brian Arthur, we might also wish to add physics, pure mathematics, computer science and populat ion biology. In any case, Rumelt offers a variety of unique and powerfully phrased arguments designed to pinpoint not just the general issues facing the modern, large corporation, but instead aims to uncover the "first principles" of corporat e strategy. From the outset, Rumelt aims at building a different kind of model than Andrews. Where Andrews is largely descriptive and illustrative, Rumelt is analytical and evaluative. In fact the particular piece whose insights we draw from here was entitled "Evaluating Business Strategy'" and focuses , among other things on questions regarding whether a particular conventional methodology for measuring corporate performance is, in fact, the appropriate set of measures to be used. Also, where Andrews in an early strategist, seeking to define in his own way, if not first principles of strategy, then the basic model of corporate strategy, and ultimatel y locked into to a model which by its very nature becomes increa singly dated with the passage of time, Rumelt is not only an early researcher in the discipline , but he is also a contemporary author. In the article in question, Rumelt begins with the rather bold statement that "Strategy can neither be formulated nor adjusted to changing See for example, "Beyond capitalism" an interview with management expert and author Peter Drucker, by Nathan Gardels, New Perspectives Quarterly, Spring 1998 vIS i2 p4(9), see also 'Flashes of genius:' Peter Drucker on entrepreneurial complacency and delusions ... and the madness of always thinking you're number one." Inc. (Special Issue: The State of Small Business), Interview by George Gendron, May 15, 1996 vl8 n7 p30(8) 1 Richard RumeIt, "Eval uating Business Strategy", in Readings in the Strategy Process, Ed. Henry Mintzberg, Prentice Hall, (New Jersey: 1998) 6
141 circumstances without a process of strategy evaluation." Like Gordon Donaldson, his first target is the "shopping list strategy" model which is often so central to the preferences of Andrews' "top management" and which holds as its sacred cow, the value of growth. Foreshadowing later, prize-winning explanations by Michael Jensen and others, Rumelt is already raising the fundamental issue of "value creation" vs. simple growth models." Rumelt further argues that "strategy evaluation is an attempt to look beyond the obvious facts regarding the short-term health of a business, and instead to appraise those more fundamental factors and trends that govern success in the chosen field of endeavor.,,9 In choosing those factors, Rumelt identifies four critical areas consistency , consonance, feasibility and advantage. For the purposes of this paper, and the construction of the 3-C test: Consistency, Coherence and Consonance, we will focus primarily upon Rumelt's first two categories and treat feasibility and advantage only in passing. This is not to argue that feasibility and advantage are not important, but merely to point out that arguments about feasibility tend to be casespecific and argument about competitive advantage have probably been developed at greater length by Michael Porter, and then by Hamel and Prahalad than by anyone else. This is not to slight Rumelt 's efforts in this area, but simply to point the reader to those areas of Richard Rumelt's work which are his strongest, and presumably both his most enduring and his most useful for immediate practical application .
4.0 The Process View In the early part of his essay, Rumelt defines four key characteristics: 10 Consistency: The strategy must not present mutually inconsistent goals and policies. Consonance: The strategy must represent an adaptive response to the environment and the critical changes occurring within it. Advantage: The strategy must provide for the creation and/or maintenance of a competitive advantage in the selected area of activity Feasibility: The strategy must neither overtax valuable resources nor create unresolvable sub problems. In explaining consistency, Rumelt argues that a first reading gives the impression that such bold flaws are unlikely to be found at the lofty levels of corporate strategy formulation and implementation (although he does not define them quite so separately as does Andrews). Rumelt relates his consistency argument to the implicit organizational coherence suggest by Andrews and even goes so far as to argue that the purpose of consistent organizational policy is to create a coherent pattern of
The most lucid explanation of this process is probably given in "The Search for Value: Measuring the Company's Cost of Capital" , by Michael C. Ehrhardt, Harvard Business School Press, 1994. 9 Ibid. No. 12 10 Ibid.
8
142 organizational action. Rumelt then offers three diagnostic techniques for determining ifthere are structural inconsistencies in an organization 's strategy: 11 1. 2.
3.
If problems in coordination and planning continue despite changes in personnel and tend to be issue based rather than people based. If success for one organizational department means, or is interpreted to mean, failure for another department, either the basic objective structure is inconsistent or the organizational structure is wastefully duplicative. If, despite attempts to delegate authority, operating problems continue to be brought to the top for the resolution of policy issues, the basic strategy is probably inconsistent.
We can see from the very phrasing of Rumelt's arguments that he raises powerful issues. Also in addressing consistency as a first principle of corporate strategy he has changed the approach to top management. In Andrews ' world, the top management was the critical locus of corporate strategic planning . Among other things, as Gordon Donaldson points out, they had probably not yet learned that corporations do not necessarily possess the inalienable right to dream the impossible dream. 12 In fact, corporations dreaming the impossible dream go bankrupt on a regular basis. Donaldson makes a very strong case that corporation s following even moderately inconsistent goals are bound to both under-perform in the short term and threaten the stability of their financial position in the long term. ':' Thus the consistency issue is one of tremendous importance.
4.1 Organizational Learning: A First Cut At the same time, it is a very topical issue, and is closely related to process reengineering, organizational learning and modern organizational design. One of the fundamental goals of all these enterprises is to get rid of these inconsistencies , which Rumelt, using mathematical phraseology in his more recent writings would call the result of in-built zero-sum game strategies. What was a standard modus operandi for the post-war, industrial, large-scale divisionally organized corporation , has long since turned into a competitive disadvantage in the modern demand driven world. As Evans and Wurster note, the creation of new technologies is often the precursor to the disappearance of entire industries. 14 When a company or a cluster of companies is built around structural inconsistencies, no matter how great the length of historical common practice, extinction is all but assured. In their words, a company's greatest assets may overnight become the company's greatest liabilities. If Rumelt's
" Ibid. Gordon Donaldson " Financial Goals and Strategic Consequences", Harvard Business Review, MayJune, 1985. 13 Ibid. 14 "Strategy and the New Economics of Information", Philip Evans and Thomas Wurster, Harvard Business Review, September-October, 1997. 12
143 arguments are to be taken seriously, then consistency appears to be the first thing which one should look for in understanding corporate strategy.
5.0 Re-examining the Development and Evolution of the Corporation While earlier authors, such as Porter 15 argued that the investment in and ownership of core technologi es was a corporation' s principal strengths, others, such as Kenich i Ohmae have argued at least as convincingly that as technologies increase the rapidity with which they involve and as research and development costs mount exponentially, a strategy for manag ing a cluster of proprietary, licensed, and jointventure technologies is the most cost-effective way of securing a continually improving competitive position. What is interesting to note is that both group s of author s, that is the group following Porter's arguments' " as well as the groups centered around Ohmae and Hamel and Prahalad' ", indicated that even though technologies are rapidly evolving and a company has to stake out a credible future position now, the time horizon for strategic plann ing has, in fact, expanded, and that the ten year strategic plan is often more important than the annual review." In the context of the above typologi es, one can see where the role of the central strategic planning group may be more rather than less import ant for 21 SI century corporation s. The big difference between these groups and their predecessors, as has already been hinted at, is that through a flatter management structure, and a more empowered work force, feedback loops can be built into the plannin g process from the near, intermediate and far "ends" of the business (to borrow Porter' s "value-chain" terminology) in order to avoid costly future mistakes by a group which is excess ively distant from daily operation s and daily changes in the marketplace.
15 Porter treats the subject in a variety of fashions, most notably in his articles for the Harvard Business Review, " How Competitive Forces Shape Strategy" (1980), The Competitive Advantage of Nations ( 1989), "Ca pital Disadvantage: America's Failing Capital Investment System" and "W hat is Strategy" (1996). 16 See Slater, Stanley F., and Olson Eric M., "A Fresh Look and Industry and Market Analysis", Business Horizons, January-February, 2002 for an analysis of the ways in which Michael Porter' s Five Forces Model (Porter, 1979) can be modified to reflect subsequent globalization and the evolution of technology in which they develop an "augmented model for market analysis" (pp.15-16) In particular, they incorporate the additional features of "co mplementary competition" and "cornplementors", as well as "co mposite competition", "customers", " market growth" and "m arket turbulence". 17 Gary Hamel and C.K. Prahalad, "T he Core Competence of the Corporation", Harvard Business Review, May-June, 1990. 18 In a similar vein to No. 29, above, An article from " Fast Company" (Issue 49, August 200 I, pp. 108 ff.) by Jennifer Reingold, entitled "Ca n C.K. Prahalad Still Pass the Test?", explores some of the ways in which Prahalad has adapted the strategies based around the concept of core competence (which he and Gary Hamel developed in the 1980' s and 1990' s) for the 2 1st century. In particular, Prahalad explains that his newer approach is based around universal inter-connectivity and breaking the mold of the print driven society. In describing his new approach, for which he has drawn on the Sanskrit word " Praja" meaning "t he assembly" or "the common people" (alternatively , "the people in common"), he argues that the mass availability of data and the ability of consumers to personalize their experie nce over the internet is a fundamental driver of changes so profound that he calls them "cosmic" (with reference to their paradigmatic scope") .
144
6.0 Strategy and the Future Complexity strategist Lissack draws on the literature of population biology to describe the competitive environment as a "fitness landscape". Following Kauffman, "a fitness landscape is a mountainous terrain showing the location of the global maximum (highest peak) and global minimum (lowest valley)", where the height of a feature is a measure of its fitness." What is interesting about the fitness landscape model is that it is a dynamic rather than a static model. Kauffman argues that real fitness landscapes and environments are not fixed but are constantly changing. The change is a product of the activities of the various species on the landscape. In this sense there is a remarkable degree of congruity between the fitness landscape metaphor and many models of product/market competitiveness. Kauffman and Lissack describe these changes as "deformations" and ascribe them to the same kinds of forces which Michael Porter describes as "jockeying for position". In more precise terms, Kauffman and Macready argue, "real fitness landscapes in evolution and economies are not fixed, but continually deforming. Such deformations occur because the outside world alters, because existing players and technologies change and impact one another, and because new players, species, technologies, or organizational innovations enter the field. Kauffman and Macready argue that a jagged fitness landscape means that solutions tend to be local and incremental rather than global and sweeping. Because we are considering a complex metaphor here, it is important to specify that local and global are mathematical terms referring to the character of the competitive environment and not geographical terms. In drawing metaphors from the literature of complexity theory, Lissack also uses the concept of "basin of attraction" , familiar to those who have read some chaos theory as "strange attractors" . A major portion of Lissack's argument is first devoted to clearing up the common misconception that a strange attractor is a thing. An attractor, in the mathematical sense, defines a solution basin - a place where solutions to particular problems or iterated equations are likely to occur. What's important from a business sense, and where the attractor metaphor has value in relation to the concept of fitness landscape. In this case, an attractor (chaotic or regular) is most important for its passive nature. It pulls participants on the fitness landscape into particular solution basins. A simple example of this is the "network" effect visible with the growth of the internet. In a networked commerce system, the value of the network is proportional to the number of participants in the 20 network.
7.0 Conclusion: Strategic Evolution In this sense, Lissack argues that modern strategy must become something very different from the earlier models which relied on control and control mechanisms. 19 Stuart Kauffman, At Home in the Universe, Oxford University Press, 1995, also Kauffman, S. and Macready, W. "Technological Evolution and Adaptive Organizations, Complexity, Vol. I, No.2, 26-43 20 See Hal Varian and Carl Shapiro,"lnformation Rules : A Strategic Guide to the Network Economy" (Harvard Business School Press, 1998)
145 He argues that not only are strategy and control different but that their relationship must change as well. Under such conditions he argues strategy is as much an attempt to understand control (hence the earlier notion of "sussing out the market") as it is to exercise control , noting:" Thus in a complex world, strategy is a set of processes for monitoring the behaviors of both the world and of the agents of the organization, observing where attractors are and attempting to supply resources and incentives for future moves . Command and control are impossible (at least in the aggregate), but the manager does retain the ability to influence the shape of the fitness landscape. Lissack fine tunes this analysis by citing Kauffman's maxim that "adaptive organizations need to develop flexible internal structures that optimize learning . That flexibility can be achieved, in part, by structures, internal boundaries and incentives that allow some of the constraints to be ignored some of the time . Properly done, such flexibility may help achieve higher peaks on fixed landscapes and optimize tracking on a deforming landscape. f Both Kauffman and Lissack propose a variety of techniques for moving firms off local minima and getting companies to higher maxima on both the static and deforming fitness landscape. Although the scope of this paper is too brief to consider these approaches, such as "simulated annealing" and "patches". Suffice it to say that not only has chaos and complexity theory expanded our view of the strategy process and moved the consideration of corporate strategy from statics to dynamics , but it has also opened up an entirely new range of strategic possib ilities for responding to the uncertainties which are an intrinsic part of the strategy process. In our own work at Southern New Hampshire University, we have employed these kinds of complex adaptive systems research tools to study technology succession and strategies for dealing with complex market adaptation and changes. While this work is too lengthy to cite in detail here, it can be found at length in both the on-line proceedings of the s", 6th and 7th International Conferences on Complex Systems as well as in InterJournal, A Publication of the New England Complex Systems Institute .P
Michael R. Lissack, "Chao s and Complexity -What does that have to do with management ?", Working Papers, Henley Management College, U.K. (2000) 22 Ibid. No. 26 23 See, Fellman, Post, Wright and Dasari, "Adaptation and Coevolution on an Emergent Global Competitive Landscape", InterJournal Complex Systems, 1001,2004; Mertz, Groothuis and Fellman, "Dynamic Modeling of New Technology Succession: Projecting the Impact of Macro Events and Micro Behaviors on Software Market Cycles" InterJournal Complex Systems, 1702,2006; and Mertz, Fellman, Groothuis and Wright, "Multi-Agent Systems Modeling of Techno logy Succession with Short Cycles", On-line proceedings of the 7th International Conference on Complex Systems, hltp://necsi .org/events/iccs7/viewpaper.php?id-41 21
Chapter 19
Towards an evaluation framework for complex social systems Diane M McDonald,,2 and Nigel Kay' Information Resources Directorate' ,Computer and Information Sciences' University of Strathc1yde [email protected], [email protected]
1.
Managing and evaluating complex systems
While there is growing realisation that the world in which we live in is highly complex with multiple interdependencies and irreducibly open to outside influence, how to make these 'systems' more manageable is still a significant outstanding issue. As Bar-Yam (2004) suggests, applying the theoretical principles of Complex Systems may help solve complex problems in this complex world. While Bar-Yam provides examples of forward-thinking organisations which have begun to see the relevance of complex systems principles, for many organisations the language and concepts of complexity science such as self-organisation and unpredictability while they make theoretical sense offer no practical or acceptable method of implementation to those more familiar with definitive facts and classical hierarchical, deterministic approaches to control. Complexity Science explains why designed systems or interventions may not function as anticipated in differing environments, without providing a silver bullet which enables control or engineering of the system to ensure the desired results. One familiar process
147
which might, if implemented with complex systems in mind, provide the basis of an accessible and understandable framework that enables policy makers and practitioners to better design and manage complex socio-technical systems is that of evaluation. Evaluation approaches, according to Stufflebeam (2001), have been driven by need to show accountability to funders and policy makers examining excellence and value for money in programmes and more recently internally within organisations to ensure quality, competitiveness and equity in service delivery. Another often used broad categorisation is that of requirement analysis, milestone achievement and impact analysis. Evaluation is therefore a collective term with differing objectives, applications and methods involved. The UK Evaluation Society (www.evaluation.org.uk) defines evaluation as "[a]n in-depth study which takes place at a discrete point in time, and in which recognised research procedures are used in a systematic and analytically defensible fashion to form a judgement on the value of an intervention". Others, however, take a less time dependent approach. For example, on Wikipedia evaluation is defined as "the systematic determination of merit, worth, and significance of something or someone". Evaluation is distinct from assessment which measures and assesses systems be it learning, research or development without making any value judgement. Other, terms often applied to designed systems are validation, which measures 'fitness for purpose' and verification which measures the fitness or correctness of the system itself. Within the context of this paper, we take evaluation to mean the measurement and assessment of the state of a system, in order to form a judgement on the 'value' or 'appropriateness' of the actions being taken by the system. How these different types of evaluation are implemented in practice depends on specific context and objectives. Increasingly a mix and match approach is adopted with appropriate evaluation tools being selected as required. See Snufflebeam (2001) for programme evaluation techniques and LTNI (1998) for general evaluation methods; House (1978) and Stufflebeam & Webster(1980) compare different methods. Evaluation of Complex Systems is of course not new - educational systems, social initiatives and government interventions are complex social systems where effective evaluation is seen as a key process in measuring success. More recently, there has been an increasing recognition that for evaluation of complex social systems to be effective, the evaluation process must take into account the theoretical understanding of complex systems. For example, Eoyang and Berkas (1998) suggest, "[t]o be effective, however, an evaluation program must match the dynamics of the system to which it is applied." Sproles (2000) emphasises that most systems that are part of a 'test & evaluation' process are actually socio-technical system and he therefore argues qualitative data collection and analysis must play an important role. Practical examples also encompass complexity theory into the design of evaluation programmes (Flynn Research 2003; Webb & Lettice 2005). Classical evaluation is [a series] of evidence-based value judgements in time. Evaluation determines the merit of a programme or particular artefact or process related to specific stakeholder requirements. It does not necessarily follow that an evaluated system obtains its desired goals or capitalises on its inherent creativity. This leads to the research question - Can an embedded evaluation system be used as a generalisable framework to improve the success of Complex Social Systems? To achieve this suggests a non-classical approach to evaluation where the evaluation forms part of a holistic complex feedback and adaption framework. Such an approach would require a number of issues to be addressed. For example, in practice, evaluation within any complex
148
social system tends not to be approached holistically, as individual evaluations are for differing stakeholder; programme level evaluations tend to be distinct from internal evaluation of specific software or processes. While all stakeholder evaluations may show satisfaction, this does not necessarily enable detection or realisation of system potential. Also, the multiple contexts within many complex social systems mean that evaluations in one context do not necessarily transfer to another. And, importantly, evaluation like any measurement changes the system it is evaluating. This paper reports on a preliminary investigation into the requirements and issues surrounding the applicability of an embedded evaluation framework as a practitioner and policy-maker friendly framework for managing complex social systems. The case study methodology used is outlined in section 2. An analysis of the characteristics of evaluation found to date is provided in section 3. In section 4, a conceptual model for exploring evaluation as a tool for managing complex social systems is present and discussed. This highlights concepts, processes and issues that require further detailed examination if a generalisable evaluation framework for improving the success of complex social system is to be developed. The paper concludes in section 5 by summarising findings and identifying novelty and future research steps.
2.
Overview of methodology
To investigate the feasibility of developing an embedded evaluation framework and to develop a conceptual model, a preliminary exploratory study was undertaken. This study examined common themes and differences in evaluation practice across three complex social systems. These case studies were selectively chosen to provide information rich cases that might inform further detailed research. In particular, cases which were using evaluation in a core, unusual or innovative way were chosen. The cases introduced here - criminal evidence systems and learning communities are briefly summarised in section 3.1 below. We also draw on some initial anecdotal evidence from innovation systems, which are also being investigated as part of this work. This part of the study is not sufficiently advanced to merit case study status. The case studies themselves were carried out using a semi-structured interview technique which was supplemented and triangulated using additional documentation relevant to the cases. The interview questions were developed through a literature survey and synthesis and were designed to explore the specification of needs and identification of issues of a new framework approach as well as how evaluation was currently used. Respondents were selected from practitioners and policy makers, as this is the target audience for the research. While the work is ongoing, data from 2 respondents per case study as well as secondary documentation is drawn upon to present preliminary observations.
3. Analysis of the characteristics of evaluation in the case studies 3.1
Overview of the cases studied
3.1.1
Criminal evidence system
In the criminal evidence system, the processes, actors and environment involved from the initial crime scene investigation through the evidence analysis to the presentation of
149 evidence in court proceedings were investigated. The main case used was the Scottish criminal evidence system which is underpinned by Scots Law, which is based on Roman law, combining features of codified civil law with elements of uncodified common law. As the nature of Scots Law - it is an adversarial system - will dictate many of the system processes, the implications of an inquisitorial-based system were also considered. A number of actors are involved in the various stages ; police, scene of crime officers, forensic scientist s (defence and prosecution) , lawyers (defence and prosecution), Procurator Fiscal , judge, defendant and jury.
3.1.2
Learning communities
Learning communities provided a second case study. These educational systems include learners , tutors or mentors and development staff including educational and technology development. As a focus, the learning communities surrounding the DIDET (www.didet.ac.uk) project, which spanned both UK and US universities and the Transactional Learning Environment project (technologies.law.strath.ac.uk/tle2) were examined. In each of the projects, both the novelty of the learning interventions and the project or transaction based learning scenarios meant that innovative assessment and evaluation techniques were required. Data from more traditional learning scenarios were used to supplement and compare findings.
3.2
Micro and macro evaluation
A variation in the 'level' at which evaluation took place was evident across the case studies. Within the criminal evidence system, evaluation was core to the daily activity of the various professionals involved. For example, police officers normally evaluate finds and other facts at the scenes to direct what is to be collected rather than instigating a blanket collection. Similarly, forensic scientists must weigh up the possibilities that the evidence collected could have been generated through competing scenarios and finally the jury must evaluate the evidence as presented by the various experts. Evaluation is at the core of the criminal evidence system . Similarly , within innovation systems evaluation is part of the day to day processes of creating and developing new products or processes. In learning communities, traditionally assessment rather than evaluation is much more prominent at the individual learner (micro) level. Evaluation tends to take place at early in and at end of learning programmes to assess the success of the learning intervention and adapt it as required. Increasingly however, an element of learner evaluation is being introduced as part of self-reflection practice within learning communities. Macro-level evaluation of the system as a whole was also important in both the innovation system and learning communities. Macro-level evaluation of interventions or programmes was however in general independent of the micro-level evaluations, although in qualitative evaluation of learning , there was some use of micro-level reflections of both learners and practitioners.
3.3
The role of context
Two significant issues arose from varying contexts within the case studies. Within the innovation system , a sudden increase in the amount of patents applications - one of the innovation measures - was observed. However, to observers on the ground, the overall
150 innovation system did not appear to have dramatically changed. Further investigation revealed that patenting had dramatically increased because of another separate initiative had spare money which it used to help organisations take out patents. The system as a whole was not intrinsically more innovative as a result although the information available had increased. The problem of differing contexts was also an issue in the learning community case study. While the supporting technology had been highly successful within the learning context, its design had followed an unorthodox approach. Concern regarding how external peers within the technology development community might judge the developments, initially at least, restricted the ability to evaluate the technological tool. Thus, as with complex systems in general, the 'measures' can be influenced by different contexts and uses within or outwith the complex system. While evaluation can and perhaps should be ongoing, due to the dynamic nature of complex social systems, any measurement and judgement is only relevant at a specific point in time. 3.4
Complexity
The complexity of the national innovation system gave rise to issues. For example, while evaluation of the various initiatives were always required by the funders and policy makers, a holistic approach was often missing . How different initiatives affected others and how the lessons learnt from current evaluations could be fed back into other ongoing initiatives was often missing. Complexity was also a significant issue within the criminal evidence system with much difficulty in identifying single linear cause and effect relations. For example, it is not uncommon for there to be several competing scenarios as to how particular forensic evidence came to be present. Similarly, in one of the learning communities, lack of uptake of a particular technology was not only due to lack of student eLiteracy training as evaluation first indicated. Subsequent evaluations showed that staff eLiteracy issues were also a strong contributing factor. So while evidence was found to substantiate the claim that evaluation did have interdependencies, feedback between micro and macro evaluation was in general lacking. Without such feedback, the potential for emergence is extremely limited.
4
Designing an evaluation system for complex social systems
4.1
Conceptual model for evaluation & management of complex social systems
' - - -(
I
-suggests
,
----' -
-
evaluation as a success fram ework ) - Ie"ds to I
suggests
;. ( framework components )
.
reqU'feS
complex dynamics
1
( co-evolution )
I
Risk & experimentation
evaluation Is a complex system?
151
Figure 1: Conceptual model for investigating management of complex social systems using evaluation From the analysis of the exploratory case studies and synthesis from the literature review, the conceptual model for exploring the design and management of evaluation systems for management of complex social systems illustrated in Figure 1 above was developed . The main features of the conceptual model are discussed below.
4.2
Embedded, co-evolving evaluation framework with feedback
The issues surrounding innovation system evaluation highlighted in section 3.4 illustrate the need for embedding the evaluation framework within the complex system itself; programme end evaluation cannot improve current programmes . Embedding, however, means more than ensuring ongoing evaluation activity. The evaluation framework needs to be able to detect changes and new system potential. The evaluation must also remain relevant and timely; as the system changes through experimentation, the evaluation must also adapt, as the evaluation measures may no longer reasonably represent the desired characteristic, as the patent example of section 3.3 illustrated. For evaluation to be an effective management tool, the results should be fed back to inform general system behaviour and specific interventions. While this means that evaluation will change the system, the converse should also hold - system changes require change in the evaluation criteria and practices. The evaluation system should co-evolve with its complex social system. Applying complex systems thinking to evaluation also suggests the potential advantages of linking micro and macro level evaluation feedback. While such feedback helps build buy-in and a sense that individuals can make a difference, from a complex systems perspective, it is such micro-macro linkages that leads to novelty - the emergence of sustainable new patterns of activity (McDonald & Weir 2006). The potential of micro macro feedback requires further exploration.
4.3
Evaluation history and critical evaluation point identification
The case studies highlighted the run up to new funding opportunities or new development phases as critical evaluation points from the practitioner and policy-maker perspective. Such points tend to be snapshots in time and do not necessarily identify developing patterns. For a fuller picture, evaluation requires to be ongoing. Continuous evaluation however is both a large overhead and potentially counter productive. For example, one of the case studies reported evaluation skewing the development - the evaluation schedule drove development rather than the actual system requirements . Additionally, participating in evaluation was viewed as time consuming. The richest feedback was obtained when participants understood how the information being gathered would be fed back to improve the system. While impact type evaluations aim to provide reassurance of quality and meeting of objectives, it is where evaluation outcomes differ from expectations that will be indicative of the future direction of the complex system. It is at these points that either corrective intervention is required in the case of undesired behaviour or new potentials may be capitalised upon. Identification of these bifurcations points however requires knowledge of existing patterns. The sharing - often informally - of reflective practice of participants own experiences were viewed as an effective way to identify emerging
152
phenomena and patterns. Such a second layer communal reflection may help minimise personal biases, enabling patterns unseen by one participant to be identified by another due to differing perspectives. The use of historical evaluation records for pattern identification and scenario mapping also requires further investigation. Thus, while change processes often dictate evaluation points, a second layer of reflective evaluation potentially offers a more complete picture, enabling critical points to be identified and interventions designed. A framework which supports the feedback of these evaluations in a timely manner is critical. Further work is required to help identify the parameters associated with optimal evaluation and feedback points.
4.4
Risk and experimentation
Complex adaptive systems by nature 'experiment', with various adaptions being selected over time. While such systems can be highly creative, this is not always desirable. Successful management of systems often involves a trade-off between experimentation and the risks of undesirable behaviour. Within the innovation system, a certain amount of risk was viewed as highly desirable, but this needed to be balanced against overall productivity. Within the criminal justice system, with its need to provide detailed forensic evidence, the scope for experimentation was extremely limited. Experimentation, such as new forensic techniques, was usually thoroughly tested outside criminal proceedings before they were accepted. Adaption and variation does take place however, with new combinations of techniques being used, which must 'compete' against the other sides methods. This suggests the possibility of developing an evaluation framework based on different classes of risk-experimentation trade-off. Classifications of a system may also vary depending on conditions. For example, in the criminal investigation case study, it was reported that the normal 'rules of the game ' may change in extremely distressing, high profile cases such as child murders. Suddenly, the realisation that the evidence will be reported to the child's parents greatly increases the need to minimise risk. This suggests a dynamic trade-off between risk, and experimentation. The potential of a classification, based risk-experimentation tradeoff, which takes into account system purpose, constraints, contexts and dynamics requires further investigation.
4.5
Discussion
As Wang et al (2004) observe (in relation to learning environments) the complex nature of artefacts in a "threat to classically ideal evaluation design where all the variables are well controlled except the [facility under evaluation]". But these interactions that we have identified here hint at another important point worth consideration - is an evaluation system designed with complexity in mind in fact a complex system itself? If so, this leads to the possibility that self-organisation and emergence could occur. Again, self-organisation and emergence may not potentially follow the desired path although the effect of coupling through embedding with the social system itself may provide the guidance required. This is analogous to Ottino's (2004) suggestion of "intelligently guid[ing] systems that design themselves in an intelligent manner". Similarly, unpredictability which we are trying to solve may potentially occur, but coupling may keep it pointed in the right direction. Detailed research on the true dynamics of complex evaluation systems is required.
5
Summary, novelty and next steps
153
In this paper, we have presented insights from a preliminary investigation of evaluation in Complex Social Systems which examined correlations and differences between three differing complex social systems. The aim of this study was to develop a conceptual model of evaluation and control of complex socio-technical systems which could then be used for further detailed study ultimately to produce a policy and practitioner relevant evaluation and control framework for complex social systems. The preliminary insights derived were: (i) Embedded evaluation system is required which co-evolves with its complex social system, coupling micro and macro level evaluations , (ii) Evaluation may be most effective when there is a feedback between micro and macro evaluation; (iii) Context varies both within and outwith the evaluation system perturbing the evaluation and (iv) Different classes of evaluation system may be appropriate to deal with trade-off between purpose, constraints and experimentation . The novelty of this work lies in the setting of a future research agenda for exploring evaluation as a success mechanism for complex systems and the identification of the additional issues which arise when evaluation is applied within a complex systems framework. The next steps are to undertake a detailed investigation based on conceptual model developed (Figure 1), identify critical evaluation points and explore the complex dynamics of evaluation. This will ultimately lead to a policy maker and practitioner friendly complex evaluation and management framework for complex social systems.
References Bar-Yam, Y., 2004, Making things work: solving complex problems in a complex world, NESCSI , Knowledge Press. Eoyang, G.H. and Berkas, T.R., 1998, Evaluating performance in a CAS. Flynn Research, 2003, Complexity Science: A conceptual framework for making connections Denver. Flynn Research. House, E.R., 1978, Assumptions underlying evaluation models. Educational Researcher Vol 7, nUID 3, pp4-12. LTDI, 1998, Evaluation Cookbook. Learning Technology Dissemination Initiative http://www.icbl.hw.ac.uk/ltdi/cookbook/cookbook.pdf (accessed 14/08/2006) McDonald , D.M. and Weir, G.R.S, 2006, Developing a conceptual model for exploring emergence lCCS 2006 (Boston). Ottino, lM., 2004, Engineering complex systems. Nature Vol 427, p389. Sproles, N., 2000, Complex Systems, Soft Science, and Test & evaluation Or 'Real men don't collect soft data' - Proceedings of the SETE2000 Conference . Stufflebeam, D.L. 2001, Evaluation Models New directions for evaluation, num89. Stufflebeam, D.L. and Webster, W,J., 1980, An analysis of alternative approaches to evaluation . Education evaluation and policy analysis, Vol 2, num 3, pp5-19. Wang, H.-C., Li, T.-Y. and Chang, C.-Y., 2004, Analyzing Empirical Evaluation of Advanced Learning Environments : Complex Systems and Confounding Factors. IEEE Learning Technology Newsletter, Vol 6, num 4, pp39-41. Webb, C. and Lettice, F., 2005, Performance measurement, intangibles, and six complexity science principles - International Conference of Manufacturing Research, Cranfield.
Chapter 20
Operational Synchronization Kevin Brandt MITRE Corporation [email protected]
Complex systems incorporatemany elements, links, and actions. OpSync describes adaptive control techniques within complex systems to stimulate coherent synchronization. This approach fuses concepts from complexity theory, network theory, and non-cooperative game theory.
1.0
Coherent Synchronization in Complex Systems
This paper defines coherent synchronization as "the relative and absolute sequencing and adaptive re-sequencing of relevant actions or objects in time and space and their alignment with intent, objectives, or policy in a complex, dynamic environment."
1.1
Emergent Behavior
Complex systems exhibit an array of behaviors. A wide range may emerge from simple interactions of system components. Emergent behavior can be simple, complex, chaotic, or random . Many assume that only desired behaviors will emerge -a fallacy . Most operational concepts recognize the need for synchronization of actions. Given improved information flows and shared situational awareness, some concepts assert sync always emerges. Advocates cite examples of coherent behavior in nature. These assertions of emergence of axiomatic coherent behavior err in the use of inductive arguments given abundant counter-examples [e.g., Stogratz] . Synchronization is one type of coherent behavior. Coherence may emerge from direct or indirect interactions among components; but external factors may induce it. Emergent synchroni zation of coupled oscillators is possible - like fireflies flashing or
155
clocks ticking in unison - and in special cases, it is certain [Stogratz]. Likewise , external action may yield coherent behavior - as an electric current induces a magnetic field . In complex systems , elements may align some attributes while other properties remain disordered. In these cases, the order is less apparent. Moreover, some elements may remain out of sync. Thus, synchronization is not universal and emergent sync in complex systems is not certain . [Manrubia] Complexity studies conclude, "Individual processes in different parts of a system [must bel coordinated Ito enable 1 the system to display coherent performance." IManrubia] However, central coordination is neither necessary nor sufficient for coherent behavior . Yet, complex systems with coherent behavior exhibit organized complexity .
1.2
Organized Complexity
In the development of information theory, Charles Bennett identified two components of organized complexity . First, algorithmic complexity describes the minimal model, formula, program, or elements needed to support a result. Second, logical depth captures the processes needed to generate the result given the initial minimal program [Davies, 19921 . Complex systems exhibiting deep logical depth , like synchronization, may arise after many cycles through the minimal program and thus they appear to "emerge over time ." However, the same logical depth derives more quickly by starting with complex formulae containing information otherwise reconstructed. Mapping the cumulative interactions of independent agents to cycles of minimal programs suggests that since emergent sync is not a certain outcome, then starting with increased algorithmic complexity might be necessary to generate the behavior.
1.3
Emergent Synchronization
Emergent synchronization in simple systems evolves after many cycles in a minimal program. For example, Thai fireflies developed an adaptive response to external flashes via genetic selection; the firefly adjusts the timing of its signal to achieve sync [Stogratz]. These synchroni zed oscillators demonstrate four elements: a coupling agent (signal), an adjustment process (minimal program), feedback or selection process (fitnes s), and computational cycles (evolutionary time). Emergent synchronization in complex systems exhibit these elements; all of them pose challenges for control. Static guidance (intent) does not provide a dynamic coupling agent. Since regional actions must synchronize globally distributed elements operating over vastly different time horizons, simple signals may not suffice . The activities of elements are complex , not simple threshold reactions like those evidenced in nature. Hence, a simple adjustment process may not suffice . Selection processes that operate for biological ecosystems are not available (mating behaviors) or desirable (predator-prey) in most control systems. Moreover , the time required for an emergent selection processes (biological computations) is not sufficiently responsive . The implications are stark: coherent synchronization will not rapidly emerge and adapt without a foundation and emergent self-sync does not provide requisite capability reliably . Studies in other domains support this conclusion with the observation that
156 "when adaptive speed is warranted, initial organizational structure is critical." [Manrubia] "It's a basic principle: Structure always affects function." [Stogratz] Operational Synchronization increases algorithmic complexity to provide sufficient structure.
2.0
Operational Synchronization
2.1
Threaded Synchronization
A software experiment developed and demonstrated a feasible approach to adaptive sync. It yielded a structural model embodied in this experimental Synchronization. Adaptation. Coordination. and Assessment (SACA) tool. The structure and performance provide direct evidence of the feasibility of a structured approach to coherent sync . A foundation for adaptive, coherent sync evolved during spiral development. This framework extends beyond time to align a wealth of activities along the three axes: • Vertical (motivation) - cascading objectives or policies • Horizontal (association) - interrelationships between objectives or objects • Temporal-spatial (location) - relative scheduling of activities The vertical axis aligns missions and actions with objectives, end states, policies, and needs. These links correspond to mission threads expanded to include intent, constraints, priorities, limits, and resources. In the information technology domain, they might correspond to information services aligned to business services supporting business processes. Initial branching of these threads leads to a traditional, decision tree structure tied to cascading objectives, goals, and tasks [Keeney]. However, complex relationships between successive and supportive objectives force the use of a network of nodes and links . This graph structure proves more robust and allows multiple links to merge and diverge at successive levels. It eliminates the assumption of vertical linearity of cascading objectives or horizontal independence between objectives. Finally, it retains lineage and exposes impacts of actions with complex, multiple inheritance. The horizontal axis links nodes of related activities - a cluster of activity nodes. These core clusters enable the dynamic, adaptive use of assets as envisioned by network-centric warfare concept [Cares] or group composable information technology services to deliver business services to meet functional needs. Horizontal links provide structure to fuse and align activities that precede, support, reinforce, or follow primary actions. They capture inter-dependencies between objectives at the same level of organization or between subsequent activities. These connections represent physical or logical ties between elements independent of subsequent scheduling of activities - a precursor vice result of the derived schedule. Activities may extend across domains and incorporate all available resources. Moreover, they may incorporate Boolean logic .
In military operations, horizontal links represent a group of aircraft containing strike aircraft, reconnaissance aircraft, unmanned vehicles, electronic warfare assets, and control aircraft. On the ground, similar relationships bind a maneuver force, its screening force, and supporting fires. In the commercial sector, horizontal sets may
include regional flights into a hub that feeds long-haul air routes.
157
The third axis , location , encompasses space and time. This axis schedules blocks or sets of activities sequenced and emplaced in space-time: action frames . The organizational separation between the previous horizontal axis (associations) and this sequencing is critical - it enables adaptive scheduling and dynamic rescheduling . Diverging from a traditional command-directed approach (command economy), the third axis envisions - but does not require - the formulation of a marketplace of options that combine and recombine with others in the selection and scheduling process: a free market of " services" . This approach expands flexibility beyond a decision tree structure into a network of rich, dynamic arrays of options. This richness provides adaptive depth and robustness. The experimental synchronization tool, SACA, established the feasibility of this threaded sync process , Figure I. It developed and used a constrained, sorting genetic algorithm (GA) to construct and select action frames . SACA demonstrated the practicality of structured sync by constructing hundreds of alternative action frames that incorporate activities across all elements of the marketplace of national power and are ALL equally feasible (but not all equally desirable) . [Ritter]
•
Threaded synchronization structures a complicated problem in many variables.
A genetic algorithm (GA) builds mixes of activities and schedules thatmeet established constraints. • Keeps the best (non-dominated) solutions; these are all feasible solutions that meet identified constraints AND provide a solution where one objective cannot be improved without making one or more other objectives worse (Pareto Optimal). Figure 1: Threaded Synchronization
Each action frame thus represents an option, a branch, a course of action, or a segment of a larger, connected , continuous storyboard . If one computes the number of options of an unstructured set of activ ity nodes, one quickly reaches a point where the number of options is quite large and intractable. By structuring the activities in objective-goal-action threads and bounding the range of options to feasible and desirable regions, the constrained-sorting genetic algorithm used for scheduling quickly converges on optimal solutions (if any exist) . This approach does NOT generate a
158 single point optimum; it returns the Pareto Optimal solutions. One such action frame could encompass an Air Tasking Order. In entertainment, a set might sequence different segments of a film into alternate storylines. In the commercial realm, these blocks could represent a market portfolio of stocks or commercial transportation assets (air, land, or sea). The performance of the software using this approach exceeded initial expectations. The process developed Pareto optimal solution sets for 10,000 activities within 15 minutes. A dynamic re-planning test held most activities constant and forced 300 activities into adaptive rescheduling; the computation converged in 15 seconds. These initial tests of the tool used a 1.5 GHz single-processor Linux workstation. Retested on a 12-node cluster, the runtime for 10,000 actions dropped to under a minute. Implications reach beyond the rapid production of activity schedules to the core processes for course-of-action development and decision analysis. SACA enabled dynamic synchronization for tactical actions. However, threaded synchronization alone falls short of providing the same capability to leaders at operational and strategic levels. They need a viable process that enables coherent adaptation of the action frames - to sequence and shape subordinate actions and supporting plans and operations. Operational Synchronization (OpSync) evolved from threaded synchronization to meet the need.
2.2
Operational Synchronization
Threaded synchronization produces a wealth of independent action frames. Combining and sequencing action frames on a storyboard produces strategic alternatives. Each path embodies a coherent, feasible campaign plan with branches and junctions. However, simply having more alternatives does not provide adaptation if each path is rigid. Hence, branches in paths may spawn alternative paths; separate routes may merge to share a series of action frames before splitting apart again; and completely separate paths may use unique action frames. However, continuity is required; paths may change direction or terminate but discontinuous leaps between pathways cannot occur. The reason for this constraint ties back to the thread that provides resource, action, and intent traceability. While this framework is new, the results are familiar. OpSync is a manifestation of an idealized war game. For example, in the classic movie "WarGames," [Badham] the WOPR computer simultaneously constructs [in OpSync terms] multiple action frames along different paths to explore and evaluate alternative war scenarios and develop a winning strategy. The alternate strategies represented by these paths need not be concurrent (synchronous in time) except (possibly) at junctions of two or more paths. As an example, one approach might embody mostly economic actions that lead to the goal over years while a diplomatic approach might reach the same goal in days. It should be evident that the concept permits the concurrent pursuit of multiple strategies in a single, coherent framework.
159 Thus, operational synchronization extends beyond threaded synchronization to align activities concurrently along four axes: • • • •
The SACA experiment developed a conceptual architecture to use the tool in a control system . Details are beyond the scope of this paper, but the collaborative structure and control feedback envisioned hold promise.
3.1
Modulated Self-Synchronization
In a push to empower users , a network of OpSync hubs could produce and transmit a global sync signal. Akin to the Global Positioning System signals or cellular networks, distributed users could receive wireless OpSync signals on a personal handset or OpSync watch - a logical extension of today's wristwatches [Clark]. These devices would provide time, location, orientation within current frames, and alignment on derivative paths . Armed with this data , users could conform their actions to an adaptive execution scheme and input local status and intended actions for integration into OpSync action frames. In this vein , modulated self-synchronization is practicable.
3.2
Decision Support
Generally, people make decisions based on either instinctive analysis or structured analysis [Jones). Instinctive decisions are marked by their rapidity and by identification and execution of a single satisfactory solution: "It's good enough! " The instinctive process works well when the problems are straightforward and immediate response is critical. This "single track" approach frequently leads to sub-optimal or partial solutions to complex problems or to even larger blunders [Horgan I. Structured analysis takes longer, but considers more factors and more options. Structured analysis begins with the identification of the problem(s) and the major factors and issues: decision variables. Next, a divergent process, brainstorming, yields an array of possible solutions. Generally, as many as three "different" courses are considered. In practice, however, the differences between these courses may not be very great. Finally , a convergent process reduces the number to a single preferred solution. A structured process helps the mind cope with complexity. "We settle for partial solutions because our minds simply can't digest or cope with all of the intricacies of complex problems ... " [Jones) OpSync is structured analysis on steroids. The four-axis structure identifies problems and decision variables being resolved. The genetic algorithm produces massive options in an unequalled, divergent process that then converges from millions of considered options to tens of non-dominated solutions. The final selection of the
160 specific solution(s) resides with the operational decision maker, who is thus empowered to see multiple options within the context of operational level decision factors . Liberated from the tyranny of building schedules, decision makers can refocus time and attention on higher-level decision factors that dominate their thinking and shape the conflict space . By mapping decision attributes into multi-dimensional decisonscapes, the decision maker might see the topography of the complex environment. Using visual or analytic clues, a commander may be able to avoid cusps or boundaries where phase changes abound and chaos reigns. [Casti]
4.0
Mathematical Foundations
4.1
Current Constraints
Most synchronization techniques use one of three general approaches. 1. "Post and avoid" strategies dominate efforts to de-conflict actions. For example, operators post actions on a sync matrix (or activity board) aligned within a desired timeframe and then manually check for conflicts. Planners avoid new conflicts as they add more actions . This method is slow, constrained, fragile, and vulnerable. 2. Linear programs find favor in some areas since they reduce process times and produce the "optimum" solution. However, most operational schedules are not linear. 3. "Advanced tools" use non-linear gradient search techniques to optimize non-linear schedules. These search techniques use complex algorithms that posit solutions lie in a convex set. Operational experience , empirical evidence, and complexity science suggest that solutions may be in regions with sharp discontinuities and cusp geometries. [Casti] Hence, the basic conditions for use of gradient search techniques are not satisfied .
4.2
Applying Non-cooperative Game Theory
Operational synchronization combines aspects from three fields of study: network science , non-linear algorithms, and multi-player non-cooperative game theory . Mission threads link objective (end state) nodes with activity nodes in a regular lattice structure (network) . Objectives: (a) tie to multiple activities; (b) set constraints and bounds for supporting activities; and (c) allocate resources to supporting activities. Activities can link to and receive resource allocations from multiple objectives. Intermediate layers in the lattice establish hubs and form the backbone of the network. Thus, partitioned graphs provide a mathematical foundation for these mission threads and networked interactions support the required dynamic adaptation. The non-linear search technique uses a constrained, sorting GA to construct, assess, and evolve feasible solutions. This approach makes no a priori assumptions about the shape of the solution space. Other search techniques may be feasible . Note that GA searches may not produce the same answers each time and may not "discover" the optimal solution (schedule), but in practice, the results have been robust and responsive .
161 Objective functions for the GA stem from the linked objectives (end states) and not directly from the attributes of individual activities. Sets of all possible activity schedules thus constitute alternative strategies in a theoretical multi-player game with competing objectives being the "players ." Thus, the dominant options found will approach the idealized points predicted by the Nash equilibrium [Kuhn] .
In Summary, this paper illustrates the limits inherent in relying on emergent selfsync in control systems . It identifies the need for fundamental structure - algorithmic complexity - as a foundation for modulated sync and documents the feasibility of using operational synchronization techniques to achieve coherent sync. OpSync overlays objectives and actions on a tiered network - a partitioned graph - and uses non-linear optimization techniques - a constrained sorting genetic algorithm - to identify feasible, non-dominated options . Finally, OpSync leverages the Nash equilibrium points defined by competing decision criteria to return a bounded set of non-dominated options . The components of OpSync reflect small advancements in prior art, but it is the assemblage of those components into the whole that establishes this new capability: Operational Synchronization (OpSync).
Bibliography [1]
[2] [3J
14]
[5] [6]
[7) [8] 19]
[10)
[11]
[I2J
Badham , J. (director) , 1983, Film: "WarGames." MGM Studios (LosAngeles) Cares, J ., 2005, Distributed Networked Operations; The Foundations of Network Centric Warfare, Alidade Press (Newport, RI) Casti, J., 1994, COMPLEXification-Explaining a Paradoxical World Through the Science of Surprise . HarperCollins (New York) Clark, A., 2003, Natural-Born Cyborgs-Minds , Technologies, and the Future of Human Intelligence, Oxford University Press (New York) Davies, P., 1992, The Mind of God-The Scientific Basis for a Rational World, Simon & Schuster (New York) Horgan , J., November 1992, Scientific American, "Profile: Karl R. Popper." Scientific American (New York) Jones , M., 1998, The Thinker's Toolkit, Three Rivers Press (New York) Keeney, R., 1992, Vlaue-Focused Thinking A Path to Creative Decisionmaking, Harvard University Press (Cambridge, MA) Kuhn, H., & Nasar, S., 2002 , The Essential John Nash, Princeton University Press (Princeton, NJ). Manrubia, S., Mikhailov , A., & Zanette, D., 2004 , Emergence of Dynamical Order-Synchronization Phenomena in Complex Systems, World Scientific Lecture Notes in Complex Systems-Volume 2, World Scientific Publishing (Singapore) Ridder, J. , & HandUber, J ., 2005 , Evolutionary Computational Methods for Synchronizaton of Effects Based Operations, In Genetic & Evolutionary Computation Conference Proceedings, ACM Strogatz, S., 2003, SYNC-The Emerging Science of Spontaneous Order, I Sl Ed, Hyperion (New York)
Chapter 21
The Complexity of Terrorist Networks Philip Vos Fellman Southern New Hampshire University
Complexity science affords a number of novel tools for examining terrorism, particularly network analysis and NK-Boolean fitness landscapes . The following paper explores various aspects of terrorist networks which can be illuminated through applications of non-linear dynamical systems modeling to terrorist network structures . Of particular interest are some of the emergent properties of terrorist networks as typified by the 9-11 hijackers network , properties of centrality, hierarchy and distance, as well as ways in which attempts to disrupt the transmission of information through terrorist networks may be expected to produce greater or lesser levels of fitness in those organizations .
163
1 Introduction Open source acquisition of information regarding terrorist networks offers a surprising array of data which social network analysis, dynamic network analysis (DNA) and NK-Boolean fitness landscape modeling can transform into potent tools for mapping covert, dynamic terrorist networks and for providing early stage tools for the interdiction of terrorist activities.
2 Mapping Terrorist Networks One of the most useful tools for mapping terrorist organizations has been network analysis. A variety of maps and mapping techniques have emerged in the post 9-11 . World. One of the earliest and most influential maps was developed by Valdis Krebs (Krebs, 2001) shortly after 9/l1 [11 :
• • • • •
FIOghtM f1 1 • Craallod Into WTC North Fl lgh l AI. nT· Cnallod In fo P""!alJon FIIIilI UA"3 · Crashod In PenneylYan a Fl igh t UA .175· Cras h d Into WT C South Oth or A oc l. IOO of Hijacko
CDP''' V'lI ~ 7I01. VIICH MI'f!M
164
This map yields a number of interesting properties . Using measures of centrality, Krebs' work analyzes the dynamics of the network. In this regard, he also illuminates the centrality measure's sensitivity to changes in nodes and links. In terms of utility as a counter-intelligence tool, the mapping exposes a concentration of links around the pilots, an organizational weakness which could have been used against the hijackers had the mapping been available prior to, rather than after the disaster, suggesting the utility of developing these tools as an ongoing mechanism for combating terrorism.
3 Carley's DyNet Model The most developed version of these tools is probably the Dynamic Network Analysis (DNA) model developed by Carley et al. [2]
Database of Organizational Scenarios Characteristics Of known or Hypothetical Network or Cellular Organization
Netv ork Profile
DYNET
--=
Attack Scenario
Critical Individuals Observe Dynamic s
.-..-------- -- -
D\ iET: A de ktop 10 0 1 for reasoning about dynamic networked and cellular orenmzatrons. SCt>ta ~_lC n ""-'j.".{Jri£ ~n
QvlrJ. ""lilt...1>' ~ 71"urf. I':~ Cb">"p
,0;;"
v.nv....,
As indicated in the figure above, Models of the 9-11 Network, the Al Qaeda network which bombed the U.S. Embassy in Tanzania and other terrorist networks also show a characteristic emergent structure, known in the vernacular as -the dragon". One element of this structure is relatively large distance [3] between a number of its covert elements [4] which suggests that some elements of the network, particularly certain key nodes [5] may be relatively easy to remove through a process of over-compartmentation or successive isolation, thus rendering the organization incapable of transmitting commands across various echelons.
4 Symmetries and Redundancies On the other hand, the 9-11 Network also demonstrates a high degree of redundancy as illustrated by the following diagram [6].
R... 5 Ln ' .b od. RDW' 6 Piloillair)lrli 01 pilol Copyright @200 2, Mark Slra /hem, Cranfield School of Msn agement
This mapping suggests, in tum, that while the network may be highly distributed [7], the redundancies built into it suggest cohesion at a number of levels as well as an hierarchical organization [8,]. A number of systems analysis tools [9] have been developed to deal with the problem of covert networks [10] and incomplete information [11].
5 Isolation and Removal of Network Nodes DNA, for example, provides a number of insights into what happens when leadership nodes are removed from different kinds of networks [12], yielding very different kinds of results depending upon whether the network is cohesive or adhesive [2]. In particular, Carley et al. [4] demonstrate how isolation strategies (also suggested in Fellman and Wright [3]) will yield different results based on the nature of the network involved. This is shown below in the comparison of node isolation impact in standard vs. cellular networks [2].
166
hpd d 19::1ai
-
-
-
-
-
---.
~ffiFs....r.-:--------1
~6J
-
-
-
-
-
-
--!
5> ..._ ......_ ...................
8 ==
hJH1dl!dticn~01
Fa i:J II. mi::r 0:A.Ja' N:twrl<
t--\,;~-tF~f=>'----I - - l.lu:B" t - --
9:>........_
- --
.........
---I
-
Clrt'd
,...;
,- -..----
... COI() N
... (")l,() U)COONMlt),....Q:)
1i
6 Terrorist Networks and Fitness Landscapes The dynamic NK-Boolean fitness landscape models developed by Stuart Kauffman [13,14] for evolutionary biology have attracted increasing interesting in a number of fields involving social phenomena . In particular, it offers a substantial degree of promise in providing new models for understanding and evaluating the strategic performance[15] of organizations [16]. An agent-based simulation conducted in 1999 and 2000 by Pankaj Ghemawat and Daniel Levinthal [17] using an NK-Boolean fitness landscape framework for strategic management decision-making, has yielded some results differentiating the effects of interpolating non-optimal values into the decision-making processes of hierarchical vs. interdependent organizational structures which have some interesting results with respect to isolation and disinformation operations against terrorist networks [18]. When they explore the difference between an optimal preset of policy configurations between hierarchical, central and random simulation arrays they find that disrupting lower order policy variables is likely to have little relative effect on the robustness of the organization as compared to the mis-specification of higher order values, particularly in hierarchical organizations, as illustrated below.
167
0 .9
.S-=
~ O.S
~
~ 0.7
.= .. ' . .;, -
0.6 o .~
+--,--
..---,--
,----.-6
..--,-- ,r --r---, 7
9
10
Prevet Pu lle y Yurf n hle Hlcrnrchy
C " u t n lllt y • Raudnm (1.:-6)
In dealing with terrorist organizations, which are primarily hierarchical in nature [19], what this finding says is that disinformation is a useful tactic (or strategy) only if it succeeds in influencing one of the key decisional variables. In other words, disinformation at the local level is unlikely to have any lasting impact on terrorist organizations . This finding also challenges the institutional wisdom [20] of assigning case officers in the field to this type of counter-terrorism operation [21]. In fact, from an operational point of view, the hierarchical nature of terrorist organizations means that there may be something of a mismatch in the entire targeting process. As Ghemawat and Levinthal note, "Less central variables not only do not constrain, or substantially influence the payoff of many other choices, but they themselves are not greatly contingent upon other policy choices. Being contingent on other policy choices facilitates compensatory shifts in policy variables other than the one that is preset. As a result of the absence of such contingencies, the presetting of lower-order policy choices is more damaging to fitness levels under the centrality structure." (p. 27) Another striking feature of this set of simulations concerns how few of the optima with preset mismatches constitute local peaks of the fitness landscape. Given the importance of configurational effects, one might reasonably conjecture that constraining one variable to differ from the global optimum would lead to the selection of a different, non-global, peak in the fitness landscape. However, in the actual simulation runs this result was unexpectedly absent.
7 Conclusion Terrorist networks are complex. They are typical of the structures encountered in the study of conflict, in that they possess multiple, irreducible levels of complexity and ambiguity . This complexity is compounded by the covert, dynamic nature of terrorist networks where key elements may remain hidden for extended periods of time and the network itself is dynamic. Network analysis, agent-based simulation , and NK-Boolean fitness landscapes offer a number of tools which may be particularly useful in sorting out the complexities of terrorist networks, and in particular, in directing long-run operational and strategic planning so that tactics which appear to offer immediately obvious rewards do not result in long term damage to the organizations fighting terrorism or the societies which they serve.
168
Bibliography [I] Valdis Krebs, "Uncloaking Terrorist Networks", First Monday, 2001 [2] Kathleen Carley, and Philippa Pattison (eds.), Dynamic Social Network Modeling and Analysi s: Workshop Summary and Papers, 313-323. Washington , D.C.: National Academies Press [3] Philip V. Fellman and Roxana Wright -Modeling Terrori st Networks Complex Systems at the Mid-Range", paper prepared for the Joint Complexity Conference, London School of Economics , September 16-18, 2003 http://www.p sych.1se.ac.uk/complexity/Conference/FellmanWright.pdf [4] Kathleen Carley, Jana Diesner, Jeffrey Reminga and Maksim Tsvetovat Toward an end-to-end approach for extracting, analyzing and visualizing network data", ISRI, Carnegie Mellon University, 2004 [5] Carter T. Butts "An Axiomatic Approach to Network Complexity." Journal of Mathematical Sociology, 24(4), 273-301 [6] Philip V. Fellman and Mark Strathern -The Symmetries and Redundancies of Terror: Patterns in the Dark", Proceedings of the annual meeting of the North American Association for Computation in the Social and Organizational Sciences, Carnegie Mellon University, June 27-29 , 2004. http://casos.isri.cmu.edulevents/conferences/2004/2004 proceedingsN.Fellman, Phill.doc. [7] Carter T. Butts "Predictability of Large-scale Spatially Embedded Networks." In Ronald Breiger, Kathleen Carley, and Philippa Pattison (eds.), Dynamic Social Network Modeling and Analysis : Workshop Summary and Papers, 313-323. Washington , D.C.: National Academies Press. [8] Kathleen Carley, "Modeling - Covert Networks ", Paper prepared for the National Academy of Science Workshop on Terrori sm, December 11,2002 [9] Carley, Kathleen M. and Butts, Carter T. (1997). "An Algorithmic Approach to the Comparison of Partially Labeled Graphs." In Proceedings of the 1997 International Symposium on Command and Control Research and Technology. June. Washington , D.C. [10] Kathleen Carley, Ju-Sung Lee and David Krackhardt -Destabilizing Networks", Connections 24(3): 31-34, INSNA (2001) [11]Jonathan P. Clemens and Lauren 0 ' Neill-Discovering an Optimum Covert Network", Santa Fe Institute,. Summer , 2004. [12] Kathleen Carley, "Dynamic Network Analysis in Counterterrorism Research", Proceedings of A Workshop on Statistics Analysis, National Academies Press, Washington , D.C., 2007 pp.169 ff. [13] Stuart Kauffman , The Origins of Order, Oxford University Press, 1993. [14] Stuart Kauffman , At Home In the Universe , Oxford University Press, 1996. [15] Winfried
Ruigrok
and
Hardy
Wagner,
Hardy-Internationalization
and
performance: A meta-analysis (1974-2001) ", Management International Review, Special Issue, Autumn 2006. [16] A. Caldart , and J. E. Ricart, J E, "Corporate Strategy Revisited: A View from Complexity Theory" , European Management Review, I 96-104 ,2004.
169 [17] Pankaj Ghemawat and Daniel Levinthal -"Choice Structures , Business Strategy and Performance : A Generalized NK-Simulation Approach", Reginald H. Jones Center, The Wharton School, University of Pennsyl vania, 2000, http ://www.people .hbs.edulpghemawat/pubs/ChoiceStructures.pdf [I 8] Philip V. Fellman, Jonathan P. Clemens , Roxana Wright, Jonathan Vos Post and Matthew Dadmun , "Disrupting Terrorist Networks : A Dynamic Fitness Landscape Approach", i h International Conference on Complex Systems, November, 2007. arXiv:0707.4036vl [nlin.AO] [I 9] Mark Sageman, Understanding Terror Networks , University of Pennsylvania Press, 2004. [20] Reuel M. Gerecht, "The Counterterrorist Myth", The Atlantic Monthly, JulyAugust, 200 I [21] Angelo Codevilla , "Doing it the Hard Way", Claremont Review of Books, Fall, 2004 http://www.claremont.org/writings/crb/faI12004/codevilla .html
Chapter 22
Complexity Studies and Security in the Complex World: An Epistemological Framework of Analysis Czeslaw Mesjasz Cracow University of Economics Cracow, Poland [email protected]
1. Introduction The impact of systems thinking can be found in numerous security-oriented research, beginning from the early works on international system : Pitrim Sorokin, Quincy Wright, first models of military conflict and war: Frederick Lanchester, Lewis F. Richardson, national and military security (origins of RAND Corporation), through development of game theory-based conflict studies, International Relations, classical security studies of Morton A. Kaplan, Karl W. Deutsch [Mesjasz 1988], and ending with contemporary ideas of broadened concepts of security proposed by the Copenhagen School [Buzan et al. 1998]. At present it may be even stated that the new military and non-military threats to contemporary complex society, such as lowintensity conflicts, regional conflicts, terrorism, environmental disturbances, etc. cannot be embraced without ideas taken from modem complex systems studies . Murray Gell-Mann [1995, 2002] [Alberts & Czerwinski 2002] attempted to identify the links between security-related issues and broadly defined complex systems studies . He stressed that one of the obstacles was that too broad definition of security made it difficult to identify the links between security studies and complex systems research . The aim of the paper, treated as an introduction to a broader research, is to provide preliminary answers how to understand conceptual assumptions of applications of complex systems concepts in security-oriented studies. These answers should allow to
171
develop an epistemological framework for applications of the ideas taken from complex systems research in security-oriented discourse and practice .
2. Concepts of Security The term security derives from Latin securus safe, secure, from se without + cura care - the quality or state of being secure or as a freedom from danger (freedom from fear or anxiety). In the classical sense security - from the Latin securitas, refers to tranquility and freedom of care, or what Cycero termed the absence of anxiety upon which the fulfilled life depends [Liotta 2002: 477( The traditional meaning of security is deriving from foreign policy and international relations - "objective security" and/or "military security" . Security is treated as an attribute of situation of the state, equivalent to absence of military external conflict. Such an approach was proposed in theory of International Relations in realism and neorealism and can be linked with classical security studies and strategic studies . Broadening the neorealist concept of security means inclusion of a wider range of potential threats, beginning from economic and environmental issues, and ending with human rights and migrations . Deepening the agenda of security studies means moving either down to the level of individual or human security or up to the level of international or global security, with regional and societal security as possible intermediate points. Parallel broadening and deepening of the concept of security has been proposed by the constructivist approach associated with the works of the Copenhagen School [Buzan et al. 1998]. These characteristics can be called the core of the concept of security and can be used as a point of departure for elaborating a survey of systemic attributes which appear in any discourse on security [Mesjasz 2006, 2008]. In the proposed eclectic approach security is referred to the folIowing sectors : military, economic, political, environmental and societal. Following Buzan et al. [1998] the concepts of existential threat and securitization are used. Any public issue can be securitized (meaning the issue is presented as an existential threat, requiring emergency measures and justifying actions outside the normal limits of political procedure) . Security is thus a self-referential practice , because it is in this practice that the issue becomes a security issue - not necessarily because a real existentia l threat exists, but due to the fact that the issue is depicted as a threat [Buzan et at. 1998].
3. Complexity and security 3.1. Defining systems and complexity The first attempts to study complex entities go back to the works of Weaver [1948] (disorganized complexity and organized complexity), Simon [1962] - the Architecture of Complexity , and Ashby [1963] - the Law of Requisite Variety . In his search for explaining the meaning of complexity Seth Lloyd [I989] identified 31
I
For an extended discussion on interpretations of security - see (Brauch 2008) .
172
definitions of complexity. Later, according to John Horgan [1997: 303] this number increased to 45. In other writings numerous definitions of complexity have been formulated and scrutinized - [Waldrop 1992], [Gell-Mann 1995], [Kauffman 1993, 1995], Holland [1995], [Bak 1996], [Bar-Yam 1997], [Prigogine 1997], [RAND 2000], [Biggiero 2001]. Complexity can be also characterized by multitude of other ideas such as artificial life, fractals, bifurcations, co-evolution, spontaneous self-organization, self-organized criticality, chaos, "butterfly effect", edge of chaos, instability, irreducibility, adaptability, far-from-equilibrium-states which are extensively depicted in a large number of writings quoted and not quoted in this paper. The above ideas can be called "hard" complexity research as an analogy with "hard" systems thinking'. The "soft" complexity research, also coined per analogy with "soft" systems thinking, includes the ideas of complexity elaborated in other areas - cybernetics and systems thinking, social sciences and in psychology. Initially, they were developed independently but after the growing impact of CAS and chaos, their authors began to treat the "hard" complexity concepts as a source of new ideas. Subjectivity is the first aspect of complexity in the "soft" approach. From the point of view of the second-order cybernetics, or in a broader approach, constructivism [Glasersfeld 1995], [Biggiero 200 I], complexity is not an intrinsic property of an object but rather depends on the observer. As to identify a genuine epistemological meaning of complexity, based on some properties of the relationships between observers (human or cognitive systems) and observed systems (all kinds of systems) Biggiero [2001: 3] treats predictability of behavior of an entity as the fundamental criterion for distinguishing various kinds of complexity. He proposes three classes of complexity : (a) objects not deterministically or stochastically predictable at all; (b) objects predictable only with infinite computational capacity; (c) objects predictable only with a transcomputational capacity. The typologies presented by Biggiero lead to two conclusions important in studying social systems. Firstly, self-reference characterizes the first class, which relates to the many forms of undecidability and interactions between observing systems [Foerster 1982]. This property in some sense favors the subjective interpretations of complexity. Second, human systems are characterized by the presence of all sources and types of complexity [Biggiero 2001: 4-6]. It may be then summarized that human systems are the "complexities of complexities". In social sciences, and particularly in sociology, special attention is given to the concepts of complexity of social systems proposed by Niklas Luhmann [1990, 1993, 1995]. First of all, as one of a few authors, he made an attempt to provide a comprehensive definition of a social system based solely on communication and on the concept of autopoiesis (self-creation) of biological systems. According to Luhmann, a complex system is one in which there are more possibilities than can be actualized [Luhmann 1990: 81].
The term soft complexity science is used, among others, by Richardson and Cilliers (2001).
2
173
Complexity of social system developed by Luhmann is strongly linked to selfreference since reduction of complexity is also a property of the system's own selfobservation, because no system can possess total self-insight. This phenomenon is representative for epistemology of modem social sciences, where observation and self-observation, reflexivity and self-reflexivity, and subsequently, self-reference are playing a growing role . According to this interpretation, social systems are becom ing self-observing, self-reflexive entities trying to solve arising problems through the processes of adaptation (learning) . 3.2. Complexity and security: Mathematical models, analogies and metaphors While systems thinking sought for holistic ideas and universal patterns in all kinds of systems, complexity research defined its goals in a more specific manner. A common theoretical framework, the vision of underlying unity illuminating nature and humankind is viewed as an epistemological foundation of complexity studies [Waldrop 1992: 12-13]. This claim for unity results from an assumption, that there are simple sets of mathematical rules that when followed by a computer give rise to extremely complicated, or rather complex, patterns. The world also contains many extremely complicated patterns. Thus, in consequence it can be concluded that simple rules underlie many extremely complicated phenomena in the world . With the help of powerful computers, scientists can root those rules out. Subsequently, at least some rules of complex systems could be unveiled. Although such an approach was criticized, as based on a seductive syllogism [Horgan 1995], [Richardson & Cilliers 200 I], it appears that it still exists explicitly or implicitly in numerous works in the hard complexity research. Another important epistemological contribution of complexity, and of nonlinearity in particular, is if not impossibility, then at least very limited capability of prediction and control which are viewed as the most important characteristic of complex systems. Although security studies cover a very broad area, including also purely technical issues, they can be treated as a part of knowledge about society, sharing numerous components with social sciences. Ideas originated in systems thinking and complexity studies are used in security -oriented research as models, analogies and metaphors. The term "models" is used only for mathematical structures. Models, analogies and metaphors deriving from systems thinking and complexity studies are gaining a special significance in social sciences. For mathematical models it is quite obvious that they are associated with "objective" research. Analogies and metaphors taken from complex systems studies are related to ideas drawn from "rational" science. They are treated as "scientific" and obtain supplementary political influence resulting from "sound" normative (precisely prescriptive), legitimacy in any debate on security theory and policy. In applications of models , analogies and metaphors in social sciences the following approaches can be identified : descriptive, explanatory, predictive, anticipatory, normative, prescriptive, retrospective, retrodictive, control and regulation. The above epistemological links between complexity research and social sciences are predominantly associated with the "hard" complexity. The input to this area
174
exerted by the "soft" complexity research is equally significant. Reflexive complexity of society has become one of the foundations of post-modern social theory. Unfortunately, various abuses and misuses may occur, when analogies and metaphors drawn from "hard" complexity research, and to a lesser extent from "soft" complexity research are treated too carelessly even by eminent social theoreticians of post-modernism/post structuralism. Several examples of such abuses are mirrored in the so-called "Sokal Hoax" and other examples widely described by the originator of that hoax [Sokal & Bricmont 1998]. The warning message conveyed in that book is of a special importance since broadening and deepening the concept of security contributed to the development of critical security research frequently referring to post-modernism, and sometimes directly or indirectly, to complex systems research. [Albert & Hilkermeier 2003].
3.2. Complex systems in security theory and policy: Can expectations be fulfilled? An overview of security-related expectations towards complex systems studies should open from a brief sociological survey in which the following question will be answered: Who and what is expecting from whom? What can be delivered by those whom the expectations are addressed to? Expectations towards complex systems research are articulated by specialists in International Relations and in associated areas - security studies and peace and conflict research [Rosenau 1990], [Jervis 1997], [Saperstein 1984, 1991, 2002], [Saperstein & Mayer-Kress 1988]. Complexity studies naturally enrich epistemology of those sciences. It is interesting to observe that complex systems are applied equally by representatives of the mainstream security studies, who treat it as a kind of extension of rational choice-based considerations [Axelrod & Cohen 1999], and by the so-called critical approaches in security research, and in a broader sense, in International Relations [Albert & Hilkermeier 2003]. Policy makers are the second group who, rather indirectly, through the academic research and/or advisors express their hopes, to ameliorate their understanding of the world with the use of complex systems ideas. Close to policy makers the military community can be placed. A part of their expectations are similar to those of the policy makers, especially at the strategic level. However, numerous expectations of the military are deriving from their will to adapt complexity methods at all levels to the situations in which military units can be used. Not only in classical military conflicts [Beyerchen 1992], but also in the post-conflict situations as well as in various emergency situations, [llachinski 1996, I996a], [Alberts 2002]. It is also necessary to mention the media and the societies, or the general public, the last social actors awaiting new insights from complexity research. Increasing complexity of the surrounding world enhances natural curiosity of the phenomena directly and/or indirectly influencing the life of the individuals. Who is the addressee of those expectations and questions. First and foremost, it is a very incoherent community of academics, advisors and other professionals . The second group is professional military analyst, who are involved in developing new methods of accomplishing functions of military systems at all levels of their hierarchy.
175
Due to a very wide scope of meaning of security, and to a multitude of complexities, it is obviously impossible to enumerate all expectations towards the complex systems research . The fundamental expectation is simple. Although increasing complexity is viewed as a law of nature and society, but after the end of the Cold War the process of "complexification" of the world system has accelerated substantially . Social systems of the tum of centuries are more complex and are labeled as chaotic society, or risk society [Beck 1992]. Reflected in all prognoses, uncertainty, speed of change, and complexity of political and economic affairs as well as environmental challenges contribute to the incomprehensibility of the world at all levels of its internal hierarchy. Widening and deepening the sense of security also contributes to increasing real, or perceived complexity of the world. Since its very beginning, the complexity research was perceived as a source of a certain promise, a source of a new language, and at the same time contributed to such perception, that there were some patterns in complexity, which could be disclosed by the mathematics models taken from a new field of science. This intellectual and at the same time emotional impressions of incomprehensibility and at the same time an appeal for new approaches are wellreflected by the metaphor of "The Ingenuity Gap" proposed by Homer-Dixon [2002]. Assuming that security is always associated with an unusual disturbance undermining the existence (functioning) of an individual/system it may be assumed that in all security-oriented theories and policies, three basic human desires are expressed : reduction of uncertainty by enhancing predictive capabilities and strengthening potential of anticipatory activities, identification of patterns of functioning of the social systems and their components, allowing to enhance protection against the disturbances , ex ante and ex post, elaboration of norms and methods allowing improve functioning of social systems and of their components .
5. Conclusions The fundamental but rather obvious conclusion is that complex systems studies have become an indispensable part of epistemology of security theory and eventually a useful instrument of security policy at the cognitive (language) level. It concerns both the impact on action and the impact on the processes of social communication although it would be rather difficult to measure that impact. The uses of complexityrelated mathematical models and analogies and metaphors have broadened the epistemological foundations of security research. It does not obviously mean that complex systems studies directly responded to the expectations of security studies in reference to prediction, explanation of causal effects, prediction, prescription, normative approach, retrospective, retrodiction and in enhancing (always limited), capabilities to influence the social phenomena. The applications of complexity ideas in security discourse have several weaknesses of which two are most important. First, too high expectations from security theory and policy, and second, mutual misuses and abuses. Security specialists, journalists and politicians too often treat the complexity-related utterances as an element of the new, modem and to some extent "magic" language. By the same token, scholars familiar with complexity models reduce social phenomena to very simple patterns, irrelevant to reality.
176
References Albert, M. & Hilkenneier, L., (eds.) 2003, Observing International Relations : Niklas Luhmann and World Politics, Routledge (London). Alberts, D. S. 2002. Information Age Transformation, Getting to a 21st Century Military, DoD Control and Command Research Program, Washington, D.C., http://www.dodccrp.org/fiIes/Alberts_IAT.pdf, 12 December 2006. Ashby,W. R., 1963, An Introduction to Cybernetics, Wiley (New York). Axelrod, R. & Cohen, M. D., 1999, Harnessing Complexity. Organizational Implications ofa Scientific Frontier, The Free Press (New York). Bak, P., 1996, How Nature Works: The Science ofSelf-Organized Criticality, Springer Verlag (New York). Bar-Yam, Y., 1997, Dynamics ofComplex Systems, Addison-Wesley (Reading, MA). Beck, U., 1992, Risk Society. Towards a New Modernity, SAGE (London). Beyerchen, A., 1992, Clausewitz, Nonlinearity and the Unpredictability of War, International Security, 17,3 . Biggiero, L., 2001, Sources of Complexity in Human Systems, Nonlinear Dynamics, Psychology and Life Sciences.S, 1. Brauch, H. G.; Grin, 1., Mesjasz, C., Dunay, P., Behera, N. c., Chourou, B., Oswald Spring, U., Liotta, P.H., Kameri-Mbote, P., (eds.), 2008. Globalisation and Environmental Challenges: Reconceptualising Security in the 21'1 Century, Springer-Verlag (Berlin - Heidelberg - New York - Hong Kong - London - Milan- Paris- Tokyo). Buzan,B., 1991 , People, States and Fear. Second Edition, HarvesterWheatsheaf(New York). Buzan, B.; Wrever, O. & de Wilde, 1., 1998, Security. A New Frameworkfor Analysis, Lynne Rienner Publishers (Boulder-London). Epstein, J. M. & Axtell, R. L., 1996, Growing Artificial Societies. Social Science from the Bottom Up, MIT Press (Cambridge, MA). Foerster, von H., 1982, Observing Systems, Intersystems Publications (Seaside, CA). Gell-Mann, M., 1995, What is Complexity? Complexity, 1,1. Gell-Mann, M., 2002,The Simple and the Complex, in: Alberts, D. S. & Czerwinski, T. J. (eds.), Complexity, Global Politics and National Security, University Press of the Pacific (Honolulu). Glazersfeld, von E., 1995. Radical Constructivism : A New Way ofKnowing and Learning, The Fanner Press (London). Holland, J. D., 1995, Hidden Order. How Adaptation Builds Complexity , Basic Books (New York). Homer-Dixon, T., 2002, The Ingenuity Gap, Vintage Books (New York). Horgan, John, 1995, From Complexity to Perplexity, Scientific American, June, 272, 6. Ilachinski, A., 1996, Land Warfare and Complexity, Part I: Mathematical Background and Technical Sourcebook, Center for Naval Analyses (Alexandria, VA). Ilachinski, A., 1996a, Land Warfare and Complexity, Part II: An Assessment ofApplicability of Nonlinear Dynamics and Complex Systems Theory to the Studies of Land Warfare, Center for Naval Analyses (Alexandria, VA). Jervis R., 1997, System Effects. Complexity in Political and Social Life, Princeton University Press (Princeton, NJ). Kauffinan, S. A., 1993, The Origins of Order: Self-Organization and Selection in Evolution, Oxford University Press (New York/Oxford). Kauffinan, S. A., 1995, At Home in the Universe. The Search for Laws ofSelfOrganization and Complexity, Oxford University Press(New York/Oxford). Liotta, P. H., 2002, Boomerang Effect: The Convergence of National and Human Security, Security Dialogue, 33, 4.
177
Lloyd, S., 1989, Physical Measures of Complexity, in: 1989 Lectures in Complex Systems, Jen E., (ed.), Addison-Wesley (Redwood City, CA) . Luhmann, N., 1990, Essays on Self-Reference, Columbia University Press (New York) . Luhmann, N., 1993, Risk: A Sociological Theory, Aldine de Gruyter (New York) . Luhmann, N., 1995, Social Systems , Stanford Univer sity Press (Palo Alto) (Original work published in German in 1984). Mesjasz, C., 1988, Applications of Systems Modelling in Peace Research, Journal of Peace Research, 25, 3. Mesjasz, c., 2002 , How Complex Systems Studies Could Help in Identification of Threats of Terrorism? InterJournal, 605, (Submitted), Brief Article . Mesjasz, C., 2006 , Complex System s Studies and the Concepts of Security, Kybernetes, 35 , 3-
4. Mesjasz, C, 2008, Security as Attributes of Social Systems, in: Brauch, H. G.; Grin , J., Mesjasz, c., Dunay , P., Behera, N. C., Chourou, B., Oswald Spring, U., Liotta, P.H., Kameri -Mbote, P., (eds .), 2008 . Globalisation and Environmental Challenges: Reconceptualising Security in the 21st Century , Springer-Verlag (Berlin - Heidelberg - New York - Hong Kong - London Milan - Paris - Tokyo). Midgley , G., 2003, (ed.), Systems Thinking, vol. I-IV, SAGE (London). Prigogine, I., 1997, End ofCertainty , The Free Press (New York) . RAND Workshop on Complexity and Public Policy, Complex Systems and Policy Analysis : New Tools for a New Millennium , 27 and 28 September 2000, RAND, Arlington, Virginia, http ://www .rand.org/scitech/stpi/Complexity/index .html . retrieved 6 September 2006 Richardson , K., & Cilliers, P., (ed.), 2001. Special Editors' Introduction. What Is Complexity Science? A View from Different Directions,Emergence, 3, 1. Rosenau, 1. N., 1990. Turbulence in World Politics. A Theory ofChange and Continuity. Princeton University Press: Princeton. Saperstein, A. M., 1984, Chaos- A Model for the Outbreak of War, Nature, 309. Saperstein, A. M., 1991, The "Long Peace" - Result of A Bipolar Competitive World, Journal of Conflict Resolution, 35, 1. Saperstein, A. M., 2002, Complexity, Chaos and National Security Policy: Metaphors or Tools, in, Alberts, D. S. & Czerwinski, T. J., (ed.), 2002, Complexity, Global Politic s and National Security, University Press of the Pacific (Honolulu). Saperstein, A. M., Mayer-Kress, G., 1988, A Nonlinear Dynamical Model of the Impact of SOl on the Arms Race, Journal ofConflict Resolution , 32, 4. Simon, H. A., 1962, The Architecture of Complexity, Proceedings of the American Philosophical Society, 106, 6. Sokal, A. & Bricmont J., 1998, Fashionable Nonsens e. Postmodern Intellectuals ' Abuse of Science , Picador (New York) . Waldrop, M. M., 1992, Complexity : The Emerging Science at the Edge of Order and Chaos, Simon & Schuster (New York) . Weaver, W., 1948, Science and Complexity, American Scientist, 36. Wrever, 0 ., 1995, Securitization and Desecuritization, in Lipschutz , R. D., (ed.), On Security , Columbia University Press (New York) .
Chapter 23
Complexities, Catastrophes and Cities: Emergency Dynamics in Varying Scenarios and Urban Topologies Giuseppe Narzisi, Venkatesh Mysore, Jeewoong Byeon and Bud Mishra Courant Insti tute of Mathematical Sciences, New York University 715 Broadway #1002, New York , NY, USA {narzisi,mysore,jw.byeon,mishra}@nyu.edu Project website: www.bioinformatics.nyu.edu/Projects/planc
1
Introduction & Background
Com plex Systems are often characterized by agents capab le of interacting wit h each ot her dynamically, often in non-linear and non-intui tive ways. Trying to characterize their dynamics often result s in partial differential equations t hat are difficult , if not impossible, to solve. A large city or a city-state is an example of such an evolving and self-organizing complex environment t hat efficiently adapts to different and numerous incremental changes to its social, cult ura l and technological infrastructure [1]. One powerful technique for analyzing such complex systems is Agent-B ased Modeling (ABM) [9], which has seen an increasing number of applications in social science, economics and also biology. The agent-based par adigm facilitat es easier tra nsfer of domain specific knowledge into a model. ABM provides a natural way to describe systems in which th e overall dynamics can be described as th e result of the behavior of populations of autonomous components : agents, wit h a fixed set of rules based on local information and possible cent ra l cont rol. As part of the NYU Cente r for Catastrophe Preparedness and Response (CCP R 1 ) , we have been exploring how ABM can serve as a powerful simulat ion technique for analyzing large-scale urban disasters. Th e central problem in Disaster Management is t hat it is not immediately apparent Iv isit : www.ny u.edu/ccpr
179
whether the current urban emergency plans are robust against such sudden, rare and punctuated catastrophic events. We have been striving towards a methodical and algorithmic approach for both preparedness and response, by combining powerful ideas from modelchecking, simulation and multi-objective optimization, in order that a large urban structure can recover from the effects of a disastrous event quickly and efficiently. Recently, game theoretic paradigms have also influenced the analysis of complex systems. In our models, persons play "games" with each other for the medical resources; persons and hospitals interact to minimize several factors like number of fatalities, average waiting time, average ill-health, cost, etc . Likewise, the heuristics people employ to choose the hospital they should head to, based on prior knowledge about their size and location and real-time knowledge about current occupancies, can be seen as an extension of the Santa Fe bar problem [3]. Game theory also discusses different kinds of strategies that can effectively describe different personality, cultural and social traits governing panic behavior: some people imitate their neighbors, some are contrarian, some are rational, some are irrational, some employ a random strategy, etc . Disaster planning is often based on assumptions derived from a conventional wisdom that is at variance with empirical field disaster research studies [2]. Our efforts to avert this error have resulted in a new system , called PLAN C (Planning with Large Agent-Networks against Catastrophes) [5, 6, 8, 7]2, with well-identified, validated, simple rules with minimal number of parameters to avoid modeler bias and unnecessary complexity. The persons, hospitals, on-site responders, ambulances and disease prognosis follow deterministic rules with probabilistic parameters that can be modified by the user. The system is implemented in Repast 3.1, 3 a popular and versatile JavaBased software toolkit that has been used to model such diverse concepts like intracellular processes and business strategies. We have also integrated ProActive 4 with RePast, in order to use the computational power of a cluster of computers to explore the parameter space of the system. Rather than focusing on the intricacies of the modeling problem, in this paper, we delve into the nature and sources of complexity in the dynamics of different kinds of catastrophes and various urban topologies.
2
Experimental results
In disaster management, it has been established that "Planning should take into consideration how people and organizations are likely to act, rather than expecting them to change their behavior to conform to the plan" [2]. ABM serves as 2 A more detailed description of our system can be found in [6] , where the Sarin gas exposure scenario is investigated in the constraints defined by Manhattan, New York , in [5J , where the Brazilian food poisoning scenario is recreated and in [7] where various dynamics for different subpopulations and hospitals configurations are analyzed. 3http://repast.sourceforge.net 4http://www-sop.inria.fr/oasis/proactive/
180 a means of describing the behavior of medical facilities (controllable) and evaluating their performance in different disease scenarios for people with different personality and health profiles. See table 1 of the supplementary material" for the choice of the parameters. Single event scenario: As a first scenario, we consider a possible terrorist attack with a chemical warfare agent at Port Authority Bus Terminal in midtown Manhattan. The left plot in Fig. 1 shows the evolution curves for the average waiting time of the affected population at the hospitals. The presence of three jumps is visible in the first 400 ticks of the curves, corresponding to the crowding effect of the flux of people at the three nearest hospitals to the site of the attack. Each climb phase is a consequence of the hospital state changing rapidly from "available" to "critical", with a resulting increase in the number of waiting noncritical persons. The flat phase that ensues is due to the state change from "critical" to "full", where all waiting persons are instructed to head to another hospital. It is interesting to note how the population size of 500 persons seems
eoc
800
1000
1200
II,..
Figure 1 : Left plot: evolution curves for the percentage of waiting persons at the hospitals and the average waiting time of the population. Right plot: evolution curves for the percentage of active and admitted persons. to produce a more complex scenario as compared to a large size of 1000, as evident in the higher waiting time at the hospitals. This unforeseen outcome can be explained by observing that after the nearest hospital becomes full, the remaining waiting population that heads to another hospital is unable to fill up the new one. The new hospital remains in a critical state for more time causing an increased waiting time. This effect is visible in the inset plot on the left of Fig. 1, where the curve for the population size of 500 produces the highest percentage of waiting persons around 400 ticks. A similar behavior is produced by an affected population of 2000 individuals, but in this case the scenario unfolded after the three nearest hospitals became full. The right plot of Fig. 1 shows the percentage of active and admitted persons. The term active denotes a person who has decided to head to a hospital. As expected, immediately after the attack, both the number of active and admitted persons quickly increases, but then different courses are produced by the different population sizes. Another unexpected behavior emerges in the right inset plot of 5 available
at www.bioinformaties.nyu .edujProjectsjplanejappendix..ices06. pdf
181 Fig. 1: an affected population of 1000 individuals produces a higher percentage of adm itted persons than that of 2000. A possible explanation can be found by observing that the resources of each hospital are the same for both popu lation sizes, but the number of persons with lethal and severe injuries increases with the population size. These are persons who need more treatment producing a longer hospitalization time and higher demand of resources. At the same time , there are also many persons, some lightly and others severely injured, who are awaiting adm ission. Multiple event scenario : As a second scenario, we consider a possible terrorist attack involving multip le explosions - in particular, caused by three bombs located respective ly in Union Square, Times Square and Central Park. The explosions are simulated to occur after 10, 120 and 300 minutes respectively. A population of 5000 persons is involved and initialized to random positions on the map at the beginning of the simulation. The left plot of Fig. 2 shows the
---
2Il
20
i
os
f" ~
,
, "'"
"."
I
~
I '~ I
2
,~
ce
/
°
2.
'500
To. .
2000
....
:lOOO
°e
eoe
"'0'
,"'" ''''
2000
....
:lOOO
Figure 2 : Left plot: evolution curves for active and waiting persons. Right plot: evolution curve for the percentage of admitted persons in the hospitals. expected increase in the number of active persons after each of the three explosions. It is interesting to note the presence of an unpredictable fourth but less rapid increase after 1000 ticks. The waiting curve instead follows a completely different path because of the different spatial positions of the hospitals with respect to the sites of the explosions and their different resource levels. The right plot of Fig. 2 shows the curve for the percentage of admitted persons in the hospitals. As expected, after each explosion we have an increase in the number of admissions , but most of them are probably persons who do not need long-term hospitalization and hence, are discharged soon. However, the percentage of admitted persons never becomes zero; random fluct uations after the 700t h tick are visible due to the probabilistic personality factors (irrationality and compliance) of each person.
2.1
U rban topologies and transportation
The behav ior of complex systems is strongly affected by the topology of the environment. This dependence has been observed in different domains (Biology, Social Science, Economy) and remains particularly true in the context of
182 emergency response for a large urb an environment . Location and distribution of th e available resour ces can make the difference in t he way a city respond s to an attack. The topology of the st reets and th e tr anspor tation syste m affect how people make decisions as they tr avel to to their destin ation.
Figure 3: Warfare age nt attack in 4 d ifferent U.S. cities: left- top: New York City, NYj left- bot tom: Boston , MAj right- top : San Francisco, CAj right-bottom: P hiladelphia, PA .
Publi cly available Geographic Information Systems (GIS) data abo ut roads and transportation system of four different cities in U.S. (see Fig. 3) was converte d into a graph, where nodes are intersections and edges are st reets . We have populat ed the model with hospit al resour ces according to the da ta already included in t he GIS source (if available) or according to publicly available web sites describing the hospit al facilities. Agents are const ra ined to move only along th e edges of th e graph , with the effective speed at each time-ste p depending on th e health level and prob abilistic terms to simulat e congest ion effects . A simple vari ant of th e LRTA*[4] algorithm for route computation is used to model a person's panic behavior. We have compared the emergency dynamics of the same warfare agent attack wit h 5000 casualties in t he downtown location s of each city. Fig. 4 shows the dynamics of percentage of deceased and waiting persons for each city. Results show that San Francisco performs the worst among all the cities stu died under almost identical attack scena rio. T his discrepancy is most likely due to the dist ribution of the hospit als, as in fact t he majority of t hem are located far away from the downtown area. On t he ot her hand, Phil adelphia and Boston exhibit comparab le performance in terms of fat alities, waiting and admissions to hospitals.
183
-Son-f_ -- -.
---
-$ln-frWlClKO
- -. ' 000
Figure 4: Left plot : percentage of deceased perso ns (by city) . Right plot : percentage of people waiting for treatment at th e hospit als (by city).
2.2
ABM model-checking
Unlike statistica l analysis of metrics averaged over multipl e agents and simulat ions, the model-checking approach focuses on individual agents' traces. Complex temporal properties may be described in Linear Temporal Logic (LTL) and t hen model-checked in a model-checker such as XSSYS6.Th e agents' traces produced in out put by t he syst em can be read using XSSYS. To demonst rate the technique, we consider an intensive toxic agent exposure in downtown Manhattan and monitor a person and a hospit al. Fig. 5 shows two examples of queries t hat can be expressed in LTL for one of the hospital and person traces, thus allowing finer aspects of the plan to be studied automatically.
[ Figure 5: Temporal Logic Ana lysis in XSSYS. Left plot : Time-Trace of a Person. Right plot : Time-Trace of a Hospital.
3
Multi-Objective Optimization for Planning
Response plans involve different , often conflict ing, criteria that must be satisfied and opt imized in parallel - number of fat alities, average population healt h, time 6 Antoniot ti
et al., Th e P ac. Sym p. on Bio.: PSB 2003, 116-127.
184 t aken to succumb, waiting time at the hospital, life expectancy, economic cost , etc . In our framework, a response plan is expressed in terms of the system rules and par ameters, producing a gargantu an strategy space t hat should be explored in order to find "opt imal" plans. Moreover , the inpu t parameters typically interact in a non-linear fashion. We have been exploring t he use of m ulti-obj ective evolutio nary algorithms (MOEAs) in order to devise plans that opt imize multiple objective functions in terms of their Pareto frontier in the high-dimensional space defined by the syste m [8] . Over the last decade MOEAs have shown to have many of t he prop erties needed to effectively tackle this challenging computational problems, such as t heir ability to: (i) generate multipl e Pareto optim al solut ions in a single run , (ii) handle a large sear ch space, and (iii) provide robustn ess against th e effects of noise. The PLAN C model produ ces, as part of its out put, severa l of the relevant objectives/crit eria involved in response planning in th e form of statistical results of th e global syst em behavior. In this context, a possible multi-objective formulation of the emergency response planning problem may be defined as follows: th e selected input parameters of th e model are th e decision variables, the criteria for plan evaluat ion are th e objectives, the par ameter ran ges are the variable bounds, and the mutual relations between th e set of parameters are th e constraints. In [8] we employed two well-known MOEAs, the Non-dominated Sortin g Genetic Algorithm II (NSGA-II) and th e Pareto Archived Evolution Strategy (PAES), and calibrated their performance for different pairs of objectives in the context of plan evaluat ion using PLAN C.
4
Conclusions and future investigations
T he complex intera ct ions between t he affected popul ation and the available resources of a response plan have remained poorly understood , are st ill beyond the analytical capa bility of traditional modeling too ls, and have resisted any syste matic investigation . In t his research work we have shown th at a deep analysis of the source of complexity generated by the simulat ion of different kind of urb an emergency scenarios is effectively possible. Our efforts aim to demonstr ate th at t he ABM par adigm, in conjunct ion with statistical analy sis, multi-objective optimization, game theory and modelchecking of agent-traces, offers a novel way to underst and , plan and control th e unwieldy dynamics of a large-scale urb an emergency response. The same empirical approach to mechanism design and selection in a complex repeat ed game will very likely find other applicat ions: namely, (1) Social Networks , (2) Swarm robots, (3) Power syste ms design, (4) Synthetic and syste ms biology, etc.
185 We are also exploring various resear ch qu estions that these novel applicat ions bring to t he forefront. 7
Bibliography [1] B ATTY, M. , Cities and Complexity: Understanding Cities with Cellular Automata, Agent-Based Models, and Fractals, MIT Press (2005) .
[2] DER H EIDE, E . Auf, "T he imp or t an ce of evide nce-based disas ter planning" , Ann als of Emergency Medicine 47, 1 (2006) , 34-49.
[3] GREENWALD, A. , B. MISHRA, and R PARIKH, "T he santa fe bar problem revisit ed : Theoretical and pr acti cal implications" , Th e Proceedings of the Summ er Festival on Game Theory: International Conference, (1998). [4] KORF, RE. , "Real-t ime heuristic search", Art. Inteli. 42 (1990) , 189-211.
[5] MYSORE, V., O. GILL, R S. DARUWALA , M. A NTONIOTTI , V. SARASWAT , and B. MISHRA, "Mult i-agent mod elin g and analysis of the br azili an foodpoisoning scenari o" , Th e Agent Conference, (2005).
[6] MYSORE, V. et al., and B. M ISHRA, "Agent mod eling of a sa rin at tack in manhat t an " , In Proc. 1st Inti. Workshop on Agent Tech. for Disaster Management, ATDM (2006).
[7] NARZISI, G., J.S. MINCER, S. SMITH, and B . MISHRA, "Resilience in t he face of disaster: Accounting for varying disaster magni tudes, resource topologies, and (sub) populat ion distribu tions in t he plan c emergency planning to ol", In Proc. of the 3rd Inti. Conf. on Indust. Appi. of Holonic and Multi-Agent Syste ms (HoloMAS 2007), vol. 4659, Springe r LNAI (2007), 433-446.
[8] NARZISI, G., V. MYSORE , and B. M ISHRA, "Mult i-object ive evolut ionary optimization of agent based mo dels: an a pplication to emerge ncy response planning" , In Proc. of the l ASTED Inti. Conf. on CompoIntell. (CI 2006), ACTA pr ess (2006) , 224-230.
[9] PN AS , Adaptive Agents, Intelligence, and Em ergent Human Organization: Capturing Complexity through Agent-B ased Modeling vol. 99(3 ), (May 2002 ).
7 Acknowledgment: We acknowledge t he support from Depa rtment of Homeland Security Grant #2204-GT-TX-OOOl and an NSF IT R grant #CCR-0325605. We also wish to t hank Dr. O. Gill currently at Bloomberg, Dr. R-S. Daruwala currently at Google an d Mr. F Menges current ly at NYU Bioinfor mat ics group for their cont rib utions to the imp lementation of PL AN C; close collaborators Dr. V. Saraswat of IBM , Profs. S. Mitter a nd G. Verg hese of MIT and P. Doersc huck of Cornell for th eir advice and cr itic isrns ; and Drs . S. Smith, L. Nelson, D. Rekow, M. Trio la, L. Halcomb and 1. Portelli for their input to the clinica l aspect of th e st udy design and development .
Chapter 24 Systems Biology via Redescription and Ontologies (II) : A Tool for Discovery in Complex Systems Samantha Kleinberg", Marco Antoniotti!t, Satish Tadepallii, Naren Ramakrishnant, Bud Mishral t Courant Institute of Mathematical Sciences, New York University :I: DISCo, Universita Milano-Bicocca, Italy § Computer Science, Virginia Tech
A complex system creates a "whole that is larger than the sum of its parts," by coordinating many interacting simpler component processes . Yet, each of these processes is difficult to decipher as their visible signatures are only seen in a syntactic background, devoid of the context. Examples of such visible datasets are time-course description of gene-expression abundance levels, neural spike-trains, or click-streams for web pages . It has now become rather effortless to collect voluminous datasets of this nature; but how can we make sense of them and draw significant conclusions? For instance, in the case of time-course gene-expression datasets, rather than following small sets of known genes, can we develop a holistic approach that provides a view of the entire system as it evolves through time? We have developed GOALIE (Gene-Ontology for Algorithmic Logic and Invariant Extraction) - a systems biology application that presents global and dynamic perspectives (e.g., invariants) inferred collectively over a gene-expression dataset. Such perspectives are important in order to obtain a process-level understanding of the underlying cellular machinery; especially how cells react, respond, and recover from environmental changes. GOALIE uncovers formal temporal logic models of biological
187 processes by redescribing time course microarray data into the vocabulary of biological processes and then piecing these redescriptions together into a Kripke structure. In such a model, possibleworldsencode transcriptional states and are connected to future possible worlds by state transitions. An HKM (Hidden Kripke Model) constructed in this manner then supports various query, inference, and comparative assessment tasks, besides providingdescriptive process-level summaries. The formal basis for GOALIE is a multi-attribute information bottleneck (IE) formulation, where we aim to retain the most relevant information about states and their transitions while at the same time compressing the number of syntactic signatures used for representing the data. We describe the mathematical formu lation, software implementation, and a case study of the yeast (S. cerevisiae) cell cycle.
1
The problem of microarray analysis
Microarrays, which allow the measurement of expression levels for tens of thousands of genes at a time, are a useful technique for gathering biological data, but it can be difficult to make sense of the results they produce. Experiments can be repeated with varying conditions (such as starvation, heat shock , etc .) and with each microarray having upwards of 10,000 probes, and many time course experiments having over 10 time points, there is a vast amount of data being generated. One frequent method used to dea l with this is to cluster jl] the data into groups that have similar properties. There are two common ways of doing this clustering: by express ion patterns over the entire dataset , which fails to take into account variations of the data over time; and by function - using a known onto logy of biological processes , which fails to find unknown groupings. What is needed is a method of modeling the data based on both biological funct ion and temporal evolution and a way to relate it to other experiments.
2
Computat ion
Our computational methods are based on a temporal redescription, which takes genes and translates them into a controlled vocabu lary, and then stitches those translations together to form a picture of the biological system as it evolves over time . To facilitate the construction of our model of the dataset, we begin by breaking the data into small overlapping (or non-overlapping) windows of time. Each window contains all of the genes in the data set , but with only their expression values for a specific interval of time. The initial cluster analysis, using the numerical gene expression data, is done on these windows, rather than the entire dataset. By doing this "windowing ," we have simplified each computational step while also allowing for the fact that groups of genes may briefly act together but diverge across the whole dataset . Using this windowed approach we can make inferences such as "process A and process B act together beginning in hours 1-2 and continuing through hours 2-4. They are then joined by process C during hours 4-6." Each cluster in each window of time is first redescribed
188 into the vocabulary of biological processes, and then we redescribe the clusters again: in relation to each other. That is, we connect clusters across time windows by tracking their describing terms. The connection relationships can be both one-to-many and many-to-one, as their meaning is that the terms in the connection persist from the source cluster to the destination cluster. A more rigorous algorithm for constructing these clusters and cluster-connections has been developed using a generalization of Shannon-Kolmogorovs rate-distortion theory, called "information bottleneck approach:" in this setting, the redescription is simply viewed as a lossy-compression of all the temporal observations by a simpler finite automaton that introduces minimal distortion in terms of the known ontologies. This generalized algorithm appears elsewhere [2]. The basis for the computations we perform is the use of temporal logic in the form of a Kripke structure. Here, this is a directed graph (often acyclic, DAG), defined by its vertices, edges, and properties (V, E, P). Generalization to probabilistic versions of Kripke structures can be achieved by assigning probabilities of transitions to the edges. The vertices represent the reachable states of the system (clusters), edges are transitions between states (cluster connections across time windows), and properties (GO Categories) are used to annotate the states in which they are true. Terms within the Gene Ontology (GO)[4] have their own hierarchical structure, which is also incorporated into the model. In the case of our yeast (S. cerevisiae) data, describing terms would include "cell cycle", and more specifically "M phase" and "GljS transition of mitotic cell cycle."
2.1
Computation in detail: HKM and IB
This section explains in detail, the methods used in GOALIE to derive a Kripke model in the form of a DAG from a given time series data set . The Information Bottleneck Principle The Information Bottleneck principle[14] is an information theoretic approach to clustering. Suppose each instance Xi of a random variable X is associated with an instance Yi of another random variable Y, and we desire a clustered representation T of X which preserves as much information about the relevant variable Y as possible. Typically X represents the features of the objects to be compressed and Y represents some relevant information about their classes. According to the IE principle the clustering scheme T is chosen to minimize the functional (1.1) L m i n = 1('1';X) - (31(1'; Y) where 1(1'; X) represents the mutual information between T and X, 1(1'; X) = Lt,x log p(g~n) and f3 is the Lagrange parameter that controls the tradeoff between compression and preservation of relevant information. Lower values of (3 give more importance to compression while higher values give more importance
,
189 to preserving relevant information. We can try to maximize the function al L in equation 2.1 as follows
Lm a x
= 1(1';Y ) -
(3- 11('1';X )
(1.2)
This equation provides a met ric to evaluate different clustering schemes of X . Th e key steps in GOALIE are 1. Partitioning t he given time series data into windows. 2. Clustering expression data in each window. 3. Connecting the clusters in neighboring windows. Clustering within a w in d ow
Let G represent the random variable corresponding to the expression vectors in a given window W. Each expression vector gi is an instantiat ion of G. Still et al.[12] show t hat k-means clustering algorit hm can be derived from the information bottleneck method. Following equation 2.1, the IE formulation of k-means clustering G is as follows
L m ax
= I (G;G) -
(3 - 1I (G;i )
(1.3)
Th e clustering G compresses t he data indices i while preserving similarity in expression space G. Choosing the partitioning of windows
T here are many ways to define an objective criterion for achieving a partitioning of t he windows. One approach is as follows. For a given time series data set , we identify a tiling (overlapping or non-overlapping) of windows such t hat when each window is re-stat ed in terms of clusters, the cluster identities in one window are infor mative of cl uster ide ntit ies in t he neighbor ing window . T he s um of mutual
information across all windows along with a penalty for overlaps can then be used to define the IE functional at this stage. Connecting the clusters in neighboring windows
Given the set of windows t hat span across all the time points in th e data set , our next task is to connect t he clust ers in neighboring windows to track t he temporal relat ionships in the dat a. In each window we find the GO ids enriched using the Fisher-Exact test with Benjamini-Hochberg (or with an empirical Bayesian approach) correction . Two clusters C, and Gj are said to be 8 - equivalent if the J accards coefficient between the sets of GO ids enriched in Gi and Gj is ~ 8. With these connections between the clusters , we can introduce a direct ed graph G = (G,E) whose vertices C are the clust ers and an edge from cluster Gi to Gj exists if they are in neighboring windows and the J accards coefficient of the GO ids in th e clusters is at least 8. See [6] for additional details of t he window-partitioning and neighbor-connecting algorithms.
190
Figure 1: Diagram of GOALIE's output of t he yeast cell cycle
Figure 2: GOAL IE output of t he HKM as a graph of clusters
.
191
3
Causal Connections
Equipped wit h an HKM, as crea ted by t he meth ods described ear lier, one could not only investigate what standa rd tempora l logic invariants (e.g., reachability, safety, etc .) hold, but also seek out complex causal relationships that are not completely explicit . When interpreted in t he conte xt of gene expressions, causal relationships can refer to sets of processes, or genes, modulat ing one anot her to establish certain pat terns of regulat ion. Attempts to dete rmine these causal relationships using stand ard statistical approac h have proven difficult [lO], however, since we can reframe the problem in terms of model checking and use t he preexist ing framework along with philosophic al ideas of probabilistic causality [13], we have been able to infer t hese causal relat ionships from t he HKM. The algorithm for this performs two key steps: namely, generate and repr esent causal relationships as a set of logical formulas; next , using multiple hypoth eses testing, determine a subset of formulas satisfied by th e model with a reasonably high confidence. We build our formulas from the represent ation of th e following typ es of causes: prima facie (possible causes) - these cause raises th e prob ability of th e effect , spurious causes - t hose which seem to raise th e probability of an effect but whose relationship may be explained by other factors, and genuine causes (non-spurious prima facie causes). This step results in three sets of temporal logical formulas (in P CTL ), relating causes to t heir effects (which t hemselves may be logical formul as). See [7] for further details.
4
Software and Results
The main out put of the software is a representation of the HKM as a directed graph. T he graph presents clusters and th eir connections and allows exploration of the genes and terms comprising each. Integrated links to websites such as Entrez[3] and t he Affymet rix database allow furth er st udy of t he probes. We present a validation process for GOALIE that was tested primarily using t he yeast (S. cerevisiae ) cell cycle dataset collected by Spellman et al.[9]. There are three components to the Spellman yeast cell cycle dat a, following t he yeast 's behavior with a -facto r, cdcl 5 and elut riation. This paper describes a case study using t he alpha factor dataset . Furth er discussion of t he visualizati on and software facilities in GOALIE is t he subject of a forth coming paper.
4.1
Cluster Graph
We began with time course gene expression dat a for 6178 genes at 18 time points. After filtering out genes with missing expression data, we ended up with 4489 genes left. Th e resultin g data was partitioned into 5 time windows, with all but the last window having an overlap of two time points . Eac h window was divided into 15 clusters, yielding a total of 75 clust ers . Th ese cluste rs formed t he initi al input to GOALIE. Th ey were re-described using t he Fisher-Exact test with Benjamini-Hochberg correction and a p-value of 0.05. T he redescription across
192 time windows was computed with a J acquar d's coefficient 8 = 0.8. Using t he notation W :C to denot e cluste r number C in window W , we describe a sampling of our results , as seen in figure 1. 1:4 to 2:15 and 2:3 DNA replication initiation as well as DNA replicat ion checkpoint are init ially up regulat ed, consistent with t he S phase of t he cell cycle. 1:4 t hen splits into two cluste rs, wit h opposite regulation pat terns. In cluster 2:15, notice t hat t he processes labeling the S portion of 1:4 are down regulated, while cytokinesis, meiotic G2/M1 tr ansition and mitotic metaph ase/anaphase tr ansition are up regulat ed in 2:3. 3:1 At this time, genes from 2:15 and 2:3 which had been regulat ed separ at ely converge, as G2 continues and processes associated with repairs are up regulat ed. As in t he first window, this cluster again splits into two groups . 4:5 Cell organization and biogenesis, cell morphogenesis checkpoint, cytokinesis and postreplication repair are all up-regulat ed, signaling the beginning of the M phase. Genes from his cluster divide into oppositely regulat ed groups in the last window 4:11 Also part of M phase, but more shar ply regulat ed, Bud site selection and chromosome condensation are first high and then go down in window 4. Th is clust er then separates into two groups , which overlap with genes from 4:5. 5:9 and 5:15 As The Mphase ends and G1 begins, 5:9 conta ins down regulated processes such as post replication repair and mismatch repair. In 5:14, there is a positive regulation of the exit from mitosis , AMP biosynthesis and aspartate biosynthesis as the cell readies for the G1 phase.
5
Conclusion and future directions
Many complex systems, whether natural or engineered, are amenable to GOALIEs semantic analysis within the kinds of logical frameworks it creates. Th rough t he examples used here, such a logical framework has been seen to provide a new way of reasoning about complex biological syste ms as well as the interface that makes this inform ation accessible to scientists. In the future, GOALIE is planned to provide support for oth er ontologies and cont rolled vocabularies, such as MeSH [8] and KEGG [5]. T here are also several interesting technical questions to be answered: How does one select the optimal size of the Kripke model, mostly determined by window size and number of clusters jl l], as those choices can strongly affect the rate (data-compression), distortion, and hence, the fidelity of the resulting models. Addit ionally, GOALIE will need to continually evolve to interface with the users from ot her fields through transparent representations such as t he Gantt charts, which minimize the information loss and provide more background information on the genetic basis for the displayed terms . There is also t he matter of determining the optim al window size and number of clustersjl l ], as those choices can strongly affect the resulting models. Additionally, we are continuing development on the Gantt charts, to minimize the information loss and provide more background on the genetic basis for the displayed terms.
193
6
Acknowledgements
We would like to acknowledge t he funding support for GOALIE from the DARPA BioCOMP pro gram , Dept . of Hom eland Security, and t he NSF EMT progr am .
Bibliography [1] BAR-JOSEPH, Z., "Analyzing time series gene expression data" , Bioinformatics 20, 16 (2004), 2493-2503 . [2] COVER, Thomas M., and Joy A. T HOMAS, Elements ofInformation Th eory , John Wiley & sons (1991). [3] "Ent rez pubmed", http://www.ncbLnlm .nih .gov/entrez/query.fcgi. [4] "Gene ontology consortium sit e" , http://www . geneontology . org. [5] "Kegg database" , http://www .genome.ad.jp/kegg/. [6] KLEINBERG, Sam antha , Kevin CASEY, and Bud MISHRA , "Systems biology via redescription and ontologies (i): Finding phase changes with applications to malari a te mpora l d at a" , Submit ted . [7] KLEINBERG , Sam antha, an d Bud MISHRA , "Inferr ing causa tion in time course data with temporal logic" , Submitted . [8] "MeSH site" , http://www . nlm.nih .gov/mesh/meshhome . ht ml (1999). [9] SP ELLMA N, P. T ., G. SHERLOCK, M. Q. ZHA NG, V. R. IYERS , K. A NDERS, M. B. EISEN , P. O. BROWN, D. BOLSTEIN, and B. F UTCH ER, "Comprehensive Identifi cation of Cell Cycle-Regula ted Genes of th e Yeast Saccharomyces Cerevisi ae by Microarray Hyb ridization" , Molecular Biology of the Cell 9 (1998), 3273- 3297. [10] SPIRTES, P., C. GLYMOUR, R. SCHEINES, and S. KAUFFMA N, "Constructing Bayesian Network Mod els of Gene Expression Networks from Microarray Data" , Proceedings of the Atlantic Symposium on Computational Bi-
ology, Genome Information Systems and Technology (2001). [11] STILL , S., and W . BIALEK , "How Many Clusters ? An InformationTheoretic P erspective" , Neural Computation 16 (2004), 2483-2506.
[12] STILL, Sus anne, Will iam B IALEK, and Leon BOTTOU, "Geomet ric clustering using the information bo ttleneck method" , NIPS , (2003). [13] S UPPES , P atri ck, A probabilistic theory ofcausality , North-Holland (1970). [14] TISHBY, N., F . C. P EREIRA , and W . BIALEK, "T he Information Bottleneck Method" , Proceedings of the 37th Annual Allerton Conference on Communi cation, Control and Computing, (1999).
Chapter 25
Biotic Population Dynamics: Creative Biotic Patterns Hector Sabelli and Lazar Kovacevic Chicago Center for Creative Development, 2400 Lakeview Avenue, Illinois , 60614. U.S.A. Hector_Sabelli@rush .edu Abstract: We present empirical studies and computer models of population dynamics that demonstrate creative features and we speculate that these creative processes may underline evolution. Changes in population size of lynx, muskrat , beaver , salmon , and fox display diversification, episodic changes in pattern, novelty , and evidence for nonrandom causation. These features of creativity characterize bios, and rule out random, periodic , chaotic , and random walk patterns. Biotic patterns are also demonstrated in time series generated with multi-agent predatorprey simulations. These results indicate that evolutionary processes are continually operating. In contrast to standard evolutionary theory (random variation, competition for scarce resources , selection by survival of the fittest, and directionless , meaningless evolution) , we propose that biological evolution is a creative development from simple to complex in which (1) causal actions generate biological variation ; (2) bipolar feedback (synergy and antagonism, abundance and scarcity) generates information (diversification, novelty and complexity) ; (3) connections (of molecules , genes, species) construct systems in which simple processes have priority for survival but complex processes acquire supremacy .
1 Introduction Although evolutionary theory and population dynamics have traditionally been separate scientific fields , evolution can occur rapidly, as illustrated not only by the rapid mutations of viruses but also by observations of birds. Ecological and evolutionary dynamics can occur on similar timescales, so population cycles can be transformed into creative processes [Yoshida et aI. 20031. Population studies provide numerical data for mathematical analysis which indicate creative patterns of change compatible with evolutionary processes rather than with stationary models . Here we formulate the hypothesis that evolution is a causal and creative development that generates diversity,
195
novelty and complexity, beyond that present in simpler origins, rather than a product of chance and selection .
2 Population dynamics Empirical data were obta ined from the Global Population Dynamics Database maintained by the NERC Centre for Population Biology, Imperial College, London. We examined Canad ian lynx , Finland voles, North Amer ican muskrat , American beaver , American red fox , and Atlantic salmon, a species for which there is sufficient data to carry out nonlinear analysis. Data analysis was performed using the Bios Data Analyzer [Sabell i et al. 20051 which measures recurrence and statistical data using a wide range of parameter points.
Bios
C haos
. ... .: ..:.
..
.
\ / :.. ~
.
"
. "' ';
:./
......
,:'I
",
-,
,"
; . 'J" -"
"
.
:~
•
.,..;"
"
..
~
..
..
, Ly nx
''".--- - - - - --
M uskrat
__
----: 'lll
:f'-".:;
..
,
•• '::':.i;.
~~!
.;.
c'
( .
,4-0.- _--'...<-
~
_
".
___J
'.,. . . ..
,..
Figure. 1: Recurrence plots of population numbers of two animal species and of mathem atically generated biotic series show organized clusters of isometries (complexes) separated by recurrence free intervals, while chaotic series generate uniform plots such as observed with random data .
All the animal populations examined show an increase in standard deviat ion (SO) with embedding (local diversification) which indicates the generation of new phenomena, as also observed with biotic series, random walks and many creative natural proces ses [Sabelli, 2005]. After shuffling the data, the SO does not increa se with embedding. Random distributions and chaotic attractors maintain a stable varianc e regardless of sample duration or embedding (after the first few embedd ings). Our result s are consistent
196 with previous observations of increased variance in population abundance with the length of the time series [Pimm 1991; Inchausti and Halley 200 I]. Recurrence plots of population numbers show the existence of episodic patterns, which are most evident in recurrence plots (Fig. I). Distinct clusters of recurrences with different patterns (complexes) separated by recurrence-free intervals are evident in the recurrence plots of muskrat, beaver, fox and salmon populations, as well as in biotic series and in brown noise. Shuffling erases these complexes. Lynx and vole populations show more regular clusters, reminiscent of the sequences of identical clusters of recurrences observed with periodic series (e.g. Volterra-Lotka model). Random and chaotic series show uniform recurrence plots, without clustering. The quantification of recurrence allows one to measure the degree of repetitiveness or of novelty. Since a recurrence is a repetition of pattern, a lower than random recurrence rate indicates that the process under consideration innovates more than chance events. Novelty is thus defined as the increase in recurrence isometry produced by shuffling the data [Sabelli 2001]. Novelty in evident in muskrat, beaver, salmon, lynx, and red-fox populations, but not in voles (Fig. 2). Novelty is an essential feature of bios, which differentiates it from chaos. Novelty initially is a surprising property, but living processes characteristically vary faster than random via specific mechanisms such as sexual reproduction, meiosis, and the induction of mutation by radiation and by stress. Mathcrnatical bios 45
40 ~
35
\
30
25 20
/"
<,
o
5
-
r-;
L.--J
10
~J
-15
20
25
Muskrat 35 30
\
25 20
'\.
15
10
\
o
---
"-... ~ ,.---;
5
10
~
,,/
--... 15
r
"-"
20
25
Figure. 2: Embedding plot of the number of isometric recurrences (vertical axis) as a function of the number of embeddings (horizontal axis) . Bold line: original series . Thin line: shuffled copy. Randomization increases the number of isometries in animal populations and in biotic series, denoting novelty , and decreases isometries in chaos .
Changes in population numbers are gradual, i.e. the successive terms in the time series are contiguous. A discrete time series is contiguous when consecutive terms are similar to each other, while consecutive terms often lie on opposite sides of the median in random and chaotic series. These changes in population number are not random. The observation of organization in the time series of differences indicates causal rather than stochastic origin. Thus time series of the differences between consecutive terms in population time series showed complexes in all 6 species, and novelty in muskrat, beaver, salmon
197 and red-fox . The nonrandom origin of these patterns was supported by statistical analysis [Kendall 1973] that showed partial autocorrelation for several lags, as observed in causal processes but not in random walks. These observations demonstrate that population numbers vary according to a pattern that we have called bios [Sabelli et al. 1997; Kauffman and Sabelli 1998]. All six species showed deterministic causation, diversification, and contiguity, and five of them met all the criteria for bios including contiguity, aperiodic pattern, complexes, and novelty. Diversification, complexes, and novelty, exclude white noise or chaos. Mathematical bios is chaotic insofar as it is aperiodic, deterministic, and extremely sensitive to initial conditions. Biotic processes are radically different from chaotic attractors in being contiguous and creative . Creativity can be operationally defined by the demonstration of three related characteristics: diversification, complexes and novelty in causal processes. Chaotic attractors do not display these properties . These time series analyses show that many processes commonly regarded as chaotic or stochastic actually are biotic. Biotic patterns are found in heartbeat intervals [Sabelli et al. 1997], respiration, sequences of bases in DNA, the shape of rivers and shorelines, meteorological data, economic series [Sabelli 2005], and the temporal distribution of galaxies representing the expansion of the universe [Sabelli and Kovacevic 2006] which suggest that cosmological evolution may be determined by causal physical forces, rather than resulting from the expansion of random non-homogeneities. What are the causal processes that generate life-like (biotic) patterns in nature? The dual, accelerating and decelerating innervation of the heart, the prototype of bios, suggested to us the importance of bipolar, positive and negative , interactions; indeed, bios is generated mathematically by trigonometric functions that model bipolar feedback [Kauffman and Sabelli 1998]. Bipolar feedback is evident in the generation of biotic patterns in empirical processes . For instance heartbeat intervals are regulated by the antagonistic action of the sympathetic accelerating and parasympathetic decelerating nerves, economic prices are affected in opposite manner by supply and demand , and the Schrodinger equation involves wave functions . In contrast, unipolar feedback, such as the one in the logistic equation, produces chaos but not bios.
I . ·I t '
"'.1'
I " .'
"1'·
..... ,."1' ",.'
I 'I'~
Figure. 3: Biotic series generated by the recursion of a trigonometric function (left), and the chaotic pattern of the series of differences between consecutive terms (right).
3 Population Dynamic Models Do similar processes arise in the complex webs formed by individuals of different
species? To examine these issues, we constructed a multi-agent simulation model using REPAST (http://repast.sourceforge.net) that includes three types of agents: plants, which grow in a sinusoidal "seasonal" pattern, herbivores that eat the grass, and carnivores who eat the buffalo . Carnivores die when they reach their "natural death" age (randomly set for each azent). or when thev lack food. The model is stable within a relativelv small
198 range of parameters, within which all three populations show biotic time series with diversification, complexes, novelty, and deterministic causation (partial autocorrelation for several lags, pattern in the series of differences between consecutive terms). Thus using multi-agent instead of differential equations shows that food chains generate creative biotic patterns instead of periodic or chaotic ones. Previous studies have shown that indeed prey-predator relations often generate complex patterns [Dercole et al. 2003; Hudson and Bjornstad 2003; Gilg et al. 2003].
4 Bios Theory of Evolution These studies indicate that changes in population are causal and creative processes. In population data, patterns in the sequence of differences and partial autocorrelation demonstrate deterministic causation; diversification and novelty signify creation; and the succession of different complexes indicates rapid change compatible with fast evolution . Standard evolutionary theory postulates that evolution occurs at the slow and gradual pace as contrasted to the more rapid changes observed in the dynamics of population. Further, standard evolutionary theory postulates (I) biological variation resulting from accidental events (such as random mutations and giant meteorite crashes); (2) competition for scarce resources; and (3) selection by the survival of the fittest. In contrast, the bios model of creation proposes that (I) biological diversification, novelty and complexity largely result from causal processes; (2) synergy and abundance are as important as antagonism and scarcity, and together constitute bipolar feedback processes that generate diversity , greater than random novelty, and complexity; and (3) survival has priority but complex processes have supremacy in both variation and selection . Action, bipolarity and connectedness are generic features of natural processes sufficient to generate diverse, novel and complex, life-like (biotic) patterns, as illustrated by mathematical recursions . We hypothesize that causal actions, bipolar interactions and connectedness significantly contribute to biological evolution . Action: Biological actions are physical actions, i.e. changes of energy in time. Time implies unidirectionality, which is imprinted as asymmetry in biomolecules (Pasteur), embodied as sequential order in processes, and more generally manifested as causality. A physical action causes consequences; there are no isolated events. Processes are sequences of actions, or more properly lattices, as actions interact, converge, and bifurcate . Causal actions can generate biological variation, as illustrated by the novelty generated by biotic feedback, sexual reproduction , meiosis, the induction of mutation, and the invention and spread of new behaviors by brained organisms that establish new forms of selection . Thus sequences of actions also become imprinted as hierarchies of complexity . Bipolarity: Just as physical processes are organized by fundamental symmetries described by mathematical groups, biological processes involve synergic and antagonistic oppositions which generate diversification and novelty. Sexual procreation is exemplary. Notwithstanding, conflict over scarce resources has been regarded as the fundamental motor of change in nineteenth century theories of biological evolution and of economic processes still regarded as current. Correspondingly, the effects of intra and interspecies competition in driving population dynamics have been emphasized. More recently, the importance of positive interactions between species [Bertness and Callaway , 1994; Bruno et al. 2003] as well as the coexistence of positive and negative effects [Jones et al. 1994] have been recognized. Competition over scarce resources generates unipolar feedback, such as the logistic equation, which can generate chaos but not bios (see above) which is the pattern actually observed in population dynamics as well as in economic processes [Sabelli 2005]. Bipolar opposition can generate bios. Both abundance and scarcity of the multiple biological and non-biological components of life occur in ecological
199 communities. Cooperation among species is as common as competition . Multicellular organisms cannot live and reproduce without the participation of intracellular mitochondria and extracellular microorganisms. We need our intestinal flora, and plants require symbiotic fungi, pollinators , other birds, mammals and insects to spread their seeds. One quarter of all documented fungi are lichenized; trees need fungi to develop; ants have cultivated fungi for over 50 million years. The importance of synergistic processes for life and its evolution is well known although surprisingly absent from neoDarwinist discussions. Notably, competition appears to be more important where there is great abundance, such as the tropical areas studied by Darwin, while cooperation among species appears to be more obvious under harsh conditions such as the Siberian plains studied by Kropotkin, the theorist of evolution by mutuality. Ideological differences may have been more important than climatic ones in determining the opposite sign of these two theories of evolution . Connectedness: Life evolves as a narrow layer of matter in which atoms nucleate to form systems : molecules, cells, organisms, communities. The formation of materials by other organisms allows the emergence of new and often more complex organisms, a fact that is obvious but not trivial [Sabelli, 1989]. In genome transfers, pre-formed materials drive and direct change. Genetic exchange is co-creation. Organisms can supply each other with genes and even complete genomes [Syvaken and Kado, 2002]. Aggregation, endosymbiosis, pluricellularity, multispecies communities , and sociality are essential for evolution. The intracellular incorporation of microorganisms as mitochondria and chloroplasts are necessary for higher forms of life, and in fact the very origin of eukaryotes may have been a cellular fusion between bacteria and archea [Gupta and Golding 1996]. Far from leaving microorganisms behind on an evolutionary ladder, we more complex creatures are both surrounded by them and composed of them. Endosymbiosis allows pluricellularity : there are no multicellular organisms composed of prokaryotic cells. Symbiogenesis theory states that species arise from the merger of independent organisms through symbiosis [Margulis and Sagan 2002]. Evolution occurs in the context of biological communities in which continual interactions exclude the occurrence of isolated events. Individuals and species do not interact at random as gas molecules but co-evolve within a web with rich topological structure. The biotic web involves hierarchical lattice structures (e.g. food chains) and processes (e.g. evolution from simple to complex organisms) and group properties at multiple levels, such as the circulation of energy and matter within and among organisms . Thus lattice, group and topological properties are intrinsic to the evolving biosphere. Connectedness thus provides a multidimensional form of causation capable of generating biotic patterns of population dynamics, as illustrated by simple multi-agent simulation models. It departs from simple causality that determines the outcome instead of creating complexity) as well as from stochasticity, which makes the unlikely assumption that events are independent from each other. The circulation of energy, information and matter are cyclic engines that can generate complexity, as illustrated by mathematical recursions . Species do not evolve in isolation while the biosphere remains stable as a homeostatic superorganism (Gaia); the biosphere evolves as a totality, creating multiple levels of complexity . Evolution creates multiple levels of complexity, which are connected in feedback processes. We thus proposed to split the concept of primacy into the complementary categories of priority and supremacy taking as a model the mammalian central nervous system, in which the lower and simpler bulbo-spinal levels have evolutionary and functional priority and the more recent and more complex brain cortex has functional supremacy [Sabelli and Carlson-Sabelli 1989]. Simple processes and structures have priority in time and in urgency: respiration precedes and is indispensable for thinking; simple processes also are more global and hence involve greater changes in energy. Complex processes and structures emerge later in evolution but acquire local supremacy because of their greater informational content. The physical and chemical processes and
200
products of abiotic evolution initiate and modulate biological evolution. The production of materials by simple organisms propels the creation of higher species. The complex biological processes generated by evolution alter drastically the physical composition of the planet, and thereby feedback upon biological evolution. The generation of materials such as oxygen by simpler organisms propels the creation of higher species by providing necessary materials (material hypothesis of evolution [Sabelli 1989]), and by generating conditions that allow for respiratory processes that are incompatible with many preexisting lifeforms. This illustrates the priority of simple organisms and the supremacy of more developed ones as a cyclic engine of evolution [Sabelli 2005]. Photosynthesis, sexual reproduction, and brain function redirect natural selection. Hierarchical relations among species are more significant than intrinsic "fitness". Cows multiply and elephant populations dwindle because of their relation to the dominant mammalian species, and AIDS viruses burgeon because they have made us their host. Topological connectedness is a generic property of natural processes; positing independent random events and single genes or individuals as the units of evolution represent unlikely speculations. Genes only function in context: the same set of genes is present in very different tissues, and different species share many genes, and differ only in their expression.
5 Evolution is Creation Unbiased consideration of evolution from protokaryotic cells to mammals indicates an overall increase in complexity. Pointing to extinction and involution (e.g. parasites), NeoDarwinian theorists assert that evolution is directionless. Gould [2002] argues that the fact that most forms of life increase in size and complexity is an "artifact" resulting from the circumstance that life begun at the extreme of smallness and simplicity. However, a simple origin is not an artifact. How else could life have started? Since life could not have started large and complex, would that very fact not reflect the basic logic of the universe? Evolutionary progress may be explained as an obligatory sequence: creation must necessarily precede (and exceed) destruction, and simplicity must necessarily precede complexity. Constructive processes create higher levels of complexity. Complexity is the necessary result of the fact that creative processes are inherently selfconserving while destructive processes are limited to what has been constructed before. This asymmetry between construction and destruction is a necessary logical relation (tautology) which is creative. Thus evolution has a direction towards greater complexity, including the formation of minds that create meaning, and in this sense evolution is meaningful. Evolution is creation and creation is evolutionary, not a single event in the distant past. Creation is necessarily accompanied by destruction: evolution includes and requires the death of individuals and the extinction of species. Involution occurs. Progress is not unavoidable. Evolution is a creative process, not a deterministic one. It is thus crucial to consider responsibly our role as dominant species. Given the human implications of evolutionary ideas, it behooves us to evaluate our assumptions in the light of all their implications, and to be critical of ideology. Making "selfish genes" the units of evolution is unscientific. Genes do not have selves because they do not have brains, and therefore cannot be selfish. What is absurd from the perspective of one science cannot be useful as a metaphor in another discipline, much less true in reality. Likewise the portrait of evolution as a chronic, bloody competition among individuals and species is at variance with growing knowledge that shows that forms multiply and grow more complex by incorporating others. Life does not take over the planet by combat, but by networking [De Duve 2002]. The notion of scarcity and struggle as the motor of change was born in political economics (Malthus) and has led to undesirable consequences.
201
Mathematics offers biology a richer and more solid set of assumptions. Lattice, group, and topology, the mother structures of mathematics (Bourbaki), describe generic properties of natural processes embodied at multiple levels of organization [Sabelli 2005]. We thus propose that evolution is the necessary consequence of causal action, bipolar feedback (synergy and antagonism), and the combination and conservation of material structures in topological webs with lattice and group organization. Causal action, bipolar opposition (positive and negative, synergy and antagonism), connectedness, and systematic creation of complexity oppose the concepts of random change, scarcity, struggle for survival, independent events, genes or individuals, and directionless evolution advanced by standard evolutionary theory. We propose the notion of organization out of order to replace Nietzsche's "order out of chaos".
Acknowledgments: We are thankful to the Society for the Advancement for Clinical Philosophy (SACP) for its support, and to Mrs. Maria McCormick for her gifts to SACP.
References Bertness , M.D. & Callaway, R. , 1994, Positive interactions in communities. Trends in Ecology and Evolution, 9, 191-193 . De Duve, C. ;1002, Life Evolving. Oxford University Press . Dercole, F. Irisson, J-O , & Rinaldi , S., 2003, Bifurcation Analysis of a Prey-Predator Coevolution Model. SIAM J. ofApplied Math . 63: 1378-1391 . Gilg , 0., Hanski, I. & Sittler , B., 2003 , Cyclic dynamics in a simple predator-prey community . Science 302: 866-868 . Gould S. J ., 2002 , The structure ofevolutionary theory. Harvard University Press. Gupta , R. S. & Golding , G . B., 1996, The origin of the eukaryoti c cell . Trends Biochem . Science 21: 166-171. Hudson , P. J . & Bjornstad, O. N., 2003 , Vole stranglers and lemming cycles . Science 30: 797-798 . Inchausti, P. & Halley, J., 2001, Investigating long-term ecological variability using the Global Population Database . Science 293 ,655-657. Jones , C.G., Lawton, J.H. & Shachak, M., 1997, Positive and negative effects of organisms as physic al ecosystem engineers. Ecology , 78, 1946-1957. Kauffinan, L. & Sabelli, H., 1998, The Process equation. Cybernetics and Systems 29 (4): 345-362 Margulis L. & Sagan D., 2002 , Acquiring genomes, Basic Books . Pimm, S., 1991, The Balance ofNature? Ecological Issues in the Conservation ofSpecies and Communities . Chicago Universit y Press, Chicago. Sabelli , H.C., 1989, Union of Opposites: A Comprehensive Theory of Natural and Human Processes . Lawrenceville, VA: Brunswick Publishing. Sabelli, H., 2001, Novelt y, a Measure of Creative Organization in Natural and Mathematical Time Series . Nonlinear Dynamics. 5: 89-113 . Sabelli , H., 2005, Bios. A Study of Creation, World Scientific . Sabelli H. & Carlson-Sabelli L. 1989, Biological Priority and Psychological Supremacy, a New Integrative Paradigm Derived from Process Theory, Amer. 1. Psych iatry 146: 1541-1551. Sabelli , H. & Kovacevi c, L., 2006, Quantum Bios and Biotic Complexity in the Distribution of Galaxies. Complexity 11: 14-25. Sabelli , H., Carlson-Sabelli , L., Patel , M & Sugerman, A. , 1997, Dynamics and psychodynamics. Process Foundat ions of Psychology. J. Mind and Behavior 18: 305-334. Sabelli , B., Sugerman, A., Kovacevic , L., Kauffman, L., Carlson-Sabelli, L., Patel, M. and Konecki, J. 2005 Bios Data Analyzer. Nonlinear Dynamics . Psychology and the Life Sciences 9: 505-538 . Syvaken, M. & Kado, C. 1.,2002 , Horizontal Gene Transfer. Academic Press Yoshida, T ., Jones , L. E., Ellner, S. P. Fussman, G. F. & N. G. Hairston Jr. ;1003, Rapid evolution drives ecological dynamics in a predator-prey system . Nature 424 : 303-306 .
Part II Models
Chapter 1
The Growing Canvas of Biological Development: Multiscale Pattern Generation on an Expanding Lattice of Gene Regulatory Nets Rene Doursat Department of Computer Science and Engineering University of Neva da, Reno http://www.cse.unr.edu/-doursat
The spontaneous generation of an entire organism from a single cell is the epitome of a self-organizing, decentralized complex system. How do nonspatial gene interactions extend in 3-D space? In this work, I present a simple model that simulates some biological developmental principles using an expanding lattice of cells. Each cell contains a gene regulatory network (GRN), modeled as a feedforward hierarchy of switches that can settle in various on/off expression states. Local morphogen gradients provide positional information in input, which is integrated by each GRN to produce differential expression of identity genes in output. Similarly to striping in the Drosophila embryo, the lattice becomes segmented into spatial regions of homogeneous genetic expression that resemble stained-glass motifs. Meanwhile, it also expands by cell proliferation, creating new local gradients of positional information within former single-identity regions. Analogou s to a "growing canvas" painting itself, the alternation of growth and patterning results in the creation of a form. This preliminary study attempts to reproduce pattern formation through a multiscale, recursive and modular process. It explores the elusive relationship between nonspatial GRN weights (genotype) and spatial patterns (phenotype). Abstracting from biology in the same spirit as neural networks or swarm optimization, I hope to be contributing to a novel engineering paradigm of system construct ion that could complement or replace omniscient architects with decentralized collectivities of agents .
204
1 Introduction The spontaneous generation of an entire organism from a single cell is the epitome of a self-organizing, decentralized complex system. Through a precise spatiotemporal interplay of genetic switches and chemical signaling, a detailed architecture is created without explicit blueprint or external intervention. Recent dramatic advances in the genetics and evolution of biological development, or "evo-devo" [e.g., Carroll et al. 200 I], have laid the foundations of a future discipline of generative development. The goal is to unify organisms beyond their seemingly endless diversity of form and describe them as variations around a common theme. The variations are the specifics of the genome; the theme is the generic elementary laws by which this genome controls its own expression, whether triggering cell division, differentiation, selfassembly or death, toward form generation. On this stage, evolution is the player. "How does the one-dimensional genetic code specify a three-dimensional animal?" [Edelman 1988] How does a static linear genome dynamically unfold in time (regulation dynamics) and space (cell assembly)? The missing genotype-phenotype link in biology's Modem Synthesis constitutes the main question of evo-devo. These issues also potentially open entirely new perspectives in engineering disciplines, including software, electrical, mechanical or even civil engineering. Could an architecture, robot or building, construct itself? Would it be possible for a swarm of software agents or small components, each containing a "genome", to self-assemble? Although themselves emergent, our cognitive faculties are strongly biased toward identifying central causes and require great effort to comprehend massively parallel processes. We spontaneously tend to ascribe the generation of order to one or a few highly informed agents, following anthropomorphic stereotypes (designer, architect, manager, etc.). Yet heteronomous human-designed order is the most sophisticated of all forms of organization. In living systems, autonomous decentralized order is the natural norm because it is the most cost-effective. Information is distributed over a large number of relatively ignorant agents, making it easier to create new states of order by evolving and recombining their local interactions. To imitate Ulam's famous quip, self-organized systems are the "nonelephant" species of systems science-yet they are the least familiar of them. Therefore, since natural systems are not engineered, what can human-made systems learn from them? [Braha et al. 2006, Chap. I] In this preliminary study, I present a simple prototype simulating some aspects of biological development with dynamical systems coupled as cellular automata. Following the artistic metaphor of a growing canvas that paints itself [Coen 2000], I attempt to reproduce developmental pattern formation by a multiscale recursive process. Starting with a few broad positional and identity regions of cells, details are gradually added by cell proliferation and differentiation using only local information. A "shape" emerges from self-assembling agents that all carry the same program. Parts 2 and 3 layout the main ideas of the model, which is further discussed in Part 4.
2 Stained-Glass Patterning This part describes the principles of developmental patterning based on spatial unfolding of genetic program [Carroll et al. 2001]-a process that could be nicknamed
205 "shape from switching"-and gathers them into an abstract 2-D model. Section 2.1 briefly summarizes current knowledge about gene regulatory networks and positional integration . Section 2.2 proposes a model of "stained-glass" patterning based on an array of multilayered networks . This model is generalized in Part 3 to a multiscale system that incorporates the growth of the organism in a recursive and modular way.
2.1 Genetic switches integrate positional information An important amount of genetic information critical for development is contained in the nonexpressed parts of DNA. Stretches of nucleotide sequences, called genetic regulatory sites or switches, control gene expression at multiple stages in the developing organism. Switches are generally located upstream of the genes they regulate and are bound by specific proteins, whose effect is to interfere with the enzymatic processes responsible for genetic transcription (Fig. la). These proteins selectively attach to segments of the DNA strand, as keys to locks, and can either repress or promote transcription. Switches often combine multiple gap-separated lock-segments that may reuse the same key-genes in different places and stages of development.
"
x i~
CD
@
- --£
-- L0~
~
i
~ =CQ)
(a)
1 - A and (nol B) .4 - o{n.\ + liT +,,", B = a{b-, + bT +1,", (b) -' .. x r lO Y
/-i 1A
(:J [\
r
x
J'
I
x
(d)
X
J
x
x
Fig. 1: Principles of spatial patterning from a GRN. (a-b) Two diagrams of regulatory interactions: proteins X and Y promote genes A and B by binding to upstream regulatory sites; then protein A promotes, while B represses, gene I. (c) Variation of expression levels in I-D (at a given Y level): A and B respond to a gradient of X at two different thresholds , hence create boundaries at two different x coordinates; these in tum define the region of identity gene I , where A is high but B is low. (d) Same view in 2-D: I is the intersection of high A and low B.
As regulatory proteins are themselves synthesized by the expression of other genes, the developmental genetic toolkit can be globally described as a complex web of regulatory influences, or a gene regulatory network (GRN) (Fig. 1b). Although the structural and functional properties of GRNs are not fully understood, it seems that they are broadly organized into functional modules [Schlosser & Wagner 2004] that reflect the successive stages and anatomical modules of organismal development. Studies of stripe formation in the Drosophila embryo have identified a cascade of morphological refinements that start with global molecular gradients and progressively lead to the precise positioning of appendages [e.g., Coen 2000, Carroll et al. 200 I]. Each period is characterized by the activation of a group of genes that respond to regulatory signals from the previous group, and trigger the next group (Fig. l c). As an example, initial protein deposits of maternal origin located asymmetrically in the
206 egg start diffusing across the syncytium. Then, depending on their concentration on each point, these initial molecules regulate the expression a first set of genes at different levels and positions on the embryo's axes (antero-posterior, dorso-ventral and proximo-distal). The regulatory sites for these genes are sensitive to the concentration of maternal proteins at various thresholds, creating staggered regions of expression in the form of "stripes". These stripes in tum intersect in various ways to give rise to the next generation of gene regions (Fig. ld) . In particular, they form identity regions (Drosophila's "imaginal discs"), where cells are characterized by a common signature of genetic activity, setting the basis for differentiated limb growth (leg, antenna, wing, etc.) and organogenesis. In summary, molecular gradients of morphogenetic factors (the "keys") provide positional information [Wolpert 1969] that is integrated in each cell nucleus by its genetic switches (the "locks") along several spatial dimensions. Developmental genes are regulated by differential levels of lock-key fitness and expressed in specific patterns or territories in the geography of the embryo. A territory represents a combination of multiple gene expression values, i.e., morphogenetic protein concentrations . In this study, I model the territories of embryo partitioning in a way similar to the colorful compartments created by the intersection of lead carnes in stained-glass works.
• •
(d)
Fig. 2: Numerical simulation of stained-glass region formation from a simple feedforward GRN. (a) Subset of lOOxlOO identical, coupled GRNs (each contains 4 "boundary" genes and 3 "identity" genes): bottom nodes X and Yare linked to their neighbor counterparts to create diffusion. (c) 2-D region maps created by each node on the lattice, under random weights : X and Y form horizontal and vertical gradients; BI...4 nodes form diagonal boundaries; h . 3 are expressed in various intersection segments of the BI...4 regions. (d) Same 2-D maps, superimposed (using a different color scheme); bottom: X and Y; middle: BJ, B3 and B4; top: h . 3'
2.2 A feedforward GRN model of stained-glass segmentation The core architecture of the virtual organism is a network of networks, construed as a 2-D lattice of identical GRNs (Fig. 2a). Each GRN represents a cell and is connected to neighbor cells via some of their GRN nodes [Mjolsness et al. 1991, Salazar-Ciudad et al. 2000, von Dassow et al. 2000]. In the present work, my first attempt at modeling stained-glass patterning uses a simplistic feedforward GRN template (Fig.2b)
207 containing three layers: (1) a bottom layer with two positional nodes, X and Y, (2) a middle layer of boundary nodes {Bi(t)};~l...n and (3) a top layer of identity nodes {h(t)h=l...m (Fig. 2b). Variables X, Y, B, and h denote the gene expression levels or "activity" of the nodes . The boundary nodes compute linear discriminant functions of the positional nodes : B, = 0(WixX + WivY - ~), where {Wix, Wiv} i=l...n are the regulatory weights from X and Y to B; parameter ()i is B;'s threshold and o(u) = 1/(1 + e- Al1) (Fig . Ic, middle). The effect of a boundary node is to segment the embryo into two half-planes of strong and weak expression levels, near I and 0 (Fig.2c-d, middle). Then, identity gene levels are given by h = o(Li W'ki B, - flk), where W'ki are weights from B, to h. Therefore, territories of high identity gene expression are comprised of polygon-shaped regions, at the intersection of many boundary lines (Fig. 2c-d, top). In summary, a lattice of coupled perceptron-like GRNs can generate a stainedglass pattern of identity gene regions in two phases: first, by establishing half-planes; second, by combining these half-planes into polygonoid regions. Naturally, boundaries need not be linear , nor GRNs strictly feedforward. Most importantly, there are intermediate genes (layer of B nodes) providing the elementary boundaries that are further combined by downstream genes into nontrivial, morphology-specific regions. However, a major drawback of this simple three-layered "perceptron"-like architecture is that the number of "hidden units" (B nodes) scales up rapidly as regions become more fractured and scattered into small morphological details (Fig. 3).
3 Multiscale Segmentation: The Growing Canvas An organism's shape is not fully generated in every detail in just two phases but rather grows in incremental stages. Biological observations indicate that transcription regulation "cascades" down from site to site in a broadly directed fashion, so that a GRN structure consists of many more than three layers. In general, early developmental genes seem to pave the way for later sets of genes and are generally not reused. A frequent exception to this principle are multivalent genes reappearing in different organs and at different times during development. Yet, their capacity for repeated expression relies on independent combinations of switches, which could be represented by duplicate nodes in different layers of a feedforward GRN architecture. Therefore, to account for progressive growth and morphological refinement, a more adequate GRN model consists of a hierarchy of subnetworks, where each subnetwork performs a function similar to the simple feedforward network presented above. Instead of relying on a single group of boundary nodes to cover fine-grain details (Fig. 3a), the image can be iteratively refined by the action of several mapping modules and submodules (Fig. 3b). First, the network at the bottom of the hierarchy only establishes broad identity regions; then these identity regions in tum trigger specialized subnetworks that further create local partitioning at a finer spatial scale, etc. Morphological details are added in a hierarchical fashion, analogous to inclusions of small stained-glass motifs into bigger ones (Fig. 4a). In parallel to this hierarchical refinement, the medium expands, i.e., cells multiply and expression regions enlarge. Thus, morphological refinement basically proceeds by alternation of two fundamental steps : (I) subsegmentation of a uniform identity
208 region into finer identity regions, by creation of new local gradients of positional information; (2) enlargement of the new identity regions by cell proliferation, in which daughter cells inherit mother cells' node values (i.e., the internal gene expression state corresponding to RNA, protein and metabolic concentrations). Expanding regions are mapped by new local coordinate systems that activate the entry points of a regional subnetwork of the GRN in the next tier of the hierarchy. These new local gradients emerge from a process similar to the original diffusion of global coordinates X, Y, for example by assymmetrical release of new proteins from the borders of an expanding region. Naturally, steps (I) and (2) are not strictly separate; they can overlap and unfold at different rates in different body parts without global synchrony,
r
B,
"\ X,. l j
...... -- ........
r ....... ----- --_
I
,
BJ:
\.~:
.!>"
(b)
~:z::.-.J--_::......:l~
Fig. 3: Illustration of recursive morphological refinement based on hierarchical GRN. (a) The B layer rapidly increases with the amount of morphological detail. (b) A more economical and realistic approach relies on a hierarchy of sub-GRNs: the identity nodes of the lower (earlier) module define broad regions that trigger the higher (later) modules via new local gradients.
4 Discussion and Future Work In this article, I have shown the possibility of multiscale pattern formation based on an expanding lattice of hierarchical , feedforward gene regulatory networks. The alternation of growth and patterning ultimately results in the creation of a "form", represented by an image or shape on the lattice. The hierarchical GRN structure adequately supports both the temporal sequence of developmental stages and the spatial accumulation of details. Most importantly, compared to a single layer of discriminant functions (Fig. 3a), a hierarchy allows the reuse of modules and thus can greatly reduce the number of intermediate gene nodes needed to generate a full mapping. In the fictitious example of Fig. 3 the number of B nodes is actually not smaller in the hierarchical network than in the flat network because the pattern does not contain repeated or "homologous" parts. In biological development, however, a key property is modularity: organisms are made of repeated segments, most apparent in arthropods or the vertebrates' column and digits. Genetic sequencing has revealed many identicalor highly overlapping stretches of DNA program, not only within individuals but also across species, which indicates that during evolution segments might have duplicated then diverged [Carroll et al. 2001]. In the present model, this would correspond to reusing the same subnetwork multiple times within one hierarchical tier (as in Fig. 4b, second tier from bottom), then mutating these copies to create variants. In this study, I attempted to clarify the elusive relationship between genotype and phenotype , specifically how a nonspatial genetic network can unfold in space to become a 2-D or 3-D shape. The problem of the quantity of information or "genetic cost" is of central importance here: What is the minimal number of gene nodes
209 needed to cover a given amount of morphological details? Organisms obviously contain far fewer genes than cells or even local regions of homogeneous cell identity. This means that a GRN is actually extremely sparse and development cannot be entirely specified by positional information and switch-based logic. Modularity and component reuse are also not sufficient to explain the relative paucity of genetic instructions. Therefore, other "epigenetic" factors must contribute to the generation of morphological details: cell-to-cell signaling interactions (differential adhesion, Turing patterns), shape-sculpting programmed cell death (e.g., between digits), differential growth rates and topological transformations (gastrulation, mechanistic folding, etc.). A more comprehensive model of self-organized development should contain all of the above mechanisms (Fig. 5).
Fig. 4: The growing canvas of morphogenesis : pattern formation proceeds by successive refinements (series of images below the line) on an expanding medium (same series redrawn above the line at different sizes). (a) Simulation using a 2-tier hierarchy of 3-layer GRNs, as in Fig. 3b; each GRN contains 2 horizontal and 3 vertical boundary nodes creating 12 rectangular identity regions; two of these regions are similarly subdivided . (b-e) Same idea illustrated by a portrait at multiple scales of resolution [after Coen 2000] and a generic network diagram.
In sum, the complete morphogenetic program (the "theme") consists of a set of developmental laws at the cellular level, while the parameters of this program (the "variations") are represented by specific GRNs. Fed into the same program, different GRNs will give rise to different shapes. At this point, evolution enters the stage (the "player") and two complementary issues arise: given specific GRN weights, what pattern will the growing canvas create? Conversely, given a desired pattern as a target, what values should the GRN weights take to produce this pattern? Beyond the biological challenge of unraveling real-world molecular pathways, these questions also raise a technological challenge: dynamic self-assembly and autonomous design in the absence of a global symbolic blueprint, e.g., as in swarm robotics or distributed software agents. Previous artificial models of development have mainly followed a bottom-up approach by observing, classifying and selecting patterns emerging from given GRNs, whether randomly wired or biologically detailed [e.g., Salazar et al. 2000]. My intention is to explore the top-down, "reverse-engineering" approach and suggest through this preliminary study that potentially any given spatially-explicit blueprint could be encoded in the weights of a nonspatial GRN. Methods to compute the weights could be investigated and adapted, whether by reverse compilation of global effects into local rules [as in Nagpal 2002], fitness-based evolutionary algorithms, or by a combination of both.
210
1'. guided patterning
~
1. guided patterning
2'. differential growth
~
3. free patterning
4. elastic folding
Fig. 5: The big picture of morphogenesis. A comprehensive model of form development would integrate several elementary genetic and epigenetic mechanisms. (I) Guided patterning : GRNcontrolled establishment of expression regions (this article). (2) Differential growth: deformations (bulges, offshoots, limbs) created by region-specific proliferation rates. (3) Free patterning: texture (stripes, spots) emerging from Turing instabilities . (4) Elastic folding: transformations from mechanistic cellular forces. (5) Cell death: detail-sculpting by region removal.
In the same spirit as artificial neural networks or ant colony optimization, my goal is less a faithful reproduction of biological mechanisms than their abstraction and potential application to computational and technological problems. Drawing from biological development, I hope to be contributing to a novel engineering paradigm of system construction (virtual or physical) [Braha et al. 2006] in which the emergence of complex structures does not exclusively rest on one omniscient architect , but partly or fully on a decentralized collectivity of simple agents, each endowed with a lowcost, incomplete network of instructions.
Bibliography [1] Carroll, S. B., Grenier, J. K., & Weatherbee, S. D., 2001, From DNA to Diversity, Blackwell Scientific (Malden , MA). [2] Coen, E., 2000, The Art ofGenes, Oxford University Press. [3] Edelman, G. M., 1988, Topobiology, Basic Books . [4] Braha, D., Bar-Yam, Y., & Minai, A. A. (ed.), 2006, Complex Engineered Systems: Science Meets Technology, Springer Verlag . [5] Mjolsness, E., Sharp D. H., & Reinitz, J., 1991, A connectionist model of development, Journal ofTheoretical Biology, 152: 429-453. [6] Nagpal, R., 2002, Programmable self-assembly using biologically-inspired multi-agent control, 1st Int Confon Auton Agents, July 15-19, Bologna, Italy. [7] Salazar-Ciudad, L, Garcia-Fernandez, J., & Sole, R., 2000, Gene networks capable of pattern formation , Journal ofTheoretical Biology, 205: 587-603. [8] Schlosser, G., & Wagner, G. P. (ed.), 2004, Modularity in Development and Evolution, The University of Chicago Press. [9] von Dassow, G., Meir, E., Munro, E. M., & Odell, G. M., 2000, The segment polarity network is a robust developmental module, Nature, 406: 188-192. [10] Wolpert, L., 1969, Positional information and the spatial pattern of cellular differentiation development, Journal ofTheoretical Biology, 25: 1-47.
Chapter 2
Compound clustering and consensus scopes of metabolic networks Franziska Mat thaus Interdisciplin ary Center for Scientific Computing University of Heidelberg, Germ any [email protected]
Carlos Salazar Department of Biology Humboldt University Berlin, Germany carlos.salazar @biologie.hu-berlin.de
Oliver Ebenhoh Depart ment of Biology Humboldt University Berlin , Germany [email protected] .de
We investigate t he st ructure of metabolic network s by identifying sets of metabolites having a simila r synt hesizing capacity. We measure th e synthesizing capacity of a compound by det ermin ing all metabolites t hat can be produced from it , and call this set th e scope of t he compound. We th en define a dist ance measure based on th e Jaccard coefficient and apply a hierarchical clustering meth od . Comp ound s within th e same clust er are chemically similar and often appear in t he same metabolic pathway. For each cluster we define a consensus scope by determining a set of metab olites t hat is most similar to all scopes within th e clust er. We find t hat only a few of t he resulting consensus scopes are mut ua lly disjoint while oth ers overlap, and some consensus scopes are fully contained in ot hers. T hus, our approach reveals a number of functional subunits of t he metab olic network which are arranged into a hierarchical set ting.
212
1.1
Introduction
Cellular metabolism is mediated by highly efficient and specialized enzymes which catalyze chemical transformations of substrates into products. Since in general the products of a particular reaction may serve as substrates for other reactions, the entirety of the biochemical reactions form a complex and highly connected metabolic network. With the sequencing of whole genomes of an ever increasing number of organisms and the emergence of biochemical databases such as KEGG [6] or Brenda [8], which are based on genomic information, large-scale metabolic networks have become accessible. Several approaches to analyse the structure of large-scale metabolic networks have emerged in recent years. Graph theoretical approaches have revealed characteristic global features [1] . It was shown that metabolic networks exhibit a small world character [9], possess a scale-free topology [5] and display a hierarchical organization [7] . The representation of a metabolic network as a graph has, however, the disadvantage that during the simplification process information is lost, which makes it impossible to reconstruct the original metabolic network from the graph. We have recently developed a novel strategy for the analysis of large-scale metabolic networks. The so-called method of network expansion [2, 3], which in a natural way links structural and functional properties of metabolic networks, is based on the basic biochemical fact that only those reactions may take place which use the available substrates and that the products of these reactions may in turn be utilized by other reactions. With a number of given substrates (the seed), a series of metabolic networks is constructed, where in each step the network is expanded by those reactions which utilize only the seed and those metabolites which are products of reactions incorporated in previous steps . The set of metabolites within the final network is called the scope of the seed. By construction, the scope describes the synthesizing capacity of a metabolic network when only the seed compounds are available as external resources. In the present work, we aim at elucidating the global organization of functional aspects of metabolism by comparing the synthesizing capacities of the different biochemical compounds . We present several ways to cluster metabolites with respect to their scopes. Based on our observation that many compounds exhibit very similar synthesizing capacities, we introduce the notion of consensus scopes, which characterize the synthesizing capacities of large groups of metabolites. Comparison of the determined clusters as well as the consensus scopes reveal interesting hierarchical structures which shed new light onto the structural organization of large-scale metabolic networks.
1.2
Clustering by principal components analysis
In this section, we characterize all biochemical compounds by their synthesizing capacity. By the synthesizing capacity of a particular metabolite we understand the set of all metabolites which can in principle be synthesized by all available
213 enzymatic reactions when exclusively the metabolite itself, water and oxygen are available as substrates. This quantity is dete rmined using the network expansion algorithm as defined in [3J and, following their terminology, will be called the scope of the compound . For our calculations we retrieve a biochemical interaction network from the KEGG database which contains 4811 reactions from over 200 organisms connecting n = 4104 metabolites. For all these metabolites we calculate the scopes resulting in the n-dimensional binary vectors Sl ,' " Sn' The entry of the vector equals 1 if the respective metabolite is in the scope and o otherwise. In a first step we apply a common method of dimensionality reduction (principal components analysis, short PCA) [4J and visualize t he data by plotting the first versus the second principal component . The result is shown in Figure 1.1, where one can distinguish immediately six well separated clusters. Almost 84% of the scopes are contained in cluster 1, which is made up of small scopes with a size below 70. Only about 16% of the scopes form the remaining 5 clusters . The identification of the compounds of these clusters will follow in Section 1.3. 15
6"
10
,.2
5 N
0
Q.
a -5
1
~4 ~3
-10 - 15 0
15 10
20 PC 1
30
40
Fig u re 1. 1: Result of peA. Projection of the data onto t he two-dimensional space spanned by t he first and second principal component. The data points can be grouped into six well distinguishable clusters. The amount of variation contained in the principal components is 63.3% for the first and 20.8% for the second principal component.
It is in principle possible to further divide the clusters into subclusters by reapplying PCA on single clusters . For cluster 6, for example, this results in two subclusters , a larger one containing metabolites with a scope similar to the one of ATP, and a smaller one containing four compounds which all possess the same scope of size 2183. As is known from previous analyses [3], this is the largest existing scope of a single compound, the scope of adenosinephosphosulfate (APS) . The problem, however, is that the clustering based on PCA assigns all of the small scopes to the same cluster . The reason is that PCA is based on a Euclidian distance, and therefore two scopes with many zero entries are necessarily similar , even if they do not have a single metabolite in common. In order to achieve a
214 better resolution in the clustering of small scopes we will now derive a distance measure, which compares scopes only according to the similarity in their sets of metabolites, and then apply a hierarchical clustering algorithm.
1.3
Hierarchical clustering
A distance measure which better captures the dissimilarities between small scopes is based on the Jaccard-coefficient. For two scopes Si and Sj, the Jaccard coefficient is given as the ratio between the number of entries that are equal to one in both scopes lSi n Sj I and the number of entries that equal to one in at least one of the scopes lSi U Sj I. With this we define the distance between two scopes as: ISinSj l (1.1) dS(Si,Sj) = 1- lSi U Sj l· The distance ds is zero if the two scopes are identical, and one if they have no single metabolite in common. We choose a nearest neighbor group-average clustering algorithm [4] , a bot tom up clustering method where iteratively t he elements or clusters with t he smallest distance are joined. Group-averaging refers to the method of defining the distance between two clusters based on the distances between the cluster elements. The result obtained in this procedure is a clustering of the data on various scales. During the first iterations only very similar elements obtain the same cluster label and the clustering is very fine. Towards the end elements or clusters with large distances are joint, resulting in a coarse clustering with a smaller number of clusters . For our analysis we choose a scale of clustering where the elements with in a cluster have a distance of at most 0.2. This value is chosen because in the range of distance values between 0.1 and 0.2 the number of larger clusters is practically constant, which indicates a certain robustness of the clustering at this scale. Furthermore, the distance value of 0.2 guarantees that elements within the same cluster are indeed very similar . At the chosen scale we find 12 clusters that contain at least 10 elements and label t hem cluster I to cluster XII. A summary of the clusters is given in Table 1.1. The compounds within a cluster are often chemically similar or appear in the same metabolic pathway, for instance cluster I contains mainly amino acids, cluster V mono- and polysaccharides, and the elements of cluster XI all appear in the pathway of indole and ipecac alkaloid biosynthesis. To relate the results from the two methods (PCA and hierarchical clustering), we color each data point in Figure 1.1 depending to which of the clusters I-XII it belongs. The result is shown in Figure 1.2. It can be seen that some of t he clusters determined with the two different methods coincide. Other clusters from PCA are composed of several subclusters. Especially the large cluster of small scopes that was obtained by the PCA approach is now split into a number of sub clusters, which proves the success of the hierarchical clustering with distance measure (1.1) in obtaining a better clustering resolution for small scopes.
215
-"
SmII...".. red VII
green : VII bIKt. : XI
CV..: X
yeaow: XII
Figure 1.2: Clusters obta ined from hierar chical clust erin g. Th e clust ers ar e plotted in t he same two-dim ensional space shown in Figure 1.1.
1.4
Consensus scopes
Th e observation t hat compounds form clearly distin guishable groups which are characterized by a similar synt hesizing capacity suggests that there act ually exist only a very small number of really distin ct scopes. Even though scopes of different compounds are rarely completely identical, every scope is at least similar to one of a small set of typical scopes. These thoughts lead to the following genera lizatio n of the notion of the scope of a compound: D efin it ion of the consensus scope of a cl uster. For a cluster of compounds with similar synthes izing capacity, we define t he consensus scope by const ructing a scope vector c, in which a component is set to one if t he correspond ing met aboli t e ap pe ars in t he majority of the sco pes wi thin t he
cluster and zero ot herwise. The sizes of t he consensus scopes are also listed in Table 1.1. The consensus scope can be larger, smaller or equal th an th e clust er size. If th e consensus scope is larger or equal t he cluster size, then all (or most of) th e metabolites from the cluster also appear in th e consensus scope. In th e original notion of the scope of a compound, the compound is always included in its own scope. When talking about consensus scopes of a cluster, th e situation can be different . In the case when the consensus scope is smaller th an the cluster size, some of the compounds of t he cluster are certainly not cont ained in th e consensus scope. The consensus scopes for each cluster have a different size, but they are not necessarily mutually disjoint. To test for consensus scope overlap, we compute for every pair of consensus scopes i and j th e amoun t of metabolites t hey have in common and the number of metabolites t hat are different , and compare these numbers to t he consensus scope size. From t he mutu al overlap of a pair of con-
216 sensus scopes we obtain a scheme showing the consensus scopes in a hierarchical setting (see Figure 1.3).
Figure 1.3: Consensus scopeoverlap for the 12 clusters obtained with the hierarchical clustering method. The largest consensus scope is reached by metabolites from cluster III, which contains organic compounds consisting of heterocyclic bases, sugars and phosphate groups, for example nucleotides, deoxynucleotides (except those with thymine as base), nucleotide sugars , coenzymes except coenzyme A and second messengers such as cAMP. Cluster VI consists predominantly of those deoxynucleotides and deoxynucleotide sugars with thymine as their base. Apparently, since the consensus scope is a subset of the consensus scope of cluster III, their synthesizing capacity is smaller than that of other deoxynucleotides. Yet less can be produced from members of cluster IV which consists mainly of sugar phosphates. This can be explained by the fact that sugar phosphates are chemical groups of metabolites in clusters III and VI, however, from sugar phosphates alone, e. g. nucleotides cannot be produced. Sugars form cluster V. Obviously, since the phosphate group is not available, their synthesizing capacity is even smaller and consequently the consensus scope is completely contained in the consensus scope of cluster IV. Most other inclusion relations can also be explained by the presence or absence of characteristic chemical groups . Interestingly, there are two clusters (VII and XI), whose consensus scopes do not overlap with other consensus scopes. Metabolites within cluster VII are all derived from 20-carbon polyunsaturated essential fatty acids, known as eicosanoids. Our results indicate that only a very special group of chemicals can be produced from them and conversely, those chemicals can exclusively be produced from eicosanoids. Cluster XI represents a group of nitrogen heterocyclic compounds with a the common feature that all contain an indol group . These compounds are involved in the idole and ipecac alcaloid biosynthesis.
217 Table 1.1 : Clusters of biochemical compounds det ermined by a hierarchical clustering algorit hm. For each cluster, we list st ruct ura l cat egories to which t he majori ty of th e clust er members belong, th e cluster size and th e consensus scope size.
label I II
III IV V VI VII VIII IX X XI XII
1.5
cluster ele me nt s (representative) organic compounds containing nitrogen organic compounds not containing nitrogen compounds with heterocyclic bases, sugars and phosph ate groups sugar phosphates sugars deoxynucleotid es and their sugars with thymine as base eicosanoids dicarboxylic acids, keto acids and hydroxyacids coenzyme A compounds act ivated forms of terpenes and terpenoids nitrogen heterocyclic compounds with an indol group aromatic organic compounds with a benzene ring
cluster size 261 183
consensus scope size 423 148
102
1549
57 41 34
109 31 283
23 22
23 12
19 13
203 49
12
11
10
9
Summary
By grouping met abolites wit h respect to their synthesizing capacity, the huge variability of biochemical compounds involved in metabolism can be represented in a relati vely concise form . Apparently, th ere exist only a small number of ty pical sets of metabolites (t he consensus scopes) which can be produ ced from one single precursor. T hese sets display a hierarchy which in some cases can be explained by t he chemical groups contained in t he precursors . In ot her cases, th e underlying chemical reasons for th e hierarchica l structuring is not so apparent . T he hierarchy is a charac teristic of th e met abolic network comprising all biochemical reactions . Th e catalyzing enzymes are a produ ct of a long evolut ionary process which was governed by selection and mut ation prin ciples. In total, th ey catalyze only a small fraction of all theoretically possible chemical tra nsformations. The analysis of the hierarchical st ructuring of metabolism may put forth valuable hints on the underlying prin ciples which resu lted in the selection of the par ticular set of enzymatic reactions which is found in contemporary organisms. To furth er elucidate t his problem we plan to expand our analysis to orga nism
218 specific metabolic networks. The identification of organism specific hierarchies and the comparison among related organisms may further help to understand the principles and selection pressures which gu ided the evolution of metabolism.
Supplementary Online Material The metabolic network which was retrieved from KEGG and subsequently curated is available as a list of KEGG reaction IDs . A full list of the clusters determined with PCA and the hierarchical clustering method is provided along with a list of all corresponding consensus scopes. Supplementary online information is available for download at
Bibliography [1] BARABASI, A . L. , and Z. N. OLTVAI, "Network biology: Understanding the cell's functional organization", Nat Rev Genet 5, 2 (2004), 101-113 . [2] EBENHOH, 0 ., T. HANDORF, and R . HEINRICH , "Structural analysis of expanding metabolic networks", Genome Informatics 15 , 1 (2004), 35-45 . [3] HANDORF, '1'., O. EBENHOH, and R HEINRICH, "Expanding metabolic networks : Scopes of compounds, robustness and evolution", J . Mol. Evol. 61
(2005), 498-512 . [4] HASTIE, '1'., R . TIBSHIRANI, and Friedman J . H ., The Elements of Statistical Learning, Springer (2001). [5] JEONG, H., B . TOMBOR, R. ALBERT, Z. N. OLTVAI, and A.-L . BARABASI, "The large-scale organization of metabolic networks", Nature 407 (2000),
651-654 . [6] KANEHISA , M., S. GOTO, M. HATTORI , K. F . AOKI-KINOSHITA, M. ITOH, S. KAWASHIMA , T. KATAYAMA, M. ARAKI, and M. HIRAKAWA, "From genomics to chemical genomics: new developments in KEGG" , Nucleic Acids
Res. 34 (2006), D354-357. [7] RAVASZ, K , A. L. SOMERA, D. A. MONGRU , Z. N. OLTVAI, and A.-L. BARABASI, "Hierarchical organization of modularity in metabolic networks" ,
Science 297 (2002), 1551-1555 . [8] SCHOMBURG, 1., A CHANG, and D. SCHOMBURG , "Brenda, enzyme data and metabolic information", Nucleic Acid Research 30, 1 (2002), 47-49. [9] WAGNER, A., and D. A. FELL, "The small world inside large metabolic networks", Proc. R. Soc. Lond. B 268 (2001), 1803-1810.
Chapter 3
Endocannabinoids: Multi-scaled, Global Homeostatic Regulators of Cells and Society Robert Melamede University of Colorado, Colorado Springs, CO rmelamed @uccs.edu
Living systems are far from equilibrium open systems that exhibit many scales of emergent behavior. They may be abstractly viewed as a complex weave of dissi pative structures that maintain organization by passing electrons from reduced hydrocarbons to oxy gen. Free radicals are unavoidable byproducts of biological electron flow . Due to their highly reactive chemical properties. free radicals modify all classes of biological molecul es (carbohydrates , lipids , nucleic acids, and protein s). As a result , free radicals are destructiv e . The generally disruptive nature of free radicals makes them the "friction of life ." As such, they are believed to be the etiological agents behind age related illnesses such as cardiovascular, immunological, and neurological diseases, cancer, and ageing itself. Free radicals a so play a critical constructive role in living systems. From a thermodynam ic perspective, life can only exist if a living system takes in sufficient negative entropy from its enviro nment to overcome the obligatory increase in entropy that would result if the system could not appropriately exchange mass, energy and information with its environment. Free radicals are generated in response to perturbations in the relationship between a living system and its environment. However, evolution has selected for biological response systems to free radicals so that the cellular biochemistry can adapt to environmental perturbations by modify ing cellul ar gene expression and biochemi stry. Endocannabinoids are marijuana-like compounds that have their
220 origins hundreds of millions of years in the evolutionary past. They serve as fundamental modulators of energy homeostasis in all vertebrates . Their widespread biological activities may often be attributed to their ability to minimize the negative consequences of free radicals In fact, since cannabinoids (endo and exo) possess many anti-aging properties, they may be viewed as the "oil of life". The biological effects of cannabinoids transcend many scales of organization . Cannabinoids regulate sub-cellular biochemistry , intercellular communication, and all body systems (cardiovascular, digestive, endocrine, immunological, nervous, musculoskeletal , reproductive, respiratory, and tegumentary). It is proposed that their emergent properties extend to social, political, and economic phenomena.
1 Introduction The intent of this paper is to integrate a far from equilibrium perspective of biology, from which emergent behavior is intrinsic, with the explosion of scientific investigations into the endocannabinoid system. Endocannabinoids are marijuana-like compounds [Devane et al., 1992) produced by all deuterosomes [McPartland et al., 20061. They are believed to have their evolutionary origins 600 million years in the past. Over the past decade and a half, since the identification of cannabinoid receptors [Herkenham et al., 1990], research into the cannabinoid system has grown exponentially . Major international pharmaceutical companies are engaged in cannabinoid research, and products to turn on/off the system are in the pipeline[2003,Russo, 2004,Piomelli et aI., 2006,Tomida et al., 2006,Russo et al., 2007,Naranget aI., 2008).
2 Hypothesis The endocannabinoid system is a global homeostatic regulator [Melamede, 2005]. The actions of the cannabinoid system transcend the scales of organizaton ranging from the sub-cellular within an organism to beyond an organisms boundary where it regulates extra-organismic,yet populationdependant, hierarchal dissipative structures such as social, political, economic and religious systems. With such broad, multiscaled activities, that have evolved over 600 millions years, the cannabinoid system may underlie evolutionary advanced phenomena. For example, it has been postulated that the endocannabinoid system may provide the mind body link that emerges as the placebo effect [Melamede, 2006a]. Through their behavioral consequences, cannabinoids (and potentially behavioral biochemical regulators) create a hypervariable interface between an organism and its environment, thus linking behavior and evolution. Specifically, it is suggested that due to man's unprecedented impact on his environment, unique demands are placed on man's behavioral repertoire such that novel adaptive behavior is necessary man's survival.
221
3 Far From Equilibrium Thermodynamics For many years life, characterized by high levels of organization, appeared to contradict the Second Law of Thermodynamics that states: entropy must always increase, and free energy must decrease. The concepts developed by Illya Prigogine describe how, as long as sufficient negative entropy and matter flow into a far from equilibrium system, it can overcome the intrinsically positive entropy production of an isolated system [Kondepudi and Prigogine, 19981. Thus entropy flow is necessary to maintain flow dependant (dissipative) structures such as living systems. The biosphere may be viewed as the grand dissipative structure of life with species and individuals as component dissipative structures contained within [Nicolls and Prigogine, 19891. Similarly, as the level of magnification increases body systems, tissues and subcellular components must have a successful far from equilibrium entropic balance. dST(lotal)/dt=dSE(exchange)/dt+dSI(inlemal)/dt
Homeostasis is the process by which the inputs and outputs of entropy exchange flow to and from characteristic internal, flow dependent structures, essentially allowing them to constantly adapt to their constantly changing environment. The survival time of an individual/population is dependant on the rate of movement towards equilibrium as measured by illness and ultimately, death .
3.1 Energy Flow As energy flows through a species three categories of possibilities regarding stability are evident: a system may remain stable, the energy flow through a system might increase sufficiently to destabilize the system in which case: it may successfully bifurcate to a state of lower entropy (health and fitness), it may collapse to a state of higher entropy (apoptosis on a cellular scale, illness and death on an organismic scale, extinction on the species scale), there may be insufficient energy flow through a system so that it collapses, either totally or to a lower, yet flow dependent, level of organization (apoptosis on a cellular scale, illness and death on the organismic scale, and extinction on the species level) .
4 Endocannabinoid System The cannabinoid system is composed of ligands, receptors and ligand transporting and degrading enzymes Endocannabinoids are lipid metabolites that bind to the cannabis receptors (CBI and CB2). CBI receptors, originally thought to be mainly found in nervous tissue, have now been found in numerous other tissue types including skin, muscle etc. In contrast, the CB2 receptors are largely limited to cells of the immune system, but are also found in other tissues including the brain. Recent publications indicate that the CB2 receptor is up-regulated in any tissue under conditions of pathology. Cannabinoids are involved in the fundamental life, death,
222 differentiation alternatives of cells, and thereby extend through out the levels of biological organization.
4.1 Evolution A far from equilibrium perspective of the evolutionary progression of living systems from single celled species to man may suggests successive bifurcations in which systems became more complex so that they can more efficiently generate external entropy [Melamede, 2006b]. Successful feeding behavior and the passage of waste products is an evolutionary prerequisite for the energy driven, ongoing, nonlinear rearrangements that characterize speciation. We now know that endocannabinoids are critical homeostatic regulators of all body systems, and perhaps most importantly of energy flow in general [Cota et aI., 2003]. The activities of cannabinoids transcend scales from sub-cellular to organism. Thus, cannabinoids have the potential for their activities to become manifest as a whole that is greaterthan the sum of its parts. This concept becomes particularly applicable when one considers the impact of the cannabinoid system on the functioning of the brain [Fride, 2005]. Cannabinoids in the nervous system work via a novel retrogradesynaptic mechanism
4.2 Cannabinoids and Behavior Behavioral studies performed with CBI knockout mice provide important insights as to how these compounds function. Mice deficient in CBI receptors initially learn better than their wild type counterparts[Bilkei-Gorzo et aI., 2005]. However, as they get older, learning occurs more efficiently in the wildtype strain. This observation suggests that when memories are initially established, forgetting is not involved . However, as memories grow more complex and abstract thinking emerges, forgetting old knowledge becomes an important part of setting down new knowledge. Experiments using a Morris water maze demonstrate the critical role that cannabinoids play in relearning [Varvel and Lichtman, 2002]. While type and CBI knockout mice both learn how to solve the maze since that is how they get out of the water. However if the position of their platform is moved, the wildtype mice readily learn to go to the new position, whereas the CBI knockout mice continue to return to the place that no longer has the platform. The ability of cannabinoids to regulate relearning has important far reaching consequences that will be discussed in the section on politics. CB1 knockout mice exhibit a number of behavioral abnormalities, increased aggressive, anxiogenic and depressive-like behavior and as well as anhedonia [Zimmer et aI., 1999].
5 Health A hundred years ago, the leading cause of death in America was infectious disease (http://www.cdc.gov/nchs/fastats/lcod.htm) . As public health improved and antibiotics were developed, people lived longer. Today the leading causes of death are "age related illnesses". They included cardiovascular diseases [Steffens et al., 2005J (heart attack (#1 cause of death) and stroke (#3)), neurological diseases Milton,
223 2002,Ramirez et aI., 20051 [Hill and Gorzalka, 20051 (Alzheimer's (#8), depression), immune disorders (diabetes[Li et aI., 2001] (#6), multiple sclerosis[Shohami and Mechoulam, 2006], Crohn's Disease[Massa and Monory, 2006]), and cancer[G uzman, 2003] (#2). (http://www.cdc.gov/nchs/fastats/lcod .htm) There are numerous peer-reviewed studies that indicate a beneficial affect on all these conditions can result from activating the endocannabinoid system. The endocannabinoid system regulates all body systems (cardiovascular, digestive [Izzo and Coutts, 2005], endocrine [Maccarrone and Wenger, 2005], excretory [Brady et aI., 2004], immunological [Carrier et aI., 2005], musculo-skeletal [Casanova et aI., 2003] [Ofek et aI., 2006], neurological [Fride , 2005], tegumentary [Casanova et aI., 2003], reproductive [Wang et aI., 20061), and through these systems regulates body temperature IHollister, 1971] , food intake [Cota et aI., 20051, sleep , reproduction [Wang et aI., 20061, pain [Burns and Ineck, 20061 and mental attitude [Piornelli et aI., 2006 J. In fact, mice knocked out for their CB I receptor have shortened life spans [Zimmer et aI., 19991. Therefore, in keeping with man's evolutionary history, in which endocannabinoids are found in the most evolutionarily advanced areas of the brain , it appears that the need to extend the cannabinoid system is still with us.
6 Politics By extending research done largel y with mice to humans, one can speculate on a possible relationship between a person's cannabinoid system and their politics Additionally, what behavior might emerge as the level of cannabinoid activity rises in the population. It can occur slowly through genetic changes that effect endocannabinoid levels, or more rapidly through the consumption of the essential fatty acid precursors to endocannabinoids, and through marijuana consumption. The brain has the capacity to regenerate nerve cells, and that this process is largely controlled by endocannabinoids [Jiang et aI., 2005] . Regeneration is involved in neuronal plasticity and learning [Chevaleyre et al., 20061. It is hypothesized that people with an endocannabinoid deficiency in critical areas of the brain will tend to look backwards in time because that view minimizes the need for re-Iearning. Conversely , a robust endocannabinoid system equips an individual to adjust to the future by controlling the reformulation of old memories and patterns of behavior as new learning dictates. It is self-evident that in a population there will be some who are more endowed with endocannabinoid activity than others. The relative level of endocannabinoid activity can vary from one tissue to another depending on an individual's genetics and environmental histor y. Individuals with a relati ve endocannabinoid deficiency in critical areas of the brain will have a greater tendency to agree with one and other because they have a greater probability of looking into the past and trying to preserve the status quo . In contrast, individuals endowed with an above average endocannabinoid system can better adjust to the novelty of a developing situation . They will have a greater tendency to optimistically look into the unknowns of the future because they have the adaptive biochemical machinery. This viewpoint is supported by the finding that cannabis users have an enhanced sense of well-being
224 [Barnwell et aI., 2006], and decrease levels of depression [Denson and Earleywine, 20061. The greater tendency for conservative consensus will tend to give the cannabinoid deficient population greater political power. As a result, they will preferentially gather in government (somewhat independent of political party). In fact, there appears to be basic biochemical differences neurocognitive processes between liberals and conservatives [Amodio et a\., 2007]. If this hypothesis is correct, people with relatively lower endocannabinoid activity are the same individuals who make laws against marijuana use, even for medicinal purposes. The biological activity of cannabinoids goes against their genetics. They make laws independent of facts to the contrary despite overwhelming scientific evidence that supports the voice of thousands who use marijuana medicinally. For example, the FDA announced in April of 2006 that marijuana has no medical value.
7 Conclusions Because of the broad impact that politically motivated policy has on all aspects of our lives, such as heathcare, the war on marijuana is actually an example of evolution in action. Mankind is engaged in a genetic battle based on genetically defined behavioral determinants. The ability of cannabinoids to reduce age-related illnesses, and also to regulate open mindedness [Hill et a\., 20061 emphasizes the importance of having marijuana available for the health and survival of a population. The rapidly changing world that we live in, with its associated possible dangers for the survival or our and other species (global warming, nuclear warfare, pandemic diseases), demands that the population as a whole work cooperatively to promote policies that are responsive in a timely manner to changes that potentially threaten the very survival of mankind. As the cannabinoid activity in the human population increases, what emergent behavior might result? If humans consume essential fatty acids so that they can maximize their endocannabinoid production, and consume appropriate amounts of supplemental marijuana, they should became less depressed [Denson and Earleywine, 2006], more optimistic and forward-looking, less subject to age related illnesses, suffer less pain and become more cooperative. The dissipative structures that are our political, economic, religious and social systems might undergo drastic character changes, perhaps for the betterment of all
Bibliography I. Amodio, DM, JT lost, SL Master, andCM Yee. 2007. Neurocognitivecorrelates of liberalismandconservatism. Natur e Neuroscien ce 10, no. 10: 1246-1247 . Barnwell, SS, M Earleywine, andR Wilcox. 2006. Cannabis, motivation, and life 2. satisfaction in an internet sample. Subst Abuse Treat Prey Policy I, no. I: 2. 3. Bilkei-Gorzo, A, I Racz, 0 Valverde, MOtto, K Michel, M Sastre, and A Zimmer. 2005 . Early age-related cognitive impairment in mice lacking cannabinoid CB I receptors. Proceedings ofthe National Academy ofSciences ofthe United States of America 102, no. 43: 15670-15 675. 4. Brady, CM, R DasGupta, C Dalton, OJ Wiseman, KJ Berkley, andCJ Fowler. 2004 . Anopen-label pilotstudy of cannabis-based extracts forbladder dysfunction in advanced multiplesclerosis. Multiple Sclerosis 10, no. 4: 425-4 33.
225 5. 6. 7. 8.
9. 10.
II. 12. 13.
14. 15. 16. 17. 18.
19.
20. 21. 22.
23. 24.
Bums, TL, and JR Ineck. 2006. Cannabinoid analgesia as a potential new therapeutic option in the treatment of chronic pain. Ann Pharmacother 40, no. 2: 251-260. 2003. Cannabis-based medicines--GW pharmaceuticals: high CBD, high THC, medicinal cannabis--GW pharmaceuticals, THC:CBD. Drugs R D 4, no. 5: 306-309. Carrier, EJ, S Patel, and CJ Hillard. 2005. Endocannabinoids in neuroimmundogy and stress. Curr Drug Targets CNS Neurol Disord 4, no. 6: 657-665. Casanova , ML, C Blazquez, J Martinez-Palacio, C Villanueva, MJ Femandez-Acenero, JW Huffinan, JL Jorcano, and M Guzman. 2003. Inhibition of skin tumor growth and angiogenesis in vivo by activation of cannabinoid receptors . Journal ofClinical Investigation I I I, no. I : 43-50. Chevaleyre, V, KA Takaha shi, and PE Castillo. 2006. Endocannabinoid-mediated synaptic plasticity in the CNS .Annual Review ofNeuroscience 29, 37-76. Cota, D, G Marsicano , M Tschop, Y Grubler, C Flachskamm, M Schubert, DAuer, A Yassouridis, C Thone-Reineke, S Ortmann, F Tomasson i, C Cerv ino, E Nisoli, AC Linthorst, R Pasquali, B Lutz, GK Stalla, and U Pagotto. 2003. The endogenous cannabinoid system affects energy balance via central orexigenic drive and peripheral lipogenesis . Journal ofClinical Investigation 112, no. 3: 423-431. Cota, D, MH Tschop , TL Horvath, and AS Levine. 2005. Cannabinoids, opioids and eating behavior : The molecular face of hedonism? Bra in Res Brain Res Rev Denson, TF, and M Earleywine . 2006. Decreased depression in marijuana users.Addict Behav 31, no. 4: 738-742. Devane, WA, L Hanus, A Breuer, RG Pertwee, LA Stevenson, G Griffin, D Gibson, A Mandelbaum, A Etinger, and R Mechoulam . 1992. Isolation and structure of a brain constituent that binds to the cannabinoid receptor . Science 258, no. 5090: 1946-1949. Fride, E. 2005. Endocannabinoids in the central nervous system: from neuronal networks to behavior . Curr Drug Targets CNS Neural Disord 4, no. 6: 633-642. Guzman, M. 2003. Cannabinoids : potential anticancer agents. Nat Rev Cancer 3, no. 10: 745-755. Guzman, M. 2005. Effects on cell viability . Handb Exp Pharmacol no. 168: 627-642. Herkenham, M, AB Lynn, MD Little, MR Johnson, LS Melvin, BR de Costa, and KC Rice. 1990. Cannabinoid receptor localization in brain . Proceedings ofthe National Academy ofScienc es ofthe United States ofAmerica 87, no. 5: 1932-1936. Hill, MN, LM Froese, AC Morrish, JC Sun, am SB Floresco. 2006 . Alterations in behavioral flexibility by cannabinoid CB(l) receptor agonists and antagonists . Psychopharmacology (Berl) Hill, MN, and BB Gorzalka . 2005. Is there a role for the endocannabinoid system in the etiology and treatment of melancholic depression? Beha vioural Pharmacology 16, no. 5-6: 333-352. Hollister, LE. 1971. Actions of various marihuana derivatives in man .Pharma col Rev 23, no. 4: 349-357. Izzo, AA, and AA Coutts . 2005. Cannabinoids and the digestive tract. Handb Exp Pharmacol no. 168: 573-598. Jiang, W, Y Zhang, L Xiao, J Van Cleemput, SP Ji, G Bai, and X Zhang . 2005. Cannabinoids promote embryonic and adult hippocampus neurogenesis and produce anxiolytic- and antidepressant-like effects. Journal ofClinical Investigation 115, no. II : 3 104-3116. Kondepud i, Dilip, and I. Prigogine. 1998. Modern Thermodynamics: From Heat Engines to Dissipative Structures. John Wiley & Sons. Li, X, NE Kaminski, and LJ Fischer. 200 I. Examination of the immunosuppressive effect of delta9-tetrahydrocannabinol in streptozotocin-induced autoimmune diabetes . Int Immunopharmacol I, no. 4 : 699-712.
226 25. 26. 27. 28. 29. 30. 31. 32.
33. 34.
35.
36.
37.
38.
39.
40.
41.
42. 43. 44.
Maccarrone, M, and T Wenger. 2005. Effects of cannabinoids on hypothalamic and reproductive function. Handb Exp Phannacol no. 168: 555-571. Massa, F, and K Monory . 2006 . Endocannabinoids and the gastrointestinal tract. Journal ofEndocrinological Investigation 29, no. 3 Suppl : 47-57 . McPartland, JM, J Agraval, D Gleeson, K Heasman, and M Glass. 2006 . Cannabinoid receptors in invertebrates. J Evol Bioi 19, no. 2: 366-373. Melamede, R. 2005. Harm reduction--the cannabis paradox. Harm Reduct J2 , 17. Melamede, RJ. 2006a . Cannabinoids and the Physics of Life. Fourth National Conference on Clinical Cannabinoids Melamede , RJ. 2006b . Dissipative Structures and the Origins of Life . 60 I , Milton , NG. 2002. Anandam ide and noladin ether prevent neurotoxicity of the human amyloid-beta peptide. Neuroscience Letters 332 , no. 2: 127-130. Narang , S, D Gibson , AD Wasan , EL Ross , E Michna, SS Nedeljkovic, and RN Jamison . 2008. Efficacy of Dronabinol as an Adjuvant Treatment for Chronic Pain Patients on Opioid Therapy.J Pain 9, no. 3: 254-264. Nicolis , Gregoire, and I1yaPrigogine. I989. Exploring Complexity: An Introduction. W.H. Freeman & Company. Ofek, 0, M Karsak, N Leclerc, M Fogel , B Frenkel, K Wright, J Tam, M Attar Namdar, V Kram, E Shohami, R Mechoulam, A Zimmer, and I Bab. 2006. Peripheral cannabinoid receptor, CB2 , regulates bone mass. Proceedings ofthe National Academy ofSciences ofthe United State s ofAmerica 103, no. 3: 696-701 . Piomell i, D, G Tarzia, A Duranti, A Tontini, M Mor, TR Compton, 0 Dasse, EP Monaghan, JA Parrott, and D Putman . 2006. Pharmacological Profile of the Selective FAAH Inhibitor KDS-41 03 (URB597). CNS Drug Rev 12, no. I: 21-3 8. Ramirez, BG, C Blazquez, T Gomez del Pulgar, M Guzman, and ML de Ceballos. 2005 . Prevention of Alzheimer's disease pathology by cannabinoids: neuroprotection mediated by blockade of microglial activation. Journal ofNeuroscien ce 25, no. 8: 1904-1913 . Russo, EB . 2004. Clinical endocannabinoid deficiency (CECD): can this concept explain therapeutic benefits of cannabis in migraine, fibromyalgia, irritable bowel syndrome and other treatment-resistant conditions? Neuro Endocrinol Lett 25, no. 1-2: 31-39 . Russo, EB , GW Guy, and PJ Robson . 2007 . Cannabis, Pain, and Sleep : Lessons from Therapeutic Clinical Trials of Sativex((R» , a Cannabis-Ba sed Medicine. Chem Biodi vers 4, no. 8: 1729-1743 . Shohami , E, and R Mechoulam. 2006 . Multiple sclerosi s may disrupt endocannabinoid brain protection mechanism. Proceedings ofthe National Academy ofSciences ofthe United States ofAmerica 103, no. 16: 6087-6088 . Steffens, S, NR Veillard, C Arnaud, G Pelli, F Burger, C Staub, A Zimmer, JL Frossard, and F Mach. 2005 . Low dose oral cannabinoid therapy reduces progression of atherosclerosis in mice . Nature 434 , no. 7034: 782-786. Tomida, I, A Azuara-Blanco, H House, M Flint, RG Pertwee, and PJ Robson. 2006. Effect of Sublingual Application of Cannabinoids on Intraocular Pressure: A Pilot Study. Journal ofGlaucoma 15, no. 5: 349-353 . Varvel, SA, and AH Lichtman. 2002. Evaluation ofCBI receptor knockout mice in the Morris water maze. J Phannacol Exp Ther 30 I , no. 3: 915-924 . Wang, H, SK Dey , and M Maccarrone. 2006. Jekyll and Hyde: Two Faces of Cannabinoid Signaling in Male and Female Fertility. Endo crine Reviews Zimmer, A, AM Zimmer, AG Hohmann, M Herkenham , and TI Bonner. 1999. Increased mortality, hypoactivity, and hypoalgesia in cannabinoid CB I receptor knockout mice . Proceedings ofthe National Academy ofSciences ofthe United States ofAmerica 96, no. 10: 5780-5785.
Chapter 4
Different Neurons Population Distribution correlates with Topologic-Temporal Dynamic Acoustic Information Flow Walter Riofrio 1 and Luis Angel Aguilar! ' 2
1 Neuroscience and Behaviour Division, Universidad Peruana Cayetano Heredia, Walter.Riofrio @terra.com .pe 2 Castilla y Leon Neuroscience Institute, Salamanca, Spain [email protected]
In this study, we will focus on two aspects of neural interconnections. One is the way in which the information flow is produced, and the other has to do with the neural distribution with specific architectural arrangements in the brain. It is very important to realize that both aspects are related, but it is possible to support in the former that the information flow is not only governed by the number of spikes in the neurons, but by a series of other factors as well. Here we show the role played by GABAergic neurons in acoustic information transmission in the Central Nucleus of Inferior Colliculus (CNIC). We report a neural spatial-temporal cluster distribution, associated with each isofrequency region. With these results, we will shed some light onto the emergence of certain mental propertie s starting from the neural dynamic interact ions.
228
1 Introduction The Inferior Colliculus (IC) is the processing center of ascending information to the thalamus and brain cortex. It controls the nucleus of the lower auditory pathway, and it plays an important role in multi-sensorial integration and in motor-auditory reflex production [Oliver & Huerta 1992]. These different functions must be a consequence of the integration of excitatory and inhibitory signals from the ascendant, descendant, and commissural projections [Li & Kelly 1992, Nelson & Erulkar 1963, Shneidennan & Henkel 1987]. The inhibitory action of GABA and Gly is what shapes the temporal and spectral response of IC neurons. Due to the relevance of inhibitory processes in the physiologic responses of the IC neurons, it is realistic to think that we might obtain some explanations of the acoustic information flow dynamics from the neural populations and the histochemical and cytoarchitectural organization of this nucleus . Tonotopic organization is a fundamental property of the auditory system [von Bekesy 1960], and it is now well established that the morphological substrate for such tonotopy in the CNIC is its laminar organization [Oliver & Morest 1984]. It is further known that the rate code measures the intensity of a signal by means of the number of spikes on a single neuron or a population of neurons over a period of time [Adrian & Zotterman 1926]. Once the rate code has been determined, we take into account the total number of spikes over a period of time, but the order or timing of the spikes is not relevant. For a considerable period of time, it was thought that the variability in the inter-spike intervals would limit the sensory information that neurons would transmit. Notwithstanding, recent studies indicate that variability could be involved in certain important advantages in information flow: variability, or noise, could enhance sensitivity to weak signals, a phenomenon called stochastic resonance [Jaramillo & Wiesenfeld 1998]. Others suggest additional important advantages of variability or noise. They put forward that this apparent spike generation variability by neurons represents signals from excitatory postsynaptic potential (EPSPs), which is commonly referred to as a temporal code [Mainen & Sejnowski 1995]. The main idea of using the temporal code concept is that the exact timing of nerve spikes carries with it more information than the rate alone. This research into information flow tells us that the so-called 'noise' associated with variability in neural inter-spikes would have correlations with cognitive processes. A common agreement has been produced recently: at the moment, the temporal notion is a very important component for understanding the way in which information flow is transmitted , and comprehending this phenomenon will have strong implications on the studies of mental processes [Abeles 2004, Carr 2004, Vanrullen et al. 2005]. In this paper, we propose a working hypothesis on the implications of the sensorial information flow through the neural network and its neural distribution with regard to the most basic capacities involved in mental or cognitive properties .
229
2 Cytoarchitectural Organization in Central Nucleus of Inferior Culliculus In a recent study [Merchan et al. 2005] about the distribution of GABA-IR and GABA-IN neurons across and within the frequency-band laminae, we found that there are neurochemical differences with regard to the tonotopic organization of the CNIC (see Methods): " ...differences within the laminae are greater along the dorsomedial-ventrolateral axis than along the rostro-caudal axis...differences across frequency regions are minor. .." [page 920]. Although, we studied the quantitative anatomical distribution of GABAergic neurons in each portion of the IC, the CNIC in rats possesses up to 30% of GABAergic neurons [see page 921]. The presence ofGABA-IR and GABA-IN throughout the entire IC is shown in fig. 1.
Figure. 1: Scatter plots showing the distribution of the grey values (normalized density) obtained for all Ie neurons used in the densitometric analysis together with the control neurons (granule-, Golgi-, and Purkinje cells from the cerebellum and blood vessels). Note that the valley that separates the two peaks in the histograms approximates the threshold of OD that separates GABA-IR from GABA-IN neurons.
When the study was performed in the CNIC, we found that the distribution of GABAergic neurons was different within each isofrequency band laminae and the differences across frequency regions were almost negligible (see fig 2).
. . . ,-- ~:. ..-, .
-
'
-
... .
-
'.
. .
.
--
230 Figure. 2: Graphic-representation method of density gradient (GABA-IR in CNIC) codified by color. We can observe a clear topological distribution in rostral and caudal sections. More details see reference [Merchan et al. 2005].
The different distributions of neurons in every tonotopic plane are correlated with a GABA gradient concentration in dorso-ventral direction. These discoveries could be interpreted as a topological relationship with laminar organization in the Central Nucleus of Inferior Colliculus. GABAergic neurons oriented almost transversal to isofrequency bands are shown in fig.3.
C."'".INutl.",
... .......... .. _... ..........
_.....
.. .
....
... 7i", I nOJ'<
' ·1~
-_.
~
"
:r_,.. '"
tJ· , .....
'-.'
.... '. ,. .
1-
., ~~'i. .
'1,.' ....
..' ..
'.'
....
.
........ •
~-).
t.n _'~.;"'1 ~
~ .' .
Figure.3: The data showsthe distribution ofGABA-IR and GABA-INneurons. Compare the difference between normalized OD, perimeter, GABA puncta and Gly puncta (in green) along and across frequency regions.
3 Evolvable Complex Information "Evolvability is an organism's capacity to generate heritable phenotypic variation" [Kirschner & Gerhart 1998]. Once a linkage between evolvability and robustness is established, we can see neutral mutation as the key for increased levels of phenotypic variability, which enhance the chances of innovative roads in evolution [Wagner 2005] . One of these innovations is the evolvable increase of complexity in a trait like information management by neural aggregations. We begin with certain clarifications on the information notion, the central characteristic of a theoretical construct that we developed to study pre-biotic world emergence. Our construct, called the Informational Dynamic System, contains the essential capacities of autonomy , function, and information [Riofrio 2007]. In this way, our conceptual exploration leads us to realize that the neurons manage and process - with certain levels of success - different kinds of signals or signs. And when they are processing and transmitting a message through the neuronal circuit, they incorporate some degree of 'meaning' in these actions . Concerning information, any type of signal or sign can be a carrier of what might be information (it is the carrier of 'potential information'). We consider a signal or sign to be any matter-energy variation (sound wave, electromagnetism, concentration of a chemical compound, change in pH, etc.). The sign or signal that is a 'potential information' carrier must be in the surroundings, on the inside, or have transmitted the information to some system component.
231 The important thing for our idea is that the information, properly said - The Information - has a meaning (very basic semantic) that is created on the inside of the system . As Menat would say, "It is meaningful information." [Menant 2003]. In a naturalist perspective, any signal , sign, or information must always be with respect to something else and not imposed by some outside observer. Therefore, we do not accept the existence of things like signs, signals, or information by themselves. As with the notions of autonomy and function, we could state that the information notion is also a relational concept and, together with the other two notions, defines the Informational Dynamic Systems : 1. The emergence of information (meaningful information) is made possible because of the previous existence of a matter-energy variation. What is also needed is the existence of a kind of system having the capacity of processing the matter-energy variation incident that influences it. Then, the mean ingful information (or properly stated , Information,) is a resultant property produced in the system due to its processing capacity together with the matter-energy variation incident. 2. The reasons for understanding that the system will generate a 'sense or meaning' from the matter-energy variation are found in the system's intrinsic interdependence among its different processes, which produces - in one way or another - a counterbalance or a decrease /increase of the performance conditions in a process, which will have in tum certain influences over one or more processes in the system . 3. For the purposes of our work, we will use Collier's notion of cohesion [Collier 2003]. Cohesion is the idea that would provide the informational dynamic system its identity, in all its transformations in time. It is a relational and dynamic definition that encompasses the nature of the system organization. 4. Thus, that possible information that the signal carries must have been incorporated into some process. It is in that particular process where the information might be transmitted. 5. Then, the 'potential information' becomes Information (information with meaning for the system) since it has the capacity to produce something (an effect) in the process that incorporated it, or in some other process that is directly or indirectly connected to the initial process that incorporated it, or with something of the constrictions that are effectively maintaining the system far from thermodynamic equilibrium. 6. The effect has a repercussion in the system, influencing its own dynamic organization and this can be in the maintenance/ increase of the system cohesion. As well, the effect could produce some level of interference in the system cohesion, possibly interrupting one or more processes. In all cases , whether an effect in favor of or in contrast to cohesion, the system will develop some type of response that will be correlated to that meaningful information and the process or proce sses enveloped by the effect.
It is reasonable to think that through evolution the neurons are becoming those cellular entities that explore the potentialities of electromagnetic field management. In this respect, we support our studies in the results and proposals which, precisely,
232 claim that those things known as mental phenomena are found in the endogenous electromagnetic field produced by the brain [McFadden 2000, Pockett 2000]. One of the basic characteristics of the first ancestors of living systems, those that opened the door of pre-biotic world, was the capacity to generate "meaningful information" about their environment and about their dynamical internal milieu. Since neurons are dynamic systems and descendents of the Informational Dynamic System in our proposal, their dynamism will carry metabolic informationalfunctional messages to its interior, with its environment, and with other neuronal connections. At the end, each subdivision within the isofrequency band laminae in the Central Nucleus of Inferior Colliculus will react in different dynamic ways with the incident acoustic tone. If a dynamic topological laminar organization relationship exists, then it is possible to think in the presence of a temporal variability excitatory flow. We could state the following: It is the increase in the constraint levels that make the incremental expression ofsemantic levels possible in more sophisticated ways on neural integrations, whose complexity levels increase in an open-ended evolution.
4 Implications for future investigations If we accept that the mental phenomena, in the biological evolution, passed through increasing hierarchical levels of complexification. It is reasonable to postulate the following claim: the brain's complex levels of the different biological species would growth and, at the same time, mental properties would be diversified and would increase its levels on complexity and sophistication. Of the different ways in which it is possible to observe the growing levels of brain's complexity, we postulate, one of them is related with the specific features in where a neural network is structured (as much as components as connections). Weare proposing: the different arrangements in where the specific components in a determined information transmission road are connected would be one of the keys to improve our understanding in the relationships between the brain and the mind In this way, we believe important not only to study what would be the components of the so called neural code (or, the temporal code): To find the possible rules in which each one of the parts of the neural (temporal) code are related among themselves are also important. One direction is to propose understanding how the possible constituents of the neural (or, temporal) code are related among themselves connecting each other with certain linkage-rules. Our proposal on this issue is an attempt to contribute to the investigations in Biosemantics [Emmeche 2004, Kull 2005, Millikan 2002] and other related themes. Let's focus on an ideal concept of language for a moment. Grammar is linked by rules that allow us to distinguish an adequate relationship between words in a certain language. So, in this ideal language, we would have the criteria to differentiate the construction of adequate sentences from that of non adequate sentences because we already know what the word order is for constructing a sentence in this language [Lance. & 0' Leary-Hawthorne 1997, Langacker 1991].
233 If we had reports that the sensorial information flow is correlated with determined classes of topologies in specific portions inside the sensorial transmission signal roads, together with studies on the organizational architecture of these sensorial transmission signals roads showing us that there is a difference in the spatialtemporal topology distribution of the types of neurons, then, taking into account these experimental results and the consequences derived from our conceptual construct, we could conclude something about mental properties. Indeed, the constraints of the acoustic information flow due to the topological distribution of neural populations control the ways in which the information is transmitted. The specific distribution of neural cell types (in particular, inhibitory neurons) producing gradients of inhibition and/or excitatory signals are linked, we assume, to mental rules : the grammar ofthe mind?
5 Conclusions We can ask what role would the inter-spike variability in neurons play and what role would the different distributions of neural types in a determined sensorialinformation transmission play? Actually, the data presented on Central Nucleus of Inferior Colliculus would show some vestiges, certain indications on how this mental grammar - registered and frozen in these interrelated articulation of neural types and networks - expresses the rules in which the fires, silences and 'delays/accelerations' of the acoustic information are incorporated in the brain's endogenous electromagnetic field as basic or low-level mental representations.
Bibliography [1] Abeles, M., 2004, Time is precious, Science, 304: 523-524. [2] Adrian, E. & Zotterman , Y, 1926, The impulses produced by sensory nerve endings. Part 3. Impulses set up by touch and pressure , J. Physiol. (Lond.), 61: 465--483. [3] Carr, C.E., 2004, Timing is everything: organization of timing circuits in auditory and electrical sensory systems, J. CompoNeurol., 472: 131-133. [4] Collier, J., 2003, Hierarchical Dynamical Information Systems with a Focus on Biology, Entropy, 5(2): 100-124. [5] Emmeche, C., 2004, A-life, Organism and Body: the semiotics of emergent levels, edited by Bedeau, M., Husbands, P., Hutton, T., Kumar, S., Suzuki, H., Workshop and Tutorial Proceedings. Ninth International Conference on the Simulation and Synthesis ofLiving Systems (Alife IX), 117-124. [6] Jaramillo, F. & Wiesenfeld, K., 1998, Mechanoelectrical transduction assisted by Brownian motion: a role for noise in the auditory system, Nature Neurosci ., 1: 384-388. [7] Kirschner, M.& Gerhart, J., 1998, Evolvability, PNAS , 95: 8420-8427. [8] Kull, K., 2005, A brief history of Biosemiotics, Journal ofBiosemiotics, 1: 1-34.
234 [9] Lance , M.N. & 0' Leary-Hawthorne, J., 1997, The Grammar of Meaning: Normati vity and Semantic Discourse, Cambridge Uni versity Press . [10] Langacker, R.W., 1991, Foundations of Cognitiv e Grammar: Descriptive Application, Stanford University Press . [11] Li, L. & Kelly, J.B ., 1992, Inhibitory influence of the dorsal nucleus of the lateral lemniscus on binaural responses in the rat' s inferior colliculus, J. Neurosci., 12: 453~539 . [12] Main en, Z.F. & Sejnowski, T.1., 1995, Reliability of spike timing in neocortical neurons, Science, 268 : 1503-1506. [13] McFadden, 1., 2000 , Quantum Evolution, HarperCollins, London. [14] Menant, C., 2003 , Information and Meaning, Entropy , 5: 193-204 . [15] Merchan, M., Aguilar, L.A, Lopez-Poveda, E.A. & Malmierca, M.S., 2005 , Immunocytochemical and semiquantitative study on Gamma-aminobutiric acid and glyc ine in the Inferior Colliculus of rat, Neuroscience, 136 (3): 907-925. [16] Mill ikan, R.G., 2002, Biofunctions: Two Paradigms, edited by R. Cummins, A. Arie w, M. Perlm an, Functions: New Readings in the Philosophy ofPsychology and Biology , Oxford University Press, 113-143 . [17] Nelson, P.G. & Erulkar, S.D. , 1963, Synaptic mechanisms of excit ation and inhibition in the central auditory pathway,J. Neurophysiol., 26: 908-923. [18] Oli ver, D.L. & Huerta, M. , 1992, Inferior and superior colliculi, edited by D.B. Webster, AN. Popper, R.R. Fay, The mammalian auditory pathway neuroanatomy, Springer-Verlag, Berlin, 168-221. [19] Oliver, D.L. & More st, DX. , 1984, The central nucleus of the inferior colliculus in the cat, J. Compo Neurol., 222 : 237-264 . [20] Pock ett, S., 2000 , The Nature of Consciousness: A Hypothesis, Writers Club Press, Lincoln NE. [21] Riofrio , W., 2007, Informational Dynamic Systems: Autonomy, Information, Function, edited by C. Gershenson, D. Aerts and B. Edmonds, Worldviews, Science, and Us: Philosophy and Complexity, World Scientific, Singapore, 232-249. [22] Shneiderman, A & Henkel, CX., 1987, Banding of lateral superior olivary nucleus afferents in the inferior coll iculu s: a possible substrate for sensory integration, J. CompoNeurol., 266 : 5 19- 534. [23] Vanrullen, R., Guyonneau, R. & Thorpe, S.1., 2005, Spike times make sens e, Trends Neurosci ., 28 : 1-4. [24] von Bekesy, G., 1960, Experiments in hearing, McGraw-Hill, New York . [25] Wagner, A., 2005, Robustness, evol vability, and neutrality, FEBS Letters, 579: 1772-1778.
Chapter 5
Modeling the Dynamics of Task Allocation and Specialization in Honeybee Societies Mark Hoogendoorn, Martijn C. Schut,and Jan Treur Vrije Universiteit Amsterdam, Department of Artificial Intelligence {mhoogen, schut, treur}@cs.vu.nl
1 Introduction The concept of organization has been studied in sciences such as social science and economics, but recently also in artificial intelligence [Furtado 2005, Giorgini 2004, and McCallum 20051. With the desire to analyze and design more complex systems consisting of larger numbers of agents (e.g., in nature , society, or software), the need arises for a concept of higher abstraction than the concept agent. To this end , organizational modeling is becoming a practiced stage in the analysi s and design of multi-agent systems, hereby taking into consideration the environment of the organization. An environment can have a high degree of variability which might require organizations to adapt to the environment's dynamics, to ensure a continuous proper functioning of the organization. Hence , such change processes are a crucial function of the organization and should be part of the organizational model. An organizational model incorporating organizational change can be specified in two ways: from a centralized perspecti ve, in which there is a central authority that determines the changes to be performed within the organization, taking into account the current goals and environment, see e.g. [Hoogendoorn 2004]. A second possibility is to create a model for organizational change from a decentralized perspecti ve, in which each agent decides if and how to change its own role allocations . In the latter approach. it is much more difficult for the organization as a whole to change in a coherent way , still satisfying the goals set for the organization, as there is no overall view of the organizational change. The approach might however be the only possibility for an
236 organization to perform change as a central authority for performing change could be non existing or infeasible due to the nature of the organization. In the domain of social insects , such as honeybees and wasps, organizations are known to adapt in a decentralized fashion to environmental changes. This paper presents a model for decentralized organization change appropriate for such phenomena as occur in Nature. This model can aid in creating and anal yzing such an organization. The description of the model is done from a generic perspecti ve, abstracting from the actual tasks being performed by the organization. The scope of the model is broader than social insects: the mechanisms incorporated may work in other types of organizations as well. In [Bonebeau 2000] for example, a comparable approach is used for finding an optimal allocation of cars to paint booths. To evaluate the model proposed, the honeybee (Apis Mellifera) has been anal ysed. The model instantiated for this domain has been validated against properties as acquired from biological experts. A number of different roles have been identified in the literature (see e.g., [Schultz 2002, Winston 1982)). For the sake of brevity only five will be addressed here: (1) a brood carer takes care of feeding the larvae within the bee hive; (2) a patroller guards the hive by killing enemies entering the hive; (3) aforager harvests food from external sources; (4) an undertaker cleans the hive of corpses, and (5) a resting worker simply does nothing. Switching between roles is triggered by changes in the environment observed by the bees. Each role has a specific trigger, for which a bee has a certain threshold that determines whether this is the role it should play. The bee always plays the role for which it is most triggered. For example, bees are triggered to start playing the brood carer role when the y observe the larvae emitting a too high level of hunger pheromones. Once they are allocated to the role , they start getting food from the combs and feed the larvae that are emitting the pheromones. A trigger for the patroller role is the amount of enemies observed around the hive. Foragers that have returned from their hunt for food, communicate the location where they found the food by means of the honeybee dance (see [Camazine 2001)). For other bees currently not playing the fora ger role, such a dance is a trigger to start playing the f orager role. The more corpses there are, the more bees are being triggered to switch from their current role to being undertaker. Bees perform the resting worker role in case they are not sufficiently triggered for any other role . The generic model for decentralized organizational change is described in Sections 2 (properties at organization level) and 3 (role properties). Results of a simulation of the generic organizational model instantiated with domain-specific knowledge of the bee colony are shown in Section 4 , and finally Section 5 concludes the paper. For a more detailed overview of this work, see IHoogendoorn 2007] .
2 Modeling Organizational Properties To enable modeling an organization, an expressive language is needed that has the ability to describe the dynamics of such an organization. For this purpose TTL (Temporal Trace Language) has been adopted cf. [Jonker 2002J. TTL allows for the formal specification of dynamic properties on multiple levels of aggregation. The bottom level addresses role properties, describing the required behavior for each of the roles within the organization. On the top level organization properties are defined, expressing the overall goals or requirements for the organization. Within TTL an executable subset has been defined called LEADSTO; cf. [Bosse 20071. In case role
237 properties are expressed in this executable format, the organizational model can be simulated for (e.g., environmental) scenarios, resulting in a trace of the organizational behavior. The organization properties can thereafter be checked against the trace by means of an automated tool called TTL checker to see whether the organizational model indeed satisfies the goals or requirements set for it, given the scenario. More details and the semantics for TTL can be found in [Sharpanskykh 2005] . Examples and explanation of properties expressed in TTL are shown in more detail in [Hoogendoorn 20071. The model for decentralized organizational change presented here takes the form of a hierarchy of dynamic properties at two aggregation levels: that of the organization, and that of the roles within the organization. This section describes a number of such properties as well as the relationships between them. The highest level requirement for the organization as a whole as inspired by the biological domain experts, is survival of the population given a fluctuating environment, in other words, population size needs to stay above a certain threshold M. OPI(M) Surviving Population For any time t, a time point t'at exists such that at t' the population size is at least M.
Such a high-level requirement is refined by means of a property hierarchy, depicted as a tree in Figure I. At the highest level OPI is depicted which can be refined into a number of properties (in Figure I n properties) each expressing that for a certain aspect the society is in good condition, characterized by a certain value for a variable (the aspect variable) that is to be maintained . The property template for an aspect X is as follows : OP2(X, PI, P2) Organization Aspect Maintenance For all time points t, if v is the value of aspect variable X at t, then v is between PI and P2
Sometimes one of the two bounds is omitted, and it is only required that value v is at least PI (resp., at most P2). For the example bee society the aspects considered are wellfed brood, safety,food storage, and cleanness (addressed, respectively, by Brood Care, Patroller, Forager, and Undertaker roles). For each of these aspects a variable was defined to indicate the societal state for that aspect. For example, for wellfed brood, this variable concerns relative larvae hunger, indicated by the larvae pheromone rate. In order to maintain the value of an aspect variable X, a certain effort is needed all the time . To specify this, a property that expresses the effort made by the organization on the aspect, is introduced . Notice that the notion of provided effort at a time point t can be taken in an absolute sense (for example, effort as the amount of feeding work per time unit), but it can also be useful to take it in a relative sense with respect to a certain overall amount , which itself can vary over time (for example, effort as the fraction of the amount of feeding work per time unit divided by the overall number of larvae) . Below the latter, relative form will be taken. The general template property for aspect effort is as follows : OP3(X, WI, W2) Sufficient Aspect Effort For all time points t the effort for aspect X provided by the organization is at least WI and at most W2.
For the bee colony, for instance, the brood care workers take care that the larvae are well-fed. The effort to maintain the hunger of larvae at a certain low level is feeding the larvae. Here provided effort for brood care is defined as the brood care work per time unit divided by the larvae population size. Brood care work is taken as the amount of
238 OP1(M): Surviving the (average) brood care work for one individual brood carer times the oP2(aT " P2,) •••••••••••••••••••- oP2(aT " P2,) org~na~n~:~~~:eel number of brood carers. Whether the refined OP3(a"i1"W2,) •••••••••••••••••••• op3(a,'i1',W2,) Sufficient Aspect Effort properties given above will always hold , depends on the flexibility of the ~··················AaPtation Flexibility organization. For example, in the bee colony case, if RP(w(a,),d"W,) RP1(M) RP1(M) RP(w(a, ),d"W,) the number of larvae or RP2(M) RP2(M) RP3(M) RP3(M) enemies increases , also the number of brood care Figure. 1. Property hierarchy for decentralized workers, respectively organizational change patrollers should increase. If the adaptation to the new situation takes too much time, the property Brood Care Effort will not hold for a certain time. In principle, such circumstances will damage the success of the organization. Therefore, an adaptation mechanism is needed that is sufficiently flexible to guarantee the properties such as Brood Care Effort. For this reason , the adaptation flexibility property is introduced , which expres ses that when the effort for a certain organization aspect that is to be maintained is below a certain value , then within a certain time duration d it will increa se to become at least this value. The smaller this parameter dis, the more flexible is the adaptation; for example, if d is very large , the organization is practically not adapting. The generic propert y is expressed as follows:
~
OP4(X, H, d) Adaptation Flexibility At any point in time t, if at t the effort for aspect X provided by the organization is lower than B, then within time durationd the effort will becomeat least B. An assumption underlying this property is that not all aspects in the initial situation are critical , otherwise the adaptation mechanism will not work. OP3 expressing that sufficient effort being provided directl y depend s on this adaptation mechanism as shown in Figure 1. OP4 depends on role properties at the lowest level of the hierarch y, which are addressed in the next Section.
3 Role Properties Roles are the engines for an organization model : they are the elements in an organization model where the work that is done is specified. The properties described in Section 3 in an hierarchical manner have to be grounded in role beha vior properties as the lowest level properties of the hierarchy. In other words, specifications of role properties are needed that entail the properties at the organizational level described in Sect ion 3. In the behavioral model two types of roles are distinguished: Worker roles which provide the effort needed to maintain the different aspects throughout the organization, and Member roles which have the function to change Worker roles. Each Member role has exactly one shared allocation with a Worker role. The role behavior for the Worker roles within the organization is shown in Section 3.1, whereas Section 3.2 specifies the behavior for the Member roles .
239
3.1 Worker Role Behavior Once a certain Worker role exists as an active role, it performs the corresponding work. What this work exactly is, depends on the application: it is not part of the generic organization model. The property directly relates to OP4 which specifies the overall effort provided, as shown in Figure I. Note that Figure 1 only shows the generic form of the role property (depicted as RP(w(aj),di,Wj) where a, is the specific aspect and w(a;) the Worker role belonging to that aspect) whereas in an instantiated model a role property is present for each instance of the Worker role providing the effort for the specific aspect. In a generic form this is specified by: RP(R, d, W) Worker Contribution
For all t there is a t' with t s t' s t + d such that at t' the Worker role R delivers a work contribution of at least W.
3.2 Member Role Behavior Bya Member role M decisions about taking up or switching between Worker roles are made. As input of this decision process, information is used about the well-being of the organization, in particular about the different aspects distinguished as to be maintained; these are input state properties indicating the value of an aspect variable X: has_value(X. v). Based on this input the Member role M generates an intermediate state property representing an indication of the aspect that is most urgent in the current situation. In the generic model the decision mechanism is indicated by a priority relation priorilvelalion(XIo VI. WI•. .. • x, vn ' wn• X) indicating that aspect X has priority in the context of values Vi, respectively norms w, for aspects XI! .., Xn • This priority relation can be specialized to a particular form, as shown below by an example specialization in the last paragraph of this section. RPl(M) Aspect Urgency At any t, if at t Member role M has norms WI to W n for aspects XI to X, and receives values VI to Vn for XI to X, at its input, and has a priority relation that indicates X as the most urgent aspect for the combination of these norms and values , then at some t' ~t it will generate that X is the most urgent aspect.
Based on this, the appropriate role for the aspect indicated as most urgent is determined. If it is not the current role sharing an allocation with M, then another intermediate state property is generated expressing that the current Worker role sharing an allocation with M should be changed to the role supporting the most urgent aspect. In other words, the shared allocation of Member role M in the Change Group should change from one (the current) Worker role RI in Worker Group WG I to another one, Worker role R2 in Working Group WG2: RP2(M) Role Change Determination At any t, if at t Member role M generated that X is the most urgent aspect , and Worker role R2 is responsible for this aspect, and Rl is the current Worker role sharing an allocation with M, and Rl .. R2, then at some t' at it will generate that role R2 has to become the Worker role sharing an allocation with M, instead of Rl.
Based on this intermediate state property the Member role M generates output indicating which role should become a shared allocation and which not anymore :
240 RP3(M) Role Reallocation At any t, if at t Member role M generated that Worker role R2 has to become sharing an allocation with M, instead of Worker role RI , then at some t' ~t it will generate the output that role RI will not share an allocation with M and R2 will share an allocation with M.
All three role properties for the Member roles are depicted in Figure I. The adaptation step property OP4 for all organizational aspects dependent upon it, so each of the OP4 branches depends upon RP1, RP2, and RP3 which have therefore been depicted two times in the Figure. The generic description for the Member role behavior can be specialized one step further by incorporatinga specific decision mechanism. This gives a specific definition of the priority relation priorily_relalion(X V" Wlo . .. • Xn, Vn• Wn• Xl as has " been done for the following decision mechanism based on norms used as thresholds (see e.g. [Theraulaz 1998]). I.
2. 3. 4.
For each aspect X to be maintained a norm w(X) is present. For the Worker role RI for X sharing an allocation with Member role M, each time unit the norm has a decay described by fraction r. For each X, it is determined in how far the current value is unsatisfactory, expressed in a degree of urgency u(X) for that aspect. For each aspect with urgency above the norm, i.e., with u(X) > w(X), the relative urgency is determined: u(X)/ w(X) The most urgent aspect X is the one with highest relative urgency .
4 Simulation Results This section discusses some of the results of simulations that have been performed based on the generic organizational model, in particular the role properties presented in Section 3 have been put in an executable format and have been instantiated with domain-specific informationfor bee colonies. To validate the instantiated simulation model, the high-level dynamic properties from Section 2 were used (in accordance with biological experts). Proper functioning of such an organization in Nature is not self-evident, therefore two simulation runs are compared: one using the adaptation mechanism, and one without. Note that the results presented here are the results of a simulation of the instantiated organizational model, abstracting from allocated agents. Performing such high-level simulations of an executable organizational model enables the verification of properties against these simulation runs. Hence, it can be checked whether or not the model satisfies the properties or goals considered important. When such properties arc indeed satisfied, by allocating agents to the roles that comply to the role properties, the multi-agent system delivers the desired results as well. The circumstances are kept identical for both simulations (see [Hoogendoorn 20071 for details on these settings). Figure 2 shows results on the performance of the two settings of the organizational model. Figure 2a shows the overall population size over time. The population size of the simulation with adaptation remains relatively stable, whereas without adaptation it drops to a colony of size 3, which is equal to the amount of larvae living without being fed. Figures 2b and 2c show information regarding brood care: Firstly, the average pheromone level, the trigger to activate the allocation to brood carer. Furthermore, the number of active brood carers in the colony is shown. In the case with adaptation their number increases significantly in the beginning of the simulation, as the amount of
241
1.5
30 .....
1-- -
- - - -- - .......
1
Wilh adaplation W,IIloul adaptatJon
5 O '---~-~-~-~~
200
400
600
800
o
1000
200
400
600
bme
bme
(A)
(B)
800
1000
30 25
.
e 20
~
"§
25
20
III
brood carers
c::::J larvae
~ 15
l
15
~ 10
200
400
600
800
1000
10
1000
bme
leI
Figure.2. Resultsof simulating the bee colony with and withoutadaptation. Note that (D) only shows the workertypes for the adaptive case
pheromones observed is relatively high. Therefore, a lot of the brood carer roles are allocated. For example, at time point 300, 15 out of a population of 28 are brood carers. Despite the fact that the overall pheromone level is not decreasing rapidly, the amount of brood carer roles drops significantl y after time point 300 . This is due the fact that Member roles can only share an allocation with one Worker role at a time . When another role receives a higher urgency (e.g., there is a huge attack, demanding many patrollers) a switch of worker role takes place. Figure 2d shows the amount of worker roles of the different types (except the resting workers) within the bee colony for the setting with adaptation. The amount of brood carers decreases after time point 300 due to an increase in the amount of shared allocations to the undertaker and forager roles . This results in an increase in pheromone level again , causing a higher delta for brood care again, resulting in more brood carers, etc. The pheromone level finally stabilizes around 0.5 in the organizational model with adaptation. For the setting without adaptation, the brood carers simply cease to exist due to the fact that none of the larvae are growing up. The pheromone level stabili zes at a higher level. The properties from Section 3 have been checked by the automated TTL checker. With the following parameter settings, the properties were validated and confirmed for the organizational model with adaptation and falsified for the one without adaptation: OPI(20), OP2(broodcare,0,0.9), OP3(broodcare,O.l5 ,10000) , OP4(broodcare, OJ , 200).
242
5 Conclusion and Discussion The organizational model for decentralized organizational change was inspired by mechanisms observed in Nature for a honeybee colony case study. The scope of the model is not limited to being a model for social insects: in [Bonebeau 2000] the effectiveness of such approaches is shown for other domains as well. The model can therefore support modelers and analysts working with organizations in dynamic environments, without a central control of change. The formal specification of the behavior of the organization is described by dynamic propertiesat different aggregation levels. Once the lowest level properties within the organization are specified in an executable form, these can be used for simulation abstracting from agents (to be) allocated. Such low level properties can be indicative for the behavior of the agent allocated to that particular role. The possibility also exists to specify the role properties at the lowest aggregation level in a more abstract manner, in a non-executable format. Hierarchical relations between the properties can be identified to show that fulfillment of properties at a lower level entails the fulfillment of the higher level properties. Simulations using agents can be performed and checked for fulfillment of these properties. The case study of the honeybee colony was analysed using the model. Simulation showed that given the external circumstances, the model was effective, given overall properties put forward by biological experts.
References Bonebeau, E. and Theraulaz, G., 2000, Swarm Smarts, Scientific American , 282 (3): 72-79 . Bosse, T., Jonker , C.M., Meij, L. van der, and Treur, J., 2007, A Language and Environment for Analysis of Dynamics by Simulation. Intern . J . ofAI Tools, vol. 16,2007, pp. 435-464 . Camazine , S., Deneubourg ,J.L., Franks, N.R., Sneyd, J., Theraulaz, G., Bonabeau, E., 2001 , SelfOrganization in Biological Systems, Princeton University Press, Princeton , USA. Furtado , V., Melo, A., Dignum, V., Dignum , F., Sonenberg , L., 2005 , Exploring congruence between organizational structure and task performance: a simulation approach. In: Boissier , 0 ., Dignum , V., Matson , E., Sichman ,J . (eds .), Proc. of the 1st OOOP Workshop . Giorgini , P., Miiller,J ., Odell ,J . (eds.), 2004 , AOSE IV , LNCS 2935 , Springer-Verlag, Berlin. Hoogendoom , M., Jonker , e.M., Schut, M., and Treur , J, 2004, Modellin g the Organisation of Organisational Change. In: Giorgini, P., and Winikoff , M., (eds.), Proceedings of the d" International Workshop on Agent-Oriented Information Systems (AOIS'04). pp. 29-46. Hoogendoom , M., Schut, M.e., and Treur , J ., Modeling Decentralized Organizational Change in Honeybee Societies. In: Costa, FA. et al. (eds.), Advances in Artificial Life. Proc. of the 9th European Con]. on Artificial Life, ECAL'07. LNCS 4648. Springer Verlag, 2007, pp. 615-624. Jonker, e.M., Treur , J. 2002 , Compositional verification of multi-agent systems: a formal analysis of pro-activeness and reactiveness . Int. J. of Coop. Inf. Systems,vol.l! , pp.5I-92. McCallum , M., Vasconcelos , W.W., and Norman, T J ., 2005, Verification and Analysis of Organisational Change. In: Boissier, O. et al . (eds.) , Proc. 1st OOOP Workshop . Schultz , OJ ., Barron, A.B., Robinson , G.E., 2002, A Role for Octopamin e in Honey Bee Division of Labor , Brain, Behavior and Evolution, vol. 60, pp. 350-359 . Sharpanskykh, A., Treur , J ., 2005, Temporal Trace Language: Syntax and Semantics, Technical Report, Vrije Universiteit Amsterdam , Department of Artificial Intelligence, Amsterdam . Theraulaz, G., Bonabeau , E., and Deneubourg , J.L. , 1998, Response threshold s reinforcement and division of labor in insect societies. Proc. of the Royal Soc. of London Series B-Biological Science, 265: 327-332. Winston , M.L. and Punnet , E.N., !982, Factors determining temporal division of labor in honeybees , Canadian Journal ofZoology, vol. 60, pp. 2947-2952 .
Chapter 6 An agent-based model for Leishmania major infection Garrett M. Dancik", Douglas E. Jones':', Karin S. Dorman l ,2,4 'Departments of Statistics, 2Genetics, Development & Cell Biology, 3Veterinary Pathology, and the "Program in Bioinformatics and Computational Biology Iowa State University [email protected], [email protected], [email protected]
1. Introduction Leishmania are protozoan parasites transmitted by bites of infected sandflies. Over 20 species of Leishmania, endemic in 88 countries, are capable of causing human disease. Disease is either cutaneous, where skin ulcers occur on exposed surfaces of the body, or visceral, with near certain mortality ifleft untreated . C3HeB/FeJ mice are resistant to L. major, but develop chronic cutaneous lesions when infected with another species L. amazonensis. The well-characterized mechanism of resistance to L. major depends on a CD4+ Thl immune response, macrophage activation, and elimination of the parasite [Sacks 2002]. The factors that account for host susceptibility to L. amazonensis, however, are not completely understood, despite being generally attributed to a weakened Thl response [Vanloubbeck 2004]. Computer simulation can provide insight into the differences between these species. Toward this goal, we describe an agent-based model for L. major infection and explore the sensitivity of predictions to model parameters. Results indicate that the strength of the Th 1 response, resting macrophage speed, and parasite transfer threshold of infected macrophages (which determines when infected macrophages transfer parasite to additional cells) influence time to heal infection, while the timing of the adaptive immune response, macrophage speed, and transfer threshold impact parasite load at the peak of infection .
2. An agent-based model of Leishmania major infection Agent-based models (ABMs) inherently capture the dynamics of complex systems whose properties depend on the collective behavior of the system 's interacting components. An ABM contains distinct entities, or agents, that inhabit a spatial environment. A simulation visualizes agents as they move and interact according to update rules that are executed at discrete time steps. We describe an ABM of the immune response to L. major infection . The structure of our model follows that of Segovia-Juarez [2004], who explore granuloma formation
244
during infection with another macrophage-tropic parasite, Mycobacterium tuberculosis. All model parameters are in Table A.l of the appendix . 2.1. The environment The experimental injection of L. major in the footpad of a mouse is a common biological model for studying the immune response to parasite challenge . We model a 2mm x 2mm cross section of footpad as a 100 x 100 grid of square microcompartments. Because we assume the grid is contained within a larger infected area, the environment is toroidal, so an object leaving the grid will re-enter at the opposite end. A single micro-compartment can hold up to one macrophage and one T cell, with no restriction on chemokine molecules or parasites . We select four evenly distributed micro-compartments to serve as source compartments where new cells enter the system. We label micro-compartments (ij), starting from (0,0) at the bottom left. Define a Mooreneighborhoodoflength r at position (xooYo) to be the space
Furthermore, define MJ(xoo Yo) to be the immediate Moore neighborhood of a microcompartment (xo,Yo). 2.2. Stages of Infection 2.2.1. Initial Conditions Mice are infected with L. major promastigotes in the experimental infection we study [Vanloubbeeck 2004]. Resident tissue macrophages take up promastigotes, and the parasite changes to the non-motile amastigote form. We start simulating two days post infection to avoid initial infection events, which reduces the number of model parameters . In addition, probably more is known about footpad conditions two days post infection than at initial inoculation . Based on parasite counts at this time [Doug Jones, unpublished observations], we randomly place 105 macrophages on the grid. Fifty parasites infect macrophages, with parasites per macrophage uniform between I and Pm. All uninfected macrophages are given random lifespans uniform between 0 and 100 days. 2.2.2. Infection of Macrophages Macrophages are the primary cells that Leishmania parasites infect. We use the term resting macrophage to refer to an uninfected, unactivated cell. Intracellular parasites experience logistic growth at rate Ur with carrying capacity KJ + 30. If the macrophage is not activated, intracellular parasites grow until their number exceeds the transfer threshold KJ. Then, the macrophage enters a dying state, where it begins transferring parasite to macrophages in its length two Moore neighborhood . In Leishmania infection, parasite transmission is thought to be direct, as extracellular parasites are seldom observed [Chang 2003]. Segovia-Juarez [2004] use a similar model, but allow extracellular parasite and restrict take-up to the length I Moore neighborhood . All macrophages that are not dying, including those already infected, can take up parasite.
245
Activated macrophages will eliminate parasite they ingest. A dying macrophage is removed from the system once all of its intracellular parasites are gone.
2.2.3. Chemokines and Cell Movement Chemokines, chemical attractants that influence cell movement, play an important role in Leishmania infection [Roychoudhury 2004]. We include one generic chemokine as an attractant for both macrophages and T cells. Its diffusion and decay properties are based on interleukin-8 (IL-8), an important chemokine involved in early infection. Cell movement has been described as a biased-random walk in the presence of chemokine [Tranquillo 1988]. Let Ci,j = CMj{i,j ) =
amount of chemokine in micro-compartment (iJ) amount of chemokine in immediate Moore neighborhood MJ(iJ).
Propose a cell currently in micro-compartment (i,j) moves to micro-compartment (k,l) E Mt(iJ) with probability C k,/ / CMj{iJ) ' If the proposed micro-compartment contains no other cells, then the move proceeds. If the current cell is a macrophage (T cell) and the proposed micro-compartment contains a T cell (macrophage), then the cell will move with probability Tmove • Otherwise, the cell will not move.
2.2.4. Recruitment and the adaptive immune response During infection, T cells and macrophages are recruited to infected areas. Macrophages are actively recruited around two days post infection with L. major [Sunderkotter 1993]. During infection, antigen-presenting cells take up pathogen from the site of infection, migrate to the draining lymph node, and present antigen (processed pathogen) to naive T cells. T cells then proliferate and mature into several classes of T cells, including CD4+ Thl cells. In L. major infection, CD4+ Thl cells are directed to the infected area and activate infected macrophages . The arrival ofT cells takes between 4 and 7 days in the absence of prior pathogen exposure [Janeway 2005], and occurs only after a prolonged period of parasite growth in low dose L. major infection [Belkaid 2000]. We assume that the timing ofT cell recruitment is related to pathogen load at the infection site, and recruit T cells once a threshold pathogen level, TdeJay> is reached . In the ABM, we use source compartments to represent blood vessels where recruited cells enter. At each time-step and at each unoccupied source compartment, macrophages enter with probability Mrecr and T cells, after TdeJay is reached, enter with probability Tree," Given recruitment with these probabilities, movement into occupied source compartments is the same as described for movement around the grid.
2.2.5. The role of inflammatory macrophages During the acute stage of infection, macrophages systematically migrate to the draining lymph node after taking up foreign antigen, apoptotic immune cells, and necrotic tissue. These inflammatory macrophages have a shorter tissue lifespan than resident macrophages [Bellingan 1996]. During infection, the macrophage population is a heterogeneous mix of resident, activated, and inflammatory macrophages . For
246 simplicity, and because inflammatory macrophages likely dominate during active infection, we choose to transform all uninfected resident macrophages to inflammatory macrophages when T cell recruitment begins by assigning a lifespan uniform between 2 and Mils days. This rule is consistent with the observed decrease in macrophages seen around the time the T cell response peaks in L. major infection [Belkaid 2000]. This decrease cannot be explained merely by the death of infected macrophages, but requires a systematic change such as the proposed phenotype switch [data not shown].
2.2.6. Macrophage activation All T cells in our model are considered to be antigen specific CD4+ Th1 cells that are equally capable of activating infected macrophages. T cells within the immediate Moore neighborhood of an infected macrophage activate it with probability Toe,m• In L. major infection, macrophage activation is sufficient for elimination of intracellular parasite. Mols days after T cell activation, the macrophage destroys all intracellular parasite, undergoes apoptosis, and is removed from the grid.
2.3. Time scales Each time step in the ABM is equal to approximately six seconds of real time. Chemokine diffusion and decay as well as parasite growth occur each time step . T cells move every 200 time steps (20 minutes). Macrophages move on slower time scales that we allow to vary (Table A.1). Additional update rules, such as the take up of parasite and activation of macrophages, are allowed to occur every Update minutes.
3. Sensitivity analysis 3.1. Choices of model parameters There are 24 parameters in our model. All parameters and ranges are given in Table A. I. For each parameter, we assign ranges biologically consistent with bothL. major and L. amazonensis. References for our choice of ranges are in the table.
3.2. Experimental design For simplicity we vary only 11 model parameters. Sensitivity to remaining parameters will be explored in later work. Parameter values are chosen from their ranges using Latin hypercube sampling (LHS) as described in McKay [1992] . Each parameter assumes one of 11 possible values during a single simulation. We perform 484 runs using 22 LHS structures and two replicates for each input configuration. We simulate infection for 71 days or until all parasite is cleared. In the model, two cells of the same type cannot occupy a single micro-compartment. This rule makes it possible for infected macrophages to impede macrophage entry at the source compartments. Although a pathogen may influence the immune response in this way, it is not thought to be the case for Leishmania. In order to prevent this unintended effect from biasing our results, we discard runs where at least one infected macrophage occupies a source compartment at the end of any simulation day . A total of 261 runs are
247 used in our final analysis. Of these , 255 infections heal before 71 days .
3.2. Results We calculated R2 values for each of our parameters with various simulation output 2 measures as described in McKay [1992]. The R statistic is a measure ofthe proportion of total variation in the response that can be accounted for by variation in each parameter. These results appear in Table 3.1. Most of the variation in time until clearance is exp lained by variation in T cell recruitment rate (Treen 27.88%), the probability that a T cell activates an infected macrophage (Tac1m , 11.51%), resting macrophage cell speed (Mrsp , 11.02%), and transfer threshold (Kt. 9.83%) ; most variation in peak parasite amount is explained by variation in Tdelay (55.56%); and most of the variation in the maximum number of infecte d macrophages is explained by variation in K1 (48.08%) . # Parasites
Max
Param Time to clearance Max 2Wks 3Wks 5Wks 7Wks 10Wks Mac 01
Table 3.1. R 2 values. Percent of total variation accounted for by variation in model parameters for the following responses: time until parasite clearance, maximum amount of parasite , parasite load at 2,3,5,7, and 10 weeks post infection, and maximum numbers of macrophages (Mac), T cells (T), and infected macrophages (Mi). In each column, values in bold indicate R2 values that exceed the average R2 values for that column .
3.3 Discussion A parameter with a high R2 value indicates that the simulation response is sensitive to the value of that parameter. In the context of Leishmania infection, a parameter that has a large impact on infection characteristics indicates that 1) careful characterization of that parameter is necessary for accurate simu lation of the infection, 2) species-specific (i.e., L. major vs. L. amazonensis) parameter values may account for differences in disease dynamics, and 3) data collected on this response will aid in parameter estimation. Several parameters are important determinants of time to clearance, including some that may explain the difference between Leishmania species . Trecr and Taclm , which
248 together contribute to the efficiency of the Thl response, are known to differ between the two species of parasite [Soong 1997; Vanloubbeek 2004]. The parameter Mrsp is likely tissue specific and is not expected to vary with pathogen. One reason why our model is sensitive to this parameter is because there is a relatively large amount of uncertainty about its value. The range of values we use is 0.5-1.5 um/min, which we center at the value of 1.0 um/min used in the model of Segovia-Juarez [2004]. Actual in vivo speeds of macrophages could not be determined, though Webb [1996] observes speeds of 0.1-0.5 um/min using a Dunn chemotaxis chamber. Sensitivity to these smaller values has not been explored, though current results indicate that using accurate values of this parameter will be important for understanding Leishmania infection. Transfer threshold also has a noticeable, though less substantial, impact on disease severity . In particular both the partial rank correlation between transfer threshold and maximum number of infected cells (-0.768) and transfer threshold and time until infection clearance (-0.123) are negative , indicating that lower transfer threshold is associated with a larger and longer infection. Interestingly, in vitro experiments where macrophages are infected with 1. amazonensis produce more infected cells than similar experiments with 1. major, despite the fact that initial numbers of macrophages and parasites are the same [Mukbel 2006]. Based on these observations, we hypothesize that pathogen-specific differences in transfer threshold may influence disease outcome, making mice more susceptible to 1. amazonensis than to 1. major. An important aspect of computer model evaluation is the efficient estimation of model parameters and assessment of model bias using data observed in the field. Sensitivity analysis is useful in this context since a parameter with a large effect on a response can be estimated using field data of that response . For example, our results indicate that biological data for peak parasite load and peak infected macrophage level can be used to estimate transfer threshold; Tdelay is best estimated from maximum parasite load or parasite load at 2 weeks post infection; and Treer can be estimated by examin ing time until clearance. In biological experiments, measurement of peak counts is usually not practical, since it requires frequent sampling. It is standard, however, to measure parasite load at various time points post infection. Computer experiments such as ours can be used to identify time points for data collection that will be most informative about model parameters of interest. Bayesian methods can then be implemented for parameter estimation when ranges or prior distributions for these parameters are known [Kennedy 2001]. 4. Conclusions In this work we have described an agent-based model for simulating 1. major infection. An initial sensitivity analysis indicates that the time until clearance is most sensitive to the T cell recruitment rate and the probability that a T cell activates an infected macrophage, measures of the Thl immune response that biologists suspect are associated with host susceptibility to 1. amazonensis. Time until clearance and maximum parasite load are sensitive to the choice of resting macrophage cell speed, highlighting the need for better data about macrophage movement. Time until clearance is also influenced by the transfer threshold of infected macrophages, and differences
249 between L. major and L. amazonensis that effect this parameter may be partly responsible for the different disease outcomes observed. Acknowledgements The authors are grateful to Max Morris for insight into the analysis of the computer model. GMD was partially supported by PHS grant GM068955 to KSD and USDA IFAFS Multidisciplinary Graduate Education Training Grant (2001-52100-11506). References Belkaid,Y., Mendez, S., Lira, R., Kadambi,N., Milon, G., Sacks, S., 2000. A natural model of Leishmania major infection reveals a prolonged"silent" phase of parasite amplification in the skin before onset of lesion formation and immunity. 1. Imm. 165, 969-77. Bellingan, GJ., Caldwell,H., Howie, S.E.M., Dransfield,I., Haslett, C., 1996. In vivo fate of the inflammatorymacrophageduring resolutionof inflammation. 1. Imm.157, 2577-85. Chang, K., Reed, S.G., McGwire,B.S., Soong, L., 2002. Leishman ia model for microbial virulence: the relevance of parasite multiplication and pathoantigenicity. Acta Tropica 85, 375-90. Janeway, C.A., Travers, P., Walport, M., Shlomchik, M.1., 2005. Immunobiology: The Immune System in Health and Disease, 6th ed. Garland Science Publishing,New York. Kennedy, M.e., O'Hagan, A., 2001. Bayesian calibration of computer models. 1.R. Statist. Soc. B 63,425-64 . McKay, M.D., 1992. Latin hypercube samplingas a tool in uncertaintyanalysis of computer models. Proceedings of the 24th conference on Winter simulation. Association for Computing Machinery. 557-64. Mukbel, R.M., Pattten Jr., C., Petersen, C., Jones, D., 2006. Macrophagekilling of Leishmania amazonensis amastigotes requires both nitric oxide and superoxide. Manuscript submitted for publication. Roychoudhury, K., Roy, S., 2004. Role of chemokinesin Leishman ia infection. Curr. Mol. Med. 4,691-96. Sacks, D., Noben-Trauth, N., 2002. The immunology of susceptibility and resistance to Leishmania major in mice. Nature Rev. Immunol.2, 845-58. Segovia-Juarez, 1.L., Ganguli, S., Kirschner, D., 2004. Identifying control mechanisms of granuloma formation during M tuberculosis infection using an agent-based model. 1. Theor. BioI. 231, 357-76. Soong, L., C. H. Chang, 1. Sun, Jr B. 1. Longley, N. H. Ruddle, R. A. Flavell, D. McMahon-Pratt. 1997. Role of CD4+ T cells in pathogenesisassociated with Leishmaniaamazonensis infection. 1. Immunol. 158,5374-83 . Sunderkotter, e., Kunz, M., Steinbrink, G., Meinardus-Hager, M., Goebeler, H.B., Song, C., 1993. Resistanceof mice to experimental leishmaniasisis associatedwith more rapid appearanceof mature macrophages in vitro and in vivo. 1. Immunol.151,4891-901. Tranquillo, R.T., Lauffenburger, D.A., Zigmond, S.H., 1998. A stochastic model for leukocyte random motility and chemotaxis based on receptor binding fluctuation . 1. Cell. Bio. 106, 3039. Vanloubbeeck, Y.F., Ramer, A.E., Jie, F., Jones, D.E., 2004. CD4+ Thl cells inducedby dendritic cell-based immunotherapy in mice chronically infectedwith Leishmania amazonensis do not promote healing. Infect. Immun. 72, 4455-63.
250 Webb, S., Pollard,1., Gareth,1., 1996. Direct observation and quantification of macrophage chemoattraction to the growth factor CSF-l. 1. Cell Sci. 109, 793-803.
Appendix - Agent-based model parameters Class
Param
Description
Range I Value
Unit
Ref
A
Chemokine diffusion coefficient
0.64
1.1 min
I
11
Chemokine degradation coefficient
0.001
/ .1 min
1
CR
Minimumamount of chemokine necessaryto influence cell movement
Activated macrophage lifespan Inflammatory macrophage lifespanU(2,Mtls)days Mt,s Macrophages Transferthresholdof infected macrophages K1 u; Resting macrophage speed Ma,n Activated macrophage speed u; Infectedmacrophage speed Initial numberof macronhazes M;n it initA ~ Initial # of parasite Maximum numberof parasiteper initiallyinfectedmacrophage Pm Minimum numberof parasites Tdel= requiredfor T cell response to begin Tmove Prob. ofT cell movement T Cells Prob. A T cell will activatea macrophage T cell lifespan T,s T cell speed T.
r.;
r.:
1 scalar I 1(2.40705.5, 9.6317"5) 1.1 min 2, e 100
days
1
(1/3,4)
days
e
(3, II)
days
2,e
(8,35) (0.5, 1.5) (0.0125,0.5) 0.0007
e 1 I I
50
scalar [umzmin [umzmin lurnzrnin scalar scalar
4
scalar
e
800,3000 (5,90)
scalar %
e e
105
I,e
e
(5,90) % e days 1 3 2 urn/min I (0.22, 0.52) % 1 ts 2,e (0.0250,0.2100) % /ts 2,e 1
scalar
1
1
scalar
1
10
minutes
e
Table A.I. ABM parameters. Reference codes : 1. Segovia-Juarez [2004], and references within; 2. Belkaid [2000]; e. estimate.
Chapter 7
To be or twice to be? The life cycle development of the spruce bark beetle under climate change Holger Lange, Bjorn 0k1and and Paal Krokene
Norwegian Forest and Landscape Institute Hegskoleveien 8, N-1432 As, Norway [email protected]
Abstract We analyze the impact of climate change on the life cycle of the spruce bark beetle Ips typographus by means of a temperature-driven threshold model and temperature data from a network of more than 300 climate stations in Scandinavia. Using observed temperatures as well as climate model simulations, our model results exhibit univoltine behavior under current conditions, but predicts almost strictly bivoltine behavior for southern Norway in 2071-2100. The dynamics of this threshold phenomenon is investigated in detail. By logistic regression, the impact of regional warming can be described as a northward movement of bivoltinism by some 600 kilometers. Extension to two generations per year (bivoltinism) might increase the risk of devastating bark beetle outbreaks, although the impact of photoperiod-induced diapause in late summer and the ratio of soil or under-bark hibernations should be taken into account.
1. Introduction Insects are physiologically sensitive to temperature, have short life cycles and great mobility, and their developmental rates and geographical distributions are therefore highly responsive to changes in temperatures. Even a small increase in mean yearly temperature may have severe consequences for agriculture and forestry through insect pests (Logan et al. 2003).
252 One important aspect of insect development is voltinism (the number of generations per year), which varies both between species and geographically within one species. In temperate regions, where winters are too cold for development to proceed, the number of generations is limited by the length of the growing season. There is usually only a certain developmental stage that is able to survive the winter, and the insects need to synchronize their development with the phenology (Logan and Bentz 1999), e.g. by completing either one or two generations per year (uni- or bivoltinism). Studies e.g. from North America have indicated that changes in voltinism can have profound effects on the outbreak dynamics of tree-killing bark beetles, and thus have severe consequences for tree mortality (Hansen and Bentz 2003) . On the other hand, the role of a second generation may be less important for the frequency of outbreaks when resource-depletion dynamics is a dominating factor (0kland and Bjernstad 2006). The Eurasian spruce bark beetle Ips typographus (L.) is one of the most destructive forest insects in Europe. In Norway, Sweden and Finland I typographus normally has only one generation per year (Annila 1969), but in particularly warm summers a second generation has been initiated in southern Sweden (Butovitsch 1938) and southern Norway (Austaraa et al. 1977). Empirical studies under Norwegian conditions have indicated that the summer is too short to complete the second generation. Most individuals reach the pupal stage, which is less cold tolerant than the adult stage and does not survive the following winter (Austaraa et al. 1977). Bivoltinism is, however, common in Central Europe, and up to three generations of I typographus have been assumed to develop in warm years (Harding and Ravn 1985). If global warming extends the growing season, a higher proportion of the second generation may reach the cold hardy adult stage and survive the winter. Future temperature increase may thus lead to a northward expansion of the areas experiencing two beetle generations per year. This work attempts to estimate the northward spread of Ips typographus bivoltinism using regional climate scenarios for Norway. The development of bark beetles and other insects has often been modelled accurately on the basis of temperature alone (Logan and Powell 2001). Developmental rates are almost zero below a lower, developmental stage dependent temperature threshold, and increase more or less linearly with temperature over a restricted (but ecologically relevant) temperature range above this threshold. Here we present a phenologically detailed model that describes the seasonal development of the spruce bark beetle based on degree-day sums, which has been validated in a rearing cage experiment (Wermelinger and Seifert 1998) and also reproduces the few observations ofbivoltinism from southern Norway well. We use a large set of historical temperature time series to explore the current geographical distribution ofbivoltine development in Norway, and its possible future spread using regional climate scenarios.
2. The model of bark beetle development The current model was developed based on experimental data from Wermelinger and Seifert (1998), as this study provides all necessary parameters to model development, and their values agree reasonably well with other studies (Annila 1969; Netherer and Pennerstorfer 2001) . Our model calculates bark beetle development using daily mean (air) temperatures from a time series of several years as input data. For each of five developmental stages 0.(= 1,2,3,4 ,5) (Table 1), the onset and closure Julian dates within a given year, dt and d;, are determined through the condition that
253 (1) d: d f
equals or just exceeds stage-specific degree day thresholds Fa. Here, d is the Julian a is the daily mean temperature, is the stage-specific day within a year , threshold temperature for development, below which development stops , is the 0 for X < 0 and 1 for 0), and DO is the Heaviside function degree days function. The model is used to represent the centre value of a beetle cohort that may be followed throughout the year , provided that its dispersion is sufficiently small (which seems to be a plausible assumption). In addition, initiation of mass-flight in spring requires a maximum daily temperature above 19.5 °C (Annila 1969). Beetle development thus proceeds according to accumulated DOabove certain stage-specific developmental threshold temperatures (egg stage, larval stages, pupae, immature adults ; Table 1).
T(d)
T
(e(x) =
e(x) =
x>
e
2.1 Bivoltine potential and bivoltine fraction When the first generation (spring generation) has completed its development, a second summer generation is initiated when temperatures are favorable for flight. If development of the second generation could not be completed before the temperature drops below zero in the autumn, the model recorded the fraction of completion of the pupal stage (as number of stage-specific DO reached, divided by total DO required for completion of pupation). This fraction is termed bivoltine pot ential (BP) , a real-valued variable to be correlated with site and climatic properties in the next section. If the second generation can be completed to the adult stage , the BP of the site in the given year is one. The spatial distribution of bivoltinism in Norway is investigated by dividing, for each station, the number of years where the second generation actually has been completed (BP=l) with the observation length (usually 30 years). This ratio is the bivoltine fra ction (BF) of a location. Since an immature second generation is not expected to survive the next winter, BF is a better impression of the long-term spatial spread ofbivoltinism compared to BP. Table 1. Standard parameter values for stage specific developmental threshold (T') temperatures and heat sum requirements (DO) for Ips typographus (from Annila 1969 and Wermelinger and Seifert 1998).
TO (OC)
DO(K·days)
5
110
10.6
51.8
Larvae
8.2
204.4
Pupae
9.9
57.7
Immature adult
3.2
238.5
Stadium a
Flight of 151 generation Egg
254 2.2 Diapause and photoperiod It is well known that certain environmental conditions can trigger the onset of diapause, i.e. a physiological state of dormancy, in insects, which is considered to be a riskavoiding behaviour. An example for such a trigger is light availability, determined by the photoperiod or daylength for a given calendar date and location, combined with unfavorable temperature conditions . However, the mechanisms behind the onset of diapause and the inhibition of development and late-summer swarming are poorly understood, making a model assessment difficult (Jonsson et a1. 2007). A detailed phenological model (Baier et a1. 2007) concluded that 14.5 hours is to be considered as the critical daylength were bark beetle in the Austrian Alps start diapausing, whereas rearing cage studies indicate a photoperiod between 16 and 18 hours for Scandinavian and central European populations (Dolezal and Sehnal 2007). This phenotypic plasticity renders the effect of photoperiod limitation uncertain . We shortly comment on its importance in our context in the results section.
3 Temperature data used in the model 3.1 Measured temperature data The historical data used were based on 337 meteorological stations covering conterminous Norway (Skaugen and Tveito 2004). Daily means and, if available, daily maxima were used. In accordance to standards in climate research, the 30 years reference period 1961 to 1990 has been selected as « presence ». To further explore the geographical distribution ofbivoltine development in Europe, the developmental model was also run with 16 historical mean daily temperature data from selected localities along a latitudinal gradient through Sweden, Denmark and Germany.
3.2 Regional climate scenarios We ran the developmental model with three dynamically downscaled temperature scenarios for Norway from two different climate models (Hadley AGCM model with the A2 and B2 emission scenarios and the ECHAM4 model with the B2 scenario). The scenario period was 2071-2100. Using the REGClim approach (Skaugen and Tveito 2004), the downscaling procedure yields gap-free daily mean temperatures for the fu1l30-year scenario period for each of the 337 Norwegian stations.
4 Exploring the developmental model 4.1 Sensitivity analysis for the model Each of the 10 parameters given in Tab. 1 was subjected to changes to investigate the crucial factors determining BP, using a site with a medium value (BP=O.3) for the standard set and measured temperatures. Apart from the immature adult threshold and heat sum requirement, all parameters strongly influence BP; e.g., a 20% decrease of the heat sum parameter for the larvae stage increases BP from 0.3 to 0.7. Thus, the parameters of the model are relatively well-defined .
255 For the location As, where a longer uninterrupted temperature series (19522005) was available, the impact of mean values in temperature for individual months was calculated by simply shifting daily values by constant amounts. This impact turned out to be rather strong ; particularly decisive were August and September values, where a shift of less than +I 0 C changes the BP from 0.1 to 1. Swarming start dates for the first generation of the year always lay between July lSth and September loth, which is in good agreement with this result. The strength of the temperature - bivoltinism relation was studied in a first step in a highly aggregated manner by using annual average temperatures for all 353 sites on one hand, and 30-year average of BP on the other (Fig. I). Temperature - Bivoltine relation
•
..• • .... -
•
08 _ 0.7 .!!l
~
06
~
05
o
• •
.:g=04 >
iii
•
0.3 .
•
0.1 ,
0
'
2
. ...• ••
3
,. •
• •
••••
02 -
•
•
•
•
• •• •• " ." IIC ·.1 "I.~~.·;' .~~.---~-----45678
9
10
11
Yearly Average Tempera ture (deg C)
Figure 1. The relation between annual mean temperatures and bivoltine potential for 353 stations . It is obvious that this relationship is double-sided thresholded, with approximate limit temperatures of resp. 4 and 9.5 0 C. The relation between these limits is not particularly strong but clearly nonlinear (S-shaped). Although the chosen independent variable is extremely aggregated and simple, the correlation is significant. We also tried to determine the part of the year for which the temperature history is most decisive for the development. DOvalues were calculated for a time window starting at a given day of the year and with a certain length, and these sums were correlated with the BP again for As (1952-2005). The optimal correlation (r=O.S) is achieved for a window running from May Izt h to September 13lh. For this window, a steep ramp-like connection between DO and BP exists, switching from BP=O at 1200 Kidays to BP=I at 1300 Kidays , Our developmental model so far corresponds to a critical daylength (CD) of 0 hours. The effect of diapausing was investigated by simply varying CD between 0 and IS hours. When actual daylength falls below this threshold , the development of the second generation ceases independent of temperature. The BP remains stable up to approx. 12-13 hours and then starts to drop. For observed temperatures in As, BP is halved at 14.5 hours, the estimate of (Baier et al. 2007), and vanishes for 17 hours and
256 more, the latter representing an implausible value given the maximum daylength of 18.5 hours for this location. The photoperiod limitation requires further investigation. 4.2 Bivoltinism under current and future climatic conditions Under current (1961-1990) climate conditions, a nonvanishing BP is restricted to the region around the Oslofjord in SE Norway, confirming empirical observations (Fig. 2, left panel). North of 65 degrees latitude , not even a completed first generation could be found by the model. The picture changes dramatically when using the climate scenarios (2071-2100). The right panel of Fig. 2 shows the results for the Hadley B2 scenario, which is the mildest among the three investigated in terms of spatiotemporal average temperature increase predicted (2S C from 1961-1990 to 2071-2100). According to these temperature predictions, the occurrence of bivoltinism is very common at least for large parts of southern Norway. Several locations show two generations in every single year within the 30 year period; a few sites at higher elevations continue to be unfavorable for a second generation. 4.3 Climate Change as shift in latitude for Ips typographus A phenomenological description of the bivoltinism - latitude relationship was performed in the following way . We seek a parametrization of the bivoltine fraction which interpolates between a value of 100% south of a threshold latitude, as indicated by the temperature series from Central Europe, and zero in regions far north unsuitable for spruce trees, and which is S- shaped between these two extremes. The following (logistic) function fulfils these requirements:
&_ 1 - 1 + exp(a 10g((qS - qSs )/(~o
W
- qS)) - r)
(I»
Figure 2. Bivoltine Potential (BP) according to our model (eq. for southern Norway. Seven classes for the 3D-years average BP have been built. Left panel: using observed temperatures from 1961 to 1990 ; right panel: Hadley model B2 scenario for 2071 to 2100.
257
-
o
- -
blvottl.,.. fracllon (observed) logistic fit from observations blvottlne fraction (HadAM 82 ) logl5Uc fit (HadAM B2)
0.6
0.7
0.6 0.5
0 .4 0 .3
0 .2
•
0.1
L
50
~-
55
- - - -.'I - : : - - -
LaUtudo (degr e.. North)
65
70
Figure 3. Bivoltine fractions calculated from observed temperature series(1961 -1990) and the HadAMB2 scenario for 2071-2100, and respective fits using a logistic function with three parameters. The resulting northward shift is indicatedby the arrow.
where
¢
is geographical latitude; for simplicity, we require the bivoltine fraction
to be vanishing only at the North Pole (¢90); ¢s is the southern threshold latitude, and
a
and
r
are empirical shape parameters of the function. For observed temperature
series, this approach leads to a satisfying fit shown in Fig. 3 (R 2 = 0.75,RMSE = 0.17 ; both
a
and
r
estimates were highly significant different from zero) . The optimal
value for the latitude threshold was found to be ¢, (obs) = 48 0 • Assuming that this empirical relationship holds unchanged in structure as well for the climate scenarios, we fixed the values for a and obtained and readjusted the threshold latitude. Fig . 3
r
shows the result for the HadAM B2 climate scenario. The optimal value found was ¢s(HadAMB2) = 53.40 , and the performance of the fit was even slightly better (R 2 = 0.80,RMSE = 0.16). Thus, for this climate scenario, a northward movement of the change in reproduction cycle of
/);,¢., = 5.40 ,
or 600 km with a standard deviation of
only 10 km, is predicted. Other scenarios give qualitatively similar results.
5. Summary We developed a simple model that makes use of easily available data, which can be run with relatively well-defined parameters, and which agrees with field observations. Using this model on historical as well as climate scenario temperature series, Norway seems to be in a highly transient situation where the life cycle of Ips typographus changes from univoltine to bivoltine. This shift has potentially profound effects on the
258 spruce forest ecosystem and on forestry. With two generations per year, there will also be two attack periods on spruce annually, one in the spring and one in July/August. It worsens the situation that Norway spruce is probably more susceptible to beetle attacks later in the summer than during the current flight period in mid-May. The current study does not analyze the role of adaptation strategies of Norway spruce and Ips typographus for life cycle development and outbreak dynamics under future conditions . Acknowledgments. We would like thank the ICCS6 participants for interesting discussions during the conference, in particular Guy Hoelzer and Lael Parrott. This work has been supported by the NorwegianResearchCouncilunder grant no. NFRI55893/720.
References Annila, E. (1969): Influence of temperature opon the development and voltinism of~ tvpographus L. (Coleoptera, Scolytidae). Ann. Zoo!' Fennici 6, 161-208. Austaraa, 0 ., Pettersen,H. and Bakke, A. (1977): Bivoltism in Ips typographus in Norway, and winter mortality in second generation. Medd. Nor. Inst. Skogforsk. 33 (7), 272-281. Baier, P., Pennerstorfer, J. and Schopf, A. (2007): PHENIPS--A comprehensive phenologymodel ofIps typographus (L.) (Col., Scolytinae) as a tool for hazard rating of bark beetle infestation. Forest Ecology and Management 249 (3), 171-186. Butovitsch, V. (1938): am granbarkborrens massforokning i sodra Dalarne. Norrlands Skogsvdrdsforbunds Tidskrift 1938 (91-126). Dolezal, P. and Sehnal, F. (2007): Effectsof photoperiodand temperature on the development and diapauseof the bark beetle Ips typographus. Journal ofApplied Entomology 131 (3), 165-173. Hansen,E.M. and Bentz, B.l (2003): Comparison of reproductive capacity among univoltine, semivoltine, and re-emerged parent spruce beetles (Coeloptera: Scolytidae). The Canadian Entomologist 135, 697-712. Harding, S. and Ravn, H.P. (1985): Seasonal activity ofIps typographus in Denmark. Zeitschrift fiir Angewandte Entomologie 99,123 -131. JOnsson, A.M., Harding, S., Barring, L. and Ravn, H.P. (2007): Impact of climate change on the populationdynamicsofIps typographus in southern Sweden. Agriculturaland Forest Meteorology 146 (1-2), 70-81. Logan, lA. and Bentz, BJ. (1999): Model Analysisof MountainPine Beetle (Coleoptera: Scolytidae) Seasonality. Environmental Entomology 28 (6), 924-934. Logan, lA. and Powell, lA. (2001): Ghost Forests,Global Warming, and the MountainPine Beetle (Coleoptera: Scolytidae). American Entomologist 47 (3), 160-172. Logan, lA., Regniere,J. and Powell, lA. (2003): Assessing the impacts of global warming on forest pest dynamics. Frontiers in Ecology and the Environment 1 (3), 130-137 . Netherer, S. and Pennerstorfer, J, (2001): Parameters Relevant for Modellingthe Potential Developmentof Ips typographus (Coleoptera: Scolytidae).lntegrated Pest ManagementReviews 6 (3 - 4), 177-184. Skaugen,T.E. and Tveito, a .E. (2004): Growing-season and degree-day scenario in Norway for 2021-2050. Climate Research 26 (3), 221-232. Wermelinger, B. and Seifert, M. (1998): Analysis of the temperature dependentdevelopment of the spruce bark beetle Ips typographus (L.) (Col., Scolytidae). Journal ofApplied Entomology 122, 185-191. 0kland, B. and Bjornstad, a.N. (2006): A resource-depletion model of forest insect outbreaks. Ecology 87 (2), 283-290.
Chapter 8 A Formal Analysis of Complexity Monotonicity Tibor Bosse, AlexeiSharpanskykh, and Jan Treur Vrije Universiteit Amsterdam, Department of Artificial Intelligence {tbosse, sharp , treur }@cs.vu.nl
1 Introduction Behaviour of organisms can occur in different types and complexities, varying from very simple behaviour to more sophisticated forms . Depending on the complexity of the externally observable behaviour, the internal mental represent ations and capabilities required to generate the behaviour also show a large variety in complex ity. From an evolutionary viewpoint, for example, Wilson [ 1992] and Darwin (1871) point out how the development of behaviour relates to the development of more complex cognitive capabilities. Godfrey-Smith [1996, p. 3] assumes a relationship between the complexity of the environment and the development of mental representat ions and capabilities. He formulates the main theme of his book in condensed form as follows : 'The function of cognition is to enable the agent to deal with environm ental complexity' (the Environmental Complexity Thesis) . In this paper , this thesis is refined as follows : • the more complex the environment, the more sophisticated is the behaviour required to deal with this environment, • the more sophistic ated the behaviour, the more complex are the mental representations and capabilities needed Th is refined thesis will be called the Complexity Monotonicity Thesis. The idea is that to deal with the physical environment, the evolution process has generated and still generates a variety of organisms that show new forms of behaviour. These new forms of behaviour are the result of new architectures of organisms, including cognitive systems with mental representations and capabilities of various degrees of complexity. The occurrence of such more complex architectures for organisms and the induced more complex behaviour itself increases the complexity of the environment during the evolution process . New organisms that have to deal with the behaviour of such already occurring organi sms live in a more complex environment, and therefore need more complex behaviour to deal with this environment, (to be) realised by an architecture with again more complex mental capabilities. In particular , more complex environments often ask for taking into account more complex histories , which requires more complex internal cognitive representations and dynamics, by which more complex behaviour is generated . This perspective generates a number of questions. First, how can the Complexity Monotoni city Thes is be formalised, and in particular how can the 'more complex ' relation be formalised for ( I) the environment, (2) externally observ able agent
260
behaviour and (3) internal cognitive dynamics? Second, connecting the three items, how to formalise (a) when does a behaviour fit an environment: which types of externally observable behaviours are sufficient to cope with which types of environments, and (b) when does a cognitive system generate a certain behaviour: which types of internal cognitive dynamics are sufficient to generate which types of externally observable agent behaviour? In this paper these questions are addressed from a dynamics perspective, and formalised by a temporal logical approach . Complexity of the dynamics of environment, externally observable agent behaviour and internal cognitive system are formalised in terms of structure of the formalised temporal specifications describing them , thus answering (I) to (3). Moreover, (a) and (b) are addressed by establishing formalised logical (entailment) relations between the respective temporal specifications. Furthermore, four cases of an environment, suitable behaviour and realising cognitive system are analysed and compared with respect to complexity, thus testing the Complexity Monotonicity Thesis . More details of this study can be found in [Bosse et al . 2oo8J.
2 The Complexity Monotonicity Thesis The environment imposes certain requirements that an organism's behaviour needs to satisfy; these requirements change due to changing environmental circumstances. The general pattern is as follows . Suppose a certain goal G for an organism (e.g., sufficient food uptake over time) is reached under certain environmental conditions ESI (Environmental Specification 1), due to its Behavioural Specification BSI, realised by its internal (architecture) CSI (Cognitive Specification 1). In other words , the behavioural properties BSI are sufficient to guarantee G under environmental conditions ESI, formally ESI & BSI ::::} G, and the internal dynamics CSI are sufficient to guarantee BSI, formally CS1 ::::} BSI. In other environmental circumstances, described by environmental specification ES2 (for example, more complex) the old circumstances ESI may no longer hold, so that the goal G may no longer be reached by behavioural properties BSI. An environmental change from ESI to ES2 may entail that behaviour BSI becomes insufficient. It has to be replaced by new behavioural properties BS2 (also more complex) which express how under environment ES2 goal G can be achieved , i.e., ES2&BS2::::} G. Thus , a population is challenged to realise such behaviour BS2 by changing its internal architecture and its dynamics, and as a consequence fulfill goal G again. This challenge expresses a redesign problem: the given architecture of the organism as described by CSI (which entails the old behavioural specification BSI) is insufficient to entail the new behavioural requirements BS2 imposed by the new environmental circum stances ES2; the evolution process has to redesign the architecture into one with internal dynamics described by some CS2 (also more complex) , with CS2 ::::} BS2, to realise the new requirements on behaviour. The Complexity Monotonicity Thesis can be formalised in the following manner. Suppose < E" B" C1 > and < E:!, B2 , C2 > are triples of environment, behaviour and
261
cogrnu ve system, respecti vely, such that the behaviours B, are adequate for the respect ive environment 1; and realised by the cogn itive system Cj • Then the Complexity Monotonicity Thesis states that E1 S C Ez •• B, So B2 & B1 S C B2 • • C, S C C2 Here S c is a partial orderin g in complexity, where X S c Y indicates that Y is more complex than X. A special case is when the co mplexity ordering is assumed to be a total ordering where for every two elements X, Y either X S c Y or Y So X (i.e., they are comparable), and when some complexity measure cm is ava ilable , assigning degree s of complexity to environments, behavio urs and cognitive systems, such that X S c Y ¢> cm(X) s cm(Y) where s is the stand ard ordering relation on (real or natu ral) numbers. In this case the Complexity Monotonicity The sis can be reformul ated as cm(E1) s cm(Ez) •• cm(B1) s cm(B2 ) & cm(B1) s cm(B2) • • cm(C 1) s cm(C2) Th e Temporal Complexity Monotonicit y Thesis can be used to explain increase of compl exity during evolution in the following manner. Make the following assumption on Addition of Environmental Complexity by Adaptation: "adaptation of a species to an environment adds complexity to this environment" . Suppose an initial environment is described by ESO, and the ada pted species by BSO. Then this transform s ESO into a more compl ex environmental description ESI . Based on ES I , the adapted species will have description BS I. As ES I is more co mplex than ESO, by the Complexity Monotonicity The sis it follow s that this BS I is more complex than BSO: ESO ::::: ES I • • BSO ::::: BS I. Therefore BS I again adds complexity to the envi ronment, leadin g to ES2, which is more complex than ES I, et cetera. This argument shows that the increase of complexity during evoluti on can be related to and explained by two assumptions: the Complexity Monotonicity Thesis, and the Addition of Enviro nmental Co mplexity by Adaptation assumpti on . Thi s paper focuses on the former assumption.
3 Variations in Behaviour and Environment To evaluate the approach put forw ard, a number of cases of increasing complexity are analysed , starting from very simple stimulus-response behaviour solely depending on stimuli the agent gets as input at a given point in time. This can be described by a very simple temporal structure: direct association s between the input state at one time point and the (behavi oural) output state at a next time point. A next class of behaviours, with slightly higher compl exity, analysed is delayed response behaviour. behaviour that not only depends on the current stimuli, but also may depend on input of the agent in the past. Thi s pattern of behaviour cann ot be described by direct functi onal associations between one input state and one output state; it increases temporal complexity compared to stimulus-response behaviour. For this case, the descript ion relating input states and output states necessaril y needs a referen ce to input s received in the past. Viewed from an intern al perspective, to describe mental capabilities gene rating such a behaviour, often it is assumed that it involves a
262
memory in the form of an internal model of the world state. Elements of this world state model mediate between the agent's input and output states. Other types of behaviour go beyond the types of reactive behaviour sketched above. For example, behaviour that depends in a more indirect manner on the agent's input in the present or in the past. Observed from the outside, this behaviour seems to come from within the agent itself, since no direct relation to current inputs is recognised. It may suggest that the agent is motivated by itself or acts in a goaldirected manner . For a study in goal-directed behaviour and foraging, see, for example , [Hill 2006]. Goal-directed behaviour to search for invisible food is a next case of behaviour analysed . In this case the temporal description of the externally observable behavioural dynamics may become still more complex, as it has to take into account more complex temporal relations to (more) events in the past, such as the positions already visited during a search process . Also the internal dynamics may become more complex . To describe mental capabilities generating such a type of behaviour from an internal perspective, a mental state property goal can be used. A goal may depend on a history of inputs. Finally, a fourth class of behaviour analysed, which also goes beyond reactive behaviour, is learning behaviour (e.g., conditioning) . In this case, depending on its history comprising a (possibly large) number of events, the agent's externally observable behaviour is tuned . As this history of events may relate to several time points during the learning process, this again adds temporal complexity to the specifications of the behaviour and of the internal dynamics . To analyse these four different types of behaviour in more detail, four cases of a food supplying environment are considered in which suitable food gathering behaviours are needed. These cases are chosen in such a way that they correspond to the types of behaviour mentioned above. For example , in case I it is expected that stimulus-response behaviour is sufficient to cope with the environment, whilst in case 2, 3 and 4 , respectively, delayed response behaviour , goal-directed behaviour, and learning behaviour is needed). The basic setup is inspired by experimental literature in animal behaviour such as [Tinklepaugh 19321 . The world consists of a number of positions which have distances to each other. The agent can walk over these positions . Time is partitioned in fixed periods (days) of a duration of d time units (hours) . Every day the environment generates food at certain positions, but this food mayor may not be visible , accessible and persistent at given points in time. The different types of environment with increasing temporal complexity considered are: (I) Food is always visible and accessible . It persists until it is taken. (2) Food is visible at least at one point in time and accessible at least at one later time point. It persists until it is taken. (3) Food either is visible at least at one point in time and accessible at least at one later time point, or it is invisible and accessible the whole day. It persists until it is taken . (4) One of the following cases holds: a) Food is visible at least at one point in time and accessible at least at one later time point. It persists until it is taken. b) Food is invisible and accessible the whole day. It persists until it is taken.
263
c) Food pieces can disappear, and later new pieces can appear, possibly at different positions. For every position where food appears, there are at least three different pieces in one day. Each piece that is present is visible. Each position will be accessible at least after the second food piece disappeared .
4 Modelling Approach For describing different variations in behaviour of an agent and environment , a formal modelling approach is needed. The simplest type of behaviour, stimulusresponse behaviour, can be formalised by a functional input-output association , i.e., a (mathematical) function F : inputstates --. Outputstates of the set of possible input states to the set of possible output states . A state at a certain point in time as it is used here is an indication of which of the state properties of the system and its environment are true (hold) at that time point. Note that according to this formalisation, stimulus-response behaviour is deterministic . Behaviour of this type does not depend on earlier processes, nor does it on (not observable) internal states. If also non-deterministic behaviour is taken into account, the function in the definition above can be replaced by a relation between input states and output states, which relates each input state to a number of alternatives of behaviour, i.e., R: lnputstates x Outputstates. For example , a simple behaviour of an animal that after seeing food at the position p goes to this position on condition that no obstacles are present, can be formalised using a functional association between an input state where it sees food at p and no obstacles , and an output state in which it goes to p. As opposed to stimulus-response behaviour, in less simple cases an agent's behaviour often takes into account previous processes in which it was involved ; for example , an agent that observed food in the past at position p may still go to p, although it does not observe it in the present. Instead of a description as a function or relation from the set of possible input states to the set of possible output states, in more general cases, a more appropriate descrip tion of behaviour by an input-output correlation is given in the following definition: a) A trace (or trajectory) is defined as a time-indexed sequence of states, where time points can be expressed, for example, by real or integer values. If these states are input states, such a trace is called an input trace. Similarly for an output trace. An interaction trace is a trace of (combined) states consisting of an input part and an output part. b) An input-output correlation is defined as a binary relation C : lnputjraces x Output_traces between the set of possible input traces and the set of possible output traces. c) A behavioural specification S is a set of dynami c properties in the form of temporal statement s on interaction traces . d) A given interaction trace 'Tfulfils or satisfies a behavioural specification S if all dynamic properties in S are true for the interaction trace 'T. e) A behavioural specification S is a specification ofan input-output correlation C if and only if for all interaction traces 'T input-output correlation C holds for 'Tif and only if 'Tfulfils S .
To express formal specifications for environmental, behavioural and cognitive dynamics for agents , the Temporal Trace Language (TTL, see [Bosse et al. 2006]) is used . This language is a variant of order-sorted predicate logic. In dynamic property
264
expressions, TIL allows explicit references to time points and traces . If a is a state property, then, for example state(y, t, input(agent» 1= a denotes that this state property holds in trace y at time point t in the input state of the agent. Based on such building blocks, dynamic properties can be formulated . For example, a dynamic property that describes stimulus-response behaviour of an agent that goes to food, observed in the past can be formalised as folIows: VtVx vp Vp' [ state(y. t, input(agent» 1= observed(at(agent, p))" observed(at(lood(x), p'» " observed(accessible(p')) •• state(y, t+1, outputtaqent) 1= perforrning_aclion(goto(p')) 1 Using this approach, the four variations in behaviour and environment have been formalised in detail. The results can be found in [Bosse et al. 2008] .
5 Formalisation of Temporal Complexity The Complexity Monoton icity Thesis discussed earlier involves environmental , behavioural and cognitive dynamics of living systems . In an earlier section it was shown that based on a given complexity measure em this thesis can be formalised in the folIowing manner: cm(E,) s cm(E:!) •• cm(B,) s cm(B 2) & cm(B,) :s cm(B 2) • • cm(C t ) :S cm(C 2 ) where < E" B" C, > and < Ez, Bz• Cz > are triples of environments, behaviours and cognitive systems , respectively, such that the behaviours B, are adequate for the respective environment Ej and realised by the cogn itive system Cj • What remains is the existence or choice of the complexity measure function em. To measure degrees of complexity for the three aspects considered, a temporal perspective is chosen : complexity in terms of the temporal relationships describing them . For example, if references have to be made to a larger number of events that happened at different time points in the past , the temporal complexity is higher . The temporal relationships have been formalised in the temporal language TIL based on predicate logic . This translates the question how to measure complexity to the question how to define complexity of syntactical expressions in such a language. In the literature an approach is available to define complexity of expressions in predicate logic in general by defining a function that assigns to every expression a size: [Huth and Ryan 2000] . To measure complexity, this approach was adopted and specialised to the case of the temporal language TIL. Roughly spoken , the complexity (or size) of an expression is (recursively) calculated as the sum of the complexities of its components plus I for the composing operator. In more details it runs as folIows . Similarly to standard predicate logic, predicates in TIL are defined as relations on terms. The size of a TIL-term t is a natural number s(t) recursively defined as: (I) s(x)=1 , for all variables x. (2) s(c)=1 , for all constantsymbolsc. (3) s(l(t1 ,..., In»= s(t1)+ ... + s(ln) + 1. for all function symbols I. For example, the size of the term observed(not(at(lood(x), p))) from the property BPI (see [Bosse et aI. 2008]) is equal to 6.
265
Furthermore, the size of a TIL-formula ljJ is a positive natural number s(ljJ) recursively defined as follows: (I) s(p(t, .... .J))= set,) + ... + sGt) +1, for all predicate symbols p. (2) s( ~q»=s((\fx) '1')= s((3x) '1') = s(q»+1, for all TTL-formulae 'I' and variables x. (3) s(q>&X) = s(q>lx) = s(q>"x) = s(q»+ s(x)+1, for all TTL-formulae '1' , X.
In this way, for example, the complexity of behavioural property BPI amounts to 53, and the complexity of behavioural property BP2 is 32. As a result, the complexity of the complete behavioural specification for the stimulus-re sponse case (which is determined by BPI & BP2) is 85 (see [Bosse et al. 2008J for the properties) . Using this formalisation of a complexity measure, the complexity measures for environmental , internal cognitive, and behavioural dynamics for the considered cases of stimulus-response, delayed response, goal-directed and learning behaviours have been determined . Table I provides the results . Table, 1. Temporal complexity of environmental , behav ioural and cognitive dynamics. Case Stimulus-response Delayed response Goal-directed Learning
Environmental dvnamics 262 345 387 661
Behavioural dvnamics 85 119 234 476
Cognitive dvnamics 85 152 352 562
The data given in Table I confirm the Complexity Monotonicity Thesis put forward in this paper , that the more complex the environmental dynamics, the more complex the types of behaviour an organism needs to deal with the environmental complexity, and the more complex the behaviour, the more complex the internal cognitive dynamics.
6 Discussion In this paper, the temporal complexity of environmental, behavioural, and cognitive dynamics, and their mutual dependencies, were explored . As a refinement of Godfrey-Smith (1996)'s Environmental Complexity Thes is, the Complexity Monotonicity Thesis was formulated : for more complex environments, more complex behaviours are needed , and more complex behaviours need more complex internal cognitive dynamic s. A number of example scenarios were formalised in a temporal language, and the complexity of the different formalisations was measured. Complexity of environment, behaviour and cognition was taken as temporal complexity of dynamic s of these three aspects, and the formalisation of the measurement of this temporal complexity was based on the complexity of the syntactic expressions to characterise these dynamics in a predicate logic language, as known from, e.g., [Huth and Ryan 2000J. The outcome of this approach is that the results confirm the Complexity Monotonicity Thesis . In [Godfrey-Smith 1996J, in particular in chapters 7 and 8, mathematical models are discussed to support his Environmental Complexity Thesis, following, among others, [Sober 1994J. These models are made at an abstract level , abstracting from the
266
temporal dimension of the behaviour and the underlying cognitive architectures and processes. Therefore, the more detailed temporal complexity as addressed in this paper is not covered. Based on the model considered, Godfrey-Smith [1996, Ch 7, p. 216, see also p. 118] concludes that the flexibility to accommodate behaviour to environmental conditions, as offered by cognition , is favoured when the environment shows (i) unpredictabilit y in distal conditions of importance to the organism, and (ii) predictability in the links between (observable) proximal and distal . This conclusion has been confirmed to a large extent by the formal analysis described in this paper. Comparable claims on the evolutionary development of learning capabilities in animals are made by authors such as Stephens [1991]. According to these authors, learning is an adaptation to environmental change . All these are conclusions at a global level, compared to the more detailed types of temporal complexity considered here, where cognitive processes and behaviour extend over time, and their complexity can be measured in a detailed manner as temporal complexity of their dynamics .
Bibliography [I] Bosse, T., Jonker, e.M., Meij, L. van der, Sharpanskykh , A., & Treur, J. (2006) . Specification and Verification of Dynamics in Cognitive Agent Models. In: Proceedings of the Sixth Int. Conf. on Intelligent Agent Technology , lAT'06. IEEE Computer Society Press, 247-255 . [2] Bosse, T., Sharpanskykh, A., & Treur , J. (2008). On the Complexity Monotonicity Thesis for Environment, Behaviour and Cognition. In: Baldoni, M., Son , T.e., Riemsdijk, M.B. van, and Winikoff, M. (eds.), Proc. of the Fifth Int. Workshop on Declarative Agent Languages and Technologies, DALT'07. Lecture Notes in AI, vol. 4897. Springer Verlag , 2008, pp. 175-192. [3] Darwin, e. (1871). The Descent of Man. John Murray, London. [4] Godfrey-Smith, P., (1996). Complexity and the Function of Mind in Nature . Cambridge University Press. [5] Hill, T .T. (2006). Animal Foraging and the Evolution of Goal-Directed Cognition. Cognitive Science, vol. 30, pp. 3-41. [6] Huth, M. & Ryan, M. (2000) . Logic in Computer Science: Modelling and reasoning about computer systems, Cambridge University Press. [7] Sober, E. (1994) . The adaptive advantage of learning versus a priori prejustice . In: From a Biological Point of View. Cambridge University Press, Cambridge . [8] Stephens, D. (1991). Change, regularity and value in evolution of animal learning . Behavioral Ecology, vol. 2, pp. 77-89 . [9J Tinklepaugh, O.L. (1932). MUltiple delayed reaction with chimpanzees and monkeys . Journal of Comparative Psychology, 13, 1932, pp. 207-243. [10] Wilson , O. (1992). The Diversity of Life. Harvard University Press, Cambridge, Massachusetts .
Chapter 9
Complex Features in Lotka-Volterra Systems with Behavioral Adaptation Claudio Tebaldi 1 - Deborah Lacltlgnola' 'Department of Mathematics, Politecnico of Torino Corso Duca degli Abruzzi 24, Torino - Italy 2Department of Mathematics, University of Leece Via Provo Lecce-Amesano, Leece - Italy [email protected][email protected]
1.1. Introduction Lotka-Volterra systems have played a fundamental role for mathematical modelling in many branches of theoretical biology and proved to describe, at least qualitatively, the essential features of many phenomena, see for example Murray [Murray 2002] . Furthermore models of that kind have been considered successfully also in quite different and less mathematically formalized context: Goodwin' s model of economic growth cycles [Goodwin 1967] and urban dynamics [Dendrinos 1992] are only two of a number of examples. Such systems can certainly be defined as complex ones and in fact the aim of modelling was essentially to clarify mechanims rather than to provide actual precise simulations and predictions . With regards to complex systems, we recall that one of their main feature, no matter of the specific definition one has in mind, is
adaptation, i. e. the ability to adjust.
268 Lotka-Volterra systems are a large class of models for interaction among species. Depending on such interactions competition, cooperation or predator-prey situations can occurr, giving rise to further classifications. The dynamics depends on parameters intrinsic to the species, tipically growth rate and carrying capacity, and on the coefficients of interaction among the species, which however are often more difficult to specify. Here we focus on competition among species and, differently from the classical case, we consider for them a kind of "learning skill": the ability to compete is proportional to the average number of contacts between species in their past, with a weak esponential delay kernel providing a "fade-out" memory effect. Adaptation in such a form is shown to be a mechanism able to establish the appearence of a variety of behaviors, different from equilibria, as distinct kinds of oscillations and chaotic patterns. Furthermore, even for given parameter values, the system can show striking features of multeplicity of attractors. This kind of "complexity" comes out as collective behavior emerging from the interactions among the species involved.
1.2. The model We consider the general competitive Lotka-Volterra system for n species
dN · N· _ I == r.[l _ _ l ]N ' - '" . It ·.N·N · dt I ki I L...J IJ I J (1)
with ();ij(t)
==
t oo N j(u)Nj(U)[(Tj (t - u)dtt
1::; i,i ::;
n, j ¥ i ,
(2)
N, (t) denotes the density of the i-species at time t, the positive parameters r, and k, stand respectively for the intrinsic growth rate and the carrying capacity of i-species . The positive continuous function aij represents the interaction coefficient between the j and i species . The delay kernel Kr is chosen as in [Noonburg 1986],
e -t/'f
KT ==--
. T.
as it provides a reasonable effect of short term memory. In this case, the set of integro-differential equations (1 )-(2) is equivalent to the following set of ordinary differential equations [Lacitignola & Tebaldi 2005],
269
1 ~ i, j
~
n, j:f:: i , i:f:: 1
(3)
where Cj
1 =kj
,I • . _ ·"1 IJ -
1'j rt i j
kj
i,j=l, ... ,n
Having the aim to discuss the role of the interactions, we consider species with the same adaptation rate T and the same carrying capacity k, except for one. Such a model can allow to investigatethe connectivityproblem [May 1973], strictly related with the role of interactions, also starting to take into account some kind of species differentiation. The n-species system with r, = 1, c,= c, T,= T, for all i =1,...., n has been extensively investigated [Bortone & Tebaldi 19981, IBarone & Tebaldi 2000]: even in the presence of such strong simmetry, i.e. when all the n species are characterized by the same ecological parameters, the system is able to provide patterns in which the species are differentiated. Coexistence can appear as dominance of one species on the others through a variety of forms, i.e. equilibria, periodic oscillations or even strange attractors. In this symmetric case, the existence of a family of invariant subspaces has been shown and a 4-dimensional model introduced, with n as a parameter. Such a reduced model is proven to give full account of existence and stability of the equilibria in the complete system. Correspondence between the reduced model and the complete one has been found for a large range of parameter values also in the time dependent regimes, even in the presence of strange attractors. Such striking reduction results, also with multiplicity of attractors, very useful in the study of competition phenomena involving a large number of species, are a consequence of the symmetry properties of the system. It was on the line to clarify this aspect that we have chosen to differentiate some species on the ground of both the characterizing parameters, carrying capacity and intrinsic growth rate [Lacitignola & Tebaldi 2003]. The analysis of the equilibria in (3) has been completely described according to the size of ecological advantage or disadvantage of the first species: the case C 1 « c exhibits the richest variety of equilibria, which have been investigated in full detail in [Lacitignola & Tebaldi 20041, also describing the phenomenology after their destabilization. The existence of a certain class of invariant subspaces for system (3) allows, also in this case, the introduction of a 7-dimensional reduced model, where n appears as a parameter: striking reduction properties are therefore still maintained [Lacitignola & Tebaldi 2005].
270 In this study, we focus on some interesting aspects of time dependent regimes and provide an example of coexistence in the form of complicated alternance between chaotic behavior and periodic one, in both cases with multiplicity of attractors.
1.2.1. The Equilibria Investigations on the structure and properties of the equilibria in (3) can be efficiently performed making use of the reduced model. By recalling the symmetry properties of the system, we remark that any solution in this reduced model corresponds in general to (n-I) such solutions in the complete system (3). Choosing the time scale, it is assumed r = I , observing that the condition r)= 1 means equal reproduction rates for all the species whereas r, < 1 or r, > 1 indicate that the first species reproduces respectively more slowly or faster than the remaining ones . In the reduced model we have at most five interior fixed points, i.e. with all non zero components, depending on the parameter r., c. and c , namely
R :X1 = u S ; X1 =
S' ; Xl
SI
:::::: ..12
B :Xi =b1 B ' ; X i = b2
b1 (' 2
¥ Xl ,X: l ¥ Xl,X~
As a consequence, in the complete system, we have the three internal equilibria R, S and s*
the (n-l) equilibria B, X bl ) ( Xbl Xbl b ", h, b X bl ( ·1 ' 1, " II , • • • ,~' II ' ~l , - " ' 1,·
X b,') ( ", hI Xbl Vbl b ) · · , -" , . . . ," 1 " II " " ' ""11 ' J
and the (n-l) equilibria B*i ~ (' '-""\' 1b2 , b" :} ~ .1\ h !
where 2 < i < n. To characterize the equilibria, we report only the n-ple of Xi'S since a ij = Xi X, at the equilibrium. We also stress that any critical point of (3) with one or more components Xi = 0 is not considered here since it is unstable for all the values of the parameters. While the structure of the equilibria is essentially the same as in the symmetric case, their features depend both on the first species level of differentiation, the ecological conditions of the remaining species and their stability properties on the adaptation parameter T.
271
1.3. Complex Behavior in the Time Dependent Regimes In this section we focus on the adaptiv e competition among four species and discuss an interesting example of complex behavior which arises in the time dependent regimes as an effect of adaptation. We consider the following intervals for the relevant parameters: 0.1 s r 1 :5 3.5, 0.0l S C 1 :5 0.2, 0.2:5 c :5 0.8,0 < T <180. The equilibria for system (3) are represented in Figure I, where the parameter plane crt remains divided into four regions. '~r---'---""----'--~---,-----,
Figure. 1. The equilibria for the case C 1« c,
C , = 0.01
.
As presented in [Lacitignola & Tebaldi 20041, Regionl has the richest phenomenology , in particular after the equilibria B, 's have become unstable [Lacitignola & Tebaldi 2006] . We recall that Bi'S are coexisten ce equilibria with a strong dominance of the ispecies and, in this region, they loose stability at T = T*B(C, r.) either by supercritical or subcritical Hopf bifurcations. The case c = 0.35 provides a variety of interesting dynamical patterns: for T > ",T*Band initial conditions near B, , according to the value of r, , it is possible to have exclusion by the first specie s, i.e . the equilibirum S , or species coexistence in the form of periodic or complicated patterns. The most interesting phenomena are obtained when T is varied and 0.1 < r, < 0.42; the case r, = 0.2 is chosen as representative of the above range and results are presented for investigations in the time dependent regimes, after destabilization of the equilibrium B2 • For this range of the parameter values, such equilibrium loose its stability at T=T*B via subcritical Hopf bifurcation and, as in the Loren z system, for a set of initial conditions near B 2 , the system exhibits complicated behavior showing a chaotic attractor surrounding this unstable fixed point. In Figure 2 the chaotic attractor is shown for T > '"T*B , i.e . T=26.5
272
Figure. 2. The strange attractor attractor in the (XI, X2, X3) phase space for the case c = 0.35, r}= 0.2 and T = 26.5. Increasing further the value of the parameter T, a stable order-7 cycle is found, where we denote as order-m cycle, a periodic solution having m spikes around B2 • Such cycle persists up to T=29.3 when it is replaced by a chaotic attractor, Figure 3.
(a)
Figure.3. Projection in the (XI> X2, X3 ) chaotic attractor at T= 29.3
phase space (a) The order-7 cycle at T=28.9 (b) The
At T=29.5 such an attractor leaves place to a stable order-6 cycle, which persists up to T=30.5, when a chaotic attractor is found, Figure 4.
(h) Figure. 4. The order-6 cycle at T=29.5. and the chaotic attractor at T=30.5
273 This phenomenology reveals an interesting alternance of chaotic and periodic windows which are strictly linked to each other. Progressively increasing the value of T, the shape of the cycle is changed by removing, step by step, a loop around B2 and equivalently the period decreases because of the disappearence of a spike . We show the occurrence of such an alternance up to the 4-order cycle, Figure 5; investigations are still in progress to clarify the system behavior when T is furtherly increased, also with the use of continuation analys is tools . Figure 3-Figure 5 provide an interesting example of a period-adding phenomenon as reported in [Ott 19931 when read for T decreasing , strictly related to the chaotic windows . The behavior described above has been found making use of the reduced model which considerably diminish the computational effort required. In all cases , persistence of the results for the complete model has been checked; as a consequence, three either cycles or strange attractors of the kind shown, are present in the system starting close to Bi .
Figure. 5. The order-5 cycle at T = 31.1. the chaotic attractor at T = 31.7 and the order-4 cycle at T = 33
The symmetry of the equilibrium Bi is however maintained, in fact, eventually after a transient, it turns out X/t) = Xh(t), j .. 1,i with striking properties of (partial) synchronization even in chaotic regimes . An interesting point under investigation is the eventual breaking of such a symmetry.
1.4. Conclusions We have studied a competitive n-species Lotka-Volterra system with behavioral adaptation in which one species is differentiated with respect to the others by carrying capacity and intrinsic growth rate . A 7-dimensional reduced model is obtained, where n appears as a parameter, which gives full account of existence and stability of equilibria for the complete system and is also effective in describing the time dependent
274 regimes for a large range of parameter values. Such a reduced model can be very useful for the study of adaptive competition involving a large number of species since the computational effort required is highly reduced. We have presented interesting aspect of the phenomenology for certain values of the carrying capacities, c and C1 ,and the intrinsic growth rate r, , when the parameter T, characterizing behavioral adaptation, is varied. According to the values of the parameter r, , we have found, as expected, exclusion by the most advantaged species as one of the possible outputs . However, because of adaptation, coexistence among the species is also possible and in different forms, equilibria, periodic oscillations or even strange attractors. This aspect is interesting because it takes care of one of the main critics to the classical competitive Lotka-Volterra systems, namely the fact that they lead to exclusion of species far more often than observed. Furthermore, the complex phenomenology presented here, also allows to make contact with the varied behavior observed in nature, especially when the number of species with relevant interactions is not very small. Going back to Goodwin's model of economic growth cycle, the introduction of behavioral adaptation in the same line discussed here [Colacchio, Sparro & Tebaldi, submitted], has substantially extended the validity of the model and provided a rich phenomenology more related to actual historical data [Harvie 2000] .
References Barone, E. & Tebaldi, C.; 2000, Stability of Equilibria in a Neural Network Model, Math. Meth. Appl. Sci. 23,1179 . Bortone,c. & Tebaldi , C., 1998, Adaptive Lotka-Volterra Systems as Neural Networks, Dyn. Cont. Impul. Sys. 4, 379. Colacchio, G., Sparro, M. & Tebaldi C., submitted, Sequences ofCycles and Transition to Chaos in a Modified Goodwin 's Growth Cycle Model, , Int. Jour. Bif. Chaos Dendrinos, D., 1992, The Dynamics of Cities, Routledge Goodwin, R.M., 1967, A Growth Cycle, in Feinstein, C.H. (ed.) Socialism, Capitalism and Economic Growth, Cambridge University Press (Cambridge). Harvie, D., 2000, Testing Goodwin: Growth Cycles in Ten OECD Countries, Cambridge Journal of Economics 24, 349. Lacitignola, D. & Tebaldi, C., 2003, Symmetry breaking effects and time dependent regimes on adaptive Lotka- Volterra systems, Int. Jour. Bif. Chaos 13,375 . Lacitignola, D. & Tebaldi, c., 2004, Effects of Adaptation on Competition among Species, in New Trends in Mathematical Physics, World Scientific (Singapore) . Lacitignola , D. & Tebaldi, C., 2005, Effects ofecological differentiation on Lotka- Volterra systems for species with behavioral adaptation and variable growth rates, Mathematical Biosciences 194,95. Lacitignola, D. & Tebaldi , c. , 2006, Chaotic Patterns in Lotka-Volterra Systems with Behavioral Adaptation, in Proceed . Waves and Stability in Continuous Media, World Scientific Publishing (River Edge, NJ) May, R.M., 1973, Stability and Complexity in Model Ecosystems, Princeton University Press. Murray, J.D.,2002, Mathematical Biology , Springer. Noomburg, V.W., 1986, Competing species model with behavioral adaptation , J . Math. BioI. 24, Ott , E., 1993, Chaos in Dynamical Systems, Cambridge University Press .
Chapter 10
A Dynamic Theory of Strategic Decision Making applied to the Prisoner's Dilemma Gerald H. Thomas and Keelan Kane Milwaukee School of Engineering Chicago Center for Creative Development
The classic prisoner's dilemma has been extensively investigated by game theorists since the late 1950s, and has been scrutinized in both theoretical and empirical contexts . Many researchers have concluded that the Nash equilibrium does not apply to this game. Here we reexamine the prisoner's dilemma game from the perspective of physical decision theory (Thomas 2006), a rich dynamic framework constructed along the lines of physics that provides a program for examining general decision making processes . From this larger perspective we demonstrate that I) the Nash equilibrium can be extended to include dynamics and 2) interactions between players simultaneously involve both self-interest and the interests of others, even if one starts by adopting the assumption that agents are driven only by self-interest or only by other-interest. These results have implications far beyond the simple example of the prisoner's dilemma .
1. Introduction Physical decision theory (Thomas 2006) models the relationships between observed behaviors (strategies) and observed results of those behaviors (payoffs). As such it is subject to direct observation and refinement using the scientific method. Here, we describe players that are either egoists (who maximize self-interest) or altruists (who maximize other-interest; see, e.g., Eshel et al., 1998) and combine it with the physical decision theory in order to examine the prisoner's dilemma game This paper begins with the usual formulation of the prisoner's dilemma as a twoperson game with a non-zero value; the game is not symmetric. We transform this game into an equivalent symmetric game using the device of Von Neumann (for a review see
276 Luce and Raiffa, 1957) that identifies a third player whose payoffs are specified to ensure both games have the same Nash equilibrium and payoffs for the original players. This symmetric game is extended to a dynamic game whose fixed points are precisely the Nash equilibrium. By construction all three games have the same dynamic content at the static equilibrium point. However, the physical framework goes beyond the traditional framework by predicting the non-equilibrium behavior and identifies steadystate (stationary flow) behavior that is distinct from the static behavior.
2. Prisoner's Dilemma Formulation We start with the usual and illustrative formulation of the prisoner's dilemma as a game in normal form between two players/prisoners specified by a payoff matrix for each player. In our example, we write the payoff matrix for player 1 separately as: GIZ Nz Cz
NI
-0.1
-I.
CI 0 -0.9 There is an identical payoff matrix for player 2. Using the standard game theory analysis we conclude that player 1 confesses, where the strategies for player 1 are to confess (C l ) or the alternative to not confess (N1) . Player 2 also confesses since he sees exactly the same game matrix, with similar strategies to confess (C2) or his alternative to not confess (N2) . The identified solution is considered the Nash equilibrium in game theory, despite the "dilemma" that if each player were not to confess they would both be better off. Indeed, the latter also squares better with what some of us would expect (see, e.g., Sally, 1995). To apply the new dynamic theory of decisions, we transform the above nonsymmetric game for each player to an equivalent symmetric zero-value game by adding a Hedge strategy H, with an arbitrary scale m that does not affect the strategy choices: Cz ~ G H F!u, Nz Nz 0 0 0.1 0 0
c, NI CI
0 -0.1 0
0 -1.0 -0.9
1.0 0 0
0.9 0 0
0.9 ~ 1.0 ~ 0.9
-~
H 0 X,0.9 -,Yn; 1.0 - ~ 0.9 0 The Nash equilibrium strategy for this game is q = {o 1 0 1 m} with the property that the payoff matrix times this payoff (as matrix operations) is zero; we say that the strategy is in the null space of the matrix. The original two-person game is embedded in this symmetric game. By construction the two games have the same Nash equilibrium. Symmetric games can be solved by linear programming or differential equation techniques (see Luce and Raiffa, 1957).The classic game theory differential equation dV =F.V dr '
as we show in the next section, provides a basis for a dynamic theory that has the same limiting behavior for the static and stationary cases.
277
3. Egoists and altruists in a dynamic theory 3.1 Thomas' Dynamic Theory of Strategic Decisions Thomas' physical decision theory (2006) represents a game as a differential equation, identifying the hedge strategy with time . Thomas proposes a general form of the differential equation that takes into account the geometry of the strategy space through active geometry metric elements gab inactive geometry metric elements yaP and mixed active-inactive metric elements YapA: (expressed in terms of a "vector potential" A:) for each player a . If the metric elements are independent of a strategy, the strategy is considered inactive; otherwise it is active. The metric elements provide the specification of the speed that determines the "active" kinetic and "inactive" potential energy that describes the motion of the game from one play to the next. The vector potential the mix between active and inactive, in particular defines for each player the antisymmetric matrix Fa~ = oaA;: -ObJ\~ we identify with the decision matrix from the previous section. The inactive strategies correspond to the personal outcomes or utilities of the players, so that the vector potential generates the potential energy due to the interaction of the personal utilities with the active strategy behaviors . Successive plays of the same game represent a "flow" in this geometry with a flow direction, density and pressure. For active strategies, the flow direction is represented as V" ; for each inactive strategy the flow is a "charge" and represented by Va' The behavior of a new play of a game is determined by those games already played from a set of causal differential equations. In this paper we mention those that are relevant to our treatment of the prisoner's dilemma game. The full set of equations is an economic version of Einstein's equations applied to this geometry. We require the resultant flow equations, the economic version of Euler's equations, which result from the conservation of energy and momentum: b DV _ a b 1/ ap b 0b P g ab---VaFab V -nVaVPOaY +ha - - . or f.J + P Here, the weighted sum VaFa~ represents the composite payoff that determines the behavior of the game, and in general can be different from the separate payoffs for nonzero sum games . There are four important consequences of the above equations: I) any definition of equilibrium is dynamic and based on the flows being stationary; 2) the "game" aspect of the equation is represented as the specific composite sum VaFa~ ; 3) there are "nongame" aspects that influence the dynamics; and 4) the equations with boundary conditions provide a complete and unique quantitative solution for all variables . We propose that stationary behavior is behavior that does not change over time. Nash equilibrium is a special case of such stationary behavior because Nash equilibrium is defined as a set of strategies from which players will not deviate . In terms of our differential equation, Nash equilibrium occurs when the product of the payoff matrix and the flow equals zero . In general , stationary flow is a super set of static behaviors. The dynamic notion of stationary flow replaces the game theory notion
A:,
278 of equilibrium in dynamic games. In addition, the differential equations of physical decision theory describe dynamic behavior that is not stationary. In the composite sum, the elements are not arbitrary, but are themselves governed by differential equations that derive from the economic Einstein equations:
~ ~8b (~lgYlgac gbdYap F;n = K(,u+ p)VaV" .
vigyl
They show that the charge density (product of "charge" Va ' matter density ,u + p and coupling constant K), and motion (flow Va) determine the payoff matrix F:' . Drawing from physics we split these equations into two distinct sets. The first set, the time component of the equations, has the form of "Coulomb's" Law V'.E = Kj~ . The important consequence is that like "charges" repel and opposite "charges" attract. If like "charges" are nearby at the start, the form of the equations implies that they move away from each other as time increases . For this reason we argue that systems comprised of the same "charge" - i.e. only egoist players or only altruist players - are expected to fly apart. As a rule, matter consists of equal mixtures of charges separated by short distances. We suggest the same holds here. We identify the positive "charges" with altruistic behavior (because they contribute to others), negative "charges" (because they take away from others) with egoist behavior and equal mixtures with normal behavior. We assert that normal behavior is likely composed of balanced amounts of altruist and egoist behavior. The second set, the space components of the equations, has the form analogous to Ampere's Law: V'xB a +8E/8t = Kja in which currents generate the magnetic field. For physical decision theory, helical currents define the direction of the Nash equilibrium. Thus these equations may help define and explain the origin of the equilibrium behaviors observed. There are new dynamic behaviors that are not reflections of ordinary games. For any geometry, Maxwell-like equations can be converted to wave equations involving second order time derivatives in space and time of a "vector" potential. Such waves radiate with a fixed and finite velocity whenever the active space-time dimension DA is greater than two, which it is for the prisoner's dilemma example. There should be radiation and this "radiation" determines which past events can influence any given event. The radiation is an intrinsic property of the strategic decision fields of each player and follows from their local invariance or gauge properties; its confirmation would provide strong support for this physical approach. An intriguing possibility is if a player's behavior consisted of cyclical choices reflecting opposite charges both identified with himself, then there would be acceleration and hence radiation leading to the charges collapsing onto each other. In physics, this is prevented by the quantum view of matter. We thus predict striking consequences that extend far beyond the prisoner's dilemma game.
3.2 Composite Fields, Charges and Null Behaviors If a game is played the same way over and over, the game creates a "current" for each player that creates the decision field Fa~ through the homologous Maxwell equation above . A test play of the game will then see a force through the homologous Euler flow
279 equations. Since the equations are highly non-linear, we believe it may be helpful to think of these equations sequentiall y. In the process we develop two concepts that we use to analyze the prisoner's dilemma . In this conceptualization, the force based on the decision for a given player is determined by the size of VaFa~Vb . The magnitude of the force depends on whether the flow aligns with the null vector (Nash Equilibrium) r;b of the decision matrix . We use the word null because the contribution to the force by the flow is zero. It can be shown that non-equilibrium motion, that is near equilibrium, is a helix around this direction, with a direction of rotation and a frequency given by the strength of the charge density. There is a decision matrix for each player , and the corresponding null vectors are in general not equal. of each In our analysis of the prisoner's dilemma, we identify the null direction player, along with the charge Va' We emphasize that the null direction is determined by the sources that generate it, and so is not really independent of the charge. In a later work, we drop this assumption and solve the complete set of coupled equations. The results in the next section are suggestive but provisional.
r;:
4. Prisoner's Dilemma Analysis There are two charges-altruist (positive) and egoist (negative) . In addition, for each prisoner there are two null behaviors (Nash equilibriums) depending on whether the decision matrix derived from the Maxwell-like equations results from an altruist or egoist charge. There are sixteen cases in all. We create the composite VaFa~ by taking the sum of the player decision matrices weighted by their player's charge, assuming that a player is either altruistic or egoistic . The sixteen possible cases can be reduced to the following four cases: • There is a composite behavior constructed from two egoists (egoistic null behavior) ; • There is a composite behavior constructed from two altruists (altruistic null behavior); • There is a composite behavior constructed from an altruist and an egoist where each acts appropriately or each acts oppositely to their null behavior ; • And there is a composite behavior constructed from an altruist and an egoist where one acts appropriately and the other oppositely. We believe the composite behavior, the payoff matrix for the prisoner's dilemma as usually described in the literature, reflects an egoist null behavior. In this and the other examples, we compute VaFa~ assuming that the charge for an egoist is -1, for an altruist is +1, and that the egoist null behavior is that given in the earlier section. 4.1 Egoistic null behavior from two egoists With this in mind the composite field is made up of two identical payoff matrices . The composite matrix is:
280 0
_ oJ',
r:
//I
-
or".
0 0
0 0
0.1 1
0 0.9
-1 -0.9 0 0
XI)
0% III
0
_0% III
An altruistic player differs from an egoi stic player in that such a player has a different personal utility leading to a different payoff matrix . In an altruistic world, the sign of the charge changes in the Maxwell-like equation relative to the purely egoistic world , and the payoff matrix keeps the same sign . Thus the flow will in general be reversed . To keep the flow positi ve, we change the sign of the payoff matrix. The altruist makes the same type of min-max argum ent that the egoist does , but the altruist player uses their own payoff matrix. The compo site payoff matrix is:
Va F a ab =
o o
0
0 0.1 1 o 0.9 - of", - y",
-0.1
0
-1 0 0
- 0.9 0 0 0
of",
0y,;, y,;, _0.y,;, + 0 0
0
0
0.1
1
0 0 0 0.9 -0.1 0 0 0 -1 -0.9 0 0 oX, 0 _oX; - 1m
_o~
0 o~
1m 0
The first term is player 1. Such a player would look at the possibilities as follows: if she does not confe ss, the worst that can happen is if player 2 does not confes s, and she gains 0.1 units; if she confesses, the worst that can happen is that player 2 does not confess , and she gains 0 units. Of these two cases the best for her is not to confess. The optimal strategy as she sees it is {I 0 10m}. Player 2 sees the same possibilities. Thus this is the opposite of two ego ists. We obtain the import ant result that with two altrui stic players , game theory applies using the usual rules for computing the null behavior if we allow the utilities to reflect altruistic payoffs.
4.3 Composite game with egoist and altruist where each acts appropriately (or both oppositely) to their behavior In the two previous cases, the null behavior of the compo site game does not depend on the char ges of the players. We deduced the composite null behavior by taking the null behavior of each player individually . In the remainin g cases, the composite null behavior need not be the null behavior of any one player. The null behavior is the stationary direction qb along which the composite payoff produces no force , VaF::,qb. We find the composite null behavior by findin g those vectors in the null space of the
281 composite matrix. The null vectors can be found using standard techniques and verified by inspection. To illustrate, we take the charge for player 1 to be -I and the charge for player 2 to be + I corresponding to the payoff. We take the form for player I from Sec . 4.1 (first term) and for player 2 from Sec . 4.2 (second term). Neither the pure egoist nor the pure altruist solution is an equilibrium value . The composite decision matrix is: o 0 0 I _ o1//11/
VaFaab =
o
0
-I
0
0.9/
0
I
0
0
_0%,
1m
-I
0
0
0
01/ 1m
01/
_ 0.9/
0.9/
_ 0.1/
0
1m
/m
/111
/m
The null behavior is {OJ 0.9 0.9 0.1 m} . The altruist (player 2) predominantly chooses the option to confess (with odds of 9: I) and the egoist (player I) predominantly chooses the option to not confess (with odds 9: I) . Each player is impacted significantly because of the presence of the oppositely behaving player, the player mismatch.
4.4 Composite game with egoist and altruist where only one acts appropriately To illustrate what happens in a game with an egoist and an altruist where one acts appropriately and the other oppositely (i.e . we subtract the altruist payoff), we select as representative an egoist null behavior and an altruistic null behavior where both players are egoists (both charges negative). This example and some similar cases exhaust the possibilities The composite game is: o 0 -0.2 -I 0 71111/
a VaF ab =
o
0
-I
-1.8
0.2 I
I 1.8
0 0
o o
IV
1 9/
1m
/m
0 9/ /m
o
The composite null behavior is {-I 9 9 -18m} . The negative strategy suggests a more detailed dynamic analysis is called for. A preliminary analysis suggests that when there is a mismatch between expectations (both here and in the previous section), a bipolar behavior is generated where each player acts with both charges.
5. Conclusions Dynamic theories are characterized by fixed points, which exhibit stationary or constant flow that can be attractive or repulsive, and can have periodic or semi-periodic behaviors. A dynamic theory of decision-making extends the standard static game theory, and structurally changes the notion and context of game theory equilibrium. Our analysis is based on a causal evolution of behavior from some initial point in time that is based only on past behaviors. Because of the nature of the dynamics we are led to the conclusion that games are not played only by egoists, or indeed only by altruists, but by players reflecting a balance of both attributes (charges) . To the extent
282 that such charges occupy nearby spaces, it may be that some "quantum" formulation is necessary for the theory to be consistent. Analysis at the "quantum" level might be achieved by a psychologically-oriented framework. In any case, empirical investigations of human participants presented with prisoner-dilemma game situations have yielded interesting results that contradict standard game theory analysis (see Sally, 1995 for a review). Some of these results have motivated so-called "psychological" game theories (e.g. , Dufwenberg & Kirchsteiger, 1998; Rabin, 1993). The hallmark of these psychological frameworks is that they attempt to model players' "fairness" or "kindness," as well as each player's beliefs about whether his or her own actions will be reciprocated with (un)kind or (un)fair actions. As Rabin (1993) notes in his psychological model, the notion of altruism can bear on the notion of fairness: "the same people who are altruistic to other altruistic people are also motivated to hurt those who hurt them" (p. 1281, emphasis in original). This paper provides a preliminary behaviorist (rather than a psychological) view of the prisoner's dilemma, and a formulation of this view inside a dynamic theory. There are many other interesting questions that relate to the prisoner's dilemma and in the process of studying these problems, we have created and now are pursuing a program to flesh out and extend the dynamic theory to include both behavioral and psychological factors and create solutions to these and other similar problems.
References [I] [2] [3] [4] [5) [6]
Dufwenberg, Martin & Kirchsteiger, Georg (1998). A theory oj sequential reciprocity . Tilburg Center for Economic Research discussion paper 9837. Eshel, Han; Samuelson , Larry; & Shaked , Avner (1998). Altruists , egoists , and hooligans in a local interaction model. The American Economic Review, March, 157-179. Luce, R. D., and Raiffa, H. (1957), Games and Decisions (Dover Publications, NY). Rabin, Matthew (1993) . /ncorporatingjairness into game theory and economics. American Economic Review, 83,1281-1302. Sally, David (1995) . Conversation and cooperation in social dilemmas: A meta-analysis of experiments from /958 to 1992. Rationality and Society, 7(/), 58-92. Thomas, Gerald H., Geometry, Language and Strategy (World Scientific. 2006).
Chapter 11
Animal network phenomena: insights from triadic games Mi ke Mesterton-Gibbons Department of Mathematics, Florida State University mesterto@math .fsu.edu
Tom N. Sherrat t Department of Biology, Carleton University [email protected]
Games of animal conflict in networks rely heavily on computer simulat ion becaus e analysis is difficult , the degree of difficulty increasing sharply with the size of the network. For this reason , virtually th e ent ire analyt ical literature on evolut ionary gam e theory has assumed either dyadic int eraction or a high degree of symmet ry, or both . Yet we cannot rely exclusively on compute r simulation in t he study of any comp lex system . So th e st udy of triadic int eractions has an important role to play, because tri ads are both th e simplest groups in which asymmetric network phenom ena can be studied and the groups beyond dyads in which ana lysis of population games is most likely to be tr act abl e, especially when allowing for intrinsic variation . Here we demonstrate how such analyses can illumin at e a variety of behavioral phenom ena within networks, including coa lition formation , eavesdro pping (t he stra teg ic observation of contests between neighbors) and victory displ ays (which are performed by t he winners of contests bu t not by t he losers). In par ticular , we show th at eavesdropping acts to lower aggression t hresholds compared to games wit hout it , and t hat vict ory displays to byst anders will be most intense when t here is little difference in payoff between domin ating an opponent and not sub ordinatin g.
284
1
Triadic games
The essential ingredients for mathematical analysis of a continuous population game are a well defined reward function f, such that f (u,v) yields the reward to a focal u-strategist in a population of v-strategists, and the concept of an evolutionary stable strategy or ESS (4). Strategy v is a (strong) ESS if it is uniquely the best reply to itself, i.e., if f(v, v) > f(u, v) for all u -I- v. Two kinds of continuous triadic game of conflict have proven especially amenable to analysis. The first kind of game, which we call Type I, is one in which strategies are intensities, variance of fighting strength is zero, and the set of all possible outcomes from the triadic interaction has a discrete probability distribution for every conceivable strategy combination (u,v). Let there be K such outcomes in all, let Wi (u, v) be the probability associated with outcome i and let Pi(u) be the corresponding payoff to the focal individual. Then K
L Wi(U, v)Pi(u, v)
f(u, v) =
i=1
K
with
L Wi(U, v)
= 1.
(1.1)
i=1
The second kind of game, which we call Type II, is one in which strategies are thresholds, variance of strength is non-zero and strength is continuously distributed with probability density function g on [0, 1]; nevertheless, for all (u, v) the sample space [0,1]3 of the triad's three strengths-assumed independentcan be decomposed into a finite number K of mutually exclusive events. Let ni(u, v) denote the i-th such event, and let Pi(X,Y, Z) denote the corresponding payoff to the focal individual when its strength is X and the other two strengths in the triad are Y and Z. Then K
f(u,v)
=
L .=1
111
Pi(x ,y,z)g(x)g(y)g(z)dxdydz.
(1.2)
( x ,y ,z) E 0i( u ,v)
We provide examples of each kind of game.
2
Victory displays
Victory displays, ranging from sporting laps of honor to military parades, are well known in human societies and have been reported in various other species (1), the best known example being the celebrated "triumph ceremony" of the greylag goose (3). Two models of such victory displays exemplify the Type I game. Bower (1) defined a victory display to be a display performed by the winner of a contest but not the loser. He proposed two explanations for their function . The "advertising" rationale is that victory displays are attempts to communicate victory to other members of a social group that do not pay attention to contests or cannot otherwise identify the winner. The "browbeating" rationale is that
285 victory displays are attempts to decrease the probability that the loser of a contest will initiate a future contest with the same individual. Our modelsdistinguished by A for advertising and B for browbeating-explore the logic of these rationales. Both models assume that the members of a triad participate in three pairwise contests, and that more intense victory displays are more costly to an individual but also more effective in terms of either being seen by conspecifics (Model A) or deterring further attack (Model B): at intensity s, the cost of signalling is c(s), and the probability of the desired effect is p(s) . In either model, dominating another individual increases fitness by a , and a contest in which neither individual dominates the other increases the fitness of each by ba, where b :=; 1. In Model A, we assume that a bystander that has seen an individual win will subsequently defer to it with fixed probability Ai, where i = 0, i = 1 or i = 2 according to whether the observer is an untested individual, a prior loser or a prior winner, respectively, with 0 :=; A2 :=; Ao :=; Al :=; 1. Deferring eliminates the cost of a fight, which we denote by Co. We also allow for a prior loser to defer to an observed loser with probability A3, and we allow for a potential "loser effect" (8): an (indirectly) observed loser subsequently loses against the observer with where 0 :=; l :=; 1. probability The reward function is most readily obtained by first recording the payoffs and probabilities associated with each outcome in a table having K rows; then (1.1) determines f. Because K = 36 for Model A, however, only excerpts are shown as Table 1: the full table appears in (5). As presented, the table assumes that displays are obligate. One could argue, however, that-at least among animals with sufficient cognitive ability-victory displays should be facultative: in a triadic interaction, there is no need to advertise after an individual's final contest, because th ere is no other individual that can be influenced by the display. Our model is readily adapted to deal with this possibility, as described in (5). For the sake of definiteness, we analyze the game with
lt l
c(s)
=
'YBas,
p(s) = E+ (1 - E) (1 - e- lI s )
(1.3)
where B (> 0) has the dimensions of INTENSITy-l, so that 'Y (> 0) is a dimensionless measure of the marginal cost of displaying, and 0 :=; E :=; 1. The analysis shows that for any values of the positive parameters Co, 'Y, l, b, Ao, Al' A2 and A3 (the last six of which cannot exceed 1), there is a unique ESS at which animals display when E-the baseline probability of observing victors in the absence of a display-lies below a critical value, but otherwise do not display. This critical value is zero if display cost 'Y is too large but otherwise positive; it decreases with respect to 'Y or l, increases with respect to any of the other six parameters and is higher for facultative than for obligate signallers (except that it is independent of A3 for facultative signallers) . For sub critical values of baseline probability of observation E, the intensity of signalling at the ESS decreases with respect to 'Y or l, increases with respect to any of the other six parameters and is higher for facultative than for obligate signallers (with the same exception as before). Moreover, it largely does not matter whether the effect of signalling is
286 interpreted as increasing the probability of being seen or of being deferred to . Ta ble 1 : Model A payoff to a focal individual F whose first and second opponents are 01 and 02 , respectively, conditional on participation in the last two of the three contests. Parentheses indicate a contest in which the focal individual is not involved. A bold letter indicates that the individual's opponent deferred. Note that 01 and 02 do not label specific individuals: 01 is whichever individual happens to be the focal individual's first opponent for a given order of interaction, the other individual is 02 .
CASE
WINNERS
PROBABILITY
3rd
PAYOFF
1st
2nd
w;(u,v)
P;(u)
F
F
iAop(u)
{2 - 2c(u) - cola
2
F
F
i2{1 - AOp(Un
{2 - 2c(u) - 2co}a
F
i2{1 + l p(vn + l p(u)}{l - A2 p(VnA2 p(u) i2{1 + l p(unAl p(v) p(u) f4{1 + lp(u)}{l- A2P(u)}{1 - A2 P(Vn
5
01
02
8
F
(02)
9
F
(02) F j 02
10
F
(02)
F
i2{1
-2coa
{2 - 2c(u) - cola {I
+b -
c(u) - cola
{2 - 2c(u) - 2co}a
29
(01)
F
02
f4{1 - Ao p(v)}{l - Al p(u)}{l - l p(vn
{I - c(u) - 2co}a
33
(02)
01
02
i2{1 - l p(Vn AI p(v)
-coa
36
(01)
01
Fj02
i2 {I - Ao p(v nAl
{b - cola
In Model B, we assume t hat contestants subordinate to a current winner with a probability that increases with the intensity of the victory disp lay, and we reinterpret € as the baseline probability of submission (i.e., the probability t hat a victor elicits permanent submission from a loser in the absence of a display) . As before, we construct a table of payoffs and associated probabilities . Because the orde r of interaction does not matter in this case, t here are fewer possible outcomes; specifically, K = 10 in (1.1). We again find t hat t here is a unique ESS with a crit ical value of € , above which winners do not display, below which intensity decreases with € (5). In this regard, predictions from the models are similar; however, there is also an important difference. In the case of advertising, the inte nsity of disp lay at t he ESS increases with respect to the parameter b, an inverse measure of the reproductive advantage of dominating an opponent compared to simply not subordinating; by contrast , in the case of browbeating, the intensity of display at the ESS decreases with respect to b, as illustrated by Fig. 1. Therefore, all other t hings being equal, th e intensity of advertising victory displays will be highest when there is little difference between dominating an opponent and not subordinating, a set of cond itions likely to generate low reproductive skew (as in monogomous species) . By contrast, the intensity of browbeating victory displays will be highest when t here are greater rewards to dominating an opponent, a set
287 Bv 3
~B 2
A '" .i - >
1 ••••••············A.··············
o :-o
---=-=-0.5
=b 1
Figure 1: Comparison of advertising and browbeating ESSs. Evolutionarily stable signalling intensity (scaled with respect to ~ to make it dimensionless) is plotted as a function of dominance advantage b. Values of the other parameters (all dimensionless) are CO = 0.1 for the fixed cost of a contest, 1 = 0.5 for the loser effect (i.e., the probability that an observedloser again loses is 0.75), 'Y = 0.05 for the marginal cost of displaying, Ai = 0.9 for all i for the probability of deference and E = 0.1 for the baseline probability of the desired effect (bystander attention to victor in Model A, submission to current opponent in Model B) in the absence of a display. For obligate signallers, the advertising ESS is shown dashed; for facultative signallers, it is shown dotted.
of conditions that is likely to generate high reproductive skew. These predictions appear to accord quite well with our current understanding of the taxonomic distribution of victory displays (1; 5).
3
Coalition formation
A model of coalition formation exemplifies the Type II game . We merely sketch this model here; full details are in (6). We assume that each member of a triad knows its own strength but not that of either partner. All three strengths are drawn from the same symmetric Beta distribution on [0,1] with variance a 2 • Stronger animals tend to escalate when involved in a fight, weaker animals tend to not to escalate. If an animal considers itself too weak to have a chance of being the alpha (dominant) individual in a dominance hierarchy, then it attempts to form a coalition with everyone else: coalition means a mutual defence pact and an equal share of benefits. Let A denote total group fitness . Then it costs 8A (2: 0) to attempt a coalition; the attempt may not be successful, but if all agree to it , then there are no fights. If there's a dominance hierarchy with three distinct ranks after fighting, then the alpha individual gets exA (where ex > !), the beta individual gets (1 - ex)A and the gamma individual gets zero. If there's a three-way coalition or if the animals fight one another and end up winning and losing a fight apiece , then each gets ~A ; however, in the second case they also incur a fighting cost. If a coalition of two defeats the third individual, then each member of the pair
288 Table 2: Payoff to a focal individual F of strength X whose partners are A and B with strengths Y and Z, respectively, with t:. = q{X + Z} - Y and ((X, Y, Z) = o:p(X - Y)p(X - Z) + Hp(X - Y)p(Z - X)p(Y - Z) + p(X - Z)p(Y - X)p(Z - Y)} + (1 - o:){p(X - Y)p(Z - X)p(Z - Y) + p(X - Z)p(Y - X)p(Y - Z)} . CASE
COALITION
EVENT
PAYOFF
STRU CTURE
Oi(U, v)
Pi (X, Y,Z)
{F,A,B}
X
{F,B}, {A}
X
3
{F, A }, { B }
X
4
{F}, {A, B}
X
2
5
{F} , {A} , {B} X
6
{F}, {A}, {B} X
7
{F} , {A} , { B } X
8
{F} , {A} , {B} X
< U, Y < U,Y < U, Y > U, Y < U, Y > U,Y > U, Y > U, Y
< v, Z < v > v, Z < v < v, Z > v < v, Z < v > v, Z > v > v, Z < v < v, Z > v > v, Z > v
H -O}A
Ho:p(~)
+1-
0:- 20 - c(~)}A
P2(X,Z, Y) {o:p(X - q{ Y
+ Z}) -
c(X - q{Y
+ Z})}A
-OA
+1Z) + 1 -
{(20: - l)p(X - Y)
0: - c(X - Y)}A
{(20: - l)p(X -
0:- c(X - Z)}A
{«(X, Y, Z) - c(X - Y) - c(X - Z)}A
obtains ~A while the individual obtains zero; and if the individual defeats the pair, then it obtains aA while each member of the pair obtains ~(1 - a)A. We assume that there is at least potentially a synergistic effect, so that the effective strength of a coalition of two whose individual strengths are 8 1 and 82 is not simply S1 + S2 but rather q{S1 + S2}, where q need not equal 1. Let p(.0.s) denote the probability of winning for a coalit ion (or individual) whose combined strengt h exceeds that of its opponent by .0.s; p increases sigmoidally with .0.s at a rate determined by a parameter r measuring the reliability of strength difference as a pred ictor of fight outcome. Note that p(.0.s) +p(-.0.s) = 1 with p(-2) = 0, p(O) = ~ and p(2) = 1. We assume t hat fighting costs are equa lly borne by all members of a coalition. Let c(.0.s)A be the cost of a fight between coalitions whose effective strengths differ by .0.s. Costs are greater for more closely matched opponents; so, from a maximu m Co, cost decreases nonlinearly with l.0.s l at a rate determined by a parameter k measuring sensitivity of cost to strength difference. Let u be the coalition threshold for Player 1, the potential mutant: if its strength fails to exceed this value, then it attempts to make a mutual defence pact with each of its conspecifics. Let v be the corresponding threshold for Player 2, who represents the popu latio n. Let X be the strength of the u-strategist, and let Y and Z be the strengths of the two v strategists. We can now describe the set of mutually exclusive events with associated payoffs as in Tab le 2, and the reward follows from (1.2) with K = 8. For this game, the evolut ionarily stable st rategy set depends on seven parameters, namely, Co (maximum fight ing cost), q (synergy multiplier) , () (pact cost), a (proportion of additional group fitness to a dominant), r (reliability of strength difference as pred ictor of fight outcome), k (sensitivity of cost to
289 strength difference) and (72 (variance) . It is a complicated dependence, but it enables us to calculate, among other things, the probability that two animals will make a pact against the third in an ESS population. Details appear in (6).
4
Eavesdropping
As noted in §2, animals can eavesdrop on the outcomes of contests between neighbors and modify their behavior towards observed winners and losers. A model of such eavesdropping (7) further exemplifies the Type II game, in this case with K = 28. The model, which extends the classic Hawk-Dove model of animal conflict to allow for both continuous variation in fighting ability and costs that are greater for more closely matched opponents (as in §3), was motivated by earlier work showing that eavesdropping actually increases the frequency of mutually aggressive contests (2). But that conclusion was predicated on zero variance of strength. To obtain a tractable model with non-zero variance, we had to
o
~----~-----~- (T2
o
0.04
0.08
Figure 2: The evolutionarily stable aggression threshold under eavesdropping (V* , solid curve) as a function of variance (j2 for various values of the dominance advantage parameter b when strength has a symmetric Beta distribution on [0,1] and the cost of a fight between animals whose stengths differ by D.s is 1-ID.slo.2 for D.s E [-1, 1J. In each case, the corresponding basic threshold (v*, dashed curve) is also shown.
make several simplifying assumptions. In particular, we assumed that fights are always won by the stronger animal (the limit of §3 as reliability parameter r -----+ 00). Furthermore, we first determined a basic aggression threshold for animals that do not eavesdrop, and then considered eavesdropping only among animals whose strengths at least equal that basic threshold. Thus the question becomes whether eavesdropping raises the threshold. We found that it always does, suggesting that eavesdropping reduces rather than increases aggressive behavior in Hawk-Dove games. Typical results are shown in Fig. 2, where the parameter b has the same meaning as in §2. Details appear in (7).
290
5
Conclusion
We have shown how to obtain insights on animal network phenomena by studying them in their simplest possible setting, namely, a triad. Our analysis of victory displays (Type I, strategies as intensities) has confirmed that such behavior can occur either as an advertisement to bystanders or to browbeat a current opponent. Our analyses of coalition formation and eavesdropping (Type II, strategies as thresholds) have helped elucidate the fundamental conditions under which coalitions will form, and indicate for the first time that eavesdropping acts to reduce the frequency of escalated fighting in Hawk-Dove models. We hope that our analyses of these triadic interactions serve as important benchmarks for understanding analogous phenomena in larger networks.
6
Acknowledgments
This research was supported by National Science Foundation award DMS0421827 to MM-G and NSERC Discovery grant to TNS.
Bibliography [1] BOWER, J. L., "The occurrence and function of victory displays within communication networks", Animal Communication Networks, (P . McGREGOR ed.). Cambridge University Press Cambridge (2005), pp. 114-126. [2] JOHNSTONE, R. A., "Eavesdropping and animal conflict", Proceedings of the National Academy of Sciences USA 98 (2001), 9177-9180. [3] LORENZ, K. Z., "The triumph ceremony of the greylag goose, Anser anser L.", Philosophical Transactions of the Royal Society of London B 251 (1966), 477-478. [4] MAYNARD SMITH, J. , Evolution and the Theory of Games, Cambridge University Press Cambridge (1982). [5] MESTERTON-GIBBONS, M., and T. N. SHERRATT, "Victory displays: a game-theoretic analysis " , Behavioral Ecology 17 (2006), 597-605. [6] MESTERTON-GIBBONS, M., and T. N. SHERRATT, "Coalition formation: a game-theoretic analysis", Behavioral Ecology 18 (2007), 277-286. [7] MESTERTON-GIBBONS, M., and T . N. SHERRATT, "Social eavesdropping: a game-theoretic analysis", Bulletin of Mathematical Biology 69 (2007), 1255-1276. [8] RUTTE, C., M. TABORSKY, and M. W. G. BRINKHOF, "What sets the odds of winning and losing?", Trends in Ecology and Evolution 21 (2006), 16-21.
Chapter 12
Endogenous Cooperation Network Formation S. Angus Deptarment of Economics, Monash University, Melbourne, Australia. [email protected].
This paper employs insights from Complex Systems literature to develop a computational model of endogenous strategic network formation . Artificial Adaptive Agents (AAAs), implemented as finite state automata, playa modified two-player Iterated Prisoner's Dilemma game with an option to further develop the interaction space as part of their strategy. Several insights result from this relatively minor modification: first , I find that network formation is a necessary condition for cooperation to be sustainable but that both the frequency of interaction and the degree to which edge formation impacts agent mixing are both necessary conditions for cooperative networks. Second , within the FSA-modified IPD frame-work, a rich ecology of agents and network topologies is observed , with consequent payoff symmetry and network ' purity' seen to be further contributors to robust cooperative networks . Third , the dynamics of the strategic system under network formation show that initially simple dynamics with small interaction length between agents gives way to complex, a-periodic dynamics when interaction lengths are increased by a single step.
1
Introduction
The strategic literature has seen a long-standing interest in the nature of cooperation, with many contributions considering the simple but insightful two-player Prisoner's Dilemma (PD) game. Traditionally, such games were analysed under an uniform interaction specification such that agents met equiprobably to playa single (or repeated) two-player game. More recently however, authors have relaxed this condition , and have analysed strategic games of cooperation and coordination under both non-uniform interaction and non-uniform learning
292 environments [3, 1]. The topological significance of the interacting space has been stressed by these authors as it appears to influence t he degree to which cooperation can be sustained. In the present work, constraints concerning agent rationality and rigid agent interactions are relaxed within a fundamentally agent-based modelling framework. Moreover, in contrast to one related approach in the literature [5] , agents are given strategic abilities to change the interaction space themselves (i.e. to change interaction probabilities) during pair-wise game-play. It is in this sense that a 'network' arises in the model, and hence, such a network is said to be a truly endogenous feature of the modelling framework; a feature which to this author's knowledge has not been previously handled with boundedly rational adaptive agents . The key insights of the present work can be summarised as follows: first, an analytic analysis without network formation reveals that the modification to the standard iterated PD (IPD) framework introduced below does not change the canonical behaviour of the system ; second, that when network formation is afforded, stable cooperation networks are observed, but only if both a typeselection and enhanced 'act ivity' benefit of the network is present ; third , that the extended system under certain interaction lengths is inherently self-defeating, with both cooperation and defection networks transiently observed in a longrun specification; and fourth , that the network formation process displays selforganized criticality and thus appears to drive the complex dynamics observed in the long-run .
2
The Model
Let N = {I , . . . , n} be a constant population of agents and denote by i and j two representative members of the population. Initially, members of the population are uniformly paired to play the modified IPD game 9 described below. When two agents are paired together, they are said to have an interaction. Within an interaction, agents play the IPD for up to a maximum of T iterations, receiving a payoff equal to the sum of the individual payoffs they receive in each iteration of the IPD . An interaction ends prematurely if either player plays a 'signal' thus unilaterally stopping the interaction. A strategy for a player s describes a complete plan of action for their play within an interaction, to be explained presently. In addition to the normal moves of cooperate (C) and defect (D), an agent can also play one of two signal actions, # 8 and #w respectively. Thus, in anyone iteration of the IPD, the action-set for an agents is {C, D, # 8, #w}. As mentioned above, the playing of a signal by either player leads to the interaction stopping, possibly prior to T iterations being reached. The playing of a signal can thus serve as an exit move for a player. The interpretation of the two types of signal is as follows . Although initial pairing probabilities between all players are uniform random, agents can influence these interaction probabilities through the use of the signals. Formally, let
293 some agent i maintain a preference vector,
(1.1)
Ii
where is the preference status of agent i towards agent j and Ps > Po > Pw are natural and denote strengthen, untried and weaken preferences respective ly. Initially all entries are set to Po for all j E N / {i}. A probability vector ri for each agent is constructed from the preference vector by simple normalisation onto the real line,
.
r .
._
t
{ r . rj
-
I Ji
"£ Ii
if j E N / {i}} ,
(1.2)
such that each opponent occupies a finite, not-zero length on the line [0, 1] with arbitrary ordering. Since we study here a model of mutual network/trust formation, preferences can be strengthened only by mutual agreement . Specifically, if agents i and j are paired to play the IPD, then when the interaction ends in iteration t :::; T,
Ii = J! = {ps Pw
if s: = s{ else,
= #s ,
(1.3)
si
where denotes the play of agent i in iteration t. That is, in all cases other than mutual coordinated agreement , the two agents will lower their relative likelihood of being paired again (though the playing of #w might cause the interaction to end prematurely with the same result). Payoffs for each iteration of the PD are given by (1.4) below.
#s I
C D
#w
#s (0,0) (0,0)
c (3,3)
(5,0)
II D
(0,5) (1, 1)
#w
(0,0)
(1.4)
(0,0)
The playing of signals, is costly: the instantaneous cost for that period is the foregone payoff from a successful iteration of the IPD.
2.1
Game Play
In a period each agent is addressed once in uniformly random order to undergo m interactions with players drawn from the rest of the population (N/{i}) . An agent is paired randomly in accordance with their interaction probability vector r i with replacement after each interaction. Preference and probability vectors are updated after every interaction. Thus, it is possible that, having previously interacted with all agents , an agent retains only one preferred agent, whilst all othe rs are non-preferred, causing a high proportion (if not all) of their m interactions to be conducted with
294 their preferred partner. However, it is to be noted t hat t he value of m is only a minimum number of interactions for an agent in one period, since they will be on the 'receiving end' of other agents' intera ctions in the same period. In this way, agents who incur an immediate cost of tie st rengthening (foregoing iteration payoffs) can gain a long-term benefit t hrough further preferential interactions. At the end of T periods, t he populat ion undergoes selection. A fract ion B of the populat ion is retained (t he 'elites' ), whilst t he remainder (1- B) are replaced by new agents as described below. Selection is based on a ranking by total agent payoffs over the whole period. Where two agents have t he same total payoffin a period, the older player remains.'
2.2
Agent Modeling
Each agent is modeled as an k (maximum) st ate FSA. Since signals (#x) have only one public interpretati on, each state must include three transition responses: R(C) , R(D) and R( #) (or just two in the case of unilat eral stopping) .2 After each period, a fraction B will stay in the population , with the remaining agents being filled by new entra nts. Here, the process of imitation and innovation/mistake-making is implemented via two foundat ional processes from t he genetic algorit hm (GA) literature. Initially, two agents are randomly selected (with replacement ) from the elite popul ation. A one-point crossover operator is applied to each agent , and two new agents are formed. Th e strategy encoding (bit-strings) of t hese new agents t hen undergo point mutations at a pre-determined rate (5 bits per 1000). This process (random selection, crossover and mut at ion) continues until all the remaining spots are filled.
3 3.1
Results & Discussion Uniform interactions
To begin, we st udy a static uniform interaction space to check any unwant ed outcomes due to t he modified IPD set-up. In this situ at ion, rather t han agents upgrading t heir preference vector after each interaction, t he preference vector is uniform and unchanged th roughout the model. In this way, the effect of the modificati on to t he standard IPD framework can be analysed. Under such a scenario, the action set for each agent reduces to {C, D, #} since t he signal act ion # has no interaction space interpretation, but still provides a means of prematurely ending th e interaction (thus we may drop the sub-script ). To keep matters simple, we consider a model in which the maximum interact ion length T = 2, which yields a maximum FSA state count of k = 3. Under these conditions, a st rategy will be composed simply of a first play, and response plays to C and D . 1 Following SSA [5] . 2To facilitat e t he computational modeli ng of this environment, agent st rategies were encoded into binary form at. See [4J for an analagous description of t his meth od for FSA .
295 In this setting, no evolut ionary stable strategy will include # as a first play, since t he payoff for such a strategy with any ot her agent is 0.3 This leaves st rategies in t he form of a triplet , S :
{PI, R(C ), R(D )} ,
where PI E {C,D } and R(.) indicate subsequent plays in response to either C or D plays by the opponent R(.) E {C, D , # }. In all, 18 unique strategies can be const ructed. It is instructive to consider whet her cooperative st rategies might be evolut ionary stable in t his scenario. Clearly, a str ategy Se : { C, C, C } will yield st rictly worse payoffs than t he st rat egy S D : {D , D , D } in a mixed environment of t he two. However, it can be shown" that the st rategy SA : {C, C, #}, is uniquely evolutionary stable in an environment of SD only. However, SA is itself suscept ible to at tack by a ' mimic' agent such as SB = {C, D, D}, which itself will yield to the familiar S D . In t his way, even with the added facility of being able to end the interaction prematurely, the only evolut ionary stable strat egy with respect to th e full strategy space is that of SD . Any intermediate restin g place for t he community will soon falter and move to this end.
3.2
Uniform Interactions: Computational Results
Computational experiments were run under uniform mixing as described above as a method of model validat ion. As predicted above, t he model showed the clear dominance of SD under uniform mixing. Additionally, t he initi al 'shakeout ' periods (t < 30) gave rise to interesting wave-like strategic jostling. Agents playing cooperation first, and replying to D with # were the first to have an early peak, if short-lived, which is not unexpected, since playing th e signal is not t he best-r esponse to any subsequent play. T hereafter TFT-nice (C) peaked, but were soon overcome by the t urn-coat type (who dominates TFT-nice). However, as t he sto ck of C players diminish, ' turn-coat' too , yields to the D-resp type st rategies (such as TFT-nasty (D)). We may conclude then, t hat the presence of the signal play (#) does lit tle to affect st rategic outco mes in t he standard IPD set- up; defection st ill reigns supreme in th e uniform IPD environment . The impact of network formati on decisions by agents was parameterised in the computational experiments as follows,
Pw Ps
=
(1- TJ)2 (I + TJ f
and
(1.5) (1.6)
where TJ E [0,1 ). Th e choice of the expression is somewhat ar bitary, however, t he current specification retains symmet ry about Po = 1 for all values of TJ and by 3T he inte raction would end after the first iteration, and 9 { # x Iy) = 0 for all x E {s, w } and Y E {C,D, # (s), # (w)} . 4 P roofs available from t he aut hor on request.
296 taking the squared deviation from 1, th e ratio 'Trs/Pw could be easily varied over a wide range. To determine what condit ions are favourable for network formation, a second computational experiment was conducted, this time 't urning up' th e int eraction space impact of any signalling play by the agents. Specifically, the network tuning parameter TJ was varied in the range [0.2,0 .95] together with the minimum interaction parameter m over [2, 20]. It was found that necessary condit ions for sust ainabl e network formation were TJ ;::: 0.8 and m ;::: 10. In terms of t he population , these accord with a ratio of Ps to 'Trw (by (1.5),(1.6)) of around 80 t imes,5 and a minimum fraction of interactions per period of around 10% of t he popul at ion. Further, the fraction of mutual cooperative plays (of all PD plays) moved in an highly correlated way with degree. It would app ear , th erefore, t hat network format ion in th is model is due to agents who play C first, and P[R(C)] = #s .6 A closer look at the dynam ics of prevalent strategies under network forming conditions confirms this conclusion.
(a) 10
(b) 12
(c) 13
(d) 16
Figure 1: Example network dynamics (m = 20, TJ = 0.8): network state at end of indicated period; agent ID show next t o each node; ag en t colori ng as follows - (1'1) robust coope rat ive; (.) robust defection; ( .) opportunist ; a nd ( ) tex t for exp la nat ion).
'sucker' (see
To better understand th ese dyn amics, a series of network snapshots for one representat ive network formation trial under t he above condit ions is shown in Fig. 1. Here, at least four distin ct phases are discern able: Th e first phase, amorphous connected, saw the existence of many sucker typ es leading to a super network with high average degree. Second , the segregated connected phase saw th e network remain super connected, but clear segregation began to occur , such t hat agent-to-age nt edges become highly assort ative. Third, t he segregated di sjoint was characterised by the sucker type disappearing, leading to a 'shake-out' in the population - t he over-supply of opportunist types is recit ified , wit h only those who were able to integrate wit h the defective community able to survive . Finally, a homogeneous connected phase ensued, edges become highly dense, approaching a complete compone nt grap h due to high payoffs to intra-network community edges. The defective community disappears, with no p ossibilit y of infiltration into t he co operative co m m u n ity. 5T hat is, an agent is 80 tim es more likely to interact with a preferred agent rather than a disliked agent in a given period , based on a two-agent compari son. 6Recall, agents are free to form networks wit h any kind of behavioural basis.
297
3.3
Multiple Equilibria & the Long Run
In the previous section, conditions were identified in which stable networks were formed under parsimonious agent specification (7 = 2 implying k = 3) to enable correlation with established results in the analytic literature. Here, this constraint is relaxed and instead agents interactions of up to four iterations of the IPD game (7 = 4) are considered and their long-run dynamics studied. Recall, by increasing the length of the IPD game, the maximal FSA state count increases markedly: for 7 = {3,4} maximum state count k = {7, 15}. Previous conditions were retained , with 'T/ = 0.8 and m = 20, and each trial allowed to run for 1000periods . Since a full description of the state is not feasible" we consider an aggregate description of two fundamental state characteristics, f( C, C) - the fraction of plays in a period where mutual cooperation is observed (strategic behaviour) ; and (d) - mean agent degree (network formation) . Results are presented for five long-run trials in Fig. 2. Under low interaction length the system moves within 100 steps to one of two stable equilibria - either a stable cooperation network is formed (as was studied in the previous section) or no network arises and a stable defection population sets in. However, as the interaction length increases (and so the associated complexity of behaviour that each agent can display) , the dynamics become increasingly eratic , with multiple , apparently stable, equilibria visible in each case, but transient transitions between these equilibria observed. This situation is synonymous with that of complex system dynamics.
(a) k = 2, (d) (b) k 2,f(C , C )
(c) k = 3, (el) (d) k 3,f(C, C )
(e) k = 4, (el) (f)
k
4, f(C ,C)
Figure 2: Long-run system dynamics under different maximum interaction lengths indicating increasing complexity; five trials shown at each value of k (data smoothed over 20 steps).
Surprisingly, such complex dynamics arise in a relatively simple model of network formation. Recall, that the longest that any of the agent interactions can be in these studies was just two, three or four iterations of the modified Prisoner's Dilemma. To be very sure that such dynamics are not a consequence of the encoding of the automata themselves, an identical study was run with 7 = 4, but setting 'T/ = 0 such that all interactions would continue to be of 7 Consider that each time period, a population constitutes n x lsi bits, where lsi is the length of a string needed to represent each agent's strategy, and the network n(n - 1)/2 bits; taken together, gives rise to a possible 2n (n - l ) ! 2 + n ls l states, which for T = 2 is 29 x 1O"! (It is possible to reduce this number by conducting automata autopsies, but the problem remains .)
298 uniform prob abilities. However , in all cases, the syste m moved to a zero cooperat ion regime within t he first 100 periods and remained there. We conclude that endogeneity of network form ation is drivin g such complex dynamics as observed above.
4
Conclusions
In contrast to prevous attempts to capt ure the dynamics of strategic network formation (e.g.[5]) , t he present model provides a relatively simple foundation, but powerfully rich behaviour al and to pological environment within which to st udy the dynamics of st rategic network form ation. Analytical and subsequent computational components of the present paper indicate t hat in this simple modified IPD set-up, coopera tion is not sust ainable wit hout th e additional benefits conferred by the type-selection and type-protection network exte rnalities. Furth ermore, even with parsimonious descriptions of boundedly-ration al agent strat egies, complex dynamics are observed in this model, with multiple and transient st ationary locations a feature of th e state space," These dynamics increased in complexity with increasing agent 'intelligence' .
Bibliography [1] E LGAZZAR, A S, "A model for the evolut ion of economic syste ms in social networks" , Physica A : Statistical Mechanics and its A pplications 303, 3-4 (2002), 543-55 1. [2] LINDGREN, Krist ian, "Evolutionary phenomena in simple dynamics", A rtificial Life II (New Mexico, ) (C. G. LANGTON, C. TAYLOR, J. D. FARMER, AND S. RASMUSSEN eds.), vol. X of Santa Fe Institu te St udies in the Science of Complexity, Santa Fe Instit ute, Addison-Wesley (1992), Proceedings of t he Workshop on Art ificial Life Held February, 1990 in Santa Fe, New Mexico. [3] MASUDA, Naoki , and Kazuyuki Am ARA, "Spa tial prisoner's dilemma optimally played in small-world networks" , Physics Let ters A 313 (2003), 55-61. [4] MILLER, John H, Carte r T BUTTS , and David RODE, "Communication and cooperat ion", Journal of Economic Behavior & Organization 47 (2002), 179-195. [5] SMUCKER, M D, E A STANLEY, and D ASHLOCK, "Analyzing social network struct ures in t he iterated prisoner' s dilemma with choice and refusal" , Technical Report CS-TR-94-1259, University of Wisconsin-Madison, Department of Computer Sciences, 1210 West Dayton Str eet , Madison , WI (1994). 8Compare Lind gren 's classic paper wit h simila r conclusions [2].
Chapter 13
Mathematical model of conflict and cooperation with non-annihilating multi-opponent Khan Md. Mahb ubush Salam Department of Information Management Science, The University of Electro Communications, Tokyo, Japan [email protected] Kazuyuki Ikko Takahashi Department of Political Science, Meiji University, Tokyo, Japan [email protected]
We introduce first our multi-opponent conflict model and consider th e associated dynamical syst em for a finite collection of positions. Opponents have no strategic priority with respect to each ot her. Th e conflict interaction amon g th e opponents only produces a certain redistribution of common area of int erests. Th e limiting distribution of th e conflicting areas, as a result of 'infinite conflict interact ion for existe nce space , is investigated . Next we exte nd our conflict model and propose conflict and cooperat ion model, where some opponents cooperate wit h each other in th e conflict interaction. Here we investigate th e evolut ion of the redistribution of th e probabilities with respect t o th e conflict and cooperation composit ion, and det ermine invariant states by usin g compute r experiment.
1
Introduction
Decades of resear ch on social conflict has cont ributed to our understanding of a variety of key social, and community-based aspects of conflict escalat ion. How-
300 ever, the field has yet to put forth a formal theoretical model that links these components to the basic underlying mechanisms. This paper presents such models: dynamical-systems model of conflict and cooperation. We propose that it is particularly useful to conceptualize ongoing. In biology and social science, conflict theory states that the society or organization functions in a way that each individual participant and its groups struggle to maximize their benefits, which inevitably contributes to social change such as changes in politics and revolutions. This struggle generates conflict interaction. Usually conflict interaction takes place in micro level i.e in individual interaction or in semi-macro level i.e. in group interaction. Then these interactions give impact on macro level. Here we would like to highlight the relation between macro level phenomena and semi macro level dynamics . We construct a framework of conflict and cooperation model by using group dynamics. First we introduce a conflict composition for multi-opponent and consider the associated dynamical system for a finite collection of positions. Opponents have no strategic priority with respect to each other. The conflict interaction among the opponents only produces a certain redistribution of common area of interests. We have developed this model based on some recent papers by V. Koshmanenko, which describes a conflict model for non-annihilating two opponents. By means of conflict among races how segregation emerges in the society is shown. Next we extend our conflict model to conflict and cooperation model, where some opponents cooperate with each other in the conflict interaction. Here we investigate the evolution of the redistribution of the probabilities with respect to the conflict and cooperation composition, and determine invariant states.
2
Mathematical model of conflict with multiopponent
In some recent papers v. Koshmanenko (2003, 2004) describes a conflict model, for non-annihilating two opponent groups through their group dynamics. But we observe that there are many multi-opponent situations, in our social phenomena, where they are making conflicts to each other. For example, there are multi race (e.g., Black, White, Chinese, Hispanic, etc), multi religion (e.g., Islam, Christian, Hindu, etc) and different political opinions exist in the society and because of their differences they have conflicts to each other. Therefore it is very important to construct conflict model for multi-opponent situation to understand realistic conflict situations in the society. In order to give a good understanding of our model to the reader, we firstly explain it for the case of four opponents denoted by A l , A z, A3and A 4 and four positions. We denote by n = {Wl, Wz, W3, W4} the set of positions which A l , A z , A 3 and A 4 try to occupy. Hence Wl, Wz, W3 and W4 represents different positions in n. By a social scientific interpretation, each Wj, j = 1,2,3,4 represent an area of a big city n. Let flo, vo, 1'0 and'TIo denote the probability measures on n. We define the probability
301
that the opponents AI, A 2 , A3and A4 occupy the position Wj,j = 1,2,3,4 with probabilities f-lo(Wj),vo(Wj), /'o(Wj) and TJo(Wj) respectively. As we are thinking about the probability measures and a priori the opponents are assumed to be non-annihilating, it holds that 4
Lf-lO(Wj) j=l
4
4
4
j=l
j=l
j=l
= 1, LVO(Wj) = 1, L /'0 (Wj) = 1, L
TJo(Wj)
= 1.
(1.1)
Since AI, A 2 , A3and A 4 are incompatible, this generates a conflicting interaction and we express this mathematically in a form of conflict composition. Namely, we define the conflict composition in terms of the conditional probability to occupy, for example, WI by each of the opponents. Therefore for the opponent Al this conditional probability should be proportional to the product, f-l0( {wI}) x vo({W2}, {W3}, {W4}) x /'0({W2}, {W3}, {W4}) x TJo( {W2}, {W3}, {W4}). (1.2) We note that this corresponds to the probability for Al to occupy WI and the probability for A 2, A 3 and A 4 to be absent in that position Wi. Similarly for the opponents A 2 , A3 and A4 we define the corresponding quantities. As a result, we obtain a re-distribution of the conflicting areas . We can repeat the above described procedure for infinite number of times, which generates a trajectory of the conflicting dynamical system. The limiting distribution of the conflicting areas is investigated. The essence of the conflict is that the opponents AI, A 2 , A 3 and A 4 can not simultaneously occupy a questionable position Wj . Given the initial probability distribution: (0)
(
Pll (0) P21 (0) P3i (0) P4i
(0) P12 (0) P22 (0) P32 (0) P42
(0) P13 (0) P23 (0) P33 (0) P43
(0)) P14 (0) P24 (0) P34 (0) P44
(1.3)
the conflict interaction for each opponent for each position is defined as follows: (1)._
1
(0)(
(0))(
(0))( (0)) . 1 - P31 1 - P4i ,
(0)(
(0))(
1 - P32
Pll .- (OTPll 1 - P21 zl
(1)._
1 - P42 ,
(0)) .
(1.4)
LP~~)(l - p~~))(l - p~~))(l - p~~)) .
(1.5)
1
P12 .- (i1)P12 1 - P22 Zl
(0))(
and so on. where the normalizing coefficient
zi
O )
3
=
j=l
302 Thus after one conflict the probability distributions changes in the following way: WI Al
(
A2 A3 A4
Pll (01
(0) P21 (0) P31 (0) P41
W2 (0) P12 (0) P22 (0) P32 (0) P42
W3 (0) P13 (0) P23 (0) P33 (0) P43
W4 (0) P14 (0) P24 (0) P34 (0) P44
WI
)
Al -+
(
A2 A3 A4
Pll (II
(1) P21 (1) P31 (1) P41
W2 (1) P12 (1) P22 (1) P32 (1) P42
W3 (1) P13 (1) P23 (1) P33 (1) P43
W4 (1) P14 (1) P24 (1) P34 (1) P44
)
(1.6) Thus by induction after kth conflict the probability dist ributions changes in the following way: WI Al A2 A3 A4
(1.7) The general formulation of this model for multi opponents and multi positions and its theorem for limiting distribution is given in our recent pape r Salam, Takahashi (2006). We also investigated th is model by using empirical data but because of page restriction we can not include that in this paper.
2. 1
Computer Experimental Results
In our simulation results M(O) is the initial matrix where row vectors represent the distribution of each races. There are four races white , black, Asian and Hispan ic denoted by AI, A 2, A 3 and A 4 respectively. W1 ,W2, .... , represents the districts of a city. Here all three races moving to occupy these districts, thus the conflict appear. Here M(oo) gives the convergent or equilibrium matrix. There are several graphs in each figure. Each graph shows the trajectory correspond to the each element of the matrix. In each graph x-axis represent the number of conflict and y-axis represent the probability to occupy that position. In result 1, which is given below, we observe that opponent Al has biggest probability in city WI and after 9 interaction it occupy this city. Opponent A 2 has bigger probability to occupy city W2 and W4. But in city W4 opponent A 4 has the biggest probability to occupy since the opponents are non-annihilating opponent A 2 gather in W2 and occupy this city after 9 conflict interactions and opponent A4 occupy the city W4 after 9 conflict interaction. Opponent A 3 also has bigger probability to occupy city W2 and W3. As opponents are non-annihilating and A 2 occupy W2, opponent A 3 occupy W3 after 9 conflict interaction. Thus each races segregated into each of the cities. This result shows how segregation appear due to conflict.
W4 (k) P14 (k) P24 (k) P34 (k) P44
303
1.
W,
OJ,
W,
A, 0.6 0.1 0.2 0.1 0.2 OJ 0.1 0.4 M'O ' = A, A, 0. 1 0.4 OJ 0.2 t\ OJ 0.1 0.1 0.5 w,
A,
A, 0 0 I 0 A, O 0 0 I)
!7 I!'u r~ 0 05
0 05 _
~
"I)
~
~
o
0
o
10
Q: 0
III
, ' >c . . . .,"
0
No.at CcnfIicl
;]
co
~ 05
~
0
I
Q: 0-' ''''--
10
III
No.of CcnfIicl -,
~
::'I
:I
0
0
0 (,)
..0
010
0
Q: 0
20
::]
~ 05 e 0. 0
.li o
Q: O
2
~
0
10
~ 051 I e I 0. 0 ~
lllO
lllO
..,
£ Oo~W ,o~
A,
10
.
W
1 No.
or CcnfIicl
0
Q: O
\
o
No.
~ W
or CcnfIicl
Q: O" -
III
)0
,a.
W
0
No.
or CcnfI icl
-
III
1 No.of CcnfIicl
1
lllO )0
a. o o 0
2
Ol - - 0 10 1~ No. of CcnfIicl
~2 051
~ 05 L
Ii:
20
-
1
o
--" 10 III of Q:dIid
05
.,;
o
£ 00 ,~
05
.,;
III
1 No.of CcnfIicl ~I
~ 051 ~
2
.,;
10
1~ 05 1 I e 0. 0
, r~ ) 0
0
.D
..0
' 10
w,
. 0 05.
~
~
r
w)
0 05,
Q: 0
A1
l
to,
0 0 01 A 0 I 0 0
M'
W1
~
w, w,
W,
W,
£ 00_\. --'>--10-
JkiNo.
III
III
of CcnfIicl
to 05'
2
-g
Q: o ~
III
o
10
No. of CcnfIiet
10 No. or CcnfIicl
III
304
3
Mathematical Model of Conflict and Cooperation
Suppose that Al and A 2 cooperate with each other in this conflict interaction. We express this mathematically in a form of conflict and cooperation composition . Namely, we define the conflict and cooperation composition in terms of the conditional probability to occupy, for example, WI by each of the opponents. Therefore for the opponent Al and A 2 this conditional probability should be proportional to the product,
[JLo( {wI} )+110 ({WI} )-JLo( {wI} X110 ({wI})] Xl'o( {W2} , {W3}, {W4}) x 1]0 ({W2}, {W3} , {W4}) (1.8) We note that this corresponds to the probability for Al and A 2 to occupy WI and the probability for A 3 and A4 to be absent in that position WI . For the opponent A 3 this conditional probability should be proportional to the product, 1'0({wI}) x JLO({W2} ,{W3},{W4}) x IIO({W2},{W3},{W4}) x 1]0({W2}, {W3}, {W4})'
(1.9) Similarly for the opponent A 4 we define the corresponding quantities. As a result, we obtain a re-distribution of the conflicting areas. We can repeat the above described procedure for infinite number of times , which generates a trajectory of the conflict and cooperation dynamical system. The limiting distribution of the conflicting areas is investigated by using computer experiment. Given the initial probability distribution (1.3) the conflict and cooperation composition for each opponent for each position is defined as follows: (1)._ PH .-
and so on, ziO),s are the normalizing coefficients. Thus after one conflict the probability distributions changes as (1.6), but the quantities are different from the previous model and by induction after kth conflict the probability distributions changes as as (1.7), but the quantities are also different.
3.1
Computer Experimental Results
In this computer experimental result opponent At and A 2 cooperate each other. We observe that in position WI opponent Al has biggest probability to occupy this position. As opponent Al and A 2 cooperate each other, both of them occupy this position after 23 interactions. In position W3 opponent A 3 has the biggest probability to occupy but as Al and A 2 cooperate each other they occupy this position after 23 interactions. Since the opponents are non-annihilating opponent A 3 and A 4 occupy the positions W2 and W4 respectively.
305
(~
o A. [0.5 M"!= A, 0.5 o
0,1] M'~ = A, 0.2 OJ 0.1 0.4 A, 0. 1 0.4 OJ 0.2 t\ OJ 0.1 0.1 0.5
f
f
------, 0.
2040
Al
0: 00
f
AJ
a '~
,. 11No. of Conflict --' '' I ~ ~ ~ as ~ as
(\
20
~ OS J a
2040
No.of Conflict
0:
0'
a
40
,0:
1r: Nd. of Conflict---,
I~
~M
OO~
!
4iJ 0:
~M
e~
.ci
~ o
lt o
cl o-
0
2040
a
2040
~
~
~
0
0
o e
0
° 05 '
2
AI.J:io
No. of Conflict
0: a
l
a
~ o: a " - . a
2040 No. of Conflict
0: 00
20
40
0..,.
e~
C: O
a
1::
j
2Ol,()
I
D.
~
of Conflict
t I
0
° 05
~
. D.ci 0 0
0: a '\ a
1
• 2040 No. of Confl ict
-
~~ I No. of ConfIict _
0
~
0
2040 No. of Confl ict
2040
,. 1 No.of Canflict
·° 05'
I...~
02040
0
0
·° 05
0: a
.0
a
,. I
w,
~M
0
,. 1 r-r- No.of Conflict
I
L l r" -
40
~M
.D
a
20
0
~ 05
I..--Ha. of Conflict
J:J
o
2040
00
0.5 0
~ OS
J-
~ as
1 No. of Conflict
-Jr
~~ 1, No. of Conflict
0
W.
0.5 0
I
A. 0
_.~ Os
(d,
o o o
A, 0
w,
o
w,
(~
A. 0.6 0.1 0.2
0: a
a
20 40 No. of CanfJict
306
4
Conclusion
Since social, biological, and environmental problems are often extremely complex, involving many hard-to-pinpoint variables, interacting in hard-to-pinpoint ways. Often it is necessary to make rather severe simplifying assumptions in order to be able to handle such problems, our model can be refine by including more parameters to include more broad conflict situations. Our conflict model did not have destructive effects. One way to alter this assumption is to make the population mortality rate grow with conflict efforts. We suspect these changes would dampen the dynamics. We observed that for multi-opponent conflict model each opponent can occupy only one position but because of cooperation two opponents who cooperate each other can occupy two positions with same initial distribution. We emphasize that our framework differ from traditional game theoretical approach. Game theory makes use of the payoff matrix reflecting the assumption that the set of outcomes is known. The Nash equilibrium, the main solution concept in analytical game theory, cannot make precise predictions about the outcome of repeated games. Nor can it tell us much about the dynamics by which a population of players moves from one equilibrium to another. These limitations have motivated us to use stochastic dynamics in our conflict model. Our framework also differ from Schelling's segregation model in several respects. Specially Schelling's results are derived from an extremely small population and his model is limited to only two race-ethnic groups . Unlike Schelling's model we do not suppose the individuals' choices here we consider group 's choice.
Bibliography [1] JONES,A. J. "Game theory: Mathematical models of conflict", Hoorwood Publishing (1980) . [2] KOSHMANENKO,V.D., "Theorem on conflicts for a pair of stochastic vectors" , Ukrainian Math. Journal 55, No.4 (2003),671-678. [3] KOSHMANENKO,V.D., "Theorem of conflicts for a pair of probability measures", Math . Met. Oper. Res 59 (2003),303-313. [4] SCHELLING,T.C., "Models of Segregation", American Economic Review,Papers and Proc. 59 (1969) ,488-493. [5] SALAM,K.M.M., Takahashi, K. "Mathematical model of conflict with nonannihilating multi-opponent", Journal of Interdisciplinary Mathematics in press (2006) . [6] SALAM,K.M.M., Takahashi, K. "Segregation through conflict", Journal of Theoretical Politics, submitted.
Chapter 14
Simulation of Pedestrian Agent Crowds, with Crisis M. Lyell, R. Flo*, M. Mejia-Tellez Intelligent Automation, Inc. [email protected] *Air Force Research Laboratory
1.1. Introduction Multiple application areas have an interest in pedestrian dynamics. These range from urban design of public areas to evacuation dynamics to effective product placement within a store. In Hoogendoorn et al [Hoogendoorn 2002] multiple abstractions utilized in simulations or calculations involving pedestrian agents include (I) cost models for selected route choice, (2) macroscopic pedestrian operations, and (3) microscopic behavior. A variety of mathematical and computational techniques have been used in studying aspects of pedestrian behavior, including regression models, queuing models that describe pedestrian movement from one node to another, macroscopic models that make use of Boltzmann-like equations, and microscop ic approaches. Microscopic approaches include social force models and cellular automata models. The 's ocial force ' models can involve ad hoc analogies to physical forces. For example, a floor may be viewed as having a 'repulsive' or 'attractive' force, depending on the amount of previous pedestrian traffic. Cellular automata models are based on pedestrian walking rules that have been gleaned from observations , such as those developed from Blue and Adler [2000]. Entities that constitute 'pedestrians' have undergone some development. Still [2000] includes additional rules on his cellular automata 'agents' in his modeling of crowd flow in stadiums and concourses. The agents in 'El Botellon' [Rowe and Gomez, 2003] have a ' bottle and conversation' tropism which was inspired by social phenomenon of
308 crowds of people "wandering the streets in search of a party". This work modeled city squares as a set of nodes in a graph. Agents on a square acquire a probability of moving to another square; this probability lessens if there are other agents or a bar in the agent's current square. We are interested in agents that are more reflective of 'real people' who are pedestrians in an urban area, with goals that reflect the reason for their presence in the city. In the course of their trip to the urban area, a fire (or some other crisis) occurs. With such agents, there is no single original goal location. While some of the pedestrians want to keep as far away from the fire as possible, others might be assessing their personal business needs versus their safety needs. Their different considerations and responses should reflect their personalities, beliefs, and logical assessments. We adopt a software agent approach to the modeling of pedestrian agents in crowds. The pedestrian agents that we have developed incorporate cognitive and locomotive abilities, and have personality and emotion. The locomotive abilities are based on a translation of the Blue and Adler [20001 cellular automata rules into a software agent framework. Our model utilizes the ace appraisal model which allows for emotional effects in decision making. Note that our emotion list involves a slight extension from those of the ace list. Our pedestrian software agent design also includes a belief structure into each agent that is consistent with their personality . The Five Factor personality model [Digman, 1990] provides the framework. These affective features are integrated into the agent's cognitive processes. Such a pedestrian software agent is hybrid in the sense that it also has physical locomotion ability. We note that coupling of psychological models with individual pedestrian agents in order to investigate crowd behavior is relatively new. In addition to our work, reported upon here and in Lyell and Becker [2005], we note the work of Pelechano and colleagues [20051. In their work, they couple previous work on behavior representation and performance moderators with a social force model We also include police officer agents as part of the simulation framework. These officer agents do not incorporate a psychological /emotional framework . Rather, they encapsulate rules that reflect their training and characteristics; however, they do have the locomotive capabilities consistent with the other pedestrian agent types. The pedestrian software agents are hosted in an agent-based development and execution environment. After an initial prototype effort, we are in the process of developing a simulation framework in which to host pedestrian software agents; this will facilitate studies of pedestrian agent crowds. The paper is organized as follows . Section 2 discusses our earlier work and results from the prototype effort . Section 3 discusses our current effort on the simulation framework.
309
1.2. Early Work: The Prototype Effort The focus of the prototype effort was three-fold: (1) incorporate both personality and emotion and locomotion frameworks into a pedestrian agent model, (2) conduct validation studies, and (3) conduct initial crowd simulation experiments. We briefly report on results in this section. Further details are found in Lyell and Becker [2005].
1.2.1.
Pedestrian Agents, Personality Caricatures, Goals and Locomotion
For the initial effort, we considered three personality types, two of which were caricatures, for use in pedestrian agents. An excessively, extremely fearful (neurotic) personality, an excessively open, extremely curious personality, and an agreeable, social personality were utilized. Both the curious and the fearful (neurotic) personalities were designed to be caricatures . Each of the agent personality types was supported by an emotion set and a goal set. The goal set included higher level goals, such as "seek safety" or "attend to business goal" . Not each personality type had the same goal set; the caricature personalities each had a sub-set of the possible goals. For example, an extremely fearful pedestrian did not have the "seek safety compassionately" goal as part of its possible goal set. Concrete actions that could be taken in support of a selected goal were (a) attempted movement to a new (calculated, specified) location and (b) message sending to another pedestrian agent (or agents) . Actual movement to a new location utilized the walking rules from Blue and Adler [2000] that had been re-cast into an agent framework.
1.2.2. Verification and Validation Efforts From the CMU Software Engineering Institute's web site ICMU-SEI 2006], verification asks "did you build the product right" (meet specs) and validation asks "did you build the right product". For the verification effort, we "turned off' the cognition in the pedestrian agents and recovered the walking behavior that was found from the Blue and Adler studies 120001. One aspect, that of spontaneous lane formation of pedestrians moving in the same direction , is shown in Figure 1. This is an example of emergent behavior. Additionally in verification effort, we also investigated separately the behavior of each of the three agent personality types, and found that they exhibited their expected behavior. Figure 2 shows this for the extremely curious agent type. The validation effort was in the early work provided by a 'reasonableness' check on the results. The next sub-section presents results for one of the initial investigations .
310
E
fl.,Q,9BJn Initial Placement ; Lower Density (20 Agents , 500 squa res) Be havior Res ult Figure I: Spon taneou s Lane Formation, Westward (red dot s) moving pede strians and eas tward (blue dots ) mo ving pedestrian s separate into lanes. ot on ly is there lane formation , the ent ire westward flow has a sharp boundary from the eastward moving edestrian flow.
f C,khOp, f ie FUl
rnm.
(wnI,
Figur e I: Extremely Curious Pedest rian Agent Behav ior : " learning about a fire mean s view ing a fire". The agent s that are found at the goal sites (right hand side) are those that had traveled past the location before the fire had erup ted .
1.2.3. Initial Investigations We investigated several similar scenarios, each involving different pedestrian agent population mixes and different fire locations. Here, we present the result of one
311 investigation, shown in Figure 3. For each of these initial investigations, the following characteristics held: • Fearful Agent o I f1eams of fire , will seek safet y o Never pro-actively helpful with 'fire exi sts' messages o Will infrequently respond to direct questions from other agents • Social/ Agreeable Agents o Most complex emotional range o Proactively helpful- send 'fire exists' messages to nearby agents • Officer in all cases, mo ves towards fire , orbits fire , redirects adjacent pedestrians o All agent types obey police officer directive to leave area • City Area Description o City Grid 10 cells high , 50 cells wide o Agents enter left at metro o Business Goals at upper and lower city edge (right) o Fire Radius 2, Fire Appears at SimTime 200
50%A, 354 414 9(4 50%F "NlJI-t2-A---A..;.li..;.F-h-ea-d-e-d-to--------.;.r-=-.. ._p fficer R-directs 5Aat know of fire, not metro, multiple from officer A to BG
some A
•._ ...._..
BG
474 All agents Some from direct observation, know of some from officer, somefrom A fire agent messages
10% A, 90%F
100 %F 16 Agents do not know of fire • • ~ ••• • ¥ . ¥
Figure 3: Results for different population mixes of Agreeable and Fearful type pedestrian agents. The axis represents simulation time, and distinguished time points are shown. The results represent multiple runs for each population mix .
1.3. Framework for Simulation of Pedestrian Agent Crowds 1.3.1. Why a Framework? Among the drawbacks of the initi al effort were :
312
•
• • • • •
goal selection (dependent upon the environmental state, the emotional state of the agent, and on the agent's history) was restricted to a single goal, caricatures were used for two of the pedestrian agent personality types, the developed pedestrian agents were not 'tune-able', the urban geometry was too simple, it was difficult to simulate ' excursions' on the primary scenario, much of the simulation particulars were hard-coded rather than selectable.
A motivation for the development of a simulation framework that allows the study of pedestrian crowds in an urban area with a crisis situation is to enable simulation studies for which the user/analyst does not have to engage in software development. Multiple agent personality types should be provided for use in simulation variations. The urban area design should be configurable. In our current simulation infrastructure development effort, the goal is to support the user/ analyst in devising, executing and analyzing simulations in the domain of pedestrian crowd modeling in an urban environment through the use of simulation framework services. In particular, the user/ analyst will be able to • develop the geometry of the urban area using a template , • develop realistic pedestrian agent personal ities using templates, • assign resources to the scenario. The resources include police officer agents. • construct variations of the scenario, for different simulation investigations. The variations may include: (a) utilization of different resources (objects or police officers), utilization of different pedestrian population mixes, (c) different densities of pedestrians in the city area, (d) variations in the details of the city area ( geometry, buildings, etc.). Of course, all of the simulation variations must be within the scope of "pedestrian agent crowds in crisis, with police officers". We are in the process of developing a software framework for simulating pedestrian agent crowds in an urban area. The control functionality for the crisis situation may be provided by the police officer agents and their interactions . The major framework elements are shown graphically in Figure 4. These include the aforementioned templates as well as the simulation application (simulation engine), which is layered over the Cybele agent platform . Note that there is an open source version of Cybele [CYB 2006]. Rule engine support for pedestrian agent emotional change and goal selection rules (developed using the agent builder template) is also provided.
313 Geometry Building Template
Agent Building Template
\.
\.
\.
t of Simulation Framework for Pedestrian A ent Crowd Studies
1.3.2. Agent Builder Templates for the User / Analyst
One of our challenges has been to develop specific pedestrian agent personalities in such a manner that the personalities are 'tune-able' , within reason. Guidelines on the extent that parameters may be varied are provided by the template. The psychological framework for the 'real personality' agents that underlies the agent builder template had to be developed; details are given in Lyell, Kambe-Gelke and Flo [2006]. The agent builder template presents to the user the aspects of an agent's beliefs, its allowable emotion set and range for each emotion, and initial emotional status. The emotion elicitors are presented in the context of situations that can occur for this simulation there. The user has the ability to develop rules for goal selection. Rule development is guided by the template. A rule for goal selection involves a situation, with a given context element, and the presence of emotions specified within a range. Lists of allowable situations, contexts, and emotions are presented to the user for potential selection. The variable elements of the template provide the 'tune-able' range for the particular agent personality. The fixed elements have been developed for each of six pedestrian agent personality types that are offered by the agent builder template : (l) Social (2) Egocentric, (3) Troublemaker, (4) Complainer, (5) Good and (6) Generic. Each of these types is consistent with the Five Factor personality model; each of their emotions and responses are consistent with the OCC model.
314
References Blue, V. and Adler, J. , 2000 , Cellular Automata Microsimulation of Bi-Directional Pedestrian Flows ", Journal of the Transportation Research Board, Vol. 1678, 135141. CYB Cybele Agent Platform http ://www.opencybele.org/ Retrieved May 22 , 2006 . CMU -SEI, http ://www.sei.cmu.edu/cmm i/presentations/euro- sepg-tutorial/tsldI23 .htm. Retrieved May 22 , 2006. Digman, J., 1990, Personality Structure: Emergence of the five factor model , Ann. Rev. Psychology, 41,417-40. Hoogendoorn S., Bovy, P., Daamen, W., 2002 , Pedestrian Wayfinding and Dynamics Modeling, in Pedestrian and Evacuation Dynamics, Berlin: Springer-Verlag. Eds. Schrekenberg, M. and Sharma, S.D. Lyell , M. and Becker, M., 2005, Simulation of Cognitive Pedestrian Agent Crowds in th Crisis Situations, In Proceedings ofthe 9 World Multiconference on Systemics, Cybernetics, and Informantics. Lyell, M., Kambe Gelke , G. and Flo, R. , 2006, Developing Pedestrian Agents for Crowd Simulations, Proc . BRIMS 2006 Conference, 301-302. Ortony, A. , Clore, G., Collins , A., 1988, The Cognitive Structure ofEmotions, Cambridge: Cambridge University Press . Pelechano, N. O'Brien,K., Silverman,B. Badler , N., 2005 , Crowd Simulation Incorporating Agent Psychological Models, Roles and Communication, CROWDS 05, First International Workshop on Crowd Simulation, Lausanne, Switzerland. Rowe, J.E . and Gomez, R., 2003 , EI Botellon: Modelling the movement of crowds in a city. Journal of Complex System s. Volume 14, Number 4. Still, O.K., 2000, Crowd Dynamic s. PhD Thesis, Mathematic s Departm ent, Warwick University
Chapter 15
Traffic flow in a spatial network model Michael T. Gastner Santa Fe Institute 1399 Hyde Park Road, Santa Fe, NM 87501 [email protected]
A quantity of practical importance in the design of an infrastructure network is th e amount of traffic along different parts in the network. Traffic patterns primarily depend on the users ' preference for short paths through the network and spatial constraints for building the necessary connections. Here we study the traffic distribution in a spatial network model which takes both of these considerations into account . Assuming users always travel along the shortest path available, the appropriate measure for traffic flow along the links is a generalization of the usual concept of "edge betweenness" . We find that for networks with a minimal total maintenance cost , a small number of connections must handle a disproportionate amount of traffic. However, if users can travel more directly between different points in the network, the maximum traffic can be greatly reduced.
1
Introduction
In the last few years there has been a broad interdisciplinary effort in the analysis and modeling of networked systems such as the world wide web, the Internet, and biological, social, and infrastructure networks [16]. A network in its simplest form is a set of nodes or vertices joined together in pairs by lines or edges. In many examples , such as biochemical networks and citation networks, the vertices exist only in an abstract "network space" without a meaningful geometric interpretation. But in many other cases, such as the Internet, transportation or communication networks , vertices have well-defined positions in literal physical space, such as computers in the the Internet, airports in airline networks , or cell phones in wireless communication networks.
316 The spatial structure of these networks is of great importance for a better understanding of the networks' function and topology. Recently, several authors have proposed network models which depend explicitly on geometric space [1, 3, 6, 7, 9, 12, 13, 15, 17, 18, 19, 20, 21, 22] . In all of these models, nearby vertices are more likely to be connected than vertices far apart. However, the importance of geometry manifests itself not only in the tendency to build short edges, but also in the traffic flow on the network: given a choice between different paths connecting two (not necessarily adjacent) vertices in the network, users will generally prefer the shortest path. With few exceptions [2, 4], the literature on spatial networks has rarely analyzed traffic patterns emerging from the various models. To address this issue, this paper takes a closer look at one particular model [10] and analyzes the distribution of traffic along the edges in the network.
2
A model for optimal spatial networks
Suppose we are given the positions of n vertices, e.g. cities or airports, and we are charged with designing a network connecting these vertices together, e.g. with roads or flights. The efficiency of the network, as we will consider it here, depends on two factors . On the one hand, the smaller the sum of the lengths of all edges, the cheaper the network is to construct and maintain. On the other hand, the shorter the distances through the network, the faster the network can perform its intended function (e.g., transportation of passengers between nodes or distribution of mail or cargo) . These two objectives generally oppose each other: a network with few and short connections will not provide many direct links between distant points and , consequently, paths through the network will tend to be circuitous, while a network with a large number of direct links is usually expensive to build and operate. The optimal solution lies somewhere between these extremes. Let us define lij to be the shortest geometric distance between two vertices i and j measured along the edges in the network. If there is no path between i and j, we formally set lij = 00 . Introducing the adjacency matrix A with elements A i j = 1 if there is an edge between i and j and A i j = 0 otherwise, we can write the total length of all edges as T = I:i<j Aijl i j . We assume this quantity to be proportional to the cost of maintaining the network. Clearly this assumption is only approximately correct: networked systems in the real world will have many factors affecting their maintenance costs that are not accounted for here. It is however the obvious first assumption to make and, as we will see, can provide us with good insight about network structure. Besides maintenance, there is also a cost Z due to traveling through the network for the user. In a spirit similar to our assumption about maintenance costs, we will assume that the total travel cost is given by the sum of the distances between all vertex pairs. There is, however, one complicating factor. The travel costs are not necessarily proportional to geometric distances between vertices . In some cases, e.g. road networks, the quickest and cheapest route will indeed not be very different from the shortest route measured in kilometers. But in other
317
networks, travel costs depend more st rongly on the graph dist ance, i.e. the number of legs in a journey. In an airline network , for instance, passengers often spend a lot of t ime waitin g for connect ing flights , so that they care more abou t the number of stopovers t hey have to make than about the physical dist ance traveled. To model both cases we introdu ce two different expressions for the travel costs . For a road network, these costs are approximat ely Zl = L i <j l ij where lij is again the shortest geometric distance between i and j . For an airline network , a better approximat ion is Z2 = L i <j h ij where hij is the minimum number of legs in th e journ ey. Th e total cost of running the network is th en proportional to the sum T + ')'ZI or T +')'Z2, respectively, with ')' 2: 0 a constant th at measur es the relati ve import ance of th e two terms. Th e opt imal network in our model is th e one minimizing the thus defined total cost [5, 14].1 Th e number of edges in th e network depend s on the parameter ')'. If ')' -.. 0, th e cost of travel ex: Zl /2 vanishes and the optimal network is the one th at simply minimizes th e total length of all edges. That is, it is the minimum spanning tr ee (MST) , with exactly n - 1 edges between the n verti ces. Conversely, if ')' -.. 00 th en Zl / 2 dominat es the opt imization, regardl ess of the cost T of maint aining the network , so th at the optimum is a fully connected network or clique with all ~n(n - 1) possible edges present . For intermediate values of ')', finding the opt imal network is a non-tri vial combinatorial opt imizat ion problem, for which we can derive good, t hough usually not perfect , solut ions using th e meth od of simulated annealing [8] . We show networks obtained in this manner in Fig. 1. For ')' = 0.0002 the optim um networks are almost identical to MSTs independent of the users' preference for either short mileage or a small number of sto povers. As ')' increases, however, the two models show very distinct behaviors. In the first case t he number of edges grows, whereas in the second case the networks remain trees for all but very large ')'. If users wish to minimize grap h distances, a small number of highly connected vertices appea r with increasing ')'. Like hubs in an airline network , th ese vertices collect most of the traffic from other vertices in the vicinity. On th e ot her hand , if users care about geometric dist ances, there are no such hubs. These differences influence the traffic patterns as we will now show.
3
Edge betweenness as a measure for traffic flow
Th e definitions of Zl and Z2 imply three assumptions. Fir st , there is an equal demand for traveling between all origin-destination pairs. Second , all edges have infinite capacities, so that there are no delays due to congestion. Third, all t he traffic is along the shortest paths t hrough the network, eit her measured IThe critical read er might have noticed that Zl has t he di mension of a length wher eas Z2 is dimensionless . In t his pa per, we will get rid of Z l 'S dimension by setting the average Eucl idean "crow flies" dist ance between a vertex and its nea rest neighbor equal to one . T his will be accomplished by placing n vertices in a sq uare of length 2y'n an d imp osi ng periodic boundary cond itio ns.
318 y = 0.0002
y = 0.002
minimizing T+ yZ I
minimizing T+ yZ2
Figure 1: Networks minimizing T + -yZl (top) and T + -yZz (bottom) for different values of -y and n = 200 vertices each. The networks in the top row are obtained by minimizing geometric distances between vertices; the bottom row shows the results if the relevant distance for the user is the graph distance. The thickness of the edges represents the betweenness defined in Sec. 3. Note that we have imposed periodic boundary conditions, i.e, a line leaving the square at the top enters the square again at the bottom, and similarly a line at the left end reappears on the right.
by geometric or graph distance; in other words, users do not take intentional detours. The situation in real networks is, of course, more complicated, but these assumptions are a plausible starting point. An appropriate way to measure traffic flow under these assumptions is a generalization of the "edge betweenness" which was first introduced in [11]: We send one unit of flow between every possible origin-destination pair along the shortest path and count the number of units that have passed through one particular edge. Equivalently, the edge betweenness is the number of shortest paths in the network running along that edge. In [11], distances are measured as graph distances, but if the user measures path lengths as geometric distances, we can generalize the idea in an obvious manner. A sample calculation is shown in Fig. 2. For the models based on costs Zl and Zz we have constructed optimal networks for n = 200 randomly placed vertices and measured the betweenness of all edges for several values of 'Y. In Fig. 3, we plot the cumulative distributions, i.e, the fraction of the edges in the network whose betweenness is larger than a certain value b. Panel (a) shows the result for the first model where the user cost Zl depends on geometric distances; panel (b) shows the distribution for our second model with the user cost Zz depending on graph distances. For 'Y --t 0, where the optimal networks are MSTs, both models possess a long-stretched tail indicating that in this case some edges, like main arteries, have to support a large
319
(a)
(b)
A+-+B o
A A
A+-+C n
B
B+-+D 0
+A
B+-+C
D
0
C+ A~C+ A~C+ A~C
B
B
A+-+D
B
C+-+D
B
n
D
J>C+ A
B
C
<1> ...
A
/
B
C
'\
B
Figure 2: Calculation of the edge betweenness. (a) A simple illustrative network. Numbers refer to Euclidean edge lengths. (b) Edge betweennesses for the same network. For every pair of vertices, we send one unit of flow along the shortest geometric path between them. The amount of flow is indicated by bold numbers on the edges. Since there is no edge between A and C , the shortest path between these two vertices is A ..... D C which is slightly shorter than A ..... B ..... C. Therefore, the edges A ..... D and C D have a betweenness of 2, all other edges a betweenness of 1. We could have also used graph distance instead of geometric distance, which amounts to settin g all distances in (a) equal to one, but this generally gives different results. In terms of graph distance, t here are, for example, two shortest paths between vertices A and C, namely via B and via D. Both paths would contri bute one half unit of flow to the edge betweenness.
portion of t he flow in t he network. If such an edge fails, for exa mple because of const r uct ion or congestion , many routes in t he network will be affected. The distributions for both models become narrower as 'Y increases. The effect, however , is much st ronger in t he first t ha n in t he second model. For 'Y = 0.02, for exa mple, no edge in t he first mod el has a betweenn ess larger than 1200, whereas in t he second model the maximum is around 3300. This difference is closely relat ed to th e different network st ructures. As pointed out in Sec. 2, the second mod el, unlike the first , possesses a small num ber of highly connected vertices. These hubs collect most of t he traffic and, since th e networks are trees, t he traffic must inevit abl y pass through the few edges between the hubs which explains th eir high betweenness. Networks generated by th e first model , on the other hand , have no hubs but more edges so that th e maximum betweenness is smaller . Towards t he left-hand side most curves in Fig. 3 have jumps at b = n - 1. These jumps are present if a large fraction of th e vert ices have a degree of one becau se t hese vertices can only be reached along one edge , so t hat t ra ffic from all ot her n - 1 vertices must go t hrough t ha t edge . Since the second mod el leads to increasingly many such "dea d ends" as 'Y grows, t he jumps in Fig. 3(b ) become bigger . However, a smaller betweenness t ha n n - 1 is possible as t he curve for 'Y = 0.2 in Fig. 3(a) proves. For 'Y -. 00 the opt imal networks contain all !n(n - 1) possible edges, hence every edge has in t his limit a betweenness
320 (a)
(b) -
MST (y - > ll)
~ 0.8 : \ :t ..Q . \
-- -- y - ll.(X)02
~~
y - 0.02 ---- y - ll.2
Vi I\. ~
:
:::
\0-0
c:
.~
~
'
.;
0"
u~
~
.
-c
•
0.4
MST (y - > O) ---- y _ 2·\0-5
~E 0.6 "cc '-
.- --- Y- 0.lXXl2 -- - Y- 0.002 y - 0.02
"> ..Q
;;, ,,
--- y - ll.IX)2
,
0.6 ; : " " . \
-5
~~ u .o!j
i~
.9
~
0.2
0.4
0.2
......
-- .
.-
0;;
- ....- .. .-
1000
2000
3000
betweenness b
4000
5000
0 "' 1
1000
2000
3000
4000
5000
betweenness b
Figure 3: Cumulative edge betweenness distributions for networks with n
=
200
randomly placed vertices. (a) Distributions for networks minimizing the total cost T + ,Zl where the user costs depend on geometric distances. (b) Distributions for networks minimizing the total cost T + ,Z2 where the user costs depend on graph distances.
equal to one.
4
Conclusion
In this paper we have studied the traffic distribution in a spatial network model. The model is based on the optimization of maintenance costs measured by the length of all edges and the ease of travel measured by the sum of all distances between vertex pairs. A single parameter 'Y determines the relative weight of both considerations. If users prefer short geometric distances, more edges are added to the network. On the other hand, if the user prefers short graph distances, a hub-and-spoke network emerges with only few additional edges. The traffic along one edge can be measured as its "betweenness" which is the number of shortest paths in the network using this edge. The cheapest network to maintain, the MST, has a small number of edges with very high betweenness. If more weight is given to user-friendliness, the highest betweenness in the network decreases. The effect, however, is stronger if we minimize geometric rather than graph distance. In the first case, the additional edges can reduce the traffic by a large extent, whereas in the second case, the connections between the hubs still carry a substantial amount of traffic. In our model, we assumed that all edges can in principle handle infinitely much traffic. For future work, one could consider edges with finite capacities so that some edges along the shortest path might become congested and hence unavailable. This problem, however, possesses some non-trivial features [23] requiring a more careful analysis of the users ' strategies.
321
5
Acknowledgments
Helpful discussions with M. E. J. Newman are acknowledged.
Bibliography [1] ALVAREZ-HAMELIN, Jose Ignacio, and Nicolas SCHABANEL, "An internet graph model based on trade-off optimization" , European Physical Journal B 38 (2004), 231-237. [2] BARRAT, Alain , Marc BARTHELEMY, and Alessandro VESPIGNANI, "The effects of spatial const raint s on the evolution of weighted complex networks" , Journal of Statistical Mechanics (2005), P05003 . [3] BARTHELEMY, Marc, "Crossover from scale-free to spatial networks", Europhysics Letters 63 (2003) , 915-921. [4] BARTHELEMY, Marc, and Alessandro FLAMMINI, "Opt imal traffic networks" , Journal of Statistical Mechanics (2006) , L07002. [5] BILLHEIMER, John W ., and Paul GRAY, "Network design with fixed and variable cost elements", Transportation Science 7 (1973) , 49-74. [6] FABRIKANT, Alex, Elias KOUTSOUPIAS, and Christos H. PAPADIMITRIOU, "Heuristically optimized trade-offs: A new paradigm for power laws in the internet", lCALP, vol. 2380 of Lecture Notes in Computer Science, Springer (2002) , 110-112. [7] FLAXMAN, Abraham D., Alan M. FRIEZE, and Juan VERA, "A geometric preferential attachment model of networks" , Preprint http://www.math.cmu.edu/ aflp/Texfiles/NewGeoWeb.pdf (2006) . [8] GASTNER, Michael T ., Spatial Distributions: Density-equalizing map projections, facility location , and two-dimensional networks, PhD thesis University of Michigan, Ann Arbor (2005) . [9] GASTNER, Michael T. , and Mark E. J . NEWMAN, "Shape and efficiency in spatial distribution networks", Journal of Statistical Mechanics (2006), P01015. [10] GASTNER, Michael T., and Mark E. J . NEWMAN , "T he spatial structure of networks", European Physical Journal B 49 (2006), 247-252. [11] GIRVAN , Michelle, and Mark E. J . NEWMAN, "Community structure in social and biological networks", Proceedings of the National Academy of Sciences in the United Stat es of America 99 (2002) , 7821-7826. [12] GUIMERA, Roger, and Luis A. Nunes AMARAL, "Modeling the world-wide airport network" , European Physical Journal B 38 (2004) , 381-385.
322 [13] KAISER, Marcus, and Claus C. HILGETAG, "Spatial growth of real-world networks", Physical Review E 69 (2004), 036103. [14] Los, Marc, and Christian LARDINOIS, "Combinatorial programming, statistical optimization and the optimal transportation network problem", Transportation Research-B 16B (1982), 89-124. [15] MANNA, Subhrangshu S., and Parongama SEN, "Modulated scale-free network in Euclidean space", Physical Review E 66 (2002), 066114. [16] NEWMAN, Mark E. J ., Albert-Laszlo BARABASI, and Duncan J. WATTS eds., The structure and dynamics of networks, Princeton University Press Princeton (2006). [17] PETERMANN, Thomas, and Paolo DE Los RIOS, "Physical realizability of small-world networks", Physical Review E 73 (2006), 026114. [18] ROZENFELD, Alejandro F., Reuven COHEN, Daniel ben AVRAHAM, and Shlomo HAVLIN, "Scale-free networks on lattices", Physical Review Letters 89 (2002), 218701. [19] SEN, Parongama, Kinjal BANERJEE, and Turbasu BISWAS, "Phase transitions in a network with a range-dependent connection probability" , Physical Review E 66 (2002), 037102. [20] WARREN, Christopher P., Leonard M. SANDER, and Igor M. SOKOLOV, "Geography in a scale-free network model", Physical Review E 66 (2002), 056105. [21] XULVI-BRUNET, Ramon, and Igor M. SOKOLOV, "Evolving networks with disadvantaged long-range connections", Physical Review E 66 (2002), 026118. [22] YOOK, Soon-Hyung, Hawoong JEONG, and Albert-Laszlo BARABASI, "Modeling the internet's large-scale topology" , Proceedings of the National Academy of Sciences of the United States of America 99 (2002), 1338213386. [23] YOUN, Hyejin, Michael T . GASTNER, and Hawoong JEONG, "The price of anarchy in transportation networks: efficiency and optimality control" , Preprint physics/0712.1598 (2008).
Chapter 16
AUGMENTED NETWORK MODEL FOR ENGINEERING SYSTEM DESIGN Gergana Bounova, OlivierL de Weck Massachusetts Institute of Technology {gergana,deweck} @mit.edu
I. Introduction - Motivation Models are usually domain specific and traditionally even system-specific. As system complexity and data quantity creep up, there is an increasing need for general unifying models. Some examples are state space and nonlinear dynamics models, mostly used for mathematical domains of engineering applications. Network modeling has been introduced more recently to unite representation of systems across fields with abstract graph models. In this paper, we examine the relevance, benefits and deficiencies of network representation and analysis for engineering systems . The disadvantage of simple network models of engineering systems is that the hybrid nature and dynamics of nodes and links is not captured. In an acquaintance network, the relation of knowing someone is reversible, and uniform across all node pairs (relationships). Geometry or order is mostly irrelevant. This is not the case for most engineering systems, where at any level of abstraction components are assembled or arranged in particular ways to work properly. Moreover, the nodes and links rarely can be put in the same category . These are hybrid networks: networks comprised of nodes and maybe links of different types. For example, modeling the components of a vehicle, the parts (or subsystems) of an airplane or the states of a formation of flying vehicles, is not as simple as pointing out the nodes and the physical connections . Depending on the level of abstraction, links can be physical connections, such as an electric connector, welded point; or influence connections, like magnetic fields, chemical bonds or concentration levels, in fuel mixtures; or abstract connections, such as a transportation route. Links and nodes can have different capacities and costs, maintenance routines, dynamics. Link existence or strength can vary with time. All of these properties of real systems make simple graph representation inadequate for a useful model. Simple metrics from a pure graph model could still be useful if they are i) at the right level of abstraction, ii) encoded at the right level of detail. We call this approach "augmented network modeling for engineering design".
324 In this paper, we propose a state-space-like augmented network model, with highlevel simple abstract network description and deeper level of engineering detail description content. We give four examples, with their network description and brief discussion of simple network statistics. Then, we discuss the benefits and deficiencies of network representation.
II. Models - System Examples An augmented network model is an attempt to capture domain-specific knowledge and yet be able to extract and analyze higher-level network properties. This method does not claim to solve the modeling problems of all systems imaginable, and it does require hard additional work for application adaptation. However such a hybrid representation allows a general plug-in of many models to the same network analysis. To investigate the relevance of network models, we analyze four different systems, some of which biological, social and technological. These are described below.
1. Journal publication network for the MIT Engineering System Division This is a social network example, consisting of 196 journals as nodes: two journals are connected if one faculty member publishes in both. All nodes and links are of the same type and all links are bidirectional by definition. The data is gathered by taking a poll and recording citations. Figure 1 shows the entire dataset with its giant component and isolated clusters. The general journal theme is indicated for the different clusters. The clustering per topic confirms that every faculty member publishes in a certain area, as expected. The heterogeneous giant component (labeled "core") indicates that many researchers collaborate in an interdisciplinary fashion. The most connected journal is Management Science .
Tn..poriftoo
1""'"-Sy"'"
~ ,
BtcnoDovic.. NwlOtrical ....
PIoy,I".
UK",,",'
T<5?r
• • • •
~
Figure 1: Journal publications network : two journals are connected member publishes in both. 16 connected clusters are identified.
if one faculty
325 2. MAPK reference pathway network (1] The Mitogen-Activated Protein Kinase (MAPK) pathways transduce a large variety of external signals, leading to a wide range of cellular response s, including growth , differentiation, inflammation and apoptosis. Proteins are modeled as nodes and two proteins are connected if they interact. The data is experimentally verified, available from the KEGG databa se [2]. Nodes are of the same type (but different molecules ). Links can vary depending on the type of interaction, such as activation or inhibition. This pathway was analyzed for three species, drosophila , yeast and human. The analysis includes investigation of structural similarity via coarse-graining and motif analysis .
,.
00
. ..
:
0
-. .-;. ~ •
-
0
o o
o
.._..
......... .. .
~o
. T,_.--::•. o
Figure 2: MAPK pathwa y representations: traditional representation (left) from KEGG database [2J and graph theoretic representation of the same pathway (right). 3. Space transportation network model [3][4] There are states in space which a spacecraft occupy with minimal energy expense (ex. orbiting). State transitions require energy leaps, in the form of fuel burns, provided by the vehicle itself or an external force (another vehicle, by-passing a planet etc). States can be modeled as nodes and transitions as links. so that the mission time-value is concentrated in the nodes, while the mission cost-energy spent is contained mainly in the links. This is not a perfect assumpt ion, because transitions are not instantaneous and states are not cost-free (ex: stationkeeping and correction maneuvers) . An example is shown in Figure 4. There are three types of nodes, surface, orbit and operational sequence nodes, and around 16 types of links, such as deorbit burn , orbit injection. landing and so on. This model was created with the purpose of generating and evaluating many architectures for lunar missions .
.........
Figure 4: Space transportation network model for a lunar mission scenario. OPN (Object-Process Network , [5]) representation (left) and graph theoretic plot (right). III. Comparative Analysis, Network Statistics
326 For all the systems described above, we discuss structure and dynamics in view of network modeling. Traditional network metrics [6] are shown in Table I . A preliminary look of this limited set of data confirms the great variety of directed/undirected models, dense/sparse systems, physical/abstract models. The average path length and diameter measures show the relative size of the network. For example, an average path-length which is a small percentage of the number of nodes, is one of the characteristics of a small world. For the systems presented here, in general technological systems are not small worlds because of the effect of geometry and the importance of Euclidean distance . Degree correlations vary regardless of the type of system, sociological, biological or technological, contrary to claims that the domain matters [6][7]. Mean degree is smaller for technological systems in general, due to capacity, geometry and degrees of freedom constraints [8], but that varies from abstract to physical models. The space transportation network, as an abstract network, has a twice higher average nodal degree. Finally, all systems have different objectives for operations or performance measures. A lot of network analysis concentrates on understanding system structure, modules, cohesiveness and critical components or nodes. A few structural experiments were done to gauge the relevance of these methods for technological systems. The physically meaningful component breakdown was compared to the Newman-Girvan algorithm breakdown [6] Due to space constraints, only one example is presented here. Figure 6 shows the natural breakdown of the space transportation model versus the Newman-Girvan partitioning. Earth and near-Earth nodes are grouped together on the left (with rectangle). All transfer nodes together with Lagrange point orbits make another set (marked with filled circles). Finally all near-Moon with lunar surface nodes are marked with unfilled circles. The Newman-Girvan results are on the right, with same markings for each group. The unmarked nodes are classified in I-node clusters by the algorithm. In this, as well as with other datasets, the mismatch is evident. This means that clustering and linkage are designed in specific ways in mechanical systems, with characteristics and purpose that cannot necessarily be uncovered by traditional network analysis.
Figure 6: Space network modularized: (leji) Natural division into Earth-related, orbit/trajectory-related and Moon-related nodes; (right) Newman -Girvan clusters.
Another focus in the network literature for system analysis is node centrality. Various measures of centrality exist, the simplest of which are degree centrality and betweenness centrality (also closeness, eigen-centrality, etc). Degree distributions and node significance have been widely employed in understanding structure and dynamics of social networks. To see how these considerations relate to the variety of
327 systems we explore, we look at adjacency matrix dot alignments. That means aligning nodes against nodes and drawing a dot if a link between two of them exists. Depending on the node order in this plot, interesting or no patterns can emerge. Figure 7 shows the matrix alignments for the systems in this study. The matrices are by increasing nodal degree, increasing betweenness and then increasing eigencentrality. It is evident that for a different system, different metrics contain the important structural information. For example, the degree plot does not tell much about the MAPK pathway. The betweenness plot on the other hand shows some structure. In the case of the technological networks, it can be argued that the design plot, which is probably recorded component-wise, contains the most information. In any case, our aim was to show that traditional network metrics and structural considerations cannot be applied to all systems equal. 100
50
~-:~:.~5 ~;1;: ~
,-,-"..'".x \;~~:.. ".r. '
. -- >'" I~
°o~----:-: 50,....-'-'-'I00--'--"W
a)
orde<ed by deg<ee
....
,
20
°o~--:=--~~
::~ .... : ~ ~ .!-» •• 5
o o
c)
I.
•
5
ordered l1t'
10
5
15
dell rea
::rn : ~ ~ so
b)
o o
5
10
15
lJdered l1t' betweemess
-
'00
.. ...,c
o 9,
~
5 0
0
5
5
10
15
lJdered l1t' eigenC
0
0
t;i- I 5
10
by design
15
Figure 7: Adjacency matrix dot-alignments, node ordering increasing degree, increasing betweenness, and eigen-centrality; a) ESD journal facul ty network, b) MAPK human pathway, c) space transportation network mode, including order by design.
In engineering systems, path finding can be more important than node analysis. This is because parts and components are more likely to be well-defined, engineered and understood, whereas the whole system might have an emergent behavior. Often an engineer cares about change propagation (stress, cracks, innovation) or in mission analysis, optimal path finding, given thousands of options. Thus understanding the network relationships in the system has an inherently different purpose. Technological systems have to be operated, often fulfill a single purpose or have some finite functional spectrum. The main difference is that they do not emerge, but are designed to operate. So any structural analysis, should at a minimum uncover the schematics of the designer. As we saw, simple network models do not represent the structure of complex engineering systems very well. Our interest is less in structure, but more so in dynamics, operation at various levels and eventually growth, expansion or shrinkingof the system. A car frame propagates vibration and has to be integral, yet somewhat modular to withstand design changes without complete replacement. A space transportation network only serves to provide route
328 optimization for mission planning. To serve a design challenge right, such a model has to convey the right level of informa tion - where can one get to physically and how much effort does it take? While there is politics involved in desig n decisions, one can only attempts to make an objective decision .
Table I : Graph statistical measures fo r a the jou rnal publication network. the MAPK reference pathway, and the space transportation model.
Network type n m min Dir ected ? Hybrid nodes? Hybrid links ? Dynam ic (cha nges with time)? Number of conn. compone nts [max degr ee, mean degr ee, min degr ee} Mos t connected nod e Degr ee cor relation (Pearson coeff) [max in-betw, mean in-betw, min in-betw] Most in-between node Highest clustering coefficient node Average clusterin g coefficient # of tri an ale loops # of rectan gul ar loops Network dia meter Mea n path length
J ournal publi ca tion network for a n MIT community Social 196 547 2.791 No No No Static (as modeled)
Tot (16 ,6.4, I }. in 18, 3.2, 1 I, out: {IO, 3.2, Of Tot: TO, in: 5, ru t: TO
-0.0229
-0.306
11029,375,228 } (5.25 times n)
(4639, 1035.5, 384}
Journal of Personality and Social Psychology Many (85)
=15)
0.0402 (GC 0.2011)
(32.44 times n) 12 1
Earth surface, LLO
0.807 (GC)
0.008
0.1389, in: 0.1792
697 76
3 400
10 0
6(3%n) 3.3 (55% d, 1%n)
17 (lI %n) 6.454 (37% d, 4% n) Newman-Girvan: 4
7 (46%n) 2.5561 (36%d, 17% n)
# of commu nities
7
Pe rformance metrics
Team work? Collective publishing success ?
Cost metrics
Effective response to cell stimuli
....
Phys: 3, Newman-Girvan: 3 or8 Overall mission success easiness of getting from one location to another Mission cost (fuel, life support, power)
IV. Augmented Network Model [4] The key characteristics of an augmented network model are that i) it contai ns more infor mation than an edge list, ii) every model is domain-specific and iii) network theoretical tools can be used for global analysi s, especially in pointing to interesting areas for research.
329 Node refinement As argued in previous sections, understanding and designing an engineering system requires domain knowledge and models augmented with engineering detail. We have tested a simple methodology for such proposed analysis using the space transportation model, described in Section II.3. Nodes have three types (surface, sequence and orbit) and are described with relevant names, types, geometrical coordinates, set of states, internal node time counter, associated delta V (change in velocity magnitude), if any. node(TO).name - 'TO '; node(TOI .type. orbit ; node(TO) .origin • EO; node(TO) .dest Li : node(TO) .status .a • 205850 ; nodelTOI . s t a t us . i - 0 .4958 : nodeCTOI .status .W - 0 .2988 ; node(TO) .status .w.4 .3550 : node(TO).status .e • 0.9236 : node(TO) .status .M = 0 . 274 4 ; node(TO) .start - 0 : node(TOI .fin = 0.5*Period (node (TO) . s t a t us. a l : node(TO) .t~ • 464 72 ; node(TO) .x. [r ;v) • [ 77989 .44 8 8416 8 .757 3109 0 .9 93 0 .613 1 . 9000 .885 ) ; node(TO) .ohild· (E_EDL EO TO Li MO M_EDL ISS) ; node (TO) . parent • IE_sub_Orbital EO TO Li Me M_Sub_Orbital ISS] : node(TO).dY· tdelta Y for stationkeeping I orbital correotion)
=
Figure 8: Example of transfer orbit (TO) node definition .
Edge refinement The next step is edge refinement. Given (he model used, state-space, physical, geometrical, links can have different meaning. An acquaintance link can be described simply as a node pair, while a protein interaction edge can be augmented by environmental conditions under which interaction was detected, frequency of interaction, activation or inhibition and so on. In the case of the space network model, a link has associated delta V (fuel burn) and associated type (describing the state transition). For example, the TMllink described in Figure 9, connects an orbit node to another orbit node, hence its type is orbzorb .
. v a l = 1; . n ame = l ~ Burning' ; . type = orb2orb; . d V = 3 ,1378;
Figure 9: Translunar injection link example. It is specified in an adjacency matrix, as a link between EO (Earth orbit) and TO (transfer orbit) Rules of Dynamics
330 The use of more detailed, design-relevant node and link description is to be able to encode rules of dynamics. For example, a link might be temporal, that is, it exists, but not at all times. There is a tight launch window from Earth orbit to lunar orbit, and then the lunar surface , that depends on the desired landing location. In general, the rules of dynamics describe what transitions (links) are possible under what conditions. The system or parts of it can be simulated with given initial conditions or for a given purpose. The output, such as evaluating a pathway, can be linked to another set of models for dedicated component design for example. After the physical simulation and a better understanding of the system, this process can be iterated with investigating different network perspectives of the entire model.
v. Conclusion We have presented systems of different domains, size and level of heterogeneity. The relevance of network modeling is variable in each case, because of the domain biases and data collection problems, as well as the implications for analysis and design. For example, social network studies have long been employed to study community structure, functional clusters and prominent nodes. Network structure and topology studies are fairly straightforward to do with homogeneous networks. In biological and technological systems domain knowledge becomes essential. We discussed an example of a technological system where structure is not simple to detect, and which cannot be modeled with homogeneous models. Finally, we provide a limited description of an augmented methodology for engineering design . Our model has been applied to do design in the case of the space transportation network [4].
References 1. MapK Signaling Pathway Analysis, Advanced Systems Architecture Final Report, May 16,2006 2. KEGG database, source: http://www.genome.jplkegglpathway.html 3. Bounova et ai, Selection and Technology Evaluation ofMoon/Mars Transportation Architectures AIAA-2005-6790, Space 2005, 30 Sept - 1 Aug, Long Beach, California 4. Bounova et ai, Space Transportation Network Model for Rapid Lunar Architectures Analysis, Space Transportation Symposium, 57th International Astronautical Congress, 2006 5. Simmons et al. Mission Mode Architecture Generation for Moon-Mars Exploration Using an Executable Meta-Language, AIAA-2005-6726, Space 2005, Long Beach, California, August 30 - September 1, 2005 6. Newman, M. E. J., The structure and function ofcomplex networks, SIAM Review 45, 167-256 (2003) 7. Whitney, D., Alderson, D. (2006). "Are technological and social networks really different ?" InterJournal 8. Dan Whitney, Connectivity Limits ofMechanical Assemblies Modeled as Networks, ESD Working Paper Series
Chapter 17
Network Models of Mechanical Assemblies Daniel E Whitney Engineering Systems Division Massachusetts Institute of Technology [email protected]
1.1. Introduction Recent network research has sought to characterize comple x systems with a number of statistical metrics, such as power law exponent (if any), clustering coefficient, community behavior, and degree correlation. Use of such metrics represents a choice of level of abstraction, a balance of generality and detailed accurac y. It has been noted that "social networks" consistently display clustering coefficients that are higher than those of random or generalized random networks, that they have small world properties such as short path lengths, and that they have positive degree correlations (assortative mixing). 'Technological" or "non-social" networks display many of these characteristics except that they generally have negative degree correlations (disassortative mixing). [Newman 2003 i ] In this paper we examine network models of mechanical assemblies. Such systems are well understood functionally. We show that there is a cap on their average nodal degree and that they have negative degree correlations (disassortative mixing). We identify specific constraints arising from first principles, their structural pattern s, and engineering practice that suggest why they have these properties. In addition , we note that their main "motif' is closed loops (as it is for electric and electronic circuits), a pattern that conventional network analysis does not detect but which is used by software intended to aid in the design of such systems.
332 1.1.1. Literature
Recent network analysis research has evolved from studying scale-free behavior [Albert and Barabasi"] to a search for understanding the functional behavior of systems. Various authors have studied software [Myers iii], the Internet [Li et an, biological systems [Milo et al"], [Ravasz et alvt and electronic circuits [Ferrer I Cancho et alvit among others. Each of these authors sought to find , with varying degrees of success, relationships between network statistics or structure, usually in the form of clusters (sometimes called modules) or internal patterns, and function. Deeper study has indicated that complex networks can be classified by domain, such as social, informational , biological , and technological, and their statistical properties compared [Newman 2003 i, Table II]. [BourjauICiii] introduced the "liaison" graph model of assemblies in which a part is represented by a node and a joint between two parts by an edge called a liaison, and used network and circuit theory to find all possible assembly sequences. [Bjorke ix ] is one of many authors who use network methods to analyze tolerances and predict propagation of part size and shape variation in assemblies . Assemblies are equivalent to kinematic mechanisms, whose motion and constraint properties have been analyzed by network methods for decades [Phillips"], [Whitehead Xi], [Blanding"}, [Konkar and Cutkosky'"], [Shukla and Whitney'iV], IWhitney2004 XV ) ..
1.2. Properties of Assemblies 1.2.1. Network Models Mechanical assemblies are technological networks. Compared to the Internet, mechanical assembly networks are small, but they can be comparable in size to food webs and other recently studied networks, often having in the low to mid hundreds of nodes and edges.' Assemblies are typically designed to have a hierarchical structure, with subunits called subassemblies, which can be nested to a few levels (perhaps three to five , sometimes more). A hierarchical decomposition of an automobile might have top level assemblies {body, chassis, power train , interior} with respective main subassemblies {roof , car body side, car frame , doors, hood, trunk lid}, {springs, shock absorbers, steering gear}, {engine, transmission, drive axles , brakes}, and {seats, dashboard, steering wheel, shifter console, interior trim}, and so on. A graph model of an assembly is usually formed by considering unitary parts? as nodes and joints' between parts as symmetric edges . Such models may be aggregated by collecting unitary parts into subassemblies, representing each subassembly as a collective node, and extending from this collective node only those edges that go to parts not in the subassembly . Graphs of assemblies are simple (no self-loops or multiple edges between nodes) and connected. Assemblies are always modeled as undirected networks . A unitary part cannot be separated into two or more without using destructive methods that eliminate its identity and geometric coherence and thus end its ability to function as intended in the assembly. 3 Joints typically exert kinematic constraint on the parts they connect, or exert enough mutual force to determine the joined parts' relative locations. I
2
333
1.2.2. Statistical Properties In this Section, we give in Table I the usual statistics about some assembly networks and compare them to Table 2 in [i]. The subject assemblies are, in increasing order of number of nodes, a rifle, a model airplane engine, an exercise walker, a bicycle , and a V-8 automo bile engine. These assemblies are modeled at the unitary part level and do not contain significant subassemblies. Exceptions are the coaster brake subassem bly in the bike and items attached to the outside of the V-8 engine such as the air conditioning compressor and the alternator. Assembly
n
m
Ratio of
C*
Average path length
Random clustering coeff
Degree correlation (Pearson)
0.309
3.121
< k > In 0.088
-0.177
0.388
3.579
0.064
-0.235
0.623 0.415
2.812 4.051
0.08 0.024
-0.213 -0.202
0.225
4.345
0.0124
-0.2695
m ton
and [ < k >]
1885 Winchester Single Shot Rifle Hyper 21 Model Airplane Engine Walker ** Bicycle***
Tab le I . Statistical Properties of Some Assemblies. *Clustering Coefficient C calcu lated ignoring nodes whose nodal clustering coefficient = O. **The walker assem bly is symmetric and only half of it was analyzed . The version analyzed has 40 nodes and 64 edges . ***The bicycle was analyzed using only 10 identical spoke sets on each wheel , whereas the front has 32 and the rear has 40 . Th us the version analyzed has 131 nodes and 208 edges . ***The engine network contains only 8 of the engine's 32 identical valve trains. Thus the version analyzed has 246 nodes and 372 edges. Many other small parts are also omitted . The above simplifications do not alter the concl usions. Clustering coeffic ient and path length calc ulated using UCINET v 6.9 [Borgatti et alxvi] . Pearson correlation calcu lated using softwa re provided by Mark Newman.
1.2.3. Mean Degree of Assemblies It was stated above that there is a limit on mean degree in assemb lies. This constraint is discussed here, first empirically and then theoretically . Figure I shows that the network connectivity (links/part) of assemblies having from 6 to 462 parts does not increase with the number of parts . In both real world scale free and random networks, it is predicte d that connectivity should grow with the number of nodes. [iii For the data available, mechanical assembl ies do not behave this way. Assemb lies with more than 100 or 200 parts are rare, for various engi neering reasons . Figure I also shows visually the fact that, exce pt for the Chinese Puzzle , network connectivities for these assembl ies do not exceed -2.1 .
334 7
Chinese Puzzle 6
5 ~
e:, 4 III III
c
o
III
';0
3
::;
2
Rugged St2p ler
I\tl · •
o
o
....
Paper • Shredder
•
Exercise Walker
Bike
v-a Engine •
Many Consumer Products 100
200
300
400
500
Number o f Parts
Figure 1. Liaisons Per Part vs Number of Parts for 35 Mechanical Products. There is no correlation between the network connectivity of these assemblies and the number of parts in them. The Chinese Puzzle is an outlier for reasons discussed in the text. The average connectivity for this dataset is 1.55, close to the value for the V-8 Engine (1.51). Data in this figure are a combination of those gathered by the author and those in [Van Wie et al. ,vii] Figure 1 shows that typical assemblies do not have anywhere near the connectivity that they might. To see a physical reason why, we begin by making the analogy between mechanical assemblies and kinematic mechanisms. All mechanisms are assemblies. While not all assemblies move in the sense that typical mechanisms do, assemblies nevertheless obey the same fundamental principles of statics . Among the issues of concern in kinematics is the state of constraint of the assembly: is it under-constrained and thus capable of movement; is it "exactly" constrained, having just enough links to prevent motion ; or is it over-constrained, having more than enough links to prevent motion? The last case is considered undesirable [xi] [xii) because it could result in locked-in stress in the assembly, leading to assembly difficulties or field failure . The implication is that constraint plays the role of a limit or cost related to adding arcs to a node, as suggested by [Amaral, et al xviii ) for other kinds of networks . The Griibler criterion [x] is typically used to determine the numerical value of the number of degrees of freedom M in a planar mechanism: Equation 1 M
=
3(n- g-l)+ ~jointfreedoms/;
where n = number of parts, g = number of joints.j, = degrees of freedom of joint i
335
If M > 0, the mechanism has M under-constrained degrees of freedom. If M < 0, the mechanism has M more links than necessary to prevent motion and is overconstrained . If M = 0, the mechanism is exactly constrained. If we define a to be the number of joints (liaisons) divided by the number of parts (equivalent to the average network connectivity) and define the average number of degrees of freedom allowed per joint as f3 ' then we may obtain from Eq (I)
g = an, Equation (2)
"Lf = gf3 = af3n, and
M = 3(n- an -1) + af3n, or M = an(f3 - 3)+ 3(n-1)
Note that unless f3 > 3, increasing a will drive M negative and generate overconstraint. Furthermore, larger a and f3 mean more complex parts and a more complex product. If the mechanism is to be exactly constrained, then M = 0 and we can solve for a to yield Equation (3):
a
=
3- 3n n(f3- 3)
3 3- f3
- - as n
gets large
This expression is based on assuming that the mechanism is planar. If it is spatial, like all those in Table I, then "3" is replaced by "6" and Eq (1) is called the Kutzbach criterion, but everything else stays the same. Table 2 evaluates Equation (3) for both planar and spatial mechanisms.
f3
a planar
a spatial
0
1
1
1
1.5
1.2
2
3
1.5
Table 2. Relationship Between Number of Liaisons Per Part and Number of Joint Freedoms for Exactly Constrained Mechanisms (M=O). Table 2 shows that a cannot be very large or else the mechanism will be overconstrained. If a planar mechanism has several two degree-of-freedom joints (pin-slot, for example) then a relatively large number of liaisons per part can be tolerated. But this is rare in typical assemblies. Otherwise, the numbers in this table confirm the data in Figure I. Most assemblies are exactly constrained or have one operating degree of freedom. Thus f3 = 0 or f3 = I, yielding small values for a, consistent with our data . The Chinese Puzzle is an outlier because it is highly over-constrained according to the Kutzbach criterion . It is possible to assemble only because its joints are deliberately made loose. Nonetheless , the overabundance of constraints is the reason why it has
only one assembly sequence , that is, why it is a puzzle .
336
2.1 Degree Correlation Why do mechanical assemblies in Table 1 (and, most likely, many other) mechanical assemblies have negative r't Here we offer a number of promising suggestions . First, from a numerical standpoint, negative r goes with the tendency for highly connected (high < k » nodes (sometimes called hubs) to link to weakly connected ones, and vice versa. This is true of all the assemblies studied so far. Many of the nodes attached to high degree nodes in an assembly are degree-one pendants, the presence of which tends to drive r negative . One contributing factor encouraging positive r is many tightly linked clusters. But such configurations are discouraged by the constraint conditions discussed in the previous section. From a formal network point of view, assemblies typically are hierarchical and similar in structure to trees, although they rarely are pure trees . It is straightforward to show that a balanced binary tree has r = -1/ 3 asymptotically as the tree grows, and that a balanced binary tree with nearest neighbor cross-linking at each hierarchical level has r = -1/5 asymptotically. Such trees are similar to most distributive systems such as water, electricity, and blood. From a functional/physical point of view, highly connected nodes in a mechanical assembly typically play either high load-carrying or load-distributing roles, or provide common alignment foundations for multiple parts, or both. The frames , pivot pins, and pedal arms of the walker, the frame and front fork of the bicycle, and the cylinder block, cylinder head, and crankshaft of the engine perform important load-carrying and alignment functions in their respective assemblies. Such highly connected parts are generally few in most assemblies, and they provide support for a much larger number of low degree parts. Almost without exception these highly connected nodes do not connect directly to each other . In high power assemblies like engines , there are always interface parts, such as bearings, shims, seals, and gaskets , between these parts to provide load-sharing, smoothing of surfaces, appropriate materials, prevention of leaks , or other services . Such interface parts are necessary and not gratuitous . In addition, because they are often big, so as to be able to withstand large loads , the high- k parts have extensive surfaces and can act as foundations for other parts that would otherwise have no place to hang on. On the engine, such parts include pipes, hoses, wires, pumps and other accessories, and so on . Several of these must be located accurately on the block with respect to the crank but many do not. These are the degree-2 and degree- I nodes that make up the majority in all the assemblies studied. Summarizing, most assemblies have only a few high- k foundational, load bearing, or common locating parts, and many other parts mate to them while mating with low probability to each other. Thus even if the few high k parts mated to each other, the assortativity calculation still would be overwhelmed by many (high k - low k) pairs, yielding negative values for r.
3.1 Functional Motifs of Assemblies In many technological networks, the motifs that generate function are closed loops. This is certainly true for both mechanical assemblies and electric/electronic circuits. The V-8 engine's main loops are shown in Figure 2. Some loops are contained entirely within the engine while others (air, output power) close only when other parts of the car
337
or its environment are included. Note that some of them stay within communities identified by the Girvan -Newman algorithm [Girvan and Newrnan'"] while others extend beyond or link and coordinate different communities. These loops cannot be drawn by inspecting the graph but require domain knowledge. The clustering coefficient of a network is obtained by counting triangles, that is, by enumerating the shortest possible loops. In general, the operative loops of an assembly are longer than three (typically 6 or 8) [xv I and thus do not contribute to the clustering coefficient. In fact, the conventionally defined clustering coefficient reveals nothing about the presence, number , or length of loops longer than three. Software for finding motifs" would be helpful here, but only some of the motifs thus found would be functionally significant, and domain knowledge would be needed to identify them. Since positive degree correlation is related to more kinematic constraint while negative degree correlation is related to less kinematic constraint, the degree correlation calculation, when applied to mechanical assemblies, can be thought of as a simplified cousin of the Griibler-Kutzbach criterion , because the former simply counts links and considers them of equal strength, whereas the latter uses more information about both links and nodes and can draw a more nuanced conclusion.
Figure 2. V-8 Engine with Five Main Functional Loops Indicated and Named .
4.1 Conclusions and Observations Different kinds of systems have different operating motifs, and to understand a system is in some sense to know what those motifs are and how to find them . All of the assemblies analyzed here have a number of hubs . These are obviously important parts but they do not perform any functions by themselves. Instead, the identified functional loops are the main motifs, and they seem to include at least one hub, perhaps more . In systems where large amounts of power are involved, hubs often act as absorbers or 4
338 distributors of that power or of static or dynamic mechanical loads. In other systems, the hubs can act as concentrators or distributors of material flow or information flow . Generally, material, mechanical loads, and power/energy/information all flow in closed loops in technological or energetic systems. All the assemblies analyzed display negative degree correlation. This follows from physical principles, the assembly's structure, or engineering design reasoning.
Acknowledgements This paper benefited substantially from discussions with Professor Christopher Magee and Dr David Alderson. The author thanks Professors Dan Braha and Carliss Baldwin for stimulating discussions , Prof Mark Newman for sharing his software for calculating degree correlation, and Dr Sergei Maslov for sharing his Matlab routines for network rewiring . Many of the calculations and diagrams in this paper were made using UCINET and Netdraw from Analytic Technologies, Harvard MA.
M. E. 1. Newman, SIAM Review 45, 167 (2003). R Albert and A-L Barabasi, Review of Modern Physics 24, 47 (2002). iii C. R. Myers, Physical Review E 68, 046116 (2003). iv L. Li, D. Alderson, W. Willinger, and J. Doyle , SIGCOMM '04, Portland OR, (2004). v R. Milo , S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, and U. Alon, Science 298, 824 (2002). vi E. Ravasz, A. L. Somera, D. A. Mongru, Z. N. Oltvai , and A.-L. Barabasi, Science 297,1551 (2002). vii R. F. i Cancho, C. Janssen, and R. Sole, Physical Review E 64, 046119 (2001). viii A. Bourjault, "Contribution a une Approche Methodologique de I' Assemblage Automatise: Elaboration Automatique des Sequences Operatoires," Thesis to obtain Grade de Docteur des Sciences Physiques at l'Universite de Franche-Comte, Nov, 1984 h O. Bjorke, Computer-Aided Tolerancing (ASME Press, New York, 1989) x J. Phillips, Freedom in Machinery (Cambridge University Press, Cambridge, 1984 (vI); 1989 (v2» , Vol. 1 and 2 xi T. N. Whitehead, The Design and Use of Instruments and Accurate Mechanism (Dover Press , 1954) xii D. Blanding, Exact Constraint Design (AS ME Press, New York , 2001) xiii R. Konkar and M. Cutkosky, ASME Journal of Mechanical Design 117, 589 (1995). xiv G. Shukla and D. Whitney, IEEE Transactions on Automation Science and Technology 2 (2), 184 (2005). xv D. Whitney, Mechanical Assemblies : Their Design , Manufacture . and Role in Product Development (Oxford University Press, New York , 2004) xvi S. P. Borgatti, M . G . Everett, and L. C. Freeman, "Ucinet for Windows: Software for Social Network Analysis," Harvard MA: Analytic Technologies, 2002 . xvii M. Van Wie, J. Greer, M. Campbell, R. Stone , and K. Wood, ASME DETC, Pittsburgh, DETCOI/DTM-2 I 689, (2001). ASME Press, New York, (2001). "iii L. Am aral, A. Scala, M. Barthelemy, and H. Stanley, PNAS 97 , 11149 (2000). xix M. Girvan and M. E. 1. Newman, PNAS 99, 7821 (2002). i
ii
Chapter 18
Complex dynamic behavior on transition in a solid combustion model Jun Yu The University of Vermont [email protected] .edu
Christopher M. Danforth The University of Vermont chris.danfort [email protected]
Through examples in a free-boundary model of solid combustion, this st udy concerns nonlinear t ransition behavior of small disturbances of front propagation and tem perature as they evolve in time. This includes complex dynamics of period doubling , quadrupling, and six-folding, and it event ually leads to chaot ic oscillations. The mathematical problem is int eresting as solutions to t he linearized equat ions are unst able when a bifurcation paramet er related to th e acti vation energy passes t hrough a critical value. Therefore, it is crucial to account for t he cumulat ive effect of small nonlinearities to obtain a correct descrip tion of the evolut ion over long times. Both asymptotic and numerical solutions ar e st udied. We show that for special param et ers our method with some dominant modes ca pt ures t he formation of coherent structures. Weakly nonlinear analysis for a general case is difficult because of th e complex dynamics of th e prob lem, which lead to chaos. We discuss possible methods to improve our prediction of th e solutions in the chaotic case.
340
1
Introduction
We study the nonuniform dynamics of front propagation in solid combustion: a chemical reaction that converts a solid fuel directly into solid products with no intermediate gas phase formation. For example, in self-propagating hightemperature synthesis (SHS), a flame wave advancing through powdered ingredients leaves high-quality ceramic materials or metallic alloys in its wake. (See, for instance, [7].) The propagation results from the interplay between heat generation and heat diffusion in the medium. A balance exists between the two in some parametric regimes, producing a constant burning rate. In other cases, competition between reaction and diffusion results in a wide variety of nonuniform behaviors, some leading to chaos. In studying the nonlinear transition behavior of small disturbances of front propagation and temperature as they evolve in time, we compare quantitatively the results of weakly nonlinear analysis with direct simulations. We also propose techniques for the accurate simulation of chaotic solutions.
2
Mathematical analysis
We use a version of the sharp-interface model of solid combustion introduced by Matkowsky and Sivashinsky [6]. It includes the heat equation on a semi-infinite domain and a nonlinear kinetic condition imposed on the moving boundary. Specifically, we seek the temperature distribution u(x, t) in one spatial dimension and the interface position r(t) = {xix = f(t)} that satisfy the appropriately non-dimensionalized free-boundary problem
au a2u at
=
ax2'
x > f(t),
(1.1)
t > 0,
V=G(ul r) , t>O,
aul
ax r
=-V,
(1.2)
t>O.
(1.3)
°
Here V is the velocity of the rightward-traveling interface, i.e. V = df / dt. In addition, the temperature satisfies the condition u - t as x - t 00; that is, the ambient temperature is normalized to zero at infinity. To model solid combustion, we take the Arrhenius function as the kinetics function G in the non-equilibrium interface condition (1.2) [1, 8]. Then, with appropriate nondimensionalization, the velocity of propagation relates to the interface temperature as: V = exp
[(~) (T +~1-_1 (T)u]
(1.4)
at the interface r. Here v is inversely proportional to the activation energy of the exothermic chemical reaction that occurs at the interface, and < (T < 1
°
341 is t he ambient temperature nondimensionalized by the adiabatic temperature of combust ion products. (See [3].) The free-bound ar y problem admits a traveling-wave solut ion u( x , t) = exp(- x
+ t) ,
f(t)
=
t.
(1.5)
It is linearly unst able when v is less than t he critical value Vc = 1/3. (See, for example, [4, 10].) For the weakly nonlinear analysis, let 10 2 be a small deviation from the neut rally stable value of u, namely 10
2
=
Vc
1 3
- v = - - v.
(1.6)
We perturb the basic solution (1.5) by 10 tim es the most linearly unst able mode, evaluated at both t he neutrally stable parameter value v = 1/ 3 and the corresponding neutrally stable eigenvalue, together with complex-conjugate terms. In th e velocity expansion, we also includ e 10 tim es th e constant solution to th e linearized problem (alth ough we do not mention it explicitly in the sequel). See
[5] .
The normal-mode perturbation is modulated by a complex-valued, slowly varying amplitude function A(T), where T = €2 t. Th e amplitu de envelope sati sfies the solvability condition dA
dT =
2-
xA + ,BA A ,
(1.7)
where X and ,B are complex constants . (See [5] for details.) T he evolut ion equation (1.7) has circular limit cycles in the complex-A plane for all values of the kinetic parameter a in th e int erval 0 < a < 1 (Le. for all physical values of a ). To find A(T), we integrate the ord inary differential equation (1.7) using a four th- order Runge-Kut ta meth od.
3
Results and discussion
To compare quantitatively the asymptot ics with numerics, we first consider 10 = 0.1. The value of v remains at t he marginally unstable value V c - 10 2 , as in equation (1.6), so t/ ~ 0.323. We show in thi s section th at this choice of 10 corresponds to a mix of dynamics as a varies. Subsequently, we comment on th e impact on the front behavior of bot h decreasing and increasing 10. To start, take a = 0.48 in the kinetics function (1.4). For the remainder of t his pap er we take the initial condition A(O) = 0.1, unless ot herwise indicated. Figure 1 shows t he numerical (solid line) and asymptotic (dashed line) values of front speed perturbation as a function of time t in the interval 0 ::; t ::; 60. To find the numerical solution, we used t he Crank-Nicolson met hod to solve the prob lem in a front-attached coordinate frame, reformulating the boundary condit ion (1.3) for robustn ess. (See [5] for details.) As for the asymptotic
342
...
;----~-;---;---;;---:::----:
Figure 1: Velocity perturbation versus time: comparison between numerical (solid line) and asymptotic (dashed line) for Arrhenius kinetics, 0' = 0.48, E = 0.1, A(O) = 0.1 -) (/I ~ /Ie - E2 = 1/3 - (0.1 ) 2 = 0.323 solution , the previous section describes the order-s perturbation to the travelingwave solution (1.5). In the figure we have additionally included an order-e'' correction. Figure 1 reveals that from t = 0 to about t = 30, the small front speed perturbation is linearly unstable, and its amplitude grows exponentially in time. As this amplitude becomes large , nonlinearity takes effect. At around t = 30, the front speed perturbation has reached steady oscillation. The asymptotic solution accurately captures the period in both transient behavior for t = 0 to 30 and the long-time behavior after t = 30. The amplitude and phase differ somewhat. This is an example in which the weakly nonlinear approach describes well the marginally unstable large-time behaviors: A single modulated temporal mode captures the dynamics. To identify additional such regimes systematically, we calculate numerically the velocity perturbation data on the time interval 35 < t < 85, throughout the range of physical values of the kinetics parameter a (i.e. 0 < a < 1). Figure 2 summarizes the Fourier transformed velocity data. For each a value and each frequency, the color indicates the corresponding amplitude, with the red end of the spectrum standing for larger numbers than the violet end. For roughly 0.3 < a < 0.6, the figure shows the dominance of the lowest-order mode, suggesting the appropriateness of the weakly nonlinear analysis in this range. For other values of a , a single mode cannot be expected to capture the full dynamics of the solution. In particular, when a is greater than approximately 0.6, solutions have sharp peaks , even sharper than the numerical solution in Figure 1. Figure 2 shows that when a is smaller than approximately 0.3, the Fourier spectrum has a complicated character, starting with the emergence of a period-doubling solution for a ~ 0.25. Figure 3 gives a closer look at the dominant modes for the case of small 0' ; notice the bifurcation to a six-folding solution near 0' = 0.201. The four numerical solutions in Figures 4 and 5 illustrate the cascade of periodreplicating solutions, including doubling (a = 0.22), quadrupling (a = 0.21),
343
and six-folding (a = 0.20075). Note that Figure 2 reflects the breakdown of the numerical solut ion for a less than approximately 0.15.
,--
Figure 2: Amplitudes corres pond ing to each frequency of t he Fourier transformed velocity perturbation dat a for th e Arrhe nius kineti cs paramet er (7 in t he int erval (0,1 ), 2 E = 0.1, A(O) = 0.1,35 < t < 85 (v ~ V c - E = 1/ 3 - (0.1? = 0.323)
'--
.
Figure 3: Ampli tudes corres ponding to each frequency of t he Four ier transformed velocity perturbation dat a for t he Arrhenius kinet ics par ameter (7 in t he interval 2 2 (0.19,0.22 ) , E = 0.1, A(O) = 0.1, 35 < t < 85 (v ~ V c - E = 1/ 3 - (0.1) = 0.323)
Th e cascade of period-replicating solutions for decreasing a leads to chaos. Figure 6 (corresponding to a = 0.185) shows the sensitivity of the velocity perturbation to initi al conditions. In the figure, note that from t = 0 to approximately t = 25, the small front speed perturb ation is linearly unst able, and its amplit ude grows exponent ially in time, similar to the profile in Figure 1. As the amplit ude becomes large, nonlinear ity again comes into play. Still, the curves corresponding to two initial conditions (one with A(O) = 0.1 and the other wit h A(O) = 0.1000001) remain indistinguishab le for a long time. However, as time approaches 100 the two profiles begin to diverge, and as time evolves past 120 they disagree wildly.
344
10 , - - - - - . . . ,
Figure 4: Velocity perturbations versus time (E = 0.1, A(O) = 0.1, 1/ ~ I/c _E 2 = 1/3(0.1? = 0.323) clockwise from upper left: periodic solution for a = 0.48 (cf. Figure 1), period doubling (a = 0.22), period quadrupling (a = 0.21), period six-folding (a = 0.20075)
We propose a couple of techniques to improve model predictions in the chaotic case. One-ensemble forecasting-requires the generation of velocity profiles that correspond to slightly different initial conditions. The degree of agreement among curves in the collection (ensemble) demonstrates the level of reliability of predictions. In the spirit of the jet-stream forecasts in [9], additional data can be provided at the points at which the individual members of the ensemble diverge. For example , Figure 6, which shows an "ensemble" of only two curves, gives a preliminary indication of the need for more data at t = 100. Alternatively, we can more accurately represent solid combustion by using statistical methods to "train" the model. Comparisons with experimental data can reveal systematic and predictable error , as in Figure 7 (courtesy of [2]). The figure, which provides an analogy to the problem under consideration, shows that temperature forecasts near the sea surface off the coast of Japan are typically too warm [2] . That is, the actual temperatures minus the predicted temperatures are negative values, represented as yellow, blue , and violet in the figure. In describing combustion, as in describing sea temperatures, one can compensate methodically for such error. As the bifurcation parameter v approaches ever closer to the neutrally stable value (i.e. as e decreases) , the complex dynamics-including chaos-disappear. For example , when € = 0.06, the asymptotic and numerical solutions agree closely throughout the physical rang e of (J (0 < a < 1). By contrast, when € grows to 0.12, the (J interval in which one mode dominates strongly has a length of only 0.01. Varying € quantifies the domain of applicability of the weakly nonlinear analysis and delineates the role of (J in the dynamics . (See [5] .) In summary, linear instability provides a mechanism for transition to nonlinear coherent structures. Weakly nonlinear analysis allows the asymptotic study of the evolution of small disturbances during this transition, providing insight
345
Figure 5: Phase plots of the four solutions in Figure 4: velocity perturbation v(t) versus dv/ dt
...
, .....,...-
......
Figure 6: Velocity perturbation versus time: numerical solution for (J = 0.185, E A(O) = 0.1 and A(O) = 0.1000001 (II:::':;; lie - E2 = 1/3 - (0.1? = 0.323)
= 0.1,
into nonlinear dynamics, which can be investigated numerically. We also proposed techniques to improve predictions of solution behavior in the chaotic case. The ensemble method may provide accuracy over long time intervals. Also, given experimental data, statistical procedures can be used to train the model.
Bibliography [1] BRAILOVSKY, 1., and G. SIVASHINSKY, "Chaotic dynamics in solid fuel combustion", Physica D 65 (1993),191-198. [2] DANFORTH, C. M. , E. KALNAY , and T. MIYOSHI, "Estimating and correcting global weather model error" , Monthly Weather Review (2006), in press.
346
Figure 7: Curves of predicted (constant) near sea-surface temperature, along with colored bands of associated error (courtesy of [2])
[3] FRANKEL, M., V. ROYTBURD , and G. SIVASHINSKY, "A sequence of period doubling and chaotic pulsations in a free boundary problem modeling thermal instabilities" , SIAM J. Appl. Math 54 (1994), 1101-1112.
[4] GROSS, L. K.,
"Weakly nonlinear dynamics of interface propagation",
Stud. Appl. Math. 108,4 (2002), 323-350.
[5] GROSS, L. K. , and J. Yu, "Weakly nonlinear and numerical analyses of dynamics in a solid combustion model", SIAM J. Appl. Math 65 ,5 (2005), 1708-1725. [6] MATKOWSKY, B. J ., and G. 1. SIVASHINSKY, "P ropagat ion of a pulsating reaction front in solid fuel combustion", SIAM J . Appl. Math. 35 (1978), 465-478. [7] MERZHANOV, A. G., "SHS processes: combustion theory and practice", Arch. Combustionis 1 (1981), 23-48. [8] MUNIR, Z. A., and U. ANSELMI-TAMBURINI, "Self-propagating exothermic reactions: the synthesis of high-temperature materials by combustion" , Mat . Sci. Rep. 3 (1989), 277-365 . [9] TOTH, Z., and E. KALNAY, "Ensemble forecasting at NMC: The generation of perturbations" , Bulletin of the American Meteorological Society 74 , 12 (1993), 2317-2330. [10] Yu, J. , and L. K. GROSS , "The onset of linear instabilities in a solid combustion model", Stud. Appl. Math. 107, 1 (2001), 81-101 .
Chapter 19
Modeling the Structural Dynamics of Industrial Networks Ian F. Wilkinson School of Marketing, University of New South Wales , Australia [email protected] James B. Wiley Faculty of Commerce Victoria University of Wellington, New Zealand [email protected] Aizhong Lin School of Computing Sciences University of Technology, Sydney
1. Introduction Market systems consist of locally interacting agents who continuously pursue advantageous opportunities. Since the time of Adam Smith, a fundamental task of economics has been to understand how market systems develop and to explain their operation. During the intervening years, theory largely has stressed comparative statics analysis. Based on the assumptions of rational, utility or profit-maximizing agents, and negative, diminishing returns) feedback process, traditional economic analysis seeks to describe the, generally) unique state of an economy corresponding to an initial set of assumptions. The analysis is static in the sense that it does not describe the process by which an economy might get from one state to another.
348
In recent years, an alternative view has developed. Associated with this view are three major insights. One of these is that market processes are characterized by positive feedback as well as the negative returns stressed in classical economics, (e.g . Arthur 1994). The second insight is that market systems may be studied in the framework of complex adaptive system theory. "A 'complex system' is a system consisting of many agents that interact with each other in various ways. Such a system is 'adaptive' if these agents change their actions as a result of the events in the process of interaction", (Vriend, 1995, p. 205). Viewed from the perspective of adaptive systems, market interactions depend in a crucial way on local knowledge of the identity of some potential trading partners. "A market, then, is not a central place where a certain good is exchanged, nor is it the aggregate supply and demand of a good. In general, markets emerge as the result of locally interacting individual agents who are themselves actively pursuing those interactions that are the most advantageous ones, i.e.,they are self-organized" (Vriend, 1995, p. 205). How self-organized markets emerge in decentralized economies is a question that formal analysis of such systems seeks to answer. The third insight associated with the alternative view is that there are parallels of economic processes with biological evolution. This insight, in turn, suggests that ideas and tools of biological evolution may fruitfully be applied to the study of economics. Among the promising tools are computer-based algorithms that model the evolution of artificial life. If the tools used to model artificial life may be applied to institutions, industries, or entire economies, then their evolution and performance may be studied using computer simulation.
1.1. Objectives The aim of our work is to apply the above insights to the study of Industrial market systems, IMS's). IMS's consist of interrelated organizations involved in creating and delivering products and services to end-users. The present paper describes computer models which are capable of mimicking the evolutionary process of IMS's, drawing in particular on theNK Models developed by Stuart Kauffman (1992 , 1995). The modeling effort has two interrelated but distinct purposes. The first is to help us to better understand the processes that shape the creation and evolution of firms and networks in IMS's. This will provide a base both for predicting, and perhaps influencing, the evolution of industrial marketing systems. Secondly, the models may be used for optimizing purposes, to help us designing better performing market structures. The specific objectives of this research are to examine: the processes by which structure evolves in IMS 's ; the factors driving these processes; and the conditions under which better performing structures may evolve .
2. Background It is typical of complex adaptive systems in general, and those that mimic life processes in
particular, that order emerges in a bottom up fashion through the interaction of many
349
dispersed units acting in parallel. No central controlling agent organizes and directs behaviour. "[Tlhe actions of each unit depend upon the states and actions of a limited number of other units, and the overall direction of the system is determined by competition and coordination among the units subject to structural constraints. The complexity of the system thus tends to arise more from the interactions among the units than from any complexity inherent in the individual units per se (Tesfatsion 1997, p. 534). IMS's may be described in terms of four sets of interrelatedelements or components, i.e, actors, activities, resourcesand ideas or schema, (Hakansson and Snehota 1995, Welch and Wilkinson 2000). The actors consist of various types of firms that operate in industrial markets. A "type" of firm, such as, wholesaler, drop shipper, manufacturer's agent, rack jobber, broker, and so forth) may be described in terms of the activities that it is capable of performingas well as in terms of their schemaor "theories in use" which underlyingactor's actions and reactions(Gell-Mann 1998). In IMS's, businessentities seldomperform allofthe necessary activities or functions requiredfor a transaction to take place. Rather,theyperform some of them. The firm and other firms with complementary specialization collectively perform requisite activities for transactions to occur. One way to look at the evolution of firms in market systems is to conceive that they evolve to establish competitive niches much as organisms do in natural environments. That is, firms retain, add, or drop activities and functions as part of an on-going process. The outcome of this process is, or is not) a set which gives the firm a competitive advantage. The resulting interdependence, however, makes the "fitness" of any firm's pattern of specialization dependent on the specialization patternsof other firms in the market system. Each firm of a given type may be further described in terms of its resources, such as inventories,cost structures, wealth,and the relationshipsit has established withotheractorsin the market). Patternsof relationships may be complex,and they will themselvesevolve. The patternof relationshipsthat determines firms' interdependencies also establishes thefitness of a specific pattern of activities and functions. Firms with which a firm has relationshipshave themselves relationships with other firms. These other firms have relationships with yet other firms and so forth, including the possibility of relationships, perhaps of different types) with the original firm. Relationship patterns may shift and become unstable. Forexample, the merger of two advertising agencies may result in their having to give up one of two clients in the same industry. The forsaken client may hire a new agency that, in turn, must give up a client in the industry. "Avalanches" of changes may follow in an analogous way to Per Bak's (1996) sand piles. Depending on the nature and degree of connectivity and responsiveness among firms and relationships, changes in one part of the network can bring forth "avalanches" of changes of different sizes (Hartvigsen et. al. 2000). Each agency that changes clients loses links to the suppliers and customersof the old client and gains links to the suppliersand customersof the new client. Relationships may be of different types. For example, two mergingautomobile firmsmay require management consulting services. The consulting company may be allied with, or
350
even a subsidiary of, an accounting firm, which in turn gains links to the merged auto firm though its relationship with the consulting firm . The consulting firm in turn may gain links to the merged company's public relations agency. Because of the services provided by the consulting and accounting firms, the auto firm may require the services of a computer service bureau. The service bureau in turn may gain links to the auto firm's investment bank . However, the exclusionary restrictions described in the previous paragraph may result in changes in consulting, accounting, banking, and service bureau industries. A second objective of this research, and the primary objective of the present paper, is to gain understanding of what drives the formation of relationship patterns and to look at the patterns of stable and unstable relationships that may occur.
3. Modeling A differentiating characteristic of this research is the way in which relationships are viewed. Typically, descriptions of relations among firms make the implicit assumption that it is organizations that "organize" and direct the flows of activities in IMS's. As Resnik (1998) points out such centralist thinking tends to be common in business: "People seem to have strong attachments to centralized ways of thinking : they assume that a pattern can exist only if someone, or something) creates and orchestrates the pattern. When we see a flock of birds, they generally assume the bird in front is leading the others - when in fact it is not. And when people design new organizational structures, they tend to impose centralized control even when it is not needed." (p.27). Recent developments in the science of complex adaptive systems show how structure emerges in bottom-up, self-organizing ways. The observed behaviour of the system is the outcome of independent actions of entities that have imperfect understanding of each other's activities and objectives and who interfere with and/or facilitate each others activities and outcomes. Structure emerges as a property of the system, rather than in a top-down , managed and directed fashion (Holland 1998). From the ongoing processes of interaction, actors' bonds, resource ties, activity links and mutual understandings emerge and evolve and these constitute the structure of the IMS 's. This structure can be more or less stable. Over time, the structure of the IMS's evolves and co-evolves as a result of interaction with other IMS' s. We make use of recent developments in computer-based simulation techniques to model the evolution of relationships in IMS's. We adopt this approach for two reasons. First, the complex interacting processes that take place in such systems are beyond the scope of traditional analytical techniques. Second, the actual IMS's we observe in the real world, no matter how diverse they may be, are only samples of those that could arise. They are the outcomes of particular historical circumstances and accidents. As Langton (1996) observes in a biological context: "We trust implicitly that there are lawful regularities at work in the determination of this set [of realized entities], but it is unlikely that we will discover many of these regularities by restricting ourselves only to the set of biological entities that nature
351
actually provided us with. Rather, such regularities will be found only by exploring the much larger set of poss ible biological entitie s" (p x). So it is with IMS's. Recent developments in the modeling and simulation of complex adapti ve sys tems suggest ways in which we may explore the range of possible networks in s that might arise . In the next section, we describe the model s of IMS ' s we have developed and are developing based on Kauffman' s NK Model s.
3.1. NKModels Kauffman ( 1992, 1995) has developed a way of repre senting a network of interacting elem ents in term s of a set of N elem ents, actors, chemicals, firm s or whatever) each of whose behaviors is affected by the behaviour of K other elements. The model is a discrete time simulation model; what exists in time t+ I is determined by what existed in time t. Two versions of the NK model are relevant to our research: NK Boolean Models, used to model relationship interaction; and NK Fitness Landscapes with Patch Formation, used to model the emergence of cooperating group s of entities, such as firms. Only the first has so far been implemented in our research and it is the version de scribed in the next section of the paper. A description of the approach we are taking using NK Fitness Land scapes foll ows this.
3.2. NK Boolean Models and Relationship Interaction An important que stion is how to character ize network struc ture in term s of NK Model s. We chose to use relationships between firm s as the unit of anal ysis. Th e pattern of relationships at time t+ I is determined by the pattern of relationships at time t and Boolean logic rules fo r transforming one to the other. Note that by choosing relationship s we are effectivel y operating at a second order level compared with traditional indu strial network models. Actors are defined by the relationship s they have and not, ab initio, as network entities in their ow n right. A full er account of the model is to be found in Wilkinson and Easton ( 1997). The behav ior of relation ship s is modelled in binary term s; they either exist, i.e., are active) and have the value I or they do not exist, i.e., are inactive) and take on the value O. The beha vior of a relationship at time t dep end s on the behav ior of K other relationships, possibly including its own behavior) in the previous period . Boolean operators spec ify the behaviorof a relation in period t+ I for each combination of behaviors of the K co nnected relations in the previous period . The focal relati on is called the output or regulated relation and the K conn ected relations affecting its behavior are called input relations. Our contention is that Boole an operators may be con structed that have conventional economic/business interpretations such as complementary supply relations, competing relati on s, temporally connected relation s, and so forth . If this is so, then both the type of relati on ships and the degree of interconnection K can be modelled, and the effect of both dimensions on the char acter of IMS attractors simulated.
352
Our model is an autonomous Boolean network because there are no inputs from outside the network itself, although this can be added, see below). For the purposes of our analysis, we assume that the Boolean network is a synchronous, parallel processing network in which all elements compute their behavior in the next period at the same time. Our general methodology for examining the behavior of our NK Boolean models follows that of Kauffman: "To analyze the typical behaviour of Boolean networks with N elements, each receiving K inputs, it is necessary to sample at random from the populations of all such networks, examine their behaviors, and accumulate statistics. Numerical simulations to accomplish this therefore construct exemplars of the ensemble entirely at random. Thus the K inputs to each element are first chosen at random and then fixed, and the Boolean function assigned to each element is also chosen at random and then fixed. The resulting network is a specific member of the ensemble of NK networks" (Kauffman 1992 p 192).
3.3. The role of K Kauffman has shown that K, the number of other entities to which an entity is linked, is a critical parameter in determining the pattern of behaviour of the network. For low values of K, say 0 or 1, each entity is essentially an isolated unit. Its state does not influence other elements and so the network consists of "frozen" relationships that do not change or regularly switch on and off. For high values of K , the interactions are very complex and destabilizing, resulting in chaotic behaviour. For values of around 3 to 6 self-organizing patterns emerge within the network that are stable with respect to small perturbations, i.e., random changes in the state of individual relationships do not change the state of the system as a whole). As K is increased, the isolated actors become isolated interacting groups. Gradually, as K increases, more of the elements are joined in the network. Attractors are the sequences of relatively stable states to which the system settles for protracted periods and, as Kauffman (1992) observes, attractors "literally are most of what the systems do" (p. 191). In the present context, attractors are interpreted as relatively stable patterns of relationships that may occur in IMS. The pattern of relationships that might be observed in an actual industrial market might correspond to the patterns on one such attractor, whereas IMS are likely to have many possible attractors depending on N, K, the mix of Boolean operators involved and system starting conditions. The patterns of relationship corresponding to other attractors might correspond to patterns observed in other market systems or, possibly, to feasible patterns that have never been observed in IMS's. System behaviour is also affected by a biasing or canalising factor, reflected in a parameter p. This reflects how sensitive an element is to the behavior of theK other elements to which a given element is connected. The value of p depends on the character of the Boolean operator and is measured in terms of the proportion of situations an entity will take on the most modal value, 0 or I. The lowest value for p is 0.5 , which occurs when an element will take on the value I in the next period in 50 % of situations and 0 in the other 50 %. If it will take on the value 1 in 90% of situations, p is 0.9. Higher K may lead to order
353
with higher values of p because p acts as a kind of damping function that "prevents" chaos. The p value of networks constructed using different combinations of Boolean operators will be used to aid our analysis of the behaviour of the network.
4. An Example of an NK Boolean Model of Network Relationships To keep things simple we will consider the network in terms of the N=8 possible output relationships between supplier and distributors, which can either be active, =I) in a period or inactive, =0) in any period. Figure I illustrates the network. The system comprises four suppliers, SO to S3) and the main competitor of each supplier is the one to its left e.g. the main competitor of SO is SI. For S3 the main competitor is SO. SO
... . Sl
~
S2
.....................
Output Relation
~
r--InPut Relation 2
········L. •
S3
: <,
. . - -_Input Relation 1
~
i ······....
DO
.....•
Dl
Figure 1. A Four Supplier Two Distributor Network Figure I focuses on the output relation between SO and DO, Roo.so)and Roo.sI and RDl,SO are the input relations. The state of R OO.SO in period t+ 1 depends on the states of of Roo.sl and R DI .SOin the previous period. Roo.sl and RDl,soare the input variables and we can say Roo.so is connected to Roo.sl and RDI,SO' For simple exposition, we will for the moment assumerelation Roo.so is not an input variable i.e., the relation is not connected to itself. In this network K=2 and there are K 2 or 4 different combinations of the input variables, 00, 01,10 ,11) and there are sixteen Boolean operators that may be defined on the combinations. These are shown in Table I in terms of the behavior of the output or regulated variable, relation RDOSO) in period t+ 1, for each of the four possible combinations of input variables, RDOS2and RDISO) in period t. For example, rule I should be read as follows: Rooso is inactive in period t+ 1 for all four possible combinations of behavior RDOSI RD1SOinperiod t. Wilkinson and Easton, 1997) examine in greater detail the sixteen Boolean operators in the context of an IMS. In the next section, we briefly consider the logic of simulation. We then describe simulations of two situations where the existence of a relationship between a supplier and a distributor depends on the preferences each has for exclusive dealing. We identify two types of exclusive dealing: "Supplier Exclusive Dealing", Rule I0, Table I), and " Mutually Exclusive Dealing", Rule 3, Table I ). They are discussed in more detail below.
354
4.1. Role of Simulations Simulations are particularly useful for discovering the variety of system states that could be observed, including ones that are rare or unobserved in nature. They are useful for exploring the conditions that produce the respective states and for evaluating the relative petformance of the states. They also are useful for investigating the stability of the states. Stability may be explored by perturbing the system and following what happens, something that would be difficult, if ethical to do in natural environments. Perturbation may take many forms. The most obvious would be to induce the formation or extinction of a type of entity or relationship. It is useful to find out how such changes propagate through the system over time, determine what the final configuration(s) is, are), and observe how the process and outcome differ in relation to the type and scale of the perturbation. We have developed an NK simulation program to allow us to explore the dynamics of relationships in networks (Wilkinson, Hibbert, Easton and Lin 1999). The program is written in C++ and permits the analyst to design a particular network of connected relations and Boolean operators of interest, or to construct such networks randomly. Once the network is constructed, the input and output relations specified and the Boolean operators chosen the program computes the system states over time. The starting conditions can be specified, or randomly generated and the option exists to systematically examine the behaviour of the network for different starting conditions or alternative combinations of Boolean operators. To illustrate the program we examine the behaviour of a simple network of firms each wishing to follow exclusive dealing arrangements with other firms . Referring to the network depicted in Figure 1, we hypothsize that the existence of relationships between suppliers and distributors depends on the preferences each has for exclusive dealing. These preferences may be represented in terms of Boolean operators governing the behaviour of a focal relationship in terms of the existence of other relationships. For example, in condition, a) supplier exclusive dealing, a supplier only deals with the distributor if it is not dealing with its main competitor. Th is condition is operationalized by Rule 10 of Table 1. In condition, b) mutual exclusive dealing, a distributor wants an exclusive dealing relationship as well as the supplier and hence the dealer will not deal with the supplier if it is also dealing with other distributors. This condition is operationalized by Rule 3 of Table 1.
4.2. Supplier Exclusive Dealing The Boolean operator here is read off the row labelled " Rule 10" in Table 1. This shows the state of the output relation in period t+ J for each possible combination of states of the two input relations in period 1. In this condition the state of Input Relation 2, i.e., the existence or not of a relationship between the focal distributor and the closest competitor of the supplier dominates the rule. If this relation is active in period t the output relation will not be active in period t+ J. The distributor does not care if the supplier deals with the other distributor or not This is the Boolean rule that neither Input Relation 1 nor Input Relation 2 is active or Input
355
Relation I and not Input R-elation 2 is active . More simply the Boolean rule may be specified in terms of Input Relation 2 only i.e., not Input Relation 2. Conditions Input Relations: Roost in t is Inactive(O) RDlso in t is Inactive(O)
Rule in Boolean Form Inactive 0) Active(l)
Active(l) Inactive(O)
Active(l) Active(l)
Rooso(t+l) is l=active 0 = inactive Rule I Rule 2 Rule 3 Rule4 RuleS Rule 6
o
o
I
I
I
o
o o o
Or, two conditions) Rule7 0 Rule 8 I
I
o o I 0
Rule 9 Rule 10
o
Insulation Insulation
o o
o o o
RDOS1 nor RDiso RDiso and not Roos1 Roos1and not RDiso Roos1and RDiso
I 0
0 I
0
0
Roos1exc.or RDiso Roos1nor RDlso or RDOsl and RDlso Roos1nor RDlso or RDlso and not Roos1 Roos1nor RDiso or Roos1and not Ro1so RDiso and not RDOsl or Roos1and Roos1 RDOsl and not RDiso or RDosl and RDlso
I
I
o
0
Rule II
0
Rule 12
0
I
I
0 0
0
Or, three conditions) Rule 13 0 Rule 14 I Rule 15 I Rule 16
o
I I
o o
I
o I
RDiso or Roos1 RDOsl nand Ro1so Roos1 nor RDlso or RDisoand not Roos1 or RDOS1and RDiso Roos1 nor RDiso or Roos1and not RDlso or Roos1and RDiso
Table 1. Boolean Functions for two input variables (Source : Wilkinson and Easton, 1997)
Our simulations show that in conditions of supplier exclusive dealing, one type of equilibrium state or attractor exists where distributors deal with alternate suppliers e.g., DO deals with SO and S2 and D 1 deals with S I and S3, as shown in Figure 2. In this situation the existing relations continue because the distributor is not also dealing with the suppliersclosest competitor and the inactive relations will not change because the suppliers will not deal with distributors who are also dealing with their closest competitor.
356
so
31
DO
82
33
DI
Figure2. AnAttractor forthe Four Supplier Two Distributor Network: Supplier Exclusive Dealing Condition An equivalentattractor is when DO deals with S1 and S3 and D1 deals with SO and S2. If we start the network in this position it will remain there. However, if we change one of the starting conditions at random we find that this attractor is not very stable and the network goes to another attractor involving various kinds of cyclical behaviour. Because there are eight output relationsin our modelthe are 28 =256 possiblestarting configurations. In order to examine the types of attractors that can emerge we used the program to systematically explore the behaviorof the modelfor different starting conditions. The attractorsthatemerge from a sample of 108 of these starting conditions are shown in the appendix. Starting conditions are grouped according to the attractor that results and into broader types of attractors and in terms of the numberof states on the attractor, as indicated by the percentage of periodsa particularoutput relation is active. The results showthat the situationdepicted in Figure2 rarely occurs and that any disturbancefrom that state will result in the emergenceof another attractor. Note that there are no trails leading to the attractor in our "simple" model. In other words all states of the model are on an attractor. The mostcommonattractorsare cyclical patterns in whichall the relations switch between active and inactive states. An example is given in Figure 3 for the starting condition 10110001. Here a four period attractor results in which two relations are active 50% of the time, 2 of the 4 periods),one is always inactiveand one active. This is classifiedas a type b attractor and three other attractors have this mix of behavior. The most common type of attractor is a four period attractor in which four relations are active half the time, two are active lout of 4 and two active 3 out of 4 period.
4.3. Mutual Exclusive Dealing When the mutual exclusive dealing rule is used the number and type of attractors changes. The relevant Booleanoperator dealing is Rule 3 in Table 1. Her, the Output Relationwill be active in period t+ 1 only if both Input Relationsare not active in period t. This is the "Nor" rule i.e., neither Input Relation 1 nor 2 can be active for the output relation to exist in the next period. The same equilibrium state or attractor exists as before, when the distributors deal with alternate suppliers but it only arises if firms start in that situation. Some of these attractorsemergingare shown in Appendix2 for different startingconditions.Aneightperiod
357
cycle is a common attractor in which 4 relations are active 6 out of 8 periods and four are active lout of 8. Transient states also exist leading to an attractor. The foregoing demonstrates that even an apparently simple network can produce surprisingly complex behav ior. Kauffman (1992) shows that the number of attractors of randomly connected Boolean nets depends on K and N. When K=2 the number of attractors is about equal to the square root of N. He also shows that the behavior of the network over time starting at any two different states will tend to converge (ibid. p200), i.e. the behaviour of the network is not highly sensitive to starting conditions. Lastly, the median number of states on the attractor cycle is about equal to the square root of N. Kauffman's results are for randomly connected Boolean nets, whereas we are interested in the behaviour of Boolean networks that conform to conditions found in industrial networks. In order to explore the number and kind of attractors existing for our focal networks, we simulate the behaviour of the network over time, keeping the Boolean rules fixed, while varying the starting conditions in terms of which relations are active and which inactive in the first period. The following questions may be investigated using this model simulation:.
• • • •
What happens to the behaviour of the network over time under various starting conditions and relevant Boolean Rules? In other words, what are the attractors for the behaviour of the network? Which patterns of relations being active are mutually compatible, and what other repeated cyclical patterns of behaviour can arise? How likely is each attractor to arise i.e. how large is the basin of attraction for each attractor? How stable are different attractors i.e., does a slight move away from the attractor result in its return to the same or another attractor?
We are particularly interested in the conditions under which emergence of "edge-ofchaos" attractors emerge. This is because attractors of this sort resemble that of a partially frozen pond in that they involve combinations of a "frozen" set of fixed on relations, i.e., long-term relations) or fixed off relations, i.e., no relation), interspersed with " islands" of relations that exhibit a pattern of behaviour over time. These islands are separated from each other by the frozen skeleton of relations. This turns out to be an important characteristic as it means that changes in behavior in one part of the network do not interfere with behavior in other parts. Therefore beneficial mutations or innovations taking place within these "islands" can be retained more easily than if they caused damage elsew here. These kinds of attractors have been identified by Kauffman, Langton and others as central to the emergence and organisation of complex systems including molecules, cells, organisms, and economic and social organizations (e.g. Kauffman 1992, 1995, Langton 1992). Such network attractors also correspond to what others have observed in real network patterns, i.e., a mix of stability and change, e.g., Gadde and Mattsson 1987, Hakansson 1992) and this parallel in part motivates our research.
358
~ 0
J>(I
0
Period 1
1 Period 2
~
1 Period3
Figure 3. A Four Period Attractor for Starting Condition 1011000 I
S. Next Steps Further developments of the Boolean model involve introducing lagged operators and interactions between different industrial networks. The latter step leads to larger scale industrial and economic organisation. The extension is straig htforward. For example, the behaviour of some relations, or actors) in one industrial network can be made to depend on the behavior of relations, or actors) in another network, as well as on those in their own network. Both the number of networks S and the extent of inter-coupling among networks C may be varied and evolve. Finally, an exogenous environment can introduce noise or other exogenous effects directl y on some or all parts of the network(s), Wiley 1999).
5.1. NK Fitness Landscapes and Patches Another realization of the NK model is in terms of NK fitness landscapes. Here , an ent ity 's behavior is modeled in binary form with I = on or acti ve and 0 = off or inacti ve. The specific fitness or utility of an entity at time t+ 1 is determined by the entity' s state in the period and by the states of K other entities. In terms of IMS's, firms ' , or relation ships) fitnes s depends
359
on its own behavior in the period as well as on the behavior of K other firms e.g. suppliers, complementors, customers and competitors, or connected relationships) . An entity changes its binary state from one period to the next, 0 to I, or I to 0) if it "expects" the change to improve its fitness . Entities are assumed to change their state based on the assumption that other entity 's behavior will remain unchanged in the next period. Fitness values, usually between 0 and 1, are allocated randomly to each possible combination of states that describe K+l entities, the actor and K other entities). Here we do not plan to model directly the behavior of relationships, as we did in the foregoing NK Boolean model. Instead, relations among actors emerge as a result of the formation of patches of cooperating actors . Kauffman (1995) introduced the concept of "patches" to refer to groups of actors coordinating their actions to achieve better group fitness. We use the concepts of patches in two ways. The first focuses on the development of cooperative relations among actors in a network and the second focuses on activities as the primary unit of analysis and how groups of activities emerge to define firms .
5.2. Modeling the Emergence of Patches of Cooperating Actors The first approach follows that of Kauffman, in the sense that patches define sets of cooperating elements. With a patch size of one, entities operate independently in deciding their behavior in the next period. Larger patch sizes correspond to multiple actors cooperating to jointly improve group fitness and it is assumed that gains and losses are distributed equally among groups members - therefore average fitness in the group is the driving force of behavior. Kauffman, 1995) shows that patch size in this sense has a dramatic effect on the ability of a system to achieve higher performing structures. We plan to use NK fitness landscape type models to model the patch formation process endogenously and to explore the way network dynamics and evolution is impacted by patch formation . Previous modelling work has tended to focus on patch size as an externally imposed parameter, whereas the formation of patches in IMS is a central feature of structural change and evolution. Firms and inter firm alliances in an IMS are examples of patches formed in real networks in a self-organizing way. Modeling the way patches may form and reform over time and how this shapes network performance will yield important insights into the deep processes of industrial network evolution. Four potentially fruitful approaches to modelling patch formation have been identified. The first uses payoffrules for patch/ormation, based on work by Kauffman and McCready. Here, an IMS may begin with individual behav ior i.e. a patch size of 1, or some other starting configuration of interest, and each period actors join and leave patches according to certain rules. Actors try to join a patch, and thereby leave their existing patch) if the average payoff is greater in the other patch than their existing patch. However, actors mayor may not be accepted into a patch. New patch members are accepted only if the average patch payoffis expected to increase as a result of their joining. These expectations are determined by
360
examining what the payoffs would have been if the actor was a member of the patch in the previous period. The evolution of patches can be modelled over time using different starting configurations, different values of K, different patch churning rules, as well as different types of fitness distributions, e.g. normal versus other distributions). The second approach is based on models of iterated prisoner's dilemma games with choice and refusal ofpartners, IPD/CR) pioneered by researchers at Iowa State University, e.g. Tesfatsion 1997). These models allow actors to form links with other entities and learn from the experience interacting with them. Depending on the outcomes of interaction using a fixed number of IPD plays, links may be strengthened, extended, undermined, or broken. Over time actors form groups, patches) of interacting actors, that are more or less cooperative, or they remain isolated "wallflowers" i.e, patches of size 1. Actors can employ different strategies in their interactions with others, which will affect the outcomes of interaction and the types of groups of interacting actors that form. Strategies can also be modified over time depending on the performance of an actor and its awareness of the performance of others. The third approach models the probability ofcooperating with other actors directly and can be viewed as a variant of the IPD/CR approach, except that there is no modelling of interaction in terms of IPD games. This approach is based on the models developed by Hartvigsen et ai, 2000). Populations of actors are located on a two-dimensional aquare toroidal lattice and have a defined number of neighbours with whom they can interact Each actor has a probability of cooperating, Pi' and those that interact are considered cooperative and open to communication. Those with low PI interact infrequently and those with Pi =0 are pure defectors. A parameter specifies the amount each actor Pi is changed in response to the interaction experience each period . Interactions in any period are governed by two random numbers generated for each actor that determine if they interact and whether a neighbour cooperates or defects. If the chosen neighbour cooperates, the target actors Pi is increased by , otherwise it is decreased . By simulating the pattern of interaction over time, interacting groups, patches) emerge. The final approach is suggested by models of firm growth developed by Epstein and Axtell (1996) and Axtell (1999). In these models, actors join firms (patches) to gain economies of scale and the rewards are divided equally among members of a firm. Actors vary in terms of their work versus leisure trade offs and this affects their "cooperation" within a firm. Problems of shirking and free riders result from work-leisure tradeoffs and this leads to the breakup of firms and the formation of other firms. These models have been shown to be capable of mimicking the actual size distribution of firms in an economy.
5.3. Modeling Activity interactions using NK Fitness Landscapes The standard NK fitness model represents theN clements as scalars, binary values of l's and O's). Fitness is computed for the K+ 1 vector describing the state of the firm and the states of K other firms that influence it. The firm's state in period t+ 1 is determined to be the one that has the highest fitness.
361
The NK fitness model can be adapted to model a firm in which the states are vector valued, rather than scalar valued. For a given set of activities A, A; is the vector of Osand Is indicating whether firm "i" does or does not perform each activity. Firms can perform different combinations of activities which are the equivalent of patches i.e. they can alter the combination of activities performed in the expectation of improving overall fitness or the vector. The fitness of performing or not an activity depends on the K other activities it is connected to that are performed within the firm or by other firms. Patches here are a natural descriptor of a firm in terms of the set of activities included in the vector. Patch sizes vary in terms of the number of activities firms consider in combination. Firms can add and drop activities in their vector, i.e. change patch size) as well as alter the state of each activity in their vector or patch . The approach is similar in some ways to that proposed by McKelvey, 1999) in which he uses the activities specified in Porter's value chain as the basis for representing firms in terms of NK models.
6. Summary, Managerial Implications, and Conclusions Traditionally, the study of business and economic systems has been on the actions of agents as rational actors who interact with each other and their environment to control scarce resources. An alternative view has developed in recent years, which sees such systems as more like biological and ecological systems, as complex adaptive self-organising systems. This view profoundly changes the underlying metaphor of economic processes, from what might be called an engineering metaphor, to a biological one involving stochastic, dynamic processes, with myopic, "satisficing" agents . Formal representations of such processes can be difficult, if not impossible, to solve. There is an approach based on evolutionary computer algorithms that avoids the need for analytic solutions of a formal model. Among the promising tools are agent based computer models of complex adaptive systems and the emerging Science of Complexity. The relevance of the Science of Complexity to management is beginning to be appreciated by industry. One example is the Embracing Complexity Conferences run by Ernst and Young, which bring together scientists from various disciplines working on aspects of complex systems behaviour with business and the ICCS conferences run by the New England Complex Systems Institute which attracts business people as well as academics. New types of models are being developed to better understand, sensitize managers and perhaps control the behaviour of complex intra and inter-organizational systems. An important insight of the new types of models is that firms operate in systems in which control is distributed. While there may be exceptions, in the main, no actor or entity coordinates or directs the behaviour of the network. Instead, a firm is participating, learning and responding to the local circumstances it encounters and tries to achieve its particular objectives. Coordinated action results from interrelated yet separate operations occurring in parallel. This leads to a different concept of management and strategy; one in which the firm participates and responds to the system in which it operates rather than tries to control and
362
direct it (Wilkinson and Young 2002). Firms jointly create both their own destiny and the destiny of others . In this regard, firm s act to preser ve and create the ability to act through the futures they shape for other firm s on whom they depend as well as themsel ves. Together, they co-create the winning game s the winning actors play (Kauffman 1996). In the mainstream business academic literature, we are beginning to see the emergence of this type of thinking, as firms come to see them selves as parts of business ecosystems in which cooperative and competitive processes act to shape the dynamic s and evolution of the ecosystem (Anderson et al 1999, Haeckel 1999, Moore 1995, Moore 1996).
References Anderson, Philip, Meyer, Alan, Eisenhardt , Kathleen , Carley, Kathleen, Pettigrew, Andrew, 1999, Introduction to the Special Issue: Applications of Complexity Theory to Organization Science, Organization Science . 10, May-June 1999): 233-236. Arthur , Brian, 1994, Increasing Returns and Path Dependence in the Economy, Ann Arbour,University of Michigan Press (Ann Arbour ). Axtell , Robert , 1999, The Emergence of Firms in a Population of Agents Center on Social and Economic Dynamics , Brooking s Institution Washington , DC. Bak, Per, 1996, How Nature Works, Springer Verlag (New York). Easton, G., 1995, Methodology and Industrial Network s, in Business Marketing: an Interaction and Network Perspective, edited by K. Moller and D. T. Wilson, Kluwer (Norwell, Mass). Easton, Geoff, Wilkinson, Ian F. and Georgieva, Christina, 1997, On the Edge of Chaos: Towards Evolutionary Models of Industrial Networks, in Relationships and Networks in Internat ional Markets, edited by Hans Georg Gemunden and Thomas Ritter, Elsevier, 273-294 Epstein, Joshua M. and Axtel, Robert, 1996, Growing Artificial Societies, MIT Press (Cambridge, MA). Gadde, Lars-Eric and Mattsson, Lars-Gunnar, 1987, Stability and Change in Network Relations, International Journal of Research in Marketing, 4, pp29-41 Gell-Mann , Murray, 1995, Complex Adapti ve Systems, in The Mind, The Brain, and ComplexAdaptive Systems, edited by H. 1. Morowitz and 1.L Singer, Sante Fe Institute Studies in the Science of Complexity, Addison-Wesley (Reading), 11-24 Haeckel, Stephan H., 1999, Adaptive Enterprise Harvard Busines s School Press (Boston). Hakansson, Hakan, 1992, Evolution Processes in Industrial Networks, in Industrial Networks: A New View of Reality , edited by B Axel sson and G. Easton, Routledge (London), 129-143 Hakansson, Hakan and Ivan Snehota , 1995, Developing Relationships in Business Networks, Routledge (London) . Hartvigsen, G., Worden, L. and Levin , SA, 2000, Global Cooperation Achieved Through Small Behavioral Changes Among Strangers, Complexity 5: 3,14-19 Holland, John H., 1998, Emergence, Addi son-Wesley Publishing, Reading, MA. Kauffman , S., 1992, Origins ofOrder: SeljOrganisation and Selection in Evolution , Oxford University Press (New York). Kauffman, S., 1995, At Home in the Universe, Oxford University Press (New York). Kauffman, Stuart, 1996, Investigations: The Nature of Autonomous Agents and the Worlds they Mutually Create , Santa Fe Institute Working Paper 96-08-072. Langton , Chris, 1992, Adaptation to the Edge of Chaos, in Artificial Life II: Proceedings Volume in the Santa Fe Institute Stud ies in the Science of Complexity, edited by C. G. Langton, J.D.Farmer, S,
363
Rasmussen and C. Taylor, Addison Wesley. (Reading). Langton, Chris ed., 1996, ArtificialLife: An Overview. MIT Press (Cambridge). Levitan, Bennett,LoboJose, Kauffman, Stuartand Schuler,Richard, 1999,OptimalOrganization Size in a StochasticEnvironmentwith Externalities, Santa Fe Institute WorkingPaper 99-04-024. Moore, James, 1996, The Death ofCompetition, John Wiley (Chichester). Moore, Geoffrey, 1995, Inside the Tornado, Harper Collins (New York). McKelvey,William, 1999,AvoidingComplexity Catastrophein Coevolutionary Pockets: Strategies for Rugged Landscapes, Organization Science. 10, (May-June)294-321 Resnick,Mitchel, 1998,Unblocking the Traffic Jam in CorporateThinking, Complexity (3, 4): 27-30. Tesfatsion, L., 1997,How EconomistsCan Get Alife, in The Economyas an Evolving ComplexSystem II, edited by W. B. Arthur, S. Durlaufand DA Lane, Addison Wesley (RedwoodCity), 534-564 Vriend, N., 1995,Self Organizationof Markets: An Exampleof a Computational Approach,Journalof Computational Economics, 8, (3) 205-231 Welch, Catherineand Wilkinson, Ian F., 2000, FromAAR to AARI? Incorporating Idea Linkagesinto NetworkTheory, Industrial Marketing and Purchasing Conference, University of Bath,September. Wiley, James B., 1999, Evolutionary Economics, Artificial Economies, and Market Simulations, Working Paper, University of Western Sydney, School of Marketing, International Business and Asian Studies. Wilkinson,Ian F and G. Easton, 1997,Edge of Chaos II: Industrial Network Interpretation of Boolean Functionsin NK Models, in Interaction Relationships and Networks in BusinessMarkets: 13th IMP ConferenceVol 2, edited by F. Mazet, R. Salle and J-P Valla, Groupe ESC Lyon, 687-702 Wilkinson, Ian F. and Young Louise C; 2002, On Cooperating: Firms, RelationsandNetworks, Journal ofBusiness Research, 55 (2), 123-132. Wilkinson, Ian F., Hibbert, Bryn, Easton, Geoff and Lin, Aizhong, 1999, Boollean NK Program Version 2.0 A C++ program to simulate the behaviour of NK Boolean Networks, School of Marketing, University of New South Wales. Wilkinson, I. F., Wiley, James B. and Easton, Geoff, 1999, Simulating Industrial Relationships with Evolutionary Models, Proceedings ofthe 28,h European Marketing AcademyAnnual Conference, HumboldtUniversity, Berlin.
364
Appendix 1. A Suppl ier Exclusive Dealing Attractors for 108 Starting Conditions
T)pl'g 010 1 0 1 0 1 I 10 1 0 10 1 oI 1I 0 I01 1 I 11 0 10 1 I 10 0 1 10 1 1 11011 01 o10 I 1 I 0 1 1 10 1 1 I 0 1 o111 110 1 1 1 11 I 1 0 1 50% 500/0 50% 50% 50% 50% 50% 50%
Chapter 20
Can Models Capture the Complexity of the Systems Engineering Process? Krishna Boppana, Sam Chow, Olivier L. de Week, Christian LaFon, Spyridon D. Lekkakos, James Lyneis, Matthew Rinaldi, Zhiyong Wang, Paul Wheeler, Marat Zborovskiy Engineering Systems Division (ESD) Massachu setts Institute of Technolog y (MIT)
Leonard A. Wojcik Center for Advanced Aviation System Development (CAASD) The MITRE Corporation
1.1. Introduction Many large-scale , complex systems engineering (SE) programs have been problematic; a few examples are listed below (Bar-Yam , 2003 and Cullen, 2004), and many others have been late, well over budget, or have failed: • Hilton/Marriott!American Airlines system for hotel reservations and flights ; 1988-1992; $125 million ; "scrapped" • Federal Aviation Administration Advanced Automation System; 1982-1994; $3+ billion; "scrapped" • Internal Revenue Service tax system modernization; 1989-1997; $4 billion ; "scrapped" • Boston "Big Dig" highway infrastructure project ; roughly $15 billion ; about $9 to $10 billion over budget and late.
367 Traditional systems engineering (TSE) defines a step-by-step planned approach to systems engineering, which has proven effective across many systems engineering efforts. However, some systems engineering efforts defy the TSE process, due to various complexity-related factors along a set of characteristics summarized in Figure I (White, 2005). In this paper, we compare three modeling approaches in terms of their ability to represent the complexity of the SE process. We apply two of these approaches to the specific case of the Federal Aviation Administration (FAA) Advanced Automation System program. System Context
Strategic Context
Stakeholder Context
Figure 1. Enterprise systems engineering environment characterization template (White, 2005). Greater SE complexity corresponds to greater distance from the circle's center.
1.2. The Advanced Automation System (AAS) Program as a Case Study to Model the SE Process The AAS program (1982-1994) was chosen as a case study to compare modeling approaches to the SE process because of AAS program complexity and difficulty, and because extensive information is available about the AAS program in the open literature - we base the model comparison entirely on open literature information. The AAS schedule, at the time of the contract award to IBM (1988), was broken up into five major portions. These were: • PAMRI- The Peripheral Adapter Module Replacement Item - This was the initial project of AAS and improved the radar and communication systems of the air traffic control system. This was completed on time. • ISSS- The Initial Sector Suite System- This is the project which was intended as the initial.replacement of the controller workstations .
368
•
TAAS- The Terminal Advanced Automation system- The updating of the terminal area departure and approach control. • TCCC- Tower Control Computer Complex- This was intended as the automation system for the control towers. • ACCC- Area Control Computer Complex - This was intended as the automation system for enroute aircraft. PAMRI, consisting of the peripheral hardware updates was completed on time. The schedule and budget problems were associated with the software development, notably the ISSS development effort. The AAS event and budget timeline, based on open literature sources (Krebs and Snyder, 1988; Scott, 1988, Levin, et aI., 1992; Del Balzo, 1993; Ebker, 1993; Lewyn, 1993; Barlas, 1996; Beitel, et aI., 1998), is briefly summarized below : • 1982: The FAA sets the initial requirements for AAS and seeks contractors. • 1984: IBM and Hughes named the finalists to build the prototype. At this point $500 million has been spent developing the bid. • 1988: The FAA awards the prime contract to IBM worth $3.6 Billion. Hughes protests the award causing an initial project delay. • 1989: IBM begins work on the AAS. The software component of the project is estimated to be 2 million lines of code. • 1990 Requirements are still unclear for ISSS as indicated by the 500-700 requirements change requests for the year. To help finalize requirements , IBM builds a prototype center in Gaithersburg , Maryland so that controllers can try out the software under development. Despite the fact that requirements were not clear, approximately 1 million lines of code have already been written. Estimates indicate that 150,000 lines of code will need to be rewritten due to the requirements changes and the resulting bugs. To date the cost overrun is $242 million . • 1992: The FAA announces a 14-month delay to the project completion. FAA and IBM shake up their management. • 1993 April: IBM and the FAA freeze the requirements for ISSS. • 1993: IBM announces that the project will not be ready until after year 2000. IBM starts working on more methodical , communication -oriented project management philosophy with new managers . • 1994: The AAS program ceases to exist as originally conceived, leaving its various elements terminated, restructured, or as parts of smaller programs.
1.3. Application of System Dynamics to the AAS Program System Dynamics (SD) is a methodology for understanding the structures which create dynamic behavior. On a project, those structures consist of (1) work backlogs and the cycling of work among tasks to be done, tasks completed but needing rework (often with significant delays in discovering the rework), and tasks really completed ; (2) the feedback control mechanisms by which managers attempt to accomplish the project (adding resources, exerting schedule pressure, and working overtime); and (3) the secondary and tertiary impacts of project control (experience dilution, overcrowding, "haste makes waste", fatigue, etc.) that undermine the recovery strategy. The interplay of these structures, in the face of project problems (e.g., changing requirements ,
369 changing scope, resource bottlenecks, etc.) determines how disruptive these problems become, and ultimately project cost and schedule overrun (Lyneis, 2001). An important part of the AAS program was the ISSS, and we focused on modeling that aspect of AAS with SD. A simple SD project model developed for the MIT course ESD.36 (System and Project Management) was adapted to represent ISSS. We did this by altering budgets, schedule, normal productivity, and normal work quality to achieve the budgeted ISSS program, and then exogenously specifying problems that the project experienced . It is very clear from the background research on AAS that uncertain and unrealistic requirements profoundly affected the program. In fact, GAO reports of the mid to late eighties indicate that requirements knowledge was very poor to the point where the GAO had little confidence the AAS would be delivered on time. Requirements were not frozen until 1993, which was approximately 48 months after the project started (Del Balzo, 1993). We represented these requirements problems in two ways: (1) a time-varying impact of uncertain requirements on work quality (a 50% reduction for the first 18 months of ISSS) which was significantly reduced when the Gaithersburg facility came on line' ; and (2) a constant "penalty" from unrealistic requirements to project work quality and productivity which persisted throughout the project (a 20% penalty). Work Done 800 600 400 200
Figure 2. Impact of uncertainty and penalty on ISSS performance, Figure 2 shows four simulations from the SD model: (1) Budget; (2) 20% Penalty; (3) 50% Uncertainty; and (4) a combination of penalty and uncertainty. The budget finishes in month 58, approximately on schedule. However, with the requirements problems actually experienced by the project (represented by the combination simulation), the project is still not completed by month 120. Because we assume that staffing levels are limited to those actually experienced on the project, any unbudgeted rework generated by the requirements problems translates directly into delayed completion of the project. The other two simulations shown in the figure represent the impacts of uncertain requirements and unrealistic requirements operating alone on the project. Note that with the severe impact of uncertainty early in the project, very little work actually gets completed. However, once Gaithersburg comes online and uncertainty is eliminated, progress accelerates. In contrast, with a requirements penalty 1Gaithersburg coming on line also significantly reduced the time to discover any rework.
370 progress occurs sooner but the pace never really accelerates as the penalty persists throughout the project. With either the penalty or uncertain requirements alone, the project would have finished around month 88.2 We then conducted a series of sensitivity tests to see which aspect of the project problems had the greatest impact on project performance. We varied: sensitivity to penalty (to 10% and 30% from the base case of 20%); sensitivity to uncertainty (to 25% and 75% from the base case of 50%); sensitivity to Gaithersburg online (to month 9 and month 27 from the base case of month 18). The results are shown in Figure 3. Note that the severity of any penalty from unrealistic requirements has the greatest impact on performance . This is largely because the assumed impact persists throughout the project. Even strong uncertainty has little incremental effect as long as it is eliminated early. When Gaithersburg comes online has little impact within the range tested. This suggests (and we emphasize suggest) that the unrealistic requirements had a greater effect on the failure of ISSS than the skipped testing and the late opening of the Gaithersburg testing center.
Budget
10%
20%
50'/'
Budget
75%
Both
Both
a. Penalty.
b. Uncertainty.
Month 9
Budget
Month
27 Both . Month 18
c. Gaithersburg onlfft1t Dono 800 j Figure 3. Sensitivity to penalty, uncertainty and Gaithersburg online .
1.4. Application of HOT to the AAS Program Highly optimized tolerance (HOT) is a framework for understanding certain aspects of complexity in designed or engineered systems. Carlson and Doyle (1999) originally developed the HOT concept and applied it to forest fire management. They showed how power laws in event size emerge from minimization of expected cost in the face of The magnitude of the uncertainty and the penalty are both estimates . We choose these values for the base case simulation as they produced approximately equal impact on project progress.
2
371
design tradeoffs. Since then, HOT has been associated with tradeoff analyses in such systems as internet architecture (Alderson, et al., 2004). The HOT approach has been adapted to the SE process (Wojcik, 2004) , resulting in an open-loop control model. In the HOT model of the SE process, the function s(t) represents the progress of the SE program and may reach a maximum value of 1 if the program is completed. An incomplete SE program will have a maximum value of s(t) less than 1. The HOT model is based on minimizing total cost across program lifetime, where cost per unit time has three terms. The first term in the cost density function is the pressure to finish the project. Coefficients A and generate cost pressure to complete the program: the larger the value of A, the more cost pressure to complete the program. The parameter r is a constant representing how pressure builds up over time; the larger the value of r, the more time it takes for pressure to build up . The second term in the cost density function is the impact from random events, rolled up into a stochastic function of time to simulate stakeholder interactions, external factors and other unanticipated events. Two parameters, B (magnitude of random event cost impact) and p (probability per unit time of random events) combine to form a product coefficient Bp. The third term in the cost density function models the inherent technical difficulty of the SE program, parameterized by a coefficient D. Exper iments on projected cost of AAS based on appropriations
a(1 8
~------,.,....O::::::;;?2'"~.~=----l
99
Ye.
Figure 4. Modeled AAS cost compared to program appropriations (dotted curve). Each solid curve corresponds to a different set of HOT model parameters. To apply the HOT model to the whole AAS program, we used the project plan and original budget to calibrate the values of A, Bp, D and , as they appeared at the beginning of the AAS program. In particular, Bp is set equal to O. Then, a revised set of parameters A*, Bp*, D* and * was estimated from the actual trajectory of the AAS program, using a design of experiment (DOE) based on an orthogonal array for four factors at three levels and three confirmation runs . The results are summarized in Figure 4, which shows realized cumulative cost of the program for various parameter value combinations and a "best fit" to the actual cumulative cost. We found that the best fit B*p* has non-zero value, A* > A, and D* > D. From these parameter comparisons, combined with post-mortem assessments of the AAS program, our interpretation is that the original AAS plan underestimated both the level of uncertainty of the effort (B*p* > Bp) and the inherent technical difficulty of the AAS program (D*
372 > D). We also suggest that A * > A may indicate a problem with managerial effectiveness in the AAS program; more pressure to complete was applied from outside the program than originally anticipated.
1.5. Extensions to the Original HOT Model Based on COSYSMO COSYSMO is a parametric cost model for systems engineering projects derived from a combination of data from past systems engineering efforts and subjective inputs from systems engineering experts (Valerdi, Miller and Thomas, 2004). Data used to generate COSYSMO parameters comes from systems engineering efforts across multiple systems and companies (Center for Software Engineering, 2005). We did not attempt to apply COSYSMO to the AAS SE process; instead we extend the ESE Environment Characterization Template (Figure I) to include factors in the COSYSMO model. These potentially could be scored subjectively to permit estimation of HOT parameters A, Bp and D, as follows. COSYSMO drivers related to the impact of pressure to complete (parameter A in the HOT model) include productivity, motivation, experience and rework likelihood - these parameters characterize whether there are good people on board who can minimize the impact of schedule pressure. COSYSMO drivers related to uncertainty (parameter Bp in the HOT model) include the requirements environment, time to rediscover rework, system understanding, technical maturity, stakeholder environment, manageability, system reliability, and the nature of the contract. COSYSMO drivers related to cost due to build speed (parameter D in the HOT model) include quality, experience, productivity, process capability , degree of difficulty, scope/scale, system reliability , and the nature of the contract. In our initial attempt to relate COSYSMO drivers to the HOT parameters, we experimented with applying a score of either 0 or 1 to the COSYSMO driver and then adding up the driver scores corresponding to each HOT parameter to generate a complete set of HOT parameters . We did not draw any firm conclusions from these experiments; further work is needed to connect the HOT and COSYSMO models.
1.6. Conclusions SD emphasizes schedule and is validated against significant past experience. SD potentially can show emergent behaviors through interactions and feedback loops. HOT models the whole SE "iron triangle" (cost, schedule, performance) with a relatively simple model ; but it is not well-validated against experience. HOT has promise as a higher-level model showing emergent behaviors from planning and replanning cycles. COSYSMO is calibrated with expert inputs and past experience and covers the whole triangle, but has less potential for displaying truly emergent behaviors . Retrospective application of SD and HOT to the single example of the AAS program is inconclusive on whether modeling can provide useful insight into complexity and emergence in the SE process, but we see integration of the three approaches is a promising approach for future research into complexity in SE, to build on the strengths of all three approaches : SD for relatively detailed process modeling, HOT for coarse, higher-level modeling , and COSYSMO for calibration reference and tie-in of modeling to key factors in previous SE experience . Modeling of both successes and failures in past large-scale engineering programs will provide further insights.
Chapter 21 Biological Event Modeling for Response Planning Clement McGowan, Ph.D. Fred Cecere, M.D. Robert Darneille Nate Laverdure
Noblis, Inc. 3150 Fairview Park South Falls Church, VA 22042 [email protected]
1.0. Introduction People worldwide continue to fear a naturally occurring or terrorist-initiated biological event. Responsible decision makers have begun to prepare for such a biological event, but critical policy and system questions remain: What are the best courses of action to prepare for and react to such an outbreak? Where resources should be stockpiled? How many hospital resources-
375
introduction of a contagious disease) that spreads to-or even commences in-multiple cities. The model provides analytic support to aid in response planning. Our primary objective in this modeling is to reveal therelative effectiveness of different prevention and mitigation strategies. Our goal was to identify some critical leverage points - where policies or actions will make a substantial difference (in terms of lives saved). Further we wanted to determine the requirements for new response capabilities that, based upon our simulation modeling, will improve outcomes. We extended the classic SIR model using system dynamics, an established modeling approach well suited to ensuring that major policy dimensions, alternatives, and outcomes are addressed. To help frame and answer critical policy decisions, our bioevent model goes beyond the standard SIR disease spread to address the: • • • • • •
Effectiveness of different therapiesas a function of resource availability; Impactof resource inventories, initial positioning, and adaptive surging; Need to protect key health personnel who are vulnerableto infection; Requirement to restock consumable resources such as drug-based therapies; Potential for disease spread across independent cities, employing different preparation and response strategies; and Limitingof two-way population movement among the cities.
2.0 Description of the Model We have developed a system dynamics model for the spread of a biological agent (disease) covering up to four distinct interacting geographical areas (called "cities"). This model supports a variety of resource-based ways to diminish the spread and impact of the disease. The model's settings can be adjusted to correspondto essential characteristics of any communicable disease. This bio-event model has thefollowing major components with user settable characteristics. • Disease (infectivity, lethality, stages and their durations); • Treatments (effectiveness, number of medical personnel and duration required to administer); • Cities (population, movement between cities, initial number of infectious people, contact rate within the population, available key resources including personnel and what is required bythetreatments); • Surge (timing and amount of supplied resources and personnel); and • Policies (e.g., attempting to confine andlor reduce the contact rate with associated timings and different rates of success).
2.1. Disease Stages For greater realism in modeling disease spreadwe refined the Infected (I) stage of the standard SIR model. When infected, members of the Susceptible (S) population move first into a "Disease Incubation Period" stage for a user settable time duration. Next they move into an "InfectiousPre-Symptomatic" stage for another user-settable
376
duration . In this stage they can infect people they come into contact with and, since they are pre-symptomatic, no treatment or isolation actions will be taken. Then, after entering the "Infectious Symptomatic" stage they are partitioned between "Confined Population" and "Unconfined Infectious" stages based upon a user-settable effectiveness (or likelihood of recognition) setting . Confined people will no longer spread the disease , but will continue to require treatment resource s. Key Personnel follow a path similar to that of the general population in that they can become infected and thus require resources for treatment. Further, having infected key personnel reduces the capacity of the overall system to treat those who are infected . Ultimately those infected will end up as part of the Recovered or the Deceased Population (i.e. the (R) stage for Recovered or Removed in the SIR model) . The lethality of the particular simulated disease and the effectiveness of the treatment received are the key determinants of what percentage of those infected end up in the recovered stage. Of course disease lethality and treatment effectiveness are usersettable . --..
-
...
eo
~
, <
:
-.
-
I,,--
'--
-
....: ...---- ~
-- 0_'_- --- - -__-
..
.'
::=1-
•
Figure 1: Disease Stages Progressing from Susceptible to Recovered or Deceased
2.2. Disease Treatment by Key Personnel The finite resources needed (by key personnel) to treat infected patients can be reusable, such as Intensive Care Unit (lCU) beds, or one-use only resources, such as medications . When reusable resources are freed up they are returned to the pool of available resources . For example , once an ICU bed is assigned to a patient it will be "in use" until that patient either dies or recovers. This means both Key Personnel and the renewable resources have feedback loops of becoming available after a certain period of being "assigned." (See Figure 3.) This response model incorporates two types of resource-based treatments to reflect initial medical practice and a subsequent, more effective Treatment 2 (after more is known about the disease). Ifthere is enough of "Treatment 2" resources available it
377
will be administered to all who enter the treatment phase of the model. Otherwise, "Treatment 2" will be administered to as many people entering the treatment phase of the model as possible based on the current supply of resources and "Key Personnel Available to Administer Treatment 2" (i.e., who are alive, not symptomatic, and not currently busy administering treatment) . Subject to available resources "Treatment I" will be administered to as many people as possible entering the treatment phase that do not receive "Treatment 2." (Key Personnel will always receive the best treatment that has been introduced to date in the City.)
Figure 2: User Interface for Disease Lethality and Effectiveness of Treatments
_-
."'--_.._. ..
l-l-.;.:...._.
o
-"'-7;:-
o
~ -==-
1 . . r-, «,
__
_.
;
-.
--G-' i
..--:-.:=£.- - ..~:."'..--::=-.-. --:::':O;i;)",:;:;:.. to ::.:.:::=: ::=.._ .
...
.
__
.." " _ _" '
I~
--
. .. _ _ 1
...:.':!__ .
;'=-~ \
:.~
'
.
. .==;:=,:
--'--
;:~S~~..=-
Figure 3: Feedback Loop of Reusable Resources 2.3. Surge Response: Timing and Amount It is not feasible for cities to maintain a supply of the best possible treatment for each known disease in sufficient quantity to treat all members of their population. As a result, to deal with a major disease outbreak the nation needs the capability to rapidly supply (i.e., to surge in) additional key personnel and treatment resources to where
378
they are needed. The elements of surgecapacity are evident in its definition : the "ability to manage a sudden, unexpected increase in patient volume that would otherwise severely challenge or exceed the current capacity of the health-care system." Clearly, surge must deal with major unknowns, including a high volume of patients, some way to identify the bioagents used, treatment for the bioagents, and the attack locations. The model allows the user to surge key personnel and treatment resources (both renewable and non-renewable) to each city in whatever quantity the user wants at whatever time the user wants. The user is also able to set the starting times for Treatment I and Treatment 2 for each city individually in order to represent those treatments arriving in the cities at different times. Thus the user can experiment with scenarios that vary the timing and the amount of the surge response.
2.4. Multiple "Cities" and Population Movement The model provides for two-way movement between cities which is specified as the percentage of the population in city ~ that moves to city Ck each day (as i and k range from I to 4). The four cities with population movement allow users to investigate disease spread as well as different pre-positioning and surge strategies for different cities. With the four cities and population movement, with disease characteristics, treatments involving key personnel , renewable (e.g.,ICU's) and non-renewable (e.g., medicine) resources -- some pre-positioning and some surged at user-specified times - our Extended Bio-Event model has 118 different, user-settable parameters.
2.5. Results The model provides a variety of ways to try to limit the spread and impact of a disease. These ways include to: • Use initial and improved medical treatments (with specified effectiveness); • Confine symptomatic people (by isolating a percentage of the population immediately after they become symptomatic), • Modify the contact rate and/or the disease infectivity at a specific time (e.g., reflecting a policy that advocates staying at home and using surgical masks), • Modify the odds of confining symptomatic people at a specific time (to reflect improved recognition of those infected), and • Surge extra supplies of key personnel and/or the various types of treatment resources into the affected areas. Sensitivity analysis of the model's results can help identify "leverage points" where earlier and/or more effective action has a major impact on, say, lives saved. We built an Excel interface to iThink in order to run sensitivity analyses and to produce three-
379
dimensional graphs of the results (e.g., comparing the number of fatalities as parameters X and Y each vary over 20 distinct values) . For example the surface in Figure 4 shows how the number of deaths can vary with respect to changes in two key parameters. One axis varies the likelihood (from 5% to 100%) of confining a symptomatic person . The other axis varies when the contact rate is modified (from day I to day 20). These results reveal that the early reduction of the contact rate (e.g., through voluntary measures and surgical masks) is more important than even near perfection in confining those who are infectious.
Odds of Confining Infectious Population vs. Time for Modified Contact Rate
2500 2000 1500 Deaths
1000 500
17
Starting TIm e for Modified Contact Rate
Modified Odds of Confining Infectious Symptomatic Person
Figure 4: Reducing the Contact Rate vs, Confining th e Sy mptomatic The results represented in Figure 5 provide a striking comparison of timing versus the effectiveness of treatment in responding to a major bio-event, The scenario depicted here is for Plague with a severe Lethality (i.e., Odds of Dying without Treatment if Infected) of 80%. Treatment 2 is significantly more effective than Treatment I. (In this case the Odds of Dying when Treatment I is applied are still 50%, whereas the Odds of Dying when Treatment 2 is applied are much lower at 20 %.) The surface reflects deaths as the timing of the availability of (an adequate amount of) the two treatments vary from I to 20 days. Clearly timing is of the essence. In this scenario early use (e.g., Day 2) of the less effective Treatment I can save about the same number oflives as providing the much more effective Treatment 2 at the end of only the first week (viz., Day 7).
380
Tlmellne of Treatments 1 nd 2
1200 1000 800
Deaths
600
rJJJtttttt-H+I-J..I-J-lUIJ Time when Treatment 2 Is Surged to Area
Time When Treatment 1 Is Surged to Area
Figure 5: Treatment effectiveness and when available
3.0 Policy Inferences Systems engineering, as a discipline and approach to large complex problems, grew out of the perceived threat that the former Soviet Union posed to the US. The perceived threat today is no longer bombers and missiles but terrorists attacking with biological, chemical, radiological, or explosive means. And as it did previously, the US will have to make major investments in planning, equipment, and personnel to address the bioterrorism threat-system investments that are a natural consequence of agreeing upon an overriding national defense requirement to act effectively to prevent and respond to an attack. And if such a requirement is in place, surge-capacity planning necessarily follows. As a "system" the nation's biomedical surge capacity could be viewed as a specialized supply chain. And instead of pre-positioning extensive resources (using forecasting) modem supply chains are generally organized for rapid replenishment. How long will it take to replenish resources as they are used in response to a bioevent? The replenishment supply-chain strategy ideally holds enough medical resources in a city to handle the initial demand surge until the resources can be reliably replenished. So the greater the system' s ability to resupply, the less there is a need for pre-positioned resources. Adopting a replenishment supply chain approach to medical surge capacity would generate at least two major requirements. First, because they are now part of a supply chain, the disaster-response tiers-individual facility, local coalition, jurisdiction of an emergency operations center, region, state, and federal-would have resupply obligations. Second, rapid resupply implies a major investment in a national
381
capability (including transport), not unlike the investment the US made in the 1950s (and beyond) to have an early-wamingcapability that addresseda perceivedthreat.
4.0 Conclusions Jay W. Forrester, the developer of system dynamics, said recently (2001) that "the most important use of system dynamics should be for the design of policies." We have used sensitivity analyses of our system dynamics, bio-event model to explore the consequences of different response policies and strategies. To further leverage our work we transferred this bio-event model to a systems engineering student team from George Mason University (GMU). As part of a year long project the GMU team added resource management components to our bio-model and investigated resupply strategies as part of thesurge response.
References Amin, Y. et aI., 2006, Bio-Event Resource Management System (BRMS), IEEE Systems and Information Engineering Design Symposium.
Barrett, c.L. et al., 2005, If smal1pox strikes Portland .. . Scientific American, 292 (3), 42-49. Heathcoat, H., 2000, The Mathematics of Infectious Diseases, SIAM Review, 42 (4), 599-653. Kaplan, E. et al., 2002, Emergency response to a smal1pox attack: The case for mass vaccination, PNAS, 99 (16), 10935-10940 Manley, W., Homer, 1. et al., 2005, A DynamicModel to Support Surge Capacity Planning in a Rural Hospital, International System Dynamics Conference. Sterman, 1., 2000, Business Dynamics: Systems Thinking and Modeling for a Complex World, McGraw-Hill.
Chapter 22
Principles of self-organization and sustainable development of the world economy are the basis of global secu rity omitry CHI STill N
Institute World Economy and International Relations Ukraine Academy of Science 5.Leontovich str., Kiev, 01030 Ukraine E-mail: [email protected]
The phenomenon of states changes of the world economy during the last 200 years shows that there is a certain 70-year regularity in its development, which is expressed in increased structural complexity of the global economic system every 70 years.The development happens after certain periods of bifurcation (up to 50 years) accompanied by the lower rates of economic development, and periods of adaptation (up to 20 years) with the higher rates.The theoretical justification of this process shows that the increased structural complexity of the global economic system is the external manifestations of the self-organization process in a large complex system we call the "world economy". This process of development is based on two fundamental laws of nature: the principle of minimum dissipation of energy (or resources), and the law of conservation of accumulated energy (economic efficiency); and is realized via two types of development mechanisms - bifurcation and adaptation. Formation of the world-security system should rest on applying the natural laws of development, and lead towards the creation of a complex, two-level (regional and global) structure with the institution of geopolitical pluralism, based on implementing the "principle of minimum dissipation". This will contribute to the development of the global system on the conflict-free base.
1 Introd uction The objective of the work is to reveal and define phenomena, which are characteristic for the behavior of complex systems in the process of their development on the basis of factual material of the world economy development during the period of 1825-2000. Some clements of the work were presented at scientific forums taking place in the USA, Spain, Italy, and Russia. Theoretical issues are explained in monograph "Self-Organization of the World Economy. Euro-Asian Context". Social system - "the world economy" - is considered as a complex system consisting of two global subsystems: economic and political. Common agents for both subsystems are national economies , which interact in economic and political spheres and form connections and structure of social system of the world economy. We could use here the term "world system" (or global system), which could include both economic and political subsystems. But we believe that the term "social system - world economy" is more suitable because we assume that political system being an object of non-economic scientific disciplines should be included into economic scientific domain. Such approach allows us to view political system as an object of economics through the understanding that a certain type of political system has an influence on economic result of the world socium functioning. Thus, hereinafter the term "world economy" will mean complex, combined social system. We understand economic system of the world as aggregate of the subjects of the world economy - national economies and economic relations between them appeared in the process of international exchange of resources on the basis of international division of labour. The economic system "the world economy" can be structurally expressed by a system of international economic relations : geograph ical and products structure of international trade; geographical and social structure of international migration of labour resources; international flow of capital. Functionality of international economic relations is achieved on the basis of created international monetary system. Political system of the world economy is an aggregate of direct and indirect (i.e. through accepted collective decisions in super-national institutes) diplomatic relations between countries participating in international division of labour providing legitimacy of functionality of international economic relations. Functionality of social system the "world economy" means creation of economic and political relations in the process of international exchange of resources on the basis of international division of labour with the aim of most effective distribution of its for production in the circumstances of scarceness of resources and distribution of manufactured products in circumstances of unlimited growth of consumption . The main functional purpose of social system "the world economy" is implementation of self-regulation of the agents of economic system through political system (because national economies are the common agent for both economic and political systems) which results in state of maximum dynamic equilibrium of economic system "the world economy" - the state of economic growth of the world economy. Today it is known that the main problems of the world economy are : th reat of ecological cataclysms, poverty and destitution of two thi rds of the world population, mass hunger, illiteracy, etc ., in circumstances of continuos growth of population of the planet and limited world resources. Therefore, availability of mechanism allowing effective re-distribution of valuable
383
384
resources in the scale of developing global economy is a vital necessity for solving global problems. Development of social system means a process of increasing its stability under the influence of outside environment (keeping stability in the given limitshomeostasis) by means of accumulation of structural information changing the quantity of organization (effectiveness) of the system and making its structure more complicated. Increasing of stability is expressed in accumulation of economic effectiveness and formation of more complicated structure of society. Development of world economy means a co-evolution of economic and political subsystems development resulted in further gradual complication of social structure, which strengthens the system stability under the influence of environment, pressure of population growth and limited resources. Development means a change of equilibrium states with different macroeconomic characteristics. Each state is expressed in structural and quantitative characteristics. For world economy we will consider the international monetary system (IMS) as a structural characteristic. We will consider the growth rate of the gross national product of countries participating in international economic relations as a quantitative characteristic. Our task is to formulate verbal model of development and selforganization of the world economy as a social system and having basic concepts of the theory of social development [I], to determine and formulate systemic regularities, which are the basis for the world economy development and, consequently, for the new structure and organization of the world economy. As a result, on the basis of concluded regularities, it will be possible to estimate a forecasting variant ofthe state of the world economy in the nearest future.
2 Idea and description of the model of the world economy Development The idea of the model is based on the simple assumption "...that phenomenon of the development as a whole can be considered as a struggle between two opposite tendencies - organization and disorganization. The process of development beginning with maximum disorganization can be described as a process of accumulation of structural information, which is calculated as the difference between the real and maximum values of entropy. Therefore, phenomena of development are advisable to consider in coordination with conceptions of entropy (information) and with possibility to measure the level of organization (or disorganization) of the system at each stage of its development...." [2]. The model of self-organization of the world economy system is represented in the form of vertical convergent spiral around the axis reflecting economic effectiveness as quantitative expression of its development. The spiral itself is, in essence, a trajectory of development of the world economy throughout time. Loops of development reflect repeated cycles of development but at the qualitatively higher organizational level. Projection of loops on the plane and radius of remoteness from the axis of effectiveness demonstrate the quantity of organization in the world economy system for every observed state. Gradual decreases of radius from loop to loop shows the process of decreasing the quantity of disorganization (entropy) and increasing the quantity of organization of economic system. All this demonstrates the essence and dynamics of the process of development.
3 Structural and quantitative characteristics of the states of the world system Development of the world economy as a big complex system represents the change of its states. Each state, has structural and quantitative characteristics. Thus, for adequate description and simulation of the process of development of the world economy system, we will need to distinguish its states in the process of development and to define structural and quantitative characteristics for each distinguished state. Taking into consideration the above mentioned fact, most of researchers of the world economy development consider relations between countries of Western Europe, USA, Canada and Japan, as a core of development of international economic relations for the period of the time 1825-2000 . For the observed interval of time international economic relations are based on the international trade, migration of population and flow of capital between Western Europe, USA, Canada and Japan. The skeleton, foundation for all above-mentioned elements of international economic relations is international monetary system, which provides implementation of all such relations.
Therefore, we consider the structure of international monetary system as a structural characteristic of organization of international economic relations and, conseq uently, of the world economy system.
Three types of systems classify the international monetary system: gold standard, Bretton-Woods system and Jamaica system, which define three states of the world economy system. Period of time when principal structural elements of international monetary system are created and preserved is a period of existence of one state of the world economy system in the process of its development. One state corresponds to one single cycle of development. One cycle of development of the world economy system includes two different periods which are defined by the type of mechanism of development and for description of the model by quantitative characteristic as well: period, which is characterized by bifurcation mechanism of development; period, which is characterized by adaptation mechanism of development. Accordingly, each type of international monetary system passes through two stages of its development and functionality. The system is formed at the stage of action of bifurcation mechanism. At the stage of action of adaptation mechanism the system is actively functioning. Different rates of the world economy growth are connected with these periods: in the period of action of bifurcation mechanism of development you can observe decrease in the rate of economic growth of the world economy; in the period of action of adaptation mechanism you can observe spasmodic increase in the rate of economic growth of the world economy. Synergetic effect is
realizing.
For simplifying calculations of quantitative parameters introduced into the model, we also take this prerequisite into consideration. So, we calculate the
quantitative characteristic of the states of the world economy system for the interval of time 1875-2000 as an average value of growth rate of gross national product for above mentioned countries for each observed period of time (in %). For the period till 1875 the rate of the world economy growth is assumed as 1.5%.
385
386
4 Development and self-organization of the world economy system
According to stated criteria, the analyzed historical period of development of the world economy system is from 1825 of the XIX century till 2035 of XXI century. On the basis of temporal intervals of action of indicated systems of international monetary relations and quantitative characteristics , we create temporal boundaries of existence of three states of the world economy system in the process of its development. Thus, we define temporal intervals of three cycles and six periods of development of the world economy system. They have the following structure: The first state. The first cycle of development of the world economy system is a period of functionality of the gold standard system: 1825-1875-1895. The first period of the first cycle of development: span of time is 1825-1875; transformation period; formation of the gold standard system.; action of bifurcation mechanism of development ; rates of growth of the world economy1.5%. The second period of the first cycle of development: span of time is 18751895; active functionality of the gold standard system; action of adaptation mechanism of development ; rates of growth of the world economy- 2.6%. In the sphere of policy: countries-participants of international economic relations came to mutual collective agreement on bilateral basis about exchanging their currencies for gold; principle of achievement of social, collective agreement in international economic relations in the sphere of monetary relation had been implemented. So the world political system was being created mainly on the basis of direct diplomatic relations and in the form of military unions, coalitions, etc. The second state. The second cycle of development of the "world economy" system - period of functionality of the Bretton Woods system - 18951945-1965. The third period of the second cycle of development: span of time is 18951945; transformation period offormation of the Bretton-Woods system; action of bifurcation mechanism of development; rates of growth of the world economy 1.8%;
The main result of the third period of the second cycle of the world economy development became super-national institute including systems regulating international financial and trade relations, realizing the principle of social, collective agreement of countries-participants in the process of formation and implement of international economic relations. Such relations assumed mutual regulation of currency rate, crediting of payment balances of countries-participants, decrease of tariff limitsin international trade, etc. The forth period of the second cycle of development: span of time is 1945-
1965; In the sphere of economy: During this period, rates of the world economy development were high and spasmodically increased up to 5-6% , when compared with analogue indicators of the previous period. Mechanism of development was of adaptation type. Till the end of 60s - beginning of 70s (1972 - 1973) the system of international economic relations reached their maximum effectiveness for the current state and started transforming into the next stage. Change of principles of monetary relations defined by Jamaica Conference became logical conclusion of the fourth stage of development of the world economy system. Formation of conditions for international monetary relations had not been completed. Jamaica conference just initiated such changes.
In the sphere of policy: regional and global super-national institute started their formation. They received further development in the next fifth period of the third cycle of development. As a whole, bipolar political system and ideological opposition kept global political and economic stability. The world political system received an instrument of solving conflict situation by mutual agreementUN, which included not only institutes of regulation of world economy but Security Council as well- institute of regulation of military and conflict situation. The third state. The third cycle of development of the world economy system - period of Jamaica system - 1965-2015-2035. The fifth period of the third cycle of development - period of transformation of international monetary relations and formation of its new structure ; span of time 1965-2015; action of bifurcation mechanism of development; rates of growth of the world economy system - 3.4%. That period defined two phenomena of global development. One of them was phenomenon of regionalization : by 90s of the XX century 23 international organizations of integrated type with regional location had been created (EU, NAFTA, MERCOCUR, ASEAN etc .). They possessed more than 60% of the world gross domestic product and the major share of the volume of international trade - about 8 trillion dollars. Besides, 85 regional trading and economic agreements were concluded in 90s. The second phenomenon was globalization. Thousands transnational corporations in the world possessed up to 50% of the world production and up to 63% of the foreign trade. Such capital and resources flows were not supervised by State. And it should be noticed that we are speaking about effectively distributed resources. This allows to state that national State
as a form of organization of people activity connected with distribution of resources and benefits within limited territory and limited quantity of people became less effective when compared with regional variant of creation of groups of countries and global resources flows.
Political situation in the world dictates two ways out. The first one creation of new political order of the world on the multipolar and two-level base: regional and global. (It's due to new leaders with significant economic and military potential, who appeared on the world scene. They are EU, China, India, and Japan). The second way out - the structure of political system of the world should foresee the implementation of principles of group agreement between developed, developing and underdeveloped countries. The sixth period of the third cycle of development - forecasting, period of active functionality of new system of international monetary relations; span of time 2015-2035 (forecast); action of adaptation mechanism of development; rates of growth of the world economy- 8-9% (forecast). According to stated structural and quantitative characteristics of organization of the system of the world economy, we distinguish three states. Each state corresponds to one of three types of system of international monetary relations and three cycles of development, which correspond to six periods of development of the world economy system, five of which are real and the sixth is theoretical or forecasted. On the basis of distinguished states of the system we build a model of selforganization and development of the world economy. (Figure I)
387
388 ...- :::
..... .
-
..
Jc;J
, n! so iral tum
....r-E::-'\ I
l~
K
/
z-:
I
/- '"
;"-- ' \
\:
~
I
i-: ~ -- -
K~
f-t: ~ / I
-
tf ........ I I 1875
~
V
-- -
/ \.!:?' -
--"'-.........
,...
,.......
.,
r ,I-
~d
(I .f.. . .
I} . .. ~
pw'"
IV........
......... ---
I '" -......... ..... I I "~ - ,-·''''tIV...
..t~$
.~
2.~
11115
I
'04'
..,
I
2015 J" ~35
........
- - .... ',5"
.....-'
Rg .1 The model of cs.v.Iopmenl.nd MnoOri_ntudon 01Wortd economy tor Internl 04'time 1825-2(35)"1!l11'"
Con clusions: I.
Each consequent distinguished state of world economy has more complicated organization of political system and international monetary system. And this fact demonstrates the tendency of complication of world community structure. 2. Each consequent state is more effective from economic point of view and has the higher rate of economic growth. This allows world economy to develop steadily in conditions of grown population of the planet and limited resources. A tendency of growth of economic effectiveness of the whole system is observed in the long-lasting interval of time. 3. The process of formation of consequent structures of both political and economic organization of the world economy took place in conditions of strong non-equilibrium environment in the form of numerous military and civil conflicts and economic crises. 4. All stated above allows to conclude that: system of the world economy possesses the feature of the complex systems - self-organization. The Oksanger-Prigogin's principle of minimum energy dissipation is realized in the process of development. Each consequent organization of world economy produces less entropy than the previous one. The category of energy in physical system corresponds to category of resources in social system. Thus, principle of minimum dissipation of limited resources acts in social systems. The model reflects realization of this phenomenon in the process of development and self-organization of the world economy. Direction of development of the world economy system is defined by the law of conservation of accumulated effectiveness and this allows to say that the model has predicted potential for realization of prognosis of future organization of the world economy. 5. The global socium security assumes the socium self-preservation under the impact of various destructive processes: increasing risk of ecological disaster, mass dissemination of the epidemic diseases that people are not able
to cure yet, growing death-rate due to the ubiquitous poverty and famine, increasing deficit of finite energy resources because of their nonreproducibility (the world oil and gas reserves are available for 30-35 years of the civilization existence). According to the experts, the global warming (greenhouse effect) could happen within nearest 50 years. Besides, the threat of nuclear catastrophe and war conflicts are still existing in the world. The possible ways of solving the above problems are complicated by multi-religious and multi-cultural environment, peculiarities of mentality of the population in the whole world, differences in natural recourses availability in various regions of our planet. Thus, provision of the worldwide security means the necessity to create institutions of the global socium, which would support the socium stability in above circumstances. The worldwide security is undoubtedly impossible without realization of joint interests of diverse geographical, cultural, ethnic, etc. groups of population of the whole world. The population joint interests could be realized via establishing the institutions of geopolitical pluralism on the basis of principle of public harmony at the regional and global levels. The institution of the public harmony is a mechanism of optimization of solving the existing problems on the platform of functionality of natural laws: minimal dissipation of energy (recourses) and the law of conservation, both forming the trajectory of the civilization development. As could be obvious from the above, such laws underlie a big and complicated system of the world economy. The worldwide security means establishment of the institutions which can embody and support these laws of Nature. Such institutions will provide stable and conflict-free development of the global socium for the long-term period of time. Within the system, an effective mechanism should be created for restricting the activity of those countries or group of countries whose military potential incite them towards the actions that don't take into consideration interests of individual nations or the whole planet. The system should rest on the creation of All-Planetary Constitution - the Fundamental Law, which all existing forms of the people organization shall observe. Within the Fundamental Laws, the wide spectrum of problems of preservation of the global socium, starting from local conflicts and ending with preservation of the ecosystem and life on the Planet as a whole can be and must be solved.
References 1. D.Chistilin. Self-Organization of the World Economy. Euro-Asian Context". Moscow, Economika, 2004.
2.
R. Abdeev. The Philosophy of Information Civilization. - Moscow, Vlados, 1993
389
Chapter 23
Evolutionary Paths to Corrupt Societies of Artificial Agents Walid Nasrallah American University of Beirut, Lebanon walid@alum .mit.edu
Virtual corrupt societies can be defined as groups of interactin g computer-generated agents who predominantly choose behavior that gives short term personal gain at th e expense of a higher aggregate cost to others. This paper focuses on corrupt societies that , unlike published models in which cooperation must evolve in order for the society to continue to survive, do not naturally die out as th e corrupt class siphons off the resources. For example, a very computationally simple strategy of avoiding confrontation can allow a majority of "unethical" individuals to survive off th e efforts of an "ethical" but productive minority. Analogies are drawn to actual human societies in which similar conditions gave rise to behavior traditionally defined as economic or political corruption.
1
Introduction
"T he evolution of cooperation" or of "alt ru ism" is an idea whos e popularity has turned it into a bit of a cliche in t he world of artificial life modeling. Briefly, a person, animal or computational agent needs to take resources away from the fulfilment of it s own needs in order to cont ribute to the well being of others in its social unit. Doing so does not seem consistent with the imperative of natural selection, which rewards efficiency in usin g resources tow ards the fulfilment of a creat ure's own needs. But whenever the presence of others in the environment is part of the basic needs of a creature, then purely self-serving behavior over a cour se of generat ions runs the risk of leaving the creat ur e alone, and hence unable
391 to continue to propagate. Therefore, a creature that somehow acquires the habit of helping others would over the long term enjoy an evolutionary advantage over a similar one that did not. Since the same argument can be made for any member of a population, it stands to reason, and has been corroborated by simulations and biological observations, that cooperative behavior should be widespread in any population where social interaction is valuable. Of course, freeloaders and parasites can always come along, but they need to be eliminated, avoided or at lest kept in check in order to ward off extinction of the population. This paper examines the incidence of what I call "corrupt societies" , which are characterized by a prevalence of non-cooperative behavior in a social system that nonetheless continues to survive . These societies are studied from the abstract vantage point of iterated prisoners' dilemma interactions between computational agents . In particular, I look at five different ways in which a corrupt society can evolve: two of them from past literature, two observed by myself in the course of my research, and one that I expect to be discovered from future simulations. The latter would follow logically from the previous scenarios, but the exact details will, at least initially, depend on interactions best revealed by simulation. I conclude with some speculation about how the incidence of corruption (by the familiar economic/politcal definition) in contemporary human societies seems to correlate with factors analogous to the evolutionary paths in the simulation model.
2 2.1
Background Iterated Prisoners' Dilemma (IPD)
Studied under "reciprocal altruism" in sociobiology and "eusociality" in entomology, the idea of "evolution of cooperation" found its purest mathematical expression in the "Iterated Prisoners Dilemma" problem. Characterized by the lack of a "Core", the original Prisoners' Dilemma of Game Theory provides each participant or player with two choices, to "cooperate" with the other participant or to "renege" on the implicit promise to cooperate. The payoffs are arranged to provide a temptation t for an individual to renege when the other cooperates, a punishment p when both renege, and a reward r when both cooperate. The least favorable outcome is s (for sucker) when a cooperator faces a reneging player; this makes it more advantageous for that sole cooperator to shift to the p. A Nash equilibrium exists when both players defect, because a player that changes from defect to cooperate without any guarantee from the other player must lose, by going from p to s. In addition to being an instinctively undesired state of affairs, the p, p outcome is also not stable under the effect of a coalition between the two players, who would then move to the r, r state if both cooperate. But then, the coalition can be undermined by either player yielding to the temptation to obtain t instead of r. Hence no outcome is stable under free coalition formation and destruction. In other words, the game has no core.
392 Cooperate Cooperate
r
Defect
t
Defect s
!:
p §
t P
Figure 1: Payoffs for classic Prisoners' Dilemma. t > r > p > sand 2r > (t + s). (The column-chooser is underlined for clarity). This changes when the game is played multiple times. Under multiple iterations, players can decide to punish a defector and reward a cooperator. In the long run, strategies based on cooperation can be shown to be more stable under an evolutionary regime where payoffs allow a player to continue to exist and to possibly give rise to copies of itself. Within the general state where cooperation is favored, many different strategies can be dreamt up to maximize cumulative rewards. Different ways to remember past defections and to react to them can be pitted against different ways to sneak in an occasional defection among a string of cooperations. The defections give a temporary boost in reward, while the cooperations serve the dual purpose of keeping the opponent alive and of lulling against retaliation. The most famous outcome of allowing multiple strategies dreamt up by human researchers is reported by Axelrod, [2, 3]. It was found that the injunction to "Keep it simple, stupid" worked best : a strategy that simply cooperated the first time and then responded to every opponent by repeating the last action of the opponent was the winner against many other more sophisticated strategies. This strategy was descriptively and enduringly named "Tit-for-tat" (summarized TIT. Even strategies designed to fool "tit-for-tat", or to at least overcome the One defect of indefinitely repeated cycle of retaliation-atonement following a misunderstanding, were not able to survive contact with a multitude of other possible strategies as well as "tit-for-tat" itself [3].
2.2
Corruption in IPD and real-life analogues
"Tit-for-tat" is a strong guardian against corruption in a society. The strategy survives very well and continues to immediately detect and respond to the slightest attempt to swindle others by pretending to cooperate for a while. This is done without any large investment in memory, since both free-loaders, who try and sneak in occasional defections, and other retaliators, do not attain greater success even when they have a memory of many past interactions. The simplicity of the winner of Axelrod's competitions inspired [8] to scour the complete strategy space consisting of one-encounter-deep memory of both the agent's OWn move and that of the opponent. It was found that another simple strategy can be even more successful. "Pavlov" was the name given to the strategy after it was found to be more evolutionary persistent than initially expected by its initiators, who had originally named it "Simpleton". "Pavlov" (summarized as
393
Pav does not cooperate when it sees a cooperator, but instead repeats its own last action. Similarly, when it sees a defection , it does not defect , but rather switches its past action to the opposite move. Under random evolution, the ability of "Pavlov" to exploit a partner th at always cooperates is of more value th an "t it-for-tat's" ability to persist ently retaliat e against a partner th at always defects . Applied to human interactions , TIT reminds one of the reluctant gunfighter of movie westerns , or of th e citizen-soldier creed of America's George Washington , Rome's Lucius Quinctius Cincinnatus, and Homer's Ulysses. "Pavlov" is an amora l opportunist, more like Homer's Agamemnon or any of a number of more recent politicians . In any case, th e presence of eit her or both TIT and "Pavlov" provides th e immediat e means to reduce the initi al success of habitual defectors. In the long term , th e presence of someone who can retaliat e and survive , is necessary to prevent the extinct ion of th e whole society, which would oth erwise be the less direct consequence of the initi al success of defectors within a society.
3
Previously Studied Paths
What , th en, allows a society to be overcome by people or agents th at defect more th an th ey cooperate when different typ es of retaliators are always around? The answers are still a topi c of debate, since simulations that produce an example of a corr upt society can be shown to be special cases of a general situation where defection remains an evolutionarily losing stra tegy.
3.1
Fast Predators
Since agents only learn from th eir past experience, a habitual defector can thrive if it can seek new victims more quickly than it can be tracked down by those who would retaliate against it [4, 5]. An equilibrium ratio of fast-moving defector to other players including TIT can be calculated from the speed of movement , th e density of agent s, and the size of th e "patch" or immediat e neighborhood with in which agents can see and interact with each other [4] . This type of model raised worries that cooperation occurs more in th e real world of animal, plant and human interactions th an in th e abst ract simulation. This was explained by factors from out side the original model, such as fast spread of information between agents about who is a defector and a pre-existing assumption th at a stra nger is a defector until oth erwise proven [5] . Augment ing the model in th at way resolves th e immediate issue of disparity between th e model and observat ions, but it raises th e possibility of second-order strat egies such as false information, camouflage, and other strategums observed in th e animal kingdom. What we have is one path , mostly of metaphori cal value, to a society th at can indefinitely maintain a large number of defectors. In a human society, his theoretical path to corr uption can be compared to mounted nomadic raiders ravaging medieval peasants ' fields, or militiamen on pick-up trucks assailing contemporary villagers in a failed nation.
394
3.2
Watered-down punishment
It is also possible to play with the payoff matrix of the interactions to lower the cumulative cost to society of defection - i.e. increase s + t visaj -vis 2r [7] . When the value of t in the payoff matrix is between 1.85 and 2, and sand p are 0, and r is to 1, then a society can evolve with a majority of defector . In addition, deterministic rules for changing strategy from "defect" to "cooperate" and vice versa gives rise to a chaotic or random shifting pattern in which the ratios are roughly maintained. Although this simulation was done with immobile agents that spread their strategy to immediate neighbors like lichens or trees, the conclusion that parasitic behavior can flourish when its ill effects are limited is a useful analogy in other fields. One analogy in a human society might be with low-intensity interactions, such as queue-jumping or tax evasion. The rang e of real behaviors that can be modeled with such a restricted subset of possible interaction payoffs is small, but the metaphor has its place in characterizations of human social behavior.
4
Corruption among equally mobile agents
A series of simulations [1] was conducted to test what happens when mobility is no longer the exclusive to defectors. It was found that a cooperator that can avoid defectors is much more successful than across a wide range of conditions, including density, payoff matrix differences and mobility. In a human society, this corresponds to the familiar trait of social opprobrium. A school bully does not get invited to birthday parties; a politician who successfully evades the law loses the next election . The predominance of cooperation is reasserted in a metaphor for human behavior in which all have equal mobility and some level of heterogeneity exists among the population.
4.1
High-stakes anonymous interactions
The same strategy that helps a society avoid becoming corrupt when mobility is introduced can, in some situations, become itself a source of a new form of corruption [6] . Although [1] showed that cooperation predominates, there are distinct areas of the model's parameter space where the mobile defectors come to outnumber the other mobile strategies. Typically, in the few instances when this happens, the TIT and Pay agents are driven to extinction, and a mobile cooperator that avoids defectors without retaliating becomes the source of productivity in the population while remaining numerically in the minority. The type of corrupt society arises when density is high, mobility is high, and agents interact repeated times before disengaging and looking for a new partner. This means that meeting a defector can lead to repeated exposure to defection, which has the effect of weeding out the retaliators in the first few hundred simulation cycles. In addition, agents do not recall the identity of their last interaction partner, so a re-encounter with the same defector is met with
395
'2
10 .................---....r-
olla . IW AOUb C -e Ma.x IC . PAV Ma. a . TFT IrAb D ' WAC M.v:
, lsi' 51' 51, 15 , lSi' 5 ' 51, 51' 51" 51' s', 5 ' IS ' 51' 151' s d s ' 5""",",• 2Sl
.!Jl
120
2Sl
.!Jl
360
I
1l!
2Sl
720
I
1l!
1
120
2Sl
1l!
2Sl
360
1l!
2Sl 11l!
720
l.
2Sl
120
1l!
2Sl
360
I
1l!
2SlllJllmQ!!l
720
2
DInsIty ~
Figure 2: Dominant strategies with anonymity [6] initial cooperation even by retaliators. This may seem far-fetched metaphor for human interactions. However, the evolution of the internet has given rise to a level of anonymity in many high-stakes interactions that this particular abstract path to corruption can potentially teach us something about what to look for and what to avoid in planning human some certain social actions. The "Nigerian 419 scams" [10] are one example that springs to mind of a human social phenomenon that becomes prevalent due to this underlying dynamic.
4.2
Dense, fragmented societies
In two figures Fig. 2 and Fig. 3, the dark bars at the top show the proportion of simulations where a defecting avoider has the highest numbers after 1000 runs, i.e. where the resulting society can be said to be corrupt. The most prominent observation is that the ability to detect when a new encounter is with an agent previously seen leaves a possibility of corruption only when small neighborhood size is combined with high density. This combination has the same effect as §3.1 above. A predator can always find more victims in the high density population, but victims cannot hide as easily because of the low neighborhood size. (Interactions are only possible within the patch or neighborhood unless the agent moves away.) The interesting thing is that the predators do not need to be more mobile than other agents in order for this effect to be seen.
5
Kinship Bias: A future path for research
§4.2 described how small neighborhoods and high density, if enforced as physical constraints, lead to a higher preponderance of corrupt societies. One human ana-
396
10;
• • WAQ Mal
e . c a,aa.
6
D , PAV Max
c . m ua.
u. WACu.a.e
2 1 1 1 ........
0 ,15 1 sl, rs\dS\l lsI1Is 11 5111 51,1 5 ,1 S I1-rsI ,r~ d s ld s l l l s l d s ld s l ' I S MObi l it Y 10 20 10 20 10 20 10 20 10 20 10 20 10 20 10 20 10 20 Repetibons 120 360 720 120 360 720 120 360 720 Density 6
4
2
Neighborhood
Figure 3: Dominant strategies with l-partner-deep memory [6J
logue to this abstract path to corruption is ethnic-based ghetto-like communities in an urban setting. Today, most ethnic neighborhoods in a multi-ethnic city are self-selected, not enforced. This does not make much of a difference when mapping something so abstract as an IPD simulation with memoryless mobile agents to a to human social experience: one more feature not in the model. But one can wonder how the incidence of corruption changes when the agents reduce their pool of immediate interaction candidates in a different way more analogous to un-integrated ethnic sub-populations. From an evolutionary standpoint, it is widely accepted that costly altruistic behavior is more advantageous when directed at kin. IN the context of IPD, predicting when a newly encountered agent will be a defector can boost survival, and one way to do this is to look at how the prospective partner shares ancestors with other previously encountered agents or with the agent doing the evaluation itself. It has been suggested [9] that human societies whose members value kinship are more likely to become weaker than societies whose members treat each other equally. The increased corruption due to the small-neighborhood effect might be one way to explain this idea in abstract dynamic terms. Future research where the kinship values are directly coded into agents' strategies will be needed to show whether other, possibly advantageous, effects of kinship valuation can increase or decrease the incidence of corruption and hence the susceptibility of the society to invasion by a less corrupt group .
Bibliography [1] AKTIPIS, C. Athena, "Know when to walk away: contingent movement and the evolution of cooperation", Journal of Theoretical Biology 231, 2
397 (2004), 249-260. [2] AXELROD, Robert, "More effective choice in prisoner's dilemma", The Journal of Conflict Resolution 24, 3 (September 1980), 379-403. [3] AXELROD, Robert, The Evolution of Strategies in the Iterated Prisoner's Dilemma, Morgan Kaufman, Los Altos, CA (1987), ch. 2. [4] DUGATKIN, Lee Alan, and David Sloan WILSON, "Rover: a strategy for exploiting cooperators in a patchy environment" , American Naturalist 138, 3 (1991), 687-701. [5] ENQUIST, Magnus , and Olof LEIMAR, "The evolution of cooperation in mobile organisms", Animal Behaviour 45,4 (1993), 747-757. [6] NASRALLAH, Walid Fawzi, and Youssef George SAAD, "The role of partnering velocity and spatial mobility in the evolution of cooperative and parasitic behavior", Adaptive Behavior Journal (2006), Under review. [7] NOWAK, Martin, and R. M. MAY, "Evolutionary games and spatial chaos", Nature 259, 6398 (1992), 826-829. [8] NOWAK, Martin, and Karl SIGMUND, "A strategy of win-stay, lose-shift that outperforms tit-for-tat in the prisoner's dilemma game" , Nature 364, 6432 (1993), 56-58. [9] PETERS, Ralph, "Spotting the losers: Seven signs of non-competitive states", Parameters 28 , 1 (1998), 36-47. [10] ZUCKOFF, Mitchell, "The perfect mark : How a massachusetts psychotherapist fell for a nigerian e-mail scam", The New Yorker Magazine (May 15 2006).
Chapter 24
Path Dependence, Transformation and Convergence-A Mathematical Model of Transition to Market Roxana Wright, Plymouth State University Philip V. Fellman Southern New Hampshire University Jonathan Vos Post Computer Futures, Inc.
399
1.0 Introduction The economies of Central and Eastern Europe (CEE) have been recognized as posing unique challenges to study. The scope and nature of their political and economic development processes vary widely across the region are characterized by substantial volatilities and are inherently complex. In this context, a renewed interest in the mechanisms of economic transition has arisen in order to explain the effects of Eastern European integration into the European Union as well as the consequent integration into global economic structures. This approach, which we shall refer to as "Transition Dynamics", (TO), differs significantly from the more traditional macroeconomic and political science focus on purely internal processes of change . In this regard, the CEE transition to market can be described as a process characterized by out-of-equilibrium dynamics and a heterogeneous competitive landscape with high levels of complexity.
1.1 Previous Studies Previous studies of the CEE transition process have employed a variety of approaches, most of which can be classified as .economic transition. theories ("NET", for Neoclassical Economic Transition theories) . A smaller, but perhaps more interesting group of studies is based upon economic transformation theories (of which transformation dynamics is a particular subset). This approach to the problem of economic change in Central and Eastern Europe is built around a dynamic or systems evolution approach to economic change and tends to rely upon more quantitative models of economic processes , aiming to explain dynamics rather than stable equilibria. As mentioned above , the NET approach is typically associated with neoliberal and neoclassical economic interpretations of political and economic events. While the transformation approach to Eastern European economic change is generally more concerned with modeling dynamic processes , it too carries with it interpretations stemming from Marxist political economy and regulation theory (Pavlinek, 2002), something not typically seen in evolutionary economics studies of Western European Economies.
1.2 Embedded Markets Additional concepts which have come into general use in the literature of transition are the .plurality of transitions. (Stark, 1992), and the treatment of continuous negotiation between simultaneous processes of privatization, democratization, and globalization. All of this, in tum, takes places in an institutional context of embedded market s. In the present paper, we advance a mathematical model that clearly differentiates amongst these various mechanisms with minimal generalization. The evolutionary nature of markets and the complexity of the institution-building are characteristic of the entire CEE region, even for the countries in the CEE that are now part of the European Union. In all off these cases there are ongoing obstacles and complications in the movement towards free markets. These
400 concerns are particularly noticeable in the areas of re-nationalization and privatization in Hungary, administrative obstacles to entrepreneurship in the Czech Republic, and high unemployment and deficits in Poland) From this complex perspective, the transition process can be characterized in terms of the revolutions and political changes that have signaled the end of the centrally -planned party-state system, the economic reform, and, thirdly the impact of reform . The TO framework views reform as a broad mechanism of internal and external change; that is, the processes of marketization and globalization led by the liberalization of foreign trade and the accession to international structures. A more general feature of the TO framework is the use of evolutionary models to describe institutional roles and structures.
1.3 Foreign Direct Investment A key feature of Central and Eastern European markets and foreign direct investment flows in these countries is the growing integration with the European Union. About 60% of the region's Foreign Direct Investment (FDI) stock has been historically held by MNC.s based in countries of the EU (UNCTAD, 2000) . The EU has established strict criteria that aspiring members must adhere to in order to join the European Union. The economic environment, as well as the competitive advantages that MNC.s seek in Central and Eastern Europe, is likely to undergo considerable change as these countries . integration to the EU advances (Tihanyi and Roath, 2002). The institutions of regional integration include EU-conform regulations, enforcement of property rights, free flow of products and resources, specific anti-inflation measures, promotion of economic growth and local competition, and a prudent fiscal policy . CEE markets are distributed in widely varying positions with respect to the economic environment of the European Union, but even for those countries aspiring to integrate rapidly, many remain burdened with unprofitable state-run enterprises, poor corporate governance structures , weak or inefficient institutions and troubled socialist era welfare programs . EU accession is not yet a panacea for these countries, and institutional idiosyncrasies are likely to persist in the years to come. The Governments of the CEE countries play one of the most vital roles in relation to the strategic options that are available to MNC.s. Governments and the policies they set have a direct impact, for example, on taxes, interest rates, incorporation laws, ownership rights, repatriation of profits, and antitrust laws. Although the model of transition proposed here is focused on the transition economies of Central and Eastern Europe, there is some broader applicability to other countries going through the same processes in other regions of the world. Among these countries are the rest of the NIS countries, Mongolia, China, Vietnam, Algeria , Cambodia, the Lao Peoples Democratic Republic, Nicaragua , the Peoples Democratic Republic of Korea, and Tanzania..
2.0 A Typology of Transition to Market The transition from a centrally-planned economy to a market-based economy raises a series of broad, general issues. For example, the literature on transition recognizes the fact that reform policy implementation by the governments of
401 countries in transition is only partially effective in even the most successful transitions . History and geography also shape the inherited structure of the economy, its administrative structure, institutional capacity, and political system (The World Bank, World Development Report, 1996). Finally, the transformation of the CEE economies occurs under the influence of external forces, such as foreign investors and international institutions . In analyzing these factors, we have found it increasingly apparent that the transition process can best be characterized by focusing along three dimensions : 1. Initial conditions and their effects on the transition path including their overall relative impact (Initial Conditions) 2. Elements internal to the transition process itself which are linked to economic and institutional transformation (Institutional Structure) and, 3. External forces influencing transition (The Convergence Factor)
The present study, however, offers a novel theoretical explanation of the mechanisms of transformation and further argues for a normative view of transition that expresses in functional form not only the numerical evidence of transition (e.g. macroeconomic indicators , amount of foreign direct investment , etc.) but also the more qualitative aspects of path dependence and convergence. Table I presents a typology which describes the three dimensions of transition along with their respective mechanisms of action. t't: )..:ttf·pp; :"g 'lq"ri li2 ti I:':m~L
InitiaJlt \'d of develcpm em and eccnenu c di.lloltiOlU Pohey mduced dlstomolU (li b el: m.rlc"' .lude depende t .lt ptHud ulIh ti<m) Inn..J level of de velepmem
01'
~1;w. lrichntU
'01'"
~I.uoonl',open Secloulru on UON Prcpen y and b.nkNplcy law deve pmeeu St. leo \m u ship trdu Clion Varit lyin tunQ,i on indK,llon
01'
'" '"
reform
01'
Eeono=
Coa' tr; tDC e
FDI. ,tinCOR:ins lib.uwtion .ndins:izution.J
Sourc«.•~Q \ 'ra orngJr:. r.:>l; ~ ill:;Jro,d INcIoral DI;~.:a1/f)". S. "'ltmlS.... Ha
~;ltl "
t ;U\ ";/.- • :005
3.0 Characterization of the Transition Process A complete treatment of this process can be found in the on-line proceedings of the 6 th International Conference on Complex Systems, at: http://necsi .org/events/iccs6/papers/030 1ee087eb27e 12c777e7f242bc.pdf.
402 However, in brief, we can characterize the CEE transition process as follows. First, an accurate characterization of the transition process in the CEE region needs to take into account the heterogeneity of starting points and historical events that have marked the beginning and advancement of the various CEE countries change process from planned economy to capitalism. Initial conditions in the system are of decisive importance when determining the next step. The institutional system has advanced through long evolutionary developments and short revolutionary episodes. Among the CEE states, Slovenia, Croatia, the Czech and Slovak Republics and Hungary were the countries with more modest structural distortions. Preliminary comparisons between successful reformers (Poland, Hungary and the Czech Republic) and less advanced market economies in the CEE provide support for the view that the differences in restructuring, effective competition and economic performance are related to both policy factors (the development of credit and technical assistance programs) and non-policy factors (path dependence). Contrary to conventional wisdom, economic development does not naturally converge towards a stable solution, is not fortuitous nor is it deterministic. According to the World Bank Report (2000) the transition process in the CEE started with a sharp decline in GDP followed by recovery. The onset of transition was accompanied by severe shocks, also tied to the disruption in institutional and technological links, the supply of inputs and the delivery of outputs. The financial crises of the 1990s, such as those in Mexico, East Asia, and Russia, also contributed to delaying or interrupting the recovery of output. War and civil struggles in Moldova in 1992, and in FYR Macedonia in 1991-94 had a negative impact on infrastructure development and reforms needed for successful transition. The transition recession is considered now to have ended, as all countries in the region have recorded growth subsequent to year 2000. The recovery has varied greatly across countries : for example, Bulgaria and Romania had about two years of output decline after the initial recovery, Albania returned to recession after the 1997 financial crisis. Most of the countries in the region have been affected to varying degrees by the Russian crisis in 1998. It would appear that the recovery of output benefited from foreign investment as a source of capital and new technology and also as a signal of confidence in the transition progress. World Bank research in 2000 found that initial conditions explain more differences across countries during the initial period of output decline (1990-94) than over the subsequent years of transition. Initial distortions in the economy- including severe repressed inflation or high black market exchange rates and absence of pretransition policy reforms are most closely associated with lower performance during the first years of transition. Initial institutions and the presence or absence of market memory (number of years under socialism) were found to be strongly correlated to variations in subsequent performance. However, while the initial conditions had a greater impact on the initial collapse of output than on the subsequent recovery, the impact of institutional policies became stronger as transition progressed.
403
4.0 A Mathematical Model of Transition to Market The purpose of transition, the evolution towards market, can be described as an iterative process of sequential improvement (although the dimensions of transition are affected by a series of factors that may not necessarily lead to improvement, but rather deterioration) . A mathematical illustration of the evolution process is presented as follows:
1
1
[ I
I I I I WE ] X .+ - Ll!Jn(X.) -X.I 1. jt - - /.l.I (X . ) + I , (--/.l.(X.»/Y. 1 .. X. + w+ n w+n w+n
Where:
X n = (X~,X; ,...,X:) is a vector of economic and institutional features (I to N) at time n (after n iterations). {q} is a sequence of functions mapping features into probabilities of improvement towards market at time n.
bl is the vector of initial conditions .
w= ~b: i
Improvements towards market follow the dynamics: I i i I, with probability q~ (X) b.+ l
=
b. + P.(X.),where the random variable P.(X) -
i
0,with probability 1- q. (X)
y" represents the vector of economic and institutional features characterizing the ED is a factor determining the rate of convergence towards western features
I,
of market economy (WE); could be expressed as an exponential function
f y = ec(n-I) , where c is a convergence factor.
The mathematical expression reflecting the evolution towards market structures includes a "driver'Tthe first two terms at the right of the equation above) that takes into consideration path dependence (role of initial conditions), a perturbation component (the middle term at the right of the equation) that considers the internal factors affecting improvements towards market, and, finally a convergence component (the last term at the right of the equation) which is a measure of external influences (mainly FDI and ED accession/agreements), determining the rate of convergence towards Western features of market economies.
5.0 Conclusion The model of transition suggested in this paper allows for the overlap of effects while accounting for all aspects of change. The mechanisms at play are obviously
404 different for various countries . The equation is flexible in allowing one mechanism to take "predominance" over others, which reflects real developments in the region . For countries that are more advanced towards a market economy and more integrated within the European Union the mechanisms of change revolve around the harmonization with EU institutions and the advantageous positioning in Europe in terms of competitive advantages and value added activities . Many of these former socialist countries have liberalized their capital and labor markets . Other countries such as the Balkan states and Latvia continue to maintain a stronger state bureaucracy well after the collapse of central planning. Governments have however relinquished their coordinating power in the economy to a great extent. As these countries are also in advanced stages of becoming integrated in the EU structures, policies are guided towards harmonization and macroeconomic internationalization through Western European industrial links. For countries less advanced in transition, strategic coordination from governments still prevails. It may be argued that some of these countries are still bounded by initial distortions in economic coordination and level of development, as well as inefficiencies in political markets . Despite a general growth trend, individual country trajectories show a variety of paths, differing in the timing of significant reform implementation and an initial increase in FDI inflows, although some regularities have been pointed out across the region. Countries with low initial distortions have overcome liberalization difficulties more promptly , whereas countries with less favorable conditions at the end of the communist years have taken longer to adopt momentous reforms. Hungary , for example , had the most noticeable increase in FOI inflows immediately after 1989 as the country proceeded quicker in liberalizing the economy. Like Hungary , Czech Republic and Estonia recorded relatively high foreign investments early in transition, but unlike in Hungary , FDI inflows have not leveled off later on . It is not until towards the end of the 1990s that the Balkan States and the former CIS countries adopt a fluent reform strategy and begin recording noteworthy investment flows. Although the beginning of coherent transformation macroeconomic policies and openness to foreign investments and Western European structures marks a break from initial distortions, it is difficult to asses to what extent countries in CEE have overcome these distortions and made a break from past institutions . Most of trajectories in transition observed in the region are the result of rearrangements, reconfigurations, and re-combinations that yield the current conditions . But although this paper advocates a path-dependency approach, it should be noted that the mechanism does not condemn the CEE economies and the economic agents acting within them to simple repetition or retrogression . Rather than rigorously locking economies into certain paths, initial conditions leave a mark on the creation of future institutions.
405
Bibliography [1] Aoki, Masahiko (2001a) Subjective-Game Models and the Mechan ism of Institutional Change in Toward a Comparative Institutional Analysis, MIT Press, 2001. [2] Aoki, Masahiko (200Ib). What Are Institutions? How Should We Approach Them? The Institutional Foundations of a Market Economy, WDR 220112, Stanford Institute for Economic Policy Research , 2001 [3] Arthur, W. Brian (1994) Increasing Returns and Path Dependence in the Economy , University of Michigan Press. [4] Bandelj, N. (2004) Institutional Foundations of Economic Transformations in Central and Eastern Europe, Center for the Study of Democracy , University of California, Irvine, Paper No. 04-14 . [5] Dyker, D. (2001) . The Dynamic Impact on the Central-Eastern European Economies of Accession to the European Union: Social Capability and Technology Absorption. Europe-Asia Studies, 53-7,1001-1021. [6] Estrin, S., and Meyer, K.E., Investment Strategies In Emerging Markets, Edward Elgar Publishing (2004) [7] Gros D., and Suhrcke, M. (2000) "Ten Years After : What is Special about Transition Countries?" Hamburg Institute of International Economics Working Paper, ISSN 1432-4458
[81 Mygind, N. (1994) Societies in Transition, Center for East European Studies . [91 Pavlinek, P. (2002), "Theoretical and Conceptual Approaches to PostCommunist Transformations in Central and Eastern Europe", Invited Research Presentation at the Institute of Social Sciences, Chuo University, Tokyo, January 16th. [10) Stark, David (1992) " Path Dependence and Privatization Strategies in Eastern and Central Europe", European Politics and Societies 6: I (1992) pp. 17-54. [II] Stark, David (1998) "Post socialist Pathways: Transforming Politics and Property in East Central Europe" , Cambridge University Press. [l Zj'Tlhanyi, L. and Roath, A .S., (2002), "Technology Transfer and Institutional Development in Central and Eastern Europe", Journal of World Business 37, 188-198. [13] UNCTAD (2000), Trade and Development Report, Geneva: UNCTAD.
Chapter 25 Emergence of Networks in Distance-Constrained Trade Kumar Venkat CleanMetrics Corp. 4888 NW Bethany Blvd., Suite K5, #191 Portland, OR 97229, USA kvenkat (at) cleanmetrics.com
Wayne Wakeland Systems Science Ph.D. Program Portland State University Portland, OR 97207, USA
Abstract Long-distance trade has been rapidly increasing in recent years. As traders from around the world exchange goods, they form networks with traders as nodes and transactions as links. We use an agent-based model of a simple artificial economy to examine the emergence of trade networks when the distance between traders matters. Distance can become an issue if fuel for transportation becomes expensive or if greenhouse gas emissions from transportation become a major concern. We model the distance constraint as a transaction cost proportional to the amount of goods traded and the distance that those goods must be transported. We find that the resulting network topology is a good indicator of the stability and resilience of the economic system. The topology is random when there is no distance constraint. As the transaction cost increases, the topology transitions into a stable scale-free structure with some clustering, and a large fraction of trade occurs within local regions around the network hubs. Under these conditions, the final welfare of the traders decreases only modestly and environmental efficiency increases significantly when each region has a diverse combination of tradable goods.
1.1. Introduction Long-distance trade is an integral part of globalization and has been rapidly increasing in recent years. As traders from around the world exchange goods, they form networks where the traders represent nodes and transactions between them represent links. We examine the emergence of these trade networks using an agent-based model of a simple artificial economy, in which the distance between traders significantly influences the cost of trading. Distance can become an issue if fuel for transportation becomes expensive, or if greenhouse gas emissions from the fast-growing transportation sector become a major concern [Venkat 2003]. While information technology is rapidly
407 removing many long-standing obstacles to free trade, the ultimate constraint to trade may well be our ability to physically move material goods between traders over long distances at an acceptable real cost. We hypothesize that a distance constraint might lead to a restructuring of the fastgrowing society of global traders, and stimulate new kinds of trade relationships and networks. We test our hypothesis in this study using the techniques of agent-based computational economics [Tesfatsion 2006] in a simple setting as a first step. While other studies have focused on the effects of fixed network structures [Wilhite 200 I; Wilhite 2006], we take the view that trade networks are highly malleable and arise from the same constraints that influence economic performance. Given the evidence so far that complex systems encode their organizing principles at some level in their topology [Barabasi, et al, 2004], we investigate the evolution and structure of the networks in order to characterize the organization and functioning of our artificial economy. We model the distance constraint as a transaction cost. This cost reflects some degree of internalization of the real environmental costs of long-distance trade, including fossil fuel depletion and greenhouse gas emissions. We study the effect of this transaction cost under two different initial allocations of tradable goods, one where there are regional differences and the other where the goods are uniformly distributed throughout the world. Weare particularly interested in the properties of trade networks that emerge as we vary both the transaction cost and the initial allocation, and we examine how the network properties correlate with economic performance and environmental efficiency.
1.2. Trade Model We formulate the trade problem based on our previous work [Venkat and Wakeland 2006], adapting a simple barter economy that has been used to study economic activity on fixed networks [Wilhite 2001; Wilhite 2006]. Our artificial world consists of 1024 traders spaced uniformly in the four quadrants of a flat space, as shown in Figure l(a). Each trader is an agent who remains at a fixed location, and is able to trade with others who may be at other arbitrary locations. Traders are presumed to find potential trade partners and negotiate the terms of trade through mechanisms that are independent of their locations, such as globally-accessible electronic trade exchanges. Each trader starts out with an initial endowment of two durable goods, gl and g2, ranging from 0 to 1500 units each. The two goods suffer no degradation over time and serve as assets that can be exchanged. There is no production and the aggregate stock of goods changes only to account for the transaction cost as described later. The initial allocation can follow two distinct scenarios, maintaining nearly equal amounts of gl and g2 in our artificial world: • "Globally mixed random" (GMR): There are no regional differences. Each trader gets random quantities of the two goods such that the total quantity of both goods together is exactly 1500 units. • "Local comparative advantage random" (LeAR): The eastern half of the world has more gl than g2, and the western half has more g2 than gl. Each trader in the east receives at least 1200 units of gl and no more than 300 units of g2. Each trader in the west receives at least 1200 units of g2 and no more than 300
408 units of gl. The actual amounts are allocated randomly such that each trader starts with a total quantity of 1500 units. Each trader attempts to maximize the same symmetric Cobb-Douglas welfare function, U = gl * g2. Trade is organized in the form of trade rounds. In each round of trade, traders are chosen in random order and each trader is given a chance to initiate four consecutive trades. The trader then searches the world and finds the best possible trade partners for the four trades. Two traders consummate a trade if their marginal rates of substitutions are different, and if the welfare functions of both traders increase as a result. Trade price between agent i and agent k is determined by the following rule: Price = (g2i + g2k) / (g l, + g lj). In each trade, the initiating trader buys or sells one unit of g1 in exchange for an appropriate quantity of g2. Successive trade rounds proceed in this fashion and finally terminate when there are no further profitable trading opportunities. Each trade incurs a transaction cost computed as: Total transaction cost = distance * quantity of goods bought * unit transaction cost. We vary the unit transaction cost from oto 0.25 in our experiments. The total transaction cost is subtracted from the quantity of goods received by each trader in a trade. Traders evaluate this cost in advance and proceed with a trade only if it would still increase their welfare.
1.3. Results and Discussion The trade model was implemented and simulated using NetLogo [Wilensky 1999]. Given the fixed positions of all traders in Figure l(a), a typical network structure that emerges under the distance-based transaction cost is shown in Figure 1(b). Traders represent nodes of the network and transactions between them represent links. In this
section, we probe the origin and structure of these networks and show how they relate to aggregate economic performance in this artificial society. Trode Iletwork
(a)
(b)
Figure 1. (a) Location of traders in the artificial world. (b) Typical network structure emerging from distance-constrained trade.
When the transaction cost is zero, anyone can trade with anyone else in the world. Figure 2(a) shows that both LCAR and GMR produce the same level of final welfare
409 under these conditions. This demonstrates that unconstrained trade can efficiently move goods between traders and achieve a level of welfare that is nearly independent of initial allocations . LCAR does require more trades in order to overcome geographical differences in the initial allocation as seen in Figure 2(b). However, LCAR responds poorly to increases in transaction cost, with welfare dropping to less than 50 percent of the unconstrained case at medium and high costs. The largest drop occurs as the unit transaction cost approaches 0.05, suggesting a change in the underlying structure analogous to a phase transition , and the welfare characteristic stabilizes at about a cost of 0.1. In contrast, the average final welfare in the GMR case is within 15 percent of the unconstrained case for all transaction costs.
..
seeeee
'D
~,ooooo
~.ooooo
~
;; 3COOOO
"0 j 3COOOO
~ 200000
~ 200000
e ii:
E
"
0(
~ 100000
'00000
0001
00$
01
015
Unit Transaction Coal
(a)
02
02$
000.
ees
0'
015
02
02$
Unit Transaction Coal
(b)
Figure 2. (a) Average final welfare and (b) total numberof trades as functions of unit transaction cost. As the distance-based transaction cost increases , the average trade distance drops sharply, as seen in Figure 3(a). The trade distance stabilizes at a very low cost in the GMR case, whereas this occurs at a higher cost for LCAR, suggesting that changes in network topology may be occurring at different unit transaction costs in the two cases. Assuming that the greenhouse gas (GHG) emissions produced by each trade are proportional to the quantity of goods and the shipping distance, Figure 3(b) shows the environmental efficiency of the two initial allocations . Clearly, the transaction cost is effective in dramatically increasing the average welfare per unit of greenhouse gas emissions . GMR performs much better than LCAR because the average welfare and the number of trades are very stable while there is a large reduction in the trade distance. The environmental efficiency is relatively stable in this case beyond a cost of 0.05. We now examine the network structure in more detail. As seen in Figure 4(a), the reduction in average network degree closely follows the distance characteristics. The average degree in the GMR case is fairly stable for unit transaction costs between 0.05 and 0.25. The average degree for LCAR goes through significant changes until a cost of 0.1, and then continues to change at a slower rate. Stability of the network structure, as measured by the average degree, correlates strongly with relative stability in trade distance , number of trades, welfare and environmental efficiency for both GMR and LCAR as the transaction cost is varied. The performance of the GMR case is significantly more stable than LCAR and less sensitive to changes in transaction cost.
410
- - LCAR -GMR
..
u
C
.
,; 1$
i5 ."
~ 'O .;
«>
000'
OOS
0'
O'S
02
0001
02$
Unit Transactl on Cost
OOS
01
O'S
02
Unit Tranuetlon Cost
02S
(b)
(a)
Figure 3. (a) Average trade distance and (b) environmental efficiency of welfare as functions of unit transaction cost.
1000 1200
.... lCAR
-G"' R
- T.,... · '
-Time. 21
1000
-
. '8 eoo
-~
T.me-.1
-T.",.-51
co
z 0
E
"
Z
600
tOO 200
0001
OOS
01
0 IS
Unit Transaction Cost
(a)
02
02S
3
S
7
Number of Links per Nod e
9
"
(b)
Figure 4. (a) Average network degree as a function of unit transaction cost. (b) Evolution of the network over time (GMR, Cost = 0.1).
Figure 5 shows the degree distribution for zero and a low unit transaction cost for both GMR and LeAR. Without a transaction cost, the network is clearly random, with a high average degree as also seen in Figure 4. At a low unit transaction cost, the network still remains random, but the average degree is now much smaller since many potential longer-distance links have been eliminated by the transaction cost. Figure 6 shows the degree distribution for medium and high unit transaction costs. The GMR network displays some scale-free characteristics [Barabasi and Albert 1999] at costs of 0.05 and higher. Most nodes have a small degree while a few hubs have noticeably larger degrees, but the characteristic is limited by the small network size. It is also not an ideal scale-free model because the preferential attachment function [Barabasi, et al, 2004] is highly nonlinear due to the distance constraint. More than 70 percent of the nodes have at least two connections, suggesting some degree of clustering in the neighborhoods around the hubs. The probability that a node connects
411 -r to k other nodes decays as a power law as in all scale-free networks : P (k) = k , where the exponent r typically ranges between I and 3 in our experiments. In the LeAR scenario , the network is decidedl y random at a cost of 0.05, and moves closer to a scale-free structure at higher costs. Note that the formation of scale-free networks corresponds to a regime where trade is generally less sensitive to changes in transaction cost. eo
- Cost-o OO '
·_ NoCo5t
..
,.
eo
-Col,oOOO '
_ ' N~
~
~
'S0o
::00
'8 z
z
o~ E
::1 10
Z
~
~
~ 20
Z
.
~
'0
'0
'00 'so Number of Links per Nod.
0
100
~
-I ••
1\
o~
.
~
so
0
I
,~ 100 '00 Numbe r of Links per Nod.
(a)
2~
300
(b)
Figure 5. Degree distribution for zeroand lowunit transaction costs: (a) GMR. (b) LCAR. sao oSO
-C oJf ·O O~
"'Cost-O I
-COlt-O 15 COlt-O2 -Col,·02S
000
.. 3SO
'S3OO
Cost-O es
oSO
·- Cost-o ,
000
- Cou-o 1 ~
Zp!O
-Co,ta02S
ceu-o z
'tl
:l! 3OO
Z 0 250
0 2~
~E 100
! 200
Z ' SO
E ::I z , SO
' 00
'00
::I
so 4
6
e
'0
Numb.r of Li nks per Nodo
(a)
12
'0
0 0
'0
10
~
00
Number of Links por Nod.
(b)
Figure 6. Degreedistribution for medium and highunit transactioncosts: (a) GMR. (b) LCAR. Figure 4(b) captures the evolution of the network over time in the GMR case, where time steps correspond to trade rounds . The formation of hubs, corresponding to the power-law tail in the degree distribution, is reinforced as trade proceeds and is nearly complete after a sufficient number of trade rounds. Once all of the traders have been added to the netwo rk, the network stops growing although new connections may still be formed and old ones may be deleted . Figure 7(a) shows the distribution of initial welfare as a function of the final node degree. Figure 7(b) shows that hubs are likely to trade over longer distances on average, driven by their need to increase their welfare .
412 Hubs are also likely to engage in significantly more trade than other nodes, as seen in Figure 7(c). In the case of well-known scale-free networks such as the World Wide Web, new nodes link with higher probability to existing nodes that already have a large number of connections. In our trade model, the self-organizing principle turns out to be quite different but still purposeful. Recent analyses of real-world data have shown that international trade networks tend to have a scale-free structure [Baskaran and Bruck 2005; Bhattacharya, et al, 2007]. One of the reasons is that countries typically export goods in which they are specialized and have an advantage, and these countries become hubs in the scale-free network for specific goods. This corresponds to the case of traders starting with highly unequal quantities of gl and g2 due to the random initial allocation of goods in our model. These traders can be considered to have a comparative advantage in one of the goods. They also start with the lowest initial welfare in the artificial world as computed by the symmetric Cobb-Douglas function, and have the largest motivation to engage in trade in order to improve their individual welfare. Thus, the nodes with a high comparative advantage in one of the goods and low initial welfare are likely to become trading hubs for their neighboring nodes.
3
3SOOOO
~ 300000
ii 250000 E
=,;
:C t ~
- Cost -o ~
.. Cosl-o .1
100000
~ ....Cot.t-O.U
o ~_-_-~---":::----.--~--. o 4 is IS 10 " t... Numbe, o f Links PO' Nod.
.......
O ~_...--_---.--_-
o
•
0
a
COtt-O,2 eo.t-o25
-~~
'0
Numbe, of Links PO' Node
.2
"
(b)
(a) 300
Cos!-OO$
-
O»t-O , CoIt-O .S
eo... 02 -Co&t-02!> · o
~-..---~--,...--
o
•
e
_ _---._ _- a ID 12
Number of Un ka per Nodo
(c) Figure 7. Trade characteristics as a function of final node degree (GMR): (a) Average initial welfare. (b) Average trade distance. (c) Average number of trades.
413
1.4. Conclusion We have seen that economic performance in the GMR case, as measured by average final welfare, degrades only modestly in response to the transaction cost, while environmental efficiency increases sharply and the number of trades is nearly unchanged . The average degree of the network stabilizes very quickly at a low unit transaction cost and the network gels into a scale-free structure with a certain degree of clustering . In contrast, the response of the LeAR scenario to the transaction cost is much more severe. There is a dramatic decline in welfare coupled with increased number of trades. The network structure remains random until the transaction cost is quite high, at which point it too approaches a scale-free structure. Economic performance stabilizes to some extent when the network takes on scale-free characteristics, but remains considerably worse than the GMR case. What lessons can we draw from these experiments with a simple artificial world? First, the network topology appears to be a good indicator of the stability and resilience of the economic system. It is a useful way to characterize economic interactions that can provide insights into both organization and function. Second, a distance-based transaction cost - whether imposed by markets or through environmental regulation in the real world - could lead to stable trade networks where most trade occurs within local regions and a small fraction of trade spans longer distances. The hub structure and clustering that emerge in our experiments are ideally suited for local trade. Once such a network has formed, the loss of welfare would be limited if each region has a diverse combination of tradable goods. This suggests that diversified local economies may adapt better to distance constraints than trade regimes where each region specializes in a small number of goods.
References Barabasi, A., and Albert, R., 1999, Emergence of Scaling in Random Networks, Science, 286: 509-512. Barabasi, A., Dezso, Z., Ravasz, E., Yook, S-H., and Oltvai, Z., 2004, Scale-Free and Hierarchical Structures in Complex Networks, Sitges Proceedings on Complex Networks. Baskaran, T ., and Bruck, T ., 2005, Scale-Free Networks in International Trade, German Institute for Economic Research, Discussion Paper 493. Bhattacharya, K., Mukherjee, G., and Manna, S., 2007, The International Trade Network, in Econophysics ofMarkets and Business Networks, Springer-Verlag (Milan). Tesfatsion, L, 2006, Agent-Based Computational Economics: A Constructive Approach to Economic Theory, in Handbook of Computational Economics, Vol. 2, North-Holland (Burlington). Venkat, K., 2003, Global Trade and Climate Change, GreenBiz (www.greenbiz.com). Venkat, K., and Wakeland, W., 2006, An Agent-Based Model of Trade with Distance-Based Transaction Cost, Proceedings of the Summer Computer Simulation Conference. The Society for Modeling and Simulation International (San Diego). Wilensky, U., 1999, NetLogo, Center for Connected Learning and Computer-Based Modeling, Northwestern University (Evanston). Wilhite, A., 2001, Bilateral Trade and 'Small-World' Networks, Computational Economics, 18: 49-64.
Wilhite, A., 2006, Economic Activity on Fixed Networks, in Handbook of Computational Economics, Vol. 2, North-Holland (Burlington).
Chapter 26
Toward Agent-Based Models of the Development And Evolution of Business Relations and Networks Ian F. Wilkinson School of Marketing, University of New South Wales [email protected] Robert E. Marks Economics/AGSM, University of New South Wales [email protected] Louise Young School of Marketing, University of Technology, Sydney louise. [email protected] Finns achieve competitive advantage in part through the development of cooperative relations with other firms and organisations. We describe a program of research designed to map and model the development of cooperative inter-finn relations, including the processes and paths by which firms may evolve from adversarial to more cooperative relations . Narrative-event-history methods will be used to develop stylised histories of the emergence of business relations in various contexts and to identify relevant causal mechanisms to be included in the agent-based models of relationship and network evolution. The relationship histories will provide the means of assuring the agent-based models developed.
1 Introduction The importance of a finn 's business relations and networks in creating and sustaining its competitive advantage in domestic and international markets is being given ever more attention by academics and practitioners. In particular, the development of collaborative relations among and within firms is an important potential source of competitive advantage because the resources created through such collaboration are valuable , rare, inimitable and non-substitutable (Daugherty et al
415
2006, Davis and Spekman 2004, Dyer 2000). Such relations also present special problems and challenges for managers (and policy makers) because relations and networks are not controlled by individual firms or government - they are examples of complex adaptive systems that self-organise over time through the micro interactions and processes taking place among firms. The business relations and networks a firm operates in are part of the extended enterprise of the firm, affecting what its managers can do, see, know, learn and think. These relations extend the competences, skills, resources, and knowledge of the firm in positive and negative ways (Wilkinson and Young 2005). As a result, business relations and networks, including direct and indirect relations with customers, suppliers, distributors, competitors, technology partners and complementors, play a key role in the creation and delivery of value in the form of products and services and in the development and evolution of the value-creation process itself. A major lacuna in the literature is the lack of theory and evidence about the way business relations and networks develop and evolve over time and the role managers (and government) can play in influencing these processes, despite substantial research in the area (Wilkinson 2001). Most research and theory in business and social science is dominated by comparative-static, variance-based, survey-type approaches to explaining relationship behaviour and performance, which ignore dynamic processes, interaction, order effects and feedback effects (Abell 2004, Buttriss and Wilkinson 2006, Van de Ven and Poole 2005, Parke et aI2006). The aim of our research is to overcome this gap by building and assuring (Midgley et a12007) agent-based models of business relationship and network development and evolution as complex adaptive systems, including the following characteristics: the psychological, social, managerial, and economic mechanisms involved in the development and evolution of business relations and networks, as identified in existing theories and through a systematic narrative-eventhistory mapping of a sample of actual business relationship and network histories; capability of reproducing the stylised patterns of development and evolution of a sample of actual business relations and networks; ability to examine the stability and sensitivity of the outcomes to different types of initial conditions and contingencies; ability to examine the kind of roles managers (and governments) can and cannot play in shaping the patterns of development and evolution emerging; and ability to identify more or less effective strategies for management intervention (and government policy) to enable business relations and networks to be established and develop in productive ways. The models will incorporate the main generative mechanisms and processes driving relationship and network development as identified in the literature and from
416 studies of actual relationship and network histories, including group-selection as well as individual-selection mechanisms as part of the evolutionary processes. Group selection includes both the direct and indirect or group-level effects of the behaviour and resources of individuals, where business relations and networks are the groups of interest here. Group-selection effects (Griffing 1967, Price 1970) can help explain the emergence of anomalous cooperative or pro-social behaviour (Henrich 2004, Muir 2005), particularly relevant in modelling the evolution of collaborative relations and networks in business (see Ladley et al 2007).
2 Theory and Methods 2.1 Three methodologies Three methodologies will be used to analyse the results of prior studies of actual relationship and network histories, in order to inform the models developed and to realise and implement the ABM: first, narrative-event-history methods to map the event sequences and causal mechanisms underlying a sample of actual business relationship and network histories already gathered (Abell 2004, Buttriss and Wilkinson 2006, Van de Ven and Engleman 2005, Van de Ven and Poole 2005); second, the analysis of the histories will be further aided by use of an AI semantic text-analysing tool, Leximancer (Smith 2006); third, the agent-based models will be developed and assured (Marks 2007).
2.2 Theoretical background The roles and functions of business relations and networks have been described and classified in various ways (Walter et al 2001, 2003, Wiley et al 2006a and b). In essence such relations and networks enable and constrain the functioning and evolution of economic systems in two main ways. First, they are the means by which the fruits of the division of labour in a society are realised. They are the means by which the activities, skills, resources and outputs of people and firms specialising in different tasks are accessed, combined and coordinated in order to produce and deliver value in the form of desired products and services. The tasks of value creation and delivery are shared among specialists and are reintegrated through the relations and networks they develop among them. Second, they play an important role in shaping the wayan economic system develops and evolves through their impact on innovation, learning and knowledge development (Wilkinson 2006). They are the means by which new ideas are developed and new types of opportunities are discovered and exploited. They affect the flow of knowledge and ideas in an economy and the way they are interrelated and integrated (Burt 2004, Hargadon 2003, North, 2005). Business relations and networks lie at the heart of the institutional structure of an economy. Four types of theories of organisational change may be identified: life cycle, teleological, dialectic, and evolutionary (Van de Ven and Poole 1995). Life-cycle theories are of limited use since they assume implicitly that relationship development
417 is some rigidly unfolding process , with pre-determined stages. They have limited predictive value and leave little room for management, ambiguity, uncertainty and external contingencies in shaping relationship development. A teleological approach implicitly assumes that the relationship is controlled by a firm that can manage it its purposes, or that the most efficient structure will eventually emerge, as in transactioncost theory. Behavioural issues such as power and conflict and trust and commitment are assumed to affect only the details of the transition process, not the eventual outcomes. But existing relations do not necessarily conform to these ideals, and behavioural processes do affect these outcomes. To some extent , all firms are interdependent and therefore have some influence over each other, and the process of development and evolution arises from this micro processes of action and interaction . An evolutionary perspective comprises four generic processes: variation , selection, retention, and diffusion (Aldrich 1999). Variation concerns the mechanisms by which new forms of organisation emerge, whether intentionally or not. Selection is the process by which internal or external forces support, develop or undermine new forms. Retention refers to the way selected forms are preserved or reproduced over time. Diffusion is the process by which new forms are adopted or imitated by others in the relevant population. These processes are inter-related and together shape the pattern of evolution of organizations. There have been few in-depth studies of the way business relationship and/or networks develop and evolve (e.g. De Rond and Bouchikhi 2004, Doz 1996, Hakansson and Snehota 1995, Havila and Wilkinson 2003), and these offer only limited insight into the kinds of mechanisms and processes operating . Other research has focused on particular types of causal mechanisms or processes involved in relationship and network development , but have not been integrated into a comprehensive theory or model. These mechanisms include: (a) business-mating processes (Wilkinson et al 2005), including partner search, identification and choice processes; (b) business-dancing or -interacting processes (Wilkinson and Young 1994), such as leading, following , collaborating negotiation, communication, influence, and conflict management processes; (c) business learning, and knowledgedevelopment processes, including imitation and knowledge sharing, and the way information diffuses through business relations and networks (Burt 2004, Granovetter 1983); (d) business innovation and adaptation processes, such as the way new ideas and opportunities are discovered , developed and exploited and the way people and firms develop and adapt their activities , resources, feelings and ideas to each other (Hagel and Brown 2005, Hargadon 2003, Roy et al 2004); and (e) businessrelationship and network interconnection processes , the way activities, interactions and processes taking place in one relation affect other relations (Anderson et al 1994). The development and evolution of business relations and networks is a coevolutionary process in which firms, relations and networks constitute each others' environment (March 1996). Relations and networks also respond and adapt to the realities of the more general environmental systems in which they operate, including social-cultural, economic, technological, material and biological dimensions , which also develop and evolve over time.
418 A simple example of this co-evolutionary process is shown in Figure 1 in terms of a business relation involving two firms. The structure of business relations may be described in terms of actors, activities, resources and schemas (Hakansson and Snehota 1995, Welch and Wilkinson 2002). The actors are the people and firms involved in the relation, who act and interact in various ways using their resources (including knowledge, skills and competences). Actions and interactions are guided by each actor's relational schema and the bonds that exist between them. Relational schemas are the theories in use or mental models actors have regarding the nature and role of the relation, including what they hope to get out of it, their ideas and beliefs about themselves and the other actors involved, and their expectations regarding each actor's behaviour and contribution. Actor bonds refer to various types of emotions and feelings that can arise between people and firms, including affection, trust dependence, commitment, respect and sympathy that mayor may not be reciprocated.
Figure 1 TheRelationship Development Process Re latio ns Con ne ct ed to A ....... . .... ....~ .
Relations Conn ected t o B
.
.....
.
i
,
,,.
....".
" r--'=-~L-------,\
(
~;;;iii
\ .'- - - - -__.IIt""""- . .J FCiedback Loop for A ...............
.... ...... .
. .......
.•••"t ••
.
\
,. -----..)J.
..
-
........ ./... ... _ .::
'
-
.-
Feedback ...""
Loop!9r B _ .
The experience and outcomes of the actions and interactions taking place over time in a business relation (the relationship process), have various types of feedback effects. These are depicted in Fig. 1 in terms of two coupled feedback loops, one for each firm. In actual business relations and networks, many such interconnected feedback loops operate simultaneously.. Each relation produces its own history as a result of its initial conditions and the particular sequence of actions, interactions, events and contexts it has to deal with . Both virtuous and vicious spirals can occur. A business relation is continually being made and remade through the ongoing actions and interactions taking place. A relationship structure persists when the patterns of actions and interactions occurring for each actor reproduce the same relationship structure that produces these actions and interactions. Such relations are said to be balanced or in dynamic equilibrium, and may be more or less stable and robust in withstanding internal and external shocks. A relation continues to evolve and structural adaptations take place until it approaches a balanced state or it ends. Different kinds of balanced relations
419 (or relational attractors) emerge as a result of history and the conditions in which the relation operates, including other relations and networks. The various types of relation al attractors are reflected in the empirical taxonomies of business relations (Bensaou 1999, Cannon and Perreault 1999). Of particular interest in this research are the conditions under which forms of collaborative business relations can emerge and survive .
3 Summary Relationship and network development and evolution are multiply complex. They involve many types of interpersonal, inter-departmental and inter-firm interactions and include the simultaneous and ongoing operation of many different types of interconnected generative or driving mechanisms. The pace of action and change varies for different aspects of business, with slower dynamics constraining and influencing faster dynamics This results in extremely complex dynamics and feedback effects which cannot be handled by traditional analytical methods. Business relations and networks are examples of complex adaptive systems in which the micro interactions and processes taking place among people and within and among firms produce large-scale patterns of change and evolution in a bottom-up self-organising manner. In order to analyse such systems , we argue for simulation methods in which the behaviour of the system is played out over time under various conditions and assumptions.
Bibliography 1. Abell, Peter (2004), "Narrative Explanation: An Alternative to Variable-Centred 2. 3.
4. 5.
Explanation?" Annual Review of Sociology, 30, 287-310. Aldrich, H. (1999) Organizations Evolving Sage Anderson, i .c, Hakansson, H., Johanson, J., (1994) "Dyadic Business Relationships within a Business Network Context," Journal of Marketing, 58: 1-15.. Bensaou, M. (1999). "Porfolios of Buyer-Supplier Relationships." Sloan Management Review 35-44 Burt, R.S (2004) "Structural Holes and Good Ideas" American Journal of Sociology, 110 Number 2 (September): 349-9
6. Buttriss, G. and Wilkinson, LF. (2006) "Using Narrative Sequence Methods to 7. 8.
9.
Advance International Entrepreneurship Theory" Journal of International Entrepreneurship, 4 (4): 157-174. Cannon, J., P. and W. Perreault, D. (1999) . "Buyer-Seller Relationships in Business Markets ." Journal of Marketing Research 36(4): 439-460 Daugherty, PJ, , , R. G. Richey , A. S. Roath, S. Min, H. Chen, A. D. Arndt and S. E. Genchev (2006) "Is Collaboration Paying off for Firms ?" Business Horizons 49, 61-7 Davis, E. Spekman R. (2004) Extended Enterprise Financial Times/Prentice Hall
420 10. De Rond, M., Bouchikhi, H. (2004) " On the Dialectics of Strategic Alliances" Organization Science, 15 (1) 56-69 11. Doz, Y.L. (1996) "The Evolution of Cooperation in Strategic Alliances: Initial Conditions or Learning Processes?" Strategic Management Journal 17, 55-83 12. Dyer, 1. (2000) Collaborative Advantage Oxford University Press 13. Granovetter, M. (1983) "The Strength of Weak Ties" Sociological Theory, 1, 201-233 . 14. Griffing, B. (1967) "Selection in reference to biological groups. 1. Individual and group selection applied to populations of un-ordered groups " Aust. 1. Biol . Sci. 10: 127-139. 15. Hakansson, H., Snehota, 1. (1995) Developing Relationships in Business Networks. Routledge 16. Hagel III, 1. and Brown, 1.S. (2005) "Productive friction: how difficult business partnerships can accelerate innovation" Harvard Business Review 83 (2):82-91 , 148 17. Hargadon, A. (2003) How Breakthroughs Happen, Harvard Business School Press, Cambridge MA. 18. Havila, V., Wilkinson 1. (2003) "The principle of the conservation of business relationship energy: or many kinds of new beginnings" Industrial Marketing Management 31, 191-203 19. Henrich, 1. (2004) "Cultural group selection, coevolutionary processes and largescale cooperation" Journal of Economic Behaviour and Organisation 53 (2004) 3-35 20. Ladley, D., Wilkinson, 1. F. and Young, L.C. (2007) "Group Selection versus Individual Selection and the evolution of cooperation in business networks" paper presented at IMP Annual Conference, University of Manchester, UK. 21. March, 1.G. (1996) "Continuity and Change in Theories of Organizational Action" Administrative Science Quarterly 41 (June) : 278-87. 22. Marks R.E. (2007), "Validating simulation models: a general framework and four applied examples," Computational Economics, 30(3) : 265-290 23. Midgley D.F., Marks R.E., and Kunchamwar D. (2007) "The Building and Assurance of Agent-Based Models: An Example and Challenge to the Field," Journal of Business Research, 60: 884-893. 24. Muir, W. (2005) "Incorporation of Competitive Effects in Forest Tree or Animal Breeding Programs" Genetics 1247-1259 25. North, D. (2005) Understanding the Process of Economic Change, Princeton University Press 26. Parke, A., Wasserman, S. and Ralston D. A. (2006) "New frontiers in network theory development" Academy of Management Review 31 (3) 560-568 27. Price, G., 1970. Selection and covariance. Nature 227,520-521.
28. Roy, Subroto, Sivakurnar, K. and Wilkinson, 1.F. (2004) "Innovation Generation in Supply Chain Relationships - A Conceptual Model and Research Propositions," Journal of the Academy of Marketing Science 32 (1) 2004 61-79
421
29. Smith, A.E. and Humphreys, M.S. (2006) "Evaluation of Unsupervised Semantic Mapping of Natural Language with Leximancer Concept Mapping". Behavior Research Methods, 38 (2), 262-279 30. Van de Ven, A. and Engleman, R. (2004) "Event- and Outcome-driven Explanations of Entrepreneurship" Journal of Business Venturing 19,343-358 31. Van de Ven, A. and M. S. Poole (1995) Explaining Development and Change in Organizations, The Academy of Management Review, 20 (3) 510-540 32. Van de Ven, A. and M. S. Poole (2005) "Alternative Approaches for Studying Organizational Change" Organization Studies, 26 (9) 1377-1404
33. Walter, A., Ritter, T. and Gemunden, H.G. 2001. "Value Creation in BuyerSeller Relations." Industrial Marketing Management, 30, 365-377.
34. Walter, A., Muller, T. A., Helfert, G., and Ritter, T. 2003, "Functions of industrial Supplier Relationships and Their Impact on Relationship Quality," Industrial Marketing Management, 32, pp. 159-169
35. Welch, C., Wilkinson, I (2002) "Idea
Lo~ics and Network Theory in Business Marketing" Journal of Business to Business Marketin~ 8:3. 27-48 36. Wiley, 1., Wilkinson, I.F. and Young, L. (2006) "The Impact of Connected Relations on Relationship Performance: A Comparison of European and Chinese International Business Relations" Journal of Business and Industrial Marketing, 21 (1) 3-13 37. Wiley, J., Wilkinson, I.F., Young , L. and Denize, S. (2006) "The Direct and Indirect Functions of International Business Relationships for Suppliers and Customers: A Comparative Study or European and Chinese Firms" European Marketing Academy Annual Conference, May
38. Wilkinson, I., Young, L. (1994) "Business Dancing: An Alternative Paradigm for Relationship Marketing" Asia-Australia Marketing Journal, 2:1 1994 pp 67-80 39. Wilkinson I.F. (2001), "A History of Channels and Network Thinking in Marketing in the 20th Century" Australasian Marketing Journal 9 (2), 23-53 40. Wilkinson I.F (2006) "The Evolvability of Business and the Role of Antitrust" Antitrust Bulletin 41. Wilkinson, Ian F., Freytag, Per, Young , Louise (2005) "Business Mating: Who Chooses Whom and Gets Chosen?" IndustrialMarketing Management 34: 7 669 -680 42. Wilkinson, I., L-G Mattsson and G. Easton (2000) "International Competitiveness and Trade Promotion Policy from a Network Perspective" Journal of World Business 35:3, 275-299 43. Wilkinson, I., Young, L. (2002) "On Cooperating: Firms, Relations and Networks" Journal of Business Research 55(2), 123-132 44. Wilkinson I. And Young L. (2005) "Toward A Normative Theory of Normative Marketing Theory" Journal of Marketing Theory 5 (4) 363-396 45. Young, L. and Wilkinson, l.F . (2004) "Evolution of Networks and Cognitive Balance IMP Conference Copenhagen Sept 2004
Chapter 27
Dynamic Modeling of New Technology Succession: Projecting the Impact of Macro Events and Micro Behaviors On Software Market Cycles Sharon A. Mertz Adam Groothuis Philip Vos Fellman International Business Department Southern New Hampshire University
The subject of technology succession and new technology adoption in a generalized sense has been addressed by numerous authors for over one hundred years. Models which accommodate macro-level events as well as micro-level actions are needed to gain insight to future market outcomes. In the leT industry, macrolevel factors affecting technology adoption include global events and shocks, economic factors, and global regulatory trends . Micro-level elements involve individual agent actions and interactions, such as the behaviors of buyers and suppliers in reaction to each other, and to macro events . Projecting technology adoption and software market composition and growth requires evaluating a special set of technology characteristics, buyer behaviors , and supplier issues and responses which make this effort particularly challenging.
423
1.0 Literature Review A number of models were developed in the 1980's specifically focused on the issue of technology adoption and market shifts which result from the actions of buyers and suppliers. Farrell and Saloner' address the benefits of standardization, and the direct network externalities that can result from compatibility between suppliers. Paul David discusses the defacto adoption of the QWERTY keyboard arrangement where a potentially inferior technology is locked in due to a path dependent sequence of economic changes. i Katz and Shapiro address the effect of sponsorship on technology adoption, market shifts , and consumer choice.3 Sponsorship occurs when suppliers have proprietary technologies which they discount initially to gain market share and buyer mindset, recouping costs over time through highly profitable maintenance and services pricing." Consumer choice can be influenced by the number of other consumers selecting compatible technologies, and consumer benefits increase with higher adoption rates or increasing network
externalities.i W. Brian Arthur takes the position that insignificant events may by chance advantage one technology over another when two or more increasing-return technologies compete." Using a two -agent two-technology model , demand and supply are kept stable to examine the effects of historical events affecting agent choice of technology. The model is tested in different cases of constant, diminishing, and increasing returns to determine whether fluctuations in the order of choices introduced makes a difference in technology adoption.' Testing with homogeneous agents shows choice order does not matter, and first technology chosen is selected again and again , producing comparable results to David's QWERTY example. Gartner research identifies three broad stages of 7 - 10 year technology life cycles that track the major economic cycles. Each decade shows overlapping cycles of maturity, growth, and emergence of a major style of computing.t Industry life cycles contain generational product life cycles which evolve through stages of emergence, growth, maturity, consolidation, and decline." Buyers can be characterized within three categories, where "type A" (aggressive, leader, features focused) represents 15%, "type B" (mainstream, adaptive, benefits-focused) I Farrell, Joseph, and Garth Saloner, "Standardization, Compatibility, and Innovation". M.I.T. Working Paper #345, April, 1984. 2 David, Paul A. "Clio and the Economics of QWERTY". Economic History, Vol. 75 No.2, pp. 332-337, May, 1985. 3 Katz, Michael L., and Carl Shapiro, "Technology Adoption in the Presence of Network Externalities". The University of Chicago: Journal of Political Economy, 1986, vol. 94, no. 4, pp. 822-841. 4 See Fellman, Philip V., Post, J. V., Wright R., and Dasari, U. "Adaptation and Coevolution on an Emergent Global Competitive Landscape" /nterJourna/ Complex Systems, 1001 ,2004 5 Ibid.; Farrell and Saloner, op.cit., p.l. 6 Arthur, W. Brian, "Competing Technologies, Increasing Returns, and Lock-in by Historical Events". The Economic Journal, 99, (March 1989), pp. 116-131. 71bid., p. 118. 8 Correia, Joanne, and Mertz, Sharon, "CRM Market Trends". Gartner. Inc.: Gartner Customer RelationshipManagementSummit, San Diego, October, 2005. 9 Longwood, Jim, Tom Austin and Betsy Burton, "Understand the Challenges and Opportunities of the Market Life Cycle". Gartner, Inc., ID Number GOOI27583, February 16,2006.
424
represents 56%, and "type C" (late adopter, follower, cost-focused) represent 29% of the buyer population . 10
2.0 The Windrum-Birchenhall Model Windrum and Birchenhall use agent-based modeling to develop a model of technological successions where heterogeneous pophulations of adaptive buyers and suppliers evolve over time. I I The model simulates groups of buyers entering and exiting the market who base their technology choices on production costs, price, and performance quality. Buyers elect to adopt the new technology if the value of the existing technology is less than the design advantage and financial returns that are gained from adopting the new solution. Suppliers react by adjusting capacity, production levels and design features and set price by calculating mark-up over cost. Firms base their decisions on revenues, capacity, and profit and monetary wealth. Future design enhancement decisions are driven by the firm's ability to increase business benefit to the buyer. Technological shocks are also introduced by new entrants providing new features consumed by new buyers who enjoy the new technology at a lower unit COSt.1 2 Findings indicate that new technology adoption ultimately depends upon consumer preferences for advantages of new design features and lower production costs. The objective of the research is to use systems dynamic modeling to reproduce the model, then further evaluate it to determine: 1. Its applicability to illustrate technology adoption in a scenario involving multiple suppliers (competing technologies) and buyers (adopters) 2. Appropriate modifications to key elements which refine the model's usefulness in both a theoretical and commercial environment 3. How the model can be generalized, then adapted, to reflect other macro events and micro-level agent responses in the software technology markets Model testing shows that buyer and supplier characteristics in the original design require modification in order to adequately reflect actual technology adoption cycles. Initial enhancements result in a generalized base framework which can then be extrapolated to reflect the impact of economic shocks and market shifts. Continued testing indicates observable trends driven by price/design ratios and impacted by both externalities and agent behaviors .
3.0 The SNHU Model of Technology Succession The model was initially developed using many of the default variables outlined by Windrum and Birchenhall to simulate agent behaviors within the enterprise
Kirwin, Bill, Joseph Feiman, Diane Morello and Phillip Redman, "Enterprise Personality Profile: How Did We Get There?". Gartner, Inc., ID Number : COM-22-3093, March 16,2004. 11 Windrum and Birchenhall, op.cit. 12 Ibid., pp. 14-15. 10
425 software market." The revised model consists of two groups of primary agents : buyers, or purchasers of enterprise software, and suppliers, or software vendors. This model focuses on enterprise application software technology adoption, and does not address consumer buying patterns. Buyer actions are influenced by software features/functionality (Design), price and/or required capital investment, delivery model, and perception of vendor viability. Behaviors and interactions within the market are also influenced by a set of macro-level conditions which are disturbed by intermittent shocks. Design requirements and related technology adoption are driven by external events such as privatization in emerging economies," legal requirements, social demands, and economic conditions affecting business strategy and technology spend. When shocks are severe, buyer behaviors can produce a temporary atrophy in new technology adoption unless it guards against immediate potential risk, such as new security and monitoring technologies . Supplier behavior and investment decisions are influenced by stakeholder and shareholder expectations, buyer consumption demands for product features, pricing, and delivery models, and competitive pressures. Market maturity, resource considerations , and imperfect knowledge of new market opportunities impact supplier decisions, controlling availability of new technologies and affecting buyer choice. Suppliers can also manipulate technology adoption and buyer behavior by introducing alternative licensing schemes which can reduce cost and risks. 1'•• N'm"-oa. S_I < C.~CJ'.aL
" .. ..,I.Q~.J_D Pac ~,,=C ..:. It: I'»'
t>e .....
Lr""..
CwalmlF.t C) ::n't lll ([" Lf
~ _ • • a..:o
~ ....
d....
d (l~""
..
~
(1.1
A ~ &Q TftI":
u...
Figure. I: The SNHU Systems Dynamics of Technology Succession
For a list of the Windrum-Birchenhall variablessee Windrum-Birchenhall, op.cit., Table I, p.ZZ. Technologies and applications which are requiredto enable privatization of basic infrastructure, such as utilitiesand telecomm, show early adoptionin transitioneconomies. 13
14
426
A systems dynamic modeling tool" was chosen for model development over Windrum-Birchenhall's original series of batch programming runs with randomized variable values. An advantage of this tool is the ability to model changes and observe the impact of relationships between agent variables in real time. The model allows for shifts in product design, costs, pricing, and consumer budget to simulate market fluctuations. At the core of the model is the design space, representing the interaction point between buyers and suppliers. Several variations were made to the model. The concept of "utility" was adjusted for enterprise application software markets to more accurately reflect market characteristics. Three variables were initially selected to represent the model core for both supply and demand decisions: price, design, and customer budget. Price is determined by market forces and consumer preferences for alternative designs and consumption models. Supplier inputs to price are driven by both engineering and sales costs rather than the fixed asset considerations of physical plant and production assumed by Windrum-Birchenhall. Costs, affecting price, climb as R&D staffs expand or sales campaigns execute. Firms applying sponsorship strategies to capture share" can influence buyer decision through early price incentives, then reap increasing returns as technologies mature. The Gartner ABC model was used to allocate three buyer groups. Group size ratios were assigned using the variable Adopter Population Percentage Coefficient illustrated in Figure 1. The model simulates competition between five firms offering various price and design levels to customers with equal budgets. Price and Design variables were set for firms to illustrate levels of price-feature combinations where greater functionality commands higher prices. Prices are manipulated through a Price Setting variable.
4.0 Base Model Test Results Output in Figure 2 shows the interaction of firm utility and customer desired utility illustrated by the firm/adopter pairs within the design space. Customer desired utility is higher for early adopters than late adopters as new technologies are considered strategic imperatives. Adopter types in each firm aggregate together in groups in three discernable bands. Late adopters show the greatest variation over time. Behavior of early adopters is the most constant, characteristic of type A corporations. Mainstream adopters show only slightly greater modulation than the early adopters over time. Positioning is due to the relationship of the price/design ratio vis-a-vis the fixed customer budget.
15 Vensim® (Ventana ® Simulation Environment , copyright 19882002, Ventana Systems, Inc.) was used to construct the model. 16
Katz and Shapiro, op.cit.
427
o-ip .....
. - Late Adopters . -Mainstream Adopters •
Early Adopters
Figure 2: Design Space
4.1 Simulating a Macroeconomic Shock Given the known baseline, a macroeconomic shock was simulated by reducing buyer consumption budgets equally across all firms. The firms with the most unfavorable price to design ratio either exit the model or become subject to reduced market performance. Firms that fall out of the model represent firms selling product below cost as a sponsorship strategy to gain mindshare. Other budgets were then reduced for only firms with unfavorable design to price ratios. Results show that firms under an economic shock price themselves out of the market , simulating conditions of either process inefficiencies or uncontrolled development costs . A third scenario was tested for firms that have higher design functionality and associated prices to represent a business-critical vertical industry special ization immune to econom ic downturn. Budgets for those firms were increased and resulted in a market leader shift as shown in Figure 3. Results indicate that given differing economic conditions and varying functionality, the model is able to mimic macroeconomic effects and that results in a shift of market leaders depending on price and product design .
, I'-..
-, -: r-- V -, --...
<, /
/
. - FirmS
= -
~
,r-::
..
.-
,
.
,
,
,
. .
"
"
"
"
Figure 3: Modeling a Market Shift
"
"
428
5.0 Conclusion and Areas for Future Research Though still at the early stages of model development, a number of simulations were constructed and tested that exercised different features of the software to model various buyer adoption and supplier response conditions. Various base model tests indicated the importance of the design space as the primary interaction point, and additional coefficients increased our ability to adjust for alternative market scenarios. Tests manipulating the key variables of price, design and budget revealed model sensitivities and suggested new ways to increment and optimize the model for future extensions. The current model explores the impacts of price, design, and budget manipulation on technology adoption within enterprise software markets. Interactions within markets can vary market structure and the rate of technology adoption. Macroeconomic factors also redirect budgets either due to uncertainty or to the dictates of business infrastructure. Future models are anticipated which will introduce the impact of these elements on new technology successions.
429
Bibliography 11] Arthur, W. Brian, "Complexity and the Economy". Science. 2 April 1999,284,107109. 12]Arthur, W. Brian , " Competing Technologies, Increasing Returns, and Lock-in by Historical Events" . The Economic Journal, 99, (March 1989), pp. 116-131 . [3]Arthur, W. Brian & Wolfgang Polak , "The Evolution of Technology within a Simple Computer Model". Santa Fe: Santa Fe Institute, December 17,2004. 14]Bonabeau , Eric, "Don't Trust Your Gut". Review, May , 2003 .
Cambridge: Harvard Business
[5]Bonabeau, Eric., "Agent-based modeling: methods and techniques for simulating human systems" . Proceedings of the National Academy of Sciences, vo1.99, suppl. 3, May 14,2002. [6]Correia, Joanne , and Mertz, Sharon "CRM Market Trends". Gartner . Inc.: Gartner Customer Relationship Management Summit, San Diego , October, 2005. [7] David, Paul A. "Clio and the Economics of QWERTY". Economic History, Vol. 75 No.2, pp. 332-337, May, 1985 [8) Farrell , Joseph, and Garth Saloner, "Standardization, Compatibility, and Innovation". M.I.T. Working Paper#345 ,April, 1984.
[9]Feiman, J., Kirwin, B., Morello , D. and Redman , P. "Enterprise Personality Profile : Dimensions and Descriptors". Gartner, Inc., ID Number: COM-223417, March 16,2004. [10) Haines, Michael, 'T he Enterprise Personality Profile Builds Sales Insight" . Gartner, Inc., ID Number: G0012251O, September 2, 2004 . [11] Katz, Michael L., and Carl Shapiro , "Technology Adoption in the Presence of Network Externalities". The Universit y of Chicago : Journal of Political Economy , 1986, vol. 94, no. 4 , pp. 822-841 [121 Kirwin, B., Feiman, J., Morello , D. and Redman, P. "Enterprise Personality Profile: How Did We Get There?". Gartner, Inc., ID Number: COM-22-3093, March 16,2004 [13) Windrum, Paul, "Unlocking a lock-in: towards a model of technological succession". Maastricht: Maastricht Economic Research Institute on Innovation and Technology [14] Windrum, Paul & Chris Birchenhall, "Technological diffusion , welfare and growth: technological succession in the presence of network externalities". Maastricht: Maastricht Economic Research Institute on Innovation and Technology, MERIT Infonomics Research Memorandum Series, 2002.
Chapter 28
Hypercompetitive Environments: An Agent-based model approach Manuel Dias Instituto Superior Economia e Gestae Universidade Tecnica de Lisboa [email protected] Tanya Araujo Instituto Superior Economia e Gestae Universidade Tecnica de Lisboa [email protected] 1. Introduction Information technology (IT) environments are characterized by complex changes and rapid evolution. Globalization and the spread of technological innovation have increased the need for new strategic information resources, both from individual firms and management environments. Improvements in multidisciplinary methods and, particularly, the availability of powerful computational tools, are giving researchers an increasing opportunity to investigate management environments in their true complex nature. The adoption of a complex systems approach allows for modeling business strategies from a bottom-up perspective - understood as resulting from repeated and local interaction of economic agents - without disregarding the consequences of the business strategies themselves to individual behavior of enterprises, emergence of interaction patterns between firms and management environments. Agent-based models are at the leading approach of this attempt. Agent-based models are increasingly used in different fields of economics and management. Many of these models fall into the field of Finance (S . D. Farmer, 2005 ; R. Cont, 200 I; Stanley , 2002) and a very important part of them deals with Innovation and Diffusion processes (Dawid et ai, 2001). Among the later, some of the models
encompass the study of strategic behavior . However, those models do not usually account for both strategic behavior and the dynamics of the interactions among agents, in what concerns their interplay in unstable and hypercompetitive domains . Hypercompetitive domains are characterized as complex and adaptive environments, with non-linear behaviors, which emerge from an interdependent set of strategies (Stacey , 1995). Some authors (Eisenhardt and Sull 2001) advocate strategizing by simple rules in high velocity environments, in order to retain sufficient flexibility to make rapid decisions . Drawing on the biological evolution of ecosystems , research on hypercompetitive environments emphasizes the interdependence of different actors within a system (Eisenhardt and Galunic , 2000) , where evolution takes place in a continual process of contradictory forces with positive and negative feedbacks . In those self-organizing environment s, the actual most valuable strategy may become a loosing one . In order to deal with such a complex subject, we adopt an agent-model approach and depart from the usual solutions in this type of models: each agent occupies a given position in the two-dimensional space, being represented by a set of weighted strategies. Strategies are continuously updated and depend on the strategies of the agent nearest neighbors . At each time step, the agent performance is computed through a payoff function . As often happens, improvement of one agent payoff is generally made at the expense of other agents . Typical examples of strategies are differentiation and low cost, innovation and market segmentation . One may think on the abstract representation of strategies we have adopted in our model as being that of a highly adaptive system where specific tasks are performed without strongly committed configuration s. Our model is, therefore, a contribution to interweave two lines of research that have progressed in a separated way: business strategies models to complex environments and the agents based economic literature, with a strong emphasis on spatial interactions. The rest of this paper is structured as follows. In section two we present the main features of the model. Section three comprises the simulation results and their analysis accordingly to relevant scenarios. Section 4 concludes and outlines future work.
2. An agent-based model of business strategies - the features 2.1. Enterprise and Environment Characteristics The model comprises a set of N agents witch are randomly . placed in the two dimensional space (with periodic boundary condition s). Each agent has K strategies representing its available resources . To ensure the existence of a limited amount of resources, the vector of the strategies (Ei) is convex coupled . At each time step and based on their neighborhood, the agents are allowed to update their strategies accordingly to the rules presented in section 2.3. The amount of influence of the neighborhood in the agent strategies depends on the value of a local parameter (D), whose specification follows a sigmoid function , as in (1) . 1
N,j.;
Vit(t) = }: j-l
1+ e
-O.lS.(d -D)
•
-Eik(t»
(1)
431
432 2.2. Parameters The model parameters allows for representing the following properties: • Agent susceptibility (F j ) represents the agent propensity to redistribute its strategic resources (a null value represents inertia and the corresponding impossibility of strategic change) . • Agent Myopia (Mj) allows for weighting local and global competition asymmetrically. • Agent mobility (Sj) specifies the agent capacity to change its position in the two dimensional space (null values represent static agents and unitary values represent agents with high inertia). • Agent concentration (Ci) allows for the specification of a biased distribution of strategy weights (setting this parameter to one means that the agent invests all its strategic resources in a single strategy). 2.3. Agents Performance The algorithm for strategies updating aims to maximize individual payoffs. The payoff specification has two components, the first component accounts for the results of local interactions (between the agent and its nearest neighbors). The second component accounts for the global effects. The computation of local component (PL) is based on the Porter generic strategies theory, where differentiated products and services are intended to lead to higher profits. In this context, the global component (PG), is based on the environmental conditions and depends on the alignment between local and global strategies. The closer are these strategies, the higher is global agent performance. The agent payoff is computed as the sum of expressions (2) and (3). PL;(t)
= -
1 K
~ I Eik(t) - Vzk(t)
Kf:!
I
1 K PG;(t)=l--~IEik(t)-Gk(t)1 K~
(2) (3)
2.4. Local and Global Dynamics Usually, the dynamics of a model represents the conditions that influence its temporal evolution. Its description should consider structural properties and their evolution. As the agent payoff depends on both local and global components, in updating the agent strategies one must account for both local interaction and global environment conditions . From the local interaction point of view, an agent may choose its worse strategy (Li k) and set it with the highest possible weight (one) . . {l<=k=k\ :Vi k -min(Vi) Llk = I (4) o<=b. k\ At the next time step, the agent strategies are computed as in (5). ELik(t+l) - (l-F;) 'Eik(t)+F; . Lik (t)
(5)
From the global point of view, the condition that determines the influence of the global environment on the agent strategies aims to maximize the alignment between them and the global strategies (Gk) . Updating strategies also depends on the agent susceptibility (F), as defined in the expressions (6) and (7).
EGik(t + 1) - (1- F;)' Eik(/) + F;' Gk(/)
(6)
E'k' () F _E_ri~k(:....:./)_+_E.....;Gk~(-,-/) E'k· (1+ 1) - (1 - F) j • I + j' 2
(7)
Since our model relies on strategic differentiation , when an agent changes its position in space the euclidian distance to its major competitors is intended to be enlarged . Finding out that competitor , the agent must define an inverse rectilinear trajectory, whose distance dependents on the agent mobility coefficient (S;). Before each move, the agent must check the boundary conditions to see if it is a valid move. In that case, the new position Xi ' is computed as
xj '
=
xj + S, . dij
(8)
Finally, there is a population renewal rate (E) representing the lack of economic sustainability of agents operating under unpredictable markets This parameter specifies the percentage of agents that, at each time step, will be replaced by new ones.
3. Simulation Results The model validation was based on experimental results obtained from the simulations of a baseline and four specific scenarios . The baseline fixes the start-up referential, with a set of typical values for each parameter . The remaining four scenarios represent real enterprises environments , and are created by individual variations of each parameter (according table I). Scenario 1 corresponds to high instability enterprise domains, with the global environment strategies continuously changing . The asymmetric influence of local and global competition is evaluated in scenario 2, which uses the agent myopia coefficient (M) to model this behavior. Scenario 3 allows for investigating the influence of strategic concentration in agents' performance, through the variation of the concentration coefficient (C;). Finally, scenario 4, is tailored to represent, geographic mobility . The results obtained rely on four mains aspects: i) agent performance distribution , ii) influence of agent susceptibility to strategic change, iii) local versus global competition and iv) resources concentration during strategy definition . Table 1. Scenariosconfiguration Baseline Number of steps, R Number of agents, N Number of strategies , K Mobility , S, Renewal Rate, E Neighborhood range 0 Susceptibility, F, External Environment Gk(t) Agent mvooia Strategic concentration
cro
I
Scenario I
I
Scenario 2 100 1000 K =3 S, =O E =O.\ 0 =30 F, random random IGk(t) variable I random I M, random Mi =0.5
oro
C, =O
Scenario 3
Gl tl
I
Scenario 4
I
S, random
random M, =0.5 Ci= \ , j=I ,.., 100 Ci = O C,=O, j=\OI, ., 1000
II
3.1. Payoff Distribution An important goal of this work is to characterize self-organizing properties emerging from the agents behavior. To this end, the distribution of the performance of the agents
433
434
is evaluated, in two different situations: at the beginning of the simulations and after 1000 interactions have been performed The results show that in the later situation, the distribution of the agents payoff displays a power-law signature, as often happens in several real societies (Pareto, 1897). Moreover, the results also show that the values of the payoff function are significantly higher in stable environments than in unstable ones (scenario I) . When the role of an asymmetric influence is analysed (scenario 2), one verifies that it leads to higher payoffs, allowing to conclude for the advantage of asymmetric valorization of local and global strategies. Finally, the agents with strategic concentration (scenario 3), are likely to obtain higher payoff values, as advocated by Porter theory. :m
'50
150
'00
' 00 50 50
~5
055
06
0 65
07
075
'"
08
as
085
095
1
~5
055
06
066
07
075 Pi
oe oes
0'9
095
1
Figure. 1. Payoff distribution: baseline and scenario 3
3.2. Susceptibility to Strategic Change The ability of the agents to change their strategic resources in order to maximize their profits has assumed a relevant role in strategy theory . Our simulation results show a non-linear relation between the susceptibility (F) of the agents, and their payoff values, in the baseline and in scenarios (I and 2), Both cases, high inertia leads to high payoff values, while susceptibility superior to \.-2 leads to lower payoff values. One possible explanation is that the continuous strategic redistribution implies increasing costs.. In the opposite case, agents with reduced susceptibility may reach best performances, since they waste few resources trying to align with the global dynamics. •05
Figure. 2. Correlation between susceptibility and payoff: baseline and scenario 1
435
3.3. Strategic Concentration According to Porter theory , concentrating investment in one of the four generic strategies is fundamental to enlarge performances . As Figure I and Figure 3 show, the agents with concentrated behavior obtain higher payoffs . We can see in Figure 3 that those agents have their payoff around three main values : Pi = 1.42, Pi =0.88 e Pi =0.71. Although we do not find any reason for these three different types of distribution, we must highlight the low variance of the payoff function of such agents . 15 r---'--~--'--~--'-----,--~-,---, 14
cfX>
oq)oq,
008 00 0
0
13 12 11
0:
1
o
01
02
03
04
eo
0000
05 f.
06
01
DB
09
Figure. 3. Payoff versus susceptibility, in strategic concentration scenario
3.4. Local Competition versus Global Influence The valorization of local competition versus global influence is evaluated from the values of the myopia coefficient. This coefficient allows the agents to asymmetrically weight local and global components . Figure 3 shows the correlation between payoff values and the agent myopia coefficient (M), in the baseline and in the instability scenarios . Baseline results show a non-linear correlation which tends to benefit extreme values of M, In the instability scenario results are interesting and unexpected : there is no apparent correlation . The specification of a major influence factor (local competition or global influence) seems to depend on the environmental conditions : under high doubtful environments, the agents should pay more attention to local competition, since competitive advantages come mainly from differentiation to direct competitors. Otherwise, both factors should be equally considered . 09B
085
0$
08
09<
00: 081 OB;
~i.!'
" :: .
.:
>.:. ' .. . '" ::.
~~~~" ' '' : '
(
••••:.. • :' ~ •.z'&.'ilt.laIIl""o<1
"V , o.. .
;:
0.75
0.7
. ~\~$A. . • •
~'~\~;~
,
06< 082
Figure. 4. Correlation between payoff and agent myopia : baseline and scenario 1
436
3.5. Agents Mobility The final results concern the spatial (and self-organizing) patterns, as those observed in Figure 5. Departing from a random disposition, we come to a final distribution formed by three main clusters. Based on the agent properties, as susceptibility or mobility, and on performance indicators as the payoff function, we try to identify possible cause effect relations that could help to understand the emerging patterns. However, it seems to be no apparent correlation between those properties and the agent coordinates
Figure. s. Final spatial disposition depending on agent payoff and agent mobility
4. Conclusions Complexity sciences and agent-based modeling have proved to efficiently deal with interdependent, non-linear and emergent factors of hypercompetitive environments. The results obtained from simulations confirm the emergence of a power law distribution of the payoff of the agents . As the deterministic component of the model is reduced, such distribution has a particular interest since it follows the typical rule of distribution of wealth in real economies . Moreover, as this characteristic is absent at the initial setting, depending only on the dynamics of the agents. Regarding asymmetric valorization of local and global competition , result s advise high weight of local interaction under high uncertainty and random environments, i.e., major competitive advantages result from differentiation capability with respect to closest competitors. The third main finding, concerns to the non-linear relation between susceptibility and individual payoffs , namely, the best performance of lower susceptibility enterprises, suggesting an organizational penalization caused by continuous strategic adaptation. Finally, concentration of strategic investments on a single strategy appeared to be a crucial factor to maximize performances, in line with Porter theory . The abstract model herein presented provides enough freedom to interpret and recognize strategies as specific properties of the model . The availability of experimental (raw) data on enterprise real information may provide less abstract instantiations focu sed on a specific and concrete reality . Another possible improvement is related to enterprises operations such as fusions , partnership or direct acquisitions , since all these phenomena represent strategic actions on hypercompetitive environments. In this context, complexity sciences issues plus their increasing application in social and economical areas , are envisioned to improve the present approach .
References Araujo , T. e Mendes, R. (2001), Function and form in networks of interacting agents, arXiv :nlin.AO/00090 18.
D' Aveni, R. (1994), Hyper-Competition, New York: The Free Press. Eisenhardt, K. e Martin , J. (2000), Dynamic capabilities: What are they?, Strategic Management Journal, 21, pp. 1105-1121. Farmer, J. and Geanakoplos, J (2005) , Beyond Equilibrium and Efficiency, 352pp ., Oxford University Press, 2005 . ISBN: 0-195-15094-5 Mendes, R. (1998), Conditional exponents, entropies and a measure of dynamical selforganization, Phys. Lett. A248, pp. 167-171. Mendes, R. (1998), Medidas de Complexidade e Auto-organizacao, Col6quio Ciencias, 22, pp. 314. Mendes, R. (2000), Structure-generating mechanisms in agent-based models, arXiv :nlin.AO /0009042 . Porter, M. (1980), Competitive Strategy: Techniques)Or Analyzing Industries and Competitors, New York: Free Press. Prahalad , C. e Hamel, G. (1990) , The core competence of the organization, Harvard lJusiness Review, 68, pp. 79-91. Dawid, Reimann, and Bullnheimer, (2001), To In-novate or Not to Innovate ?, IEEE Transactions on Evolutionary Computation, v 5 n 5. Stacey , R. (1995), The science of complexity: An alternative perspective for strategic change processes . Strategic Management Journal, 16, 6, pp. 477-495 . Stanley, H. et al (2002) , Self-organized Complexity in Economics and Finance, Proc. Nat! Acad. Sci 99-Supp, pp. 2561-2565 .
437
Chapter 29 Precursors of a phase transition in a simple model system v. Halpern Department of Physics , Bar-Han University , Ramat-Gan 52900, Israel [email protected] Abstract Most theoretical and numerical studies of phase transitions are concerned with the thermodynamics and with critical phenomena such as correlation lengths and critical slowing down as the temperature T is lowered towards the critical temperature T; of the phase transition. However, an understanding of the microscopic properties of the system is required in order to find simple precursors of the phase transition which indicate that it is imminent. Our studies show that the ferromagnetic 3-spin Potts models is a simple system in which these properties can readily be studied, and we have shown elsewhere that changes in the environments of the states with time and temperature are responsible for its relaxation properties. Our new calculations indicate that the most reliable precursor of the phase transition is a dramatic increase in the tendency of the spins at most sites after changing to return to their original values (which corresponds to the disapperance of diffusion of molecules in liquids as they freeze) as the temperature is lowered towards Te • The relevance of these results to the behavior of other complex systems is discussed.
1. Introduction Phase transitions in physical systems are typical of many types of transition that occur in complex systems. Other transitions that are potentially of a similar nature include the formation and disintegration of a strongly bound community or of an economic conglomerate. While such transitions often appear to occur quite suddenly when the appropriate conditions actually occur, there are usually some indications before they happen, which we can precursors. In the theory of phase transitions, these precursors are usually examined in terms of the thermodynamic properties of the system as the critical temperature T; of the phase transition is approached. The reason for this is that these properties are very general and occur in a large range of systems, without any consideration of the microscopic properties. However, it is also interesting
439 to look for precursors in the microscopic properties of specific systems (or types of system), in order to be able to predict from them whether a phase transition is imminent. An understanding of these can lead to some ideas about when to expect such transitions , and also how to prevent or encourage them, both in similar physical systems and in more general complex systems . As a well-known example of a phase transition, which will guide us in the analysis of our model system, we consider first the transition between a liquid and a solid. One obvious apparent difference between these two states of matter is that the atoms or molecules in a solid make transitions much more slowly than in a liquid . A second difference is in the nature of the transitions. In a crystal the atoms mostly make transitions either by jumping into interstitial positions, which requires a high activation energy, or by moving across the grain boundary between one crystallite to another, which is also a rare process . Similar processes can also be defined for non-crystalline solids or glasses [Granato, 1992; Granato and Khonik, 2004]. In a liquid near the freezing temperature, on the other hand, molecules are much more mobile, and are continuously leaving and joining different sub-critical clusters. Thus, one possible precursor of a phase transition ia a dramatic slowing down in the rate at which transitions occur, and another is a change in the predominant type of transition that occurs. In addition, the molecules in a liquid are free to diffuse over large distances, while in a solid there is very little or no diffusion . This implies that in the solid there is a strong correlation between the positions of the atoms at different times, which is not present in the liquid, so that a third possible precursor of the phase transition is a change in this correlation . As we discuss at the end of this paper, analogous processes can occur in other complex systems. In this paper, we examine the approach to a phase transition in a very simple model system, namely the q-state ferromagnetic Potts model on a square lattice [Baxter, 1973]. In this system, there is associated with each site of the lattice a spin that can take anyone of q distinct values, while there is an attractive interaction between spins having the same value which encourages the formation of blocks or clusters of identical spins as the temperature is lowered towards the critical temperature of the phase transition. This transition is second order for q s 4 and first order for q > 4 [Baxter, 1973]. We have recently shown [Halpern, 2006] that the dependence on time and temperature of the relaxation properties of this system can readily be understood in terms of the changes in the environments of the sites, and that the system's behavior has some similarities to that of supercooled liquids. In that paper we only considered temperatures T above the critical temperature TJ00) of the infinite system, and were not particularly interested in the phase transition . By contrast , in this paper we examine the system 's properties as the temperature passes through the critical temperature of the phase transition . For finite systems, with sides of length L, it is well known [Priman, 1990] that the phase transition temperature TJL) may differ from that of the infinite system, so that we do not know in advance the exact critical temperature for the finite systems examined in our simulations . However, this is not of crucial importance for examining the precursors of an impending phase transition. In section 2 we describe briefly the system, as well as our motivation for studying it, and present the results of our calculations . These results and their significance are discussed in section 3, at the end of which we consider their relevance to other complex systems. Our conclusions are summarized in section 4.
440
2. The Potts model The Hamiltonian for the ordered ferromagnetic q-spin Potts model with interactions only between the spins at adjacent sites can be written as [Wu, 1982] H = -J Ii~(i)(j(a;,a) (1) where J > 0, the first sum is over all the sites i in the system and the second one over all the sites j(i) that are nearest neighbors of the site i, the spins a, can take any integer value between 1 and q, and J(a,b) = 1 if a=b and 0 if arb. Hence, the local energy associated with a spin having z adjacent sites with the same spin is just -zJ. The probability w of a change in the spin at a site which involves an increase of energy LJE at temperature T was taken to have the standard form w = wo, LJEO Before describing the properties of this system, we will discuss briefly the reason why we were interested in it, since this affects the way that we analyze its results. The Potts model can be regarded as a schematic model for a plastic crystal, in which the molecules are often treated as thin rods with their centres fixed on the sites of a lattice [Bermejo et al, 2000]. If the possible positions of these rods are restricted to a finite number q of orientations, then each orientation can be represented by a different value of the spin in the q-spin Potts model, and the Hamiltonian of equation (1) favors all the rods being parallel. These plastic crystals have properties very similar to those of supercooled fluids [Benkhof, 1998], with random molecular orientations corresponding to the liquid state and a state with the orientations of the molecules frozen corresponds to the glassy (or crystalline) state. Hence, the behavior of the Potts model can be expected to shed some light on the properties of supercooled liquids, as we have shown elsewhere [Halpern, 2006]. For our simulations we considered a square lattice of 200 x 200 sites, which was large enough to give reproducible results for different runs, and we chose Wo = 0.5. In order to save CPU time, we only performed extensive calculations for the case of 3 spins, q = 3, for which Tioo) = 1.005J/k B where ke is the Boltzmann constant [Baxter, 1973]. The simulation techniques used were the same as in our previous paper [Halpern, 2006], apart from starting the anneals from an initial state of all identical spins instead of one with random spins. Once a steady state was reached, a set of five successive simulation runs was performed on the system, with each run proceeding until the spin had changed at least once at 99% of the sites. In order to study the temperature dependence of the structure of the system and the nature of the transitions in it, we examined the fraction of sites cl, within clusters of identical spins, i.e. with z = 4, and the fraction of sites ch, at which the first change of spin occurs from within such a cluster. The reason for considering the first change of spin at a site rather than all the changes of spin is that the latter will be swamped by the sites at which the spin can change freely, even if their number is small. The other property that we study is the normalized fraction of sites at which the spin is unchanged , or correlation function, at the end of one run,fr], and at the end of five runs,fr5' Since for two completely random systems the probability that the spins at any given site are equal is Jlq, this correlation function is related to the actual fraction s of sites at which the spin is unchanged by fr = (s-l/q)/(l-Jlq) = lf2(3s-1), (3)
441 which is zero if there is no correlation between the initial and final states of the system and unity if these states are identical. In order to provide an indication of what our system looks like at various temperatures, the results that we show first, in figure I, are maps of the system, where the three different colors correspond to the three different values of the spin. The color green tends to dominate the figures because the initial state from which the systems were obtained by annealing (or thermalizing) was one with green at all the sites.
Figure 1 : Distribution ofstates for Tloo)/T= 0.95 (top), 1.00, and 1. 01(bottom)
442 From these maps, we can see clearly how the system consists of small clusters at high temperatures, and tends towards a single phase as the temperature is lowered. The big difference between the figures for Tloo)/T = 1.0 and for Tloo)/T = 1.0I suggests that the latter is close to the phase transition temperature . However, even for Tloo)/T = 1.05 we find that there are smalI pockets of spins that differ from the majority . The presence of these pockets is associated with the fact that for our system there is always a finite probability of a spin changing within a cluster, and any such change increases the probability that an adjacent spin changes.
. ........".-
0.6
.,, /
....
'5';. 0.4
_-... o
0.2
..
: ::~!:::::::f.:::::
.•..• ..
ch . ..................... .... ..
0.92
•
-
.*
.'
-
.:It . .. •...• "' .
..
,.>' . ... . ....
.
,
1.00
1.04
Figure 2 : The fraction ofsites in clusters cl, (black squares) , the fraction ofsites from which the first change ofspin is made from within a cluster ch, (red circles) , and their ratio r = chi cl, (blue stars) , asfunctions ofthe reduced inverse temperature Tloo)/T.
We now tum to the possible precursors of a phase transition listed above. Our previous results [Halpern, 2006], together with our subsequent extensions of them, do not show any dramatic slowing down in the rates at which spins change as the temperature is lowered towards and beyond Te• Accordingly we now tum to the second possible precursor, and examine whether ther is a change in the predominant type of transition . For our system, this corresponds to a change from the transitions of spins taking place on sites at the edge of clusters at temperatures well above T; to transitions from sites within clusters below T: In figure 2, we show the fraction of sites within clusters cl, and the fraction of sites ch, at which the first change of spin occurs from within such a cluster, as functions of the reduced inverse temperature Tloo)/T, and also their ratio r = chlcl4• For an ideal single phase system, alI these three quantities would be unity. The difference between ch; and cl4 is just the fraction of sites initially inside clusters that have changed their environments before the spins on them first change. As the temperature is lowered , so that Tloo)/T increases, cl, increases, as is to be expected from the thermodynamics of the system [Halpern, 2006]. There is also amore rapid increase in ch4 , so that the ratio r of the number of sites at which the first transition occurs within clusters to the total fraction of sites within clusters increases. A detailed discussion of the source of these increases is beyond the scope of this paper. However, we do not find any dramatic increase in these quantities for values of Tloo)/T up to
443 1.05, which according to figure I should certainly include the phase transition temperature. This result is associated with the small pockets of different spins which are present in our systems at all temperatures, and which reduce considerably the fraction of sites cl, inside clusters. The changes of spin at sites on the edges of these pockets require less energy than at sites inside clusters, and so the presence of these pockets also reduces appreciably the fraction of sites ch, at which the first change of spin occurs from inside a cluster. Incidentally, we see from figure 2 that the rate of change with temperature of all these quanities does have a maximu (an inflexion point in the curves) close to Te, but its position is not easy to measure accurately.
Figure 3: The average correlation function (the reduced fraction ofsites at which the spin has not changed) after a single run (red circlest.and after five successive runs Irs (black squares) , and their ratio rf = fr/ (blue stars), as functions of the reduced inverse temperature Te(oo)/T.
Finally, we consider the changes in the average values of the correlation function jr, between the original values of the spins and their values after n successive runs, where in each run the spins at 99% of the sites have changed at least once. In figure 3, we show, as functions of the reduced inverse temperature Tloo)/T, the mean value
of the correlation function Ir. for five successive runs and its valuejr, after these five successive runs. The increase in both these quantities as the temperature is lowered is associated with the increase in cl., since it can be shown that the return of a spin to its original value is more probable for sites inside clusters than for sites outside them. One expects that after five runs the correlation function will be less than after a single run unless most of the system (apart from the pockets of different spins) is in a frozen state, in which case they will tend to be equal. Thus the increase in these correlations and their tendency to equalize could be a precursor of the phase transition. As we see from figure 3, the ratio rf = fr/ increases rapidly with decreasing T, and is close to unity for Tloo)/T ~ 1.02, i.e. below the critical temperature for the finite system in accordance with the conclusions about its value that we deduced from figure 1.
444
3. Discussion In the results presented in the last section, we found that for the Potts model there are three significant properties of the system that change as the temperature is reduced towards that of the phase transition . Firstly, the size of the clusters of sites having the same spin increases, as shown in figure I , and there is a corresponding increase in the fraction of sites cl, inside such clusters, as shown in figure 2. Secondly, the fraction of the sites initially inside a cluster at which the first transition occurs before any of the spins on adjacent sites have changed, the ratio r in figure 2, also increases. Finally, as shown in figure 3, the comelation function (the reduced fraction fr of sites at which a spin returns to its original value rather than to a random value) increases as the temperature is lowered, and becomes independent of the total number of changes of spin (rf= frl =1) at temperatures below the critical temperature . Of these three precursors, only the last one shows a dramatic increase as T approaches T, and so provides a clear indication of the actual phase transition temperature . In order to interpret and appreciate the significance of these precursors, we return to the analogies of the glass transition in supercooled liquids and of the freezing of a liquid and melting of a solid. A change in the spin at a site in our model corresponds to the motion of an atom or of a molecule, and a cluster of sites with identical spins corresponds to a droplet of liquid or to a solid-like region, depending on how long its size and shape remain unchanged. In that case, for the freezing transition the first of our three precursors corresponds to an increase in the fraction of molecules in droplets and in the size of the droplets, which is rather obvious and does not provide a sensitive test of the location of the critical temperature. The second of our precursors corresponds to an increasing fraction of molecules starting to diffuse via interstitial sites within a droplet, as proposed by Granato [Granato, 1992 ; Granato and Khonik, 2004] rather than freely between droplets. However, this does not show any dramatic change as the temperature was lowered, a result that corresponds to some recent experimental resdults on transitions in supercooled liquids as the glass temperature is approached [Huang and Richert, 2006] . By contrast, our third precursor corresponds to molecules that have left their original position (or orientation, as represented by the spin in our model) tending to return to it rather than diffusing freely away. This property is measured by the average value of the correlation function after different time intervals, and their ratio (corresponding to fr) is not affected by the presence of small pockets of unfrozen regions (provided that the first time interval is long enough for the molecules in these regions to diffuse away). Thus, just as for the Potts model, a rapid increase in this ratio is a clear precursor of the transition from a liquid to a frozen state, with the ratio reaching unity when the system freezes. We now tum to the implications of our results for other complex systems, and consider as a typical example a closely coordinated community of individuals. For such a community, the energy J in our model corresponds to the attractiveness (or reduction in unpleasantness) of belonging to the community , and the temperature T (which entered our transition probabilities in the ratio exp[-iJE/kBTJ) to the stimulus to leave it. Just as our system is only in a frozen state for T < Te, where Tiro) = 1.005J/k B, so for a community even if it is very difficult to leave (low T) the community, it will not be stable (T > Tc ) and will eventually disintegrate if there are insufficient attractive forces to bind it together (very low 1). Similarly, in order for a group of individuals to unite to
445 form a stable community (or political party) it is essential that the advantages of belonging to that community outweigh those of being completely free. While these statements are fairly obvious (although not always remembered by dictators) our model shows that the existence of small pockets of dissenters does not necessarily lead to the dissolution of the community. In fact, it follows from our results for fr that such a community can be stable even if most of the members leave it temporarily, provided that most of those that leave return to it rapidly. Our model can easily be extended to allow for different values of J and T at different sites or in different regions, and an examination of the stability of a "frozen" state (jr = 1) under these conditions can shed light on the stability of a community in which the attractiveness of belonging and the temptation to leave is not uniform among all its members.
4. Conclusions Our results show that the clearest precursor of the phase transition in the 3-state ferromagnetic Potts model is provided by the correlation function between the spins , which increases rapidly as the critical temperature is approached and becomes independent of the number of transitions made by the spins below the critical temperature. This result suggests that in other complex systems the corrrelation function between the state of the system at different times can also be used as a reliable test of whether the system is approaching a dramatic change in its properties.
References Baxter, R. J., 1973, "Potts Model at the Critical Temperature", 1. Phys. C: Solid State Phys. 6 L445 Benkhof, S., Kudlik, A., Blochowicz, T. and Rossler E., 1998, "Two glass transitions in ethanol: a comparative dielectric relaxation study of the supercooled liquid and the plastic crystal", J. Phys : Condens. Matter 10 8155 . Bermejo, F. 1., Jimenez-Ruiz M., Criado, A., Cuello, G. 1., Cabrillo, c., Trouw, F. R., Fernandez-Perea, R., Lowen, H. and Fischer, H. E., 2000, "Rotational freezing in plastic crystals: a model system for investigating the dynamics of the glass transition" , 1. Phys: Condens. Matter 12 A391 Granato A. V., 1992, "lnterstitialcy model for condensed matter states of facecentered-cubic metals", Phys. Rev. Lett. 68,974. Granato A. V. and Khonik , V.A., 2004, "An Interstitialcy Theory of Structural Relaxation and Related Viscous Flow of Glasses", Phys . Rev. Lett . 93. 155502 Halpern V, 2006 , "Non-exponential relaxation and fragility in a model system and in supercooled liquids", J. Chem Phys. 124214508. Huang, W. and Richert, R., 2006 , "Dynamics of glass-forming liquids. XI. Fluctuating environments by dielectric spectroscopy", J. Chem. Phys. 124 164510 Priman, V. (ed), 1990, Finite Size Scaling and Numerical Simulation ofStatistical Systems, World Scientific (Singapore) Wu, F. Y., 1982, "The Potts Model" , Rev. Mod . Phys . 542
Chapter 30
Universality away from critical points in a thermostatistical model c. M. Lapilli, C. Wexler, P. Pfeifer Department of Physics and Astronomy - University of Missouri-Columbia. Columbia, MO 65211
Nature uses phase transitions as powerful regulators of processes ranging from climate to the alteration of phase behavior of cell membranes to protect cells from cold, building on the fact that thermodynamic properties of a solid, liquid, or gas are sensitive fingerprints of intermolecular interactions. The only known exceptions from this sensitivity are critical points. At a critical point, two phases become indistinguishable and thermodynamic properties exhibit universal behavior: systems with widely different intermolecular interactions behave identically. Here we report a major counterexample . We show that different members of a family of two-dimensional systems -the discrete p-state clock model- with different Hamiltonians describing different microscopic interactions between molecules or spins , may exhibit identical thermodynamic behavior over a wide range of temperatures. The results generate a comprehensive map of the phase diagram of the model and, by virtue of the discrete rotors behaving like continuous rotors, an emergent symmetry, not present in the Hamiltonian. This symmetry, or many-to-one map of intermolecular interactions onto thermodynamic states, demonstrates previously unknown limits for macroscopic distinguishability of different microscopic interactions.
1
Introduction
A far-reaching result in the study of phase transitions is the concept of universality, stating that entire families of systems behave identically in the neighborhood of a critical point, such as the liquid-gas critical point in a fluid, or the Curie point in a ferromagnet, at which two phases become indistinguishable. Near the
447
critical point, thermodynamic observables, such as magnetization or susceptibility, do not depend on the detailed form of the interactions between individual particles, and the critical exponents, which describe how observables go to zero or infinity at the transition, depend only on the range of interactions, symmetries of the Hamiltonian, and dimensionality of the system. The origin of this universality is that the system exhibits long-range correlated fluctuations near the critical point, which wash out the microscopic details of the interactions [1,2,3,4].
In this paper, we report a different type of strong universality. We present the surprising result that, in a specific family of systems, different members behave identically both near and away from critical points-we refer to this as extended universality, if the temperature and a parameter p, describing the interaction between neighboring molecules, exceed a certain value. In this regime, the thermodynamic observables collapse, in the sense that their values are identical for different values of p. No thermodynamic measurements in this regime reveal the details of the microscopic interaction in the Hamiltonian. This demonstrates intrinsic limits to how much information about the microscopic structure of matter can be obtained from macroscopic measurements. As the collapse maps Hamiltonians with different symmetries onto one and the same thermodynamic state, the system exhibits a symmetry not present in the microscopic Hamiltonian . The added symmetry at high temperature is the counterpart of broken symmetry at low temperature. To the best of our knowledge, no such collapse of thermodynamic observables and added symmetry have been observed before. The family under consideration is the p-state clock model, also known as pstate vector Potts model or Zp model [5], in two dimensions, with Hamiltonian
n; = -JOI)i 'Sj = -Jo.z:::cos(Bi-Bj ) , (i ,j)
(1.1)
(i,j )
where the spins, Si, can make any of p angles Bi = 21fn;j p, (ni = 1, ... , p), with respect to a reference direction ; the sum is over nearest neighbors of a square lattice; and the interaction is ferromagnetic , Jo > O. (In what follows we set J o = kg = 1.) The number of directions, p, may be thought of as discrete orientations imposed on each spin by an underlying crystallographic lattice. The model interpolates between the binary spin up/down of the Ising model [6] and the continuum of directions in the planar rotor, or XY, model [7,8]. The model is of interest to study how the ferromagnetic phase transition in the Ising model, with spontaneously broken symmetry in the ferromagnetic phase, gives way to the Berezinskii-Kosterlitz-Thouless (BKT) transition, without broken symmetry, in the rotor model. For any p, neighboring spins in low- and high-energy configurations are parallel, Si . Sj ~ 1 and antiparallel, s, . Sj ~ -1, respectively. This model has been extensively studied since its conception [5]. Elitzur et el. [9] showed that it presents a rich phase diagram with varying critical properties: for p S; 4 it belongs to the Ising universality class, with a low-temperature ferromagnetic phase and a high-temperature paramagnetic phase; for p > 4 three phases exist: a low-temperature ordered and a high-temperature disor-
448 dered phase, like in the Ising model, and a quasi-liquid intermediate phase. Duality transformations [9, 10] and RG theory gave much insight into the phases in terms of a closely related, self-dual model, the generalized Villain model [11],
n; = L[l- COS(Oi (i ,j )
OJ)]
+ LLhpcos(POi) '
(1.2)
p
where the Oi's are now continuous and the hp's are p-fold symmetry-breaking fields, similar to the crystal fields that limit the spins to p directions in the clock model. Jose et el. [12] have shown, via RG analysis, that for p < 4 the fields were relevant and the low-temperature phase was ordered, and that the fields were irrelevant for p > 4. While the p-state clock model is obtained in the limit of an infinite symmetry-breaking field, hp ----- 00, some properties of Eq. (1.2) for finite fields are still valid for its discrete counterpart, Eq. (1.1) [12,9] . But (1.1) is no longer self-dual for p > 4, and RG approximations regarding the influence of the discreteness of the angular variables are delicate near p = 6. As a result, the transition points of (1.1) in the three-phase region are not precisely known. The collapse of thermodynamic observables, or extended universality, sets in at temperature Teu , at which the system switches from a discrete-symmetry, p-sensitive state to a continuous-symmetry, p-insensitive state, indistinguishable from p = 00, as the temperature increases and crosses Teu , for p > 4. For p :=:; 4, there is no collapse and the system retains its discreteness at arbitrarily high temperatures. The collapse (non-collapse) is responsible for the BKT (nonBKT) behavior of the transitions that lie above (below) Teu . In what follows , we focus on the determination of the phase diagram, including the curve Teu (p), the characterization of each phase , and the critical properties of the two transitions present at p> 4.
2
The Phase Diagram
We performed MC simulations on a square 2D lattice of size N = L x L with periodic boundary conditions. Lattice sizes ranged from L = 8 to 72, and averages for the computed quantities involved sampling of 105 -10 6 independent configurations, with equilibration runs varying from px (1,000-5,000) MC steps (a MC step is one attempt to change, on average, every lattice element). Figure 1 shows a summary of our results. The Ising model (p = 2) shows a single second order phase transition at Tlsing = 2/1n[1 + V2] c::: 2.27, in units of JO/kB . The p = 4 case also shows a single transition (in the Ising universality class) at T; = Tl sing /2 c::: 1.13. Most interesting is the case for p> 4, which exhibits a low-temperature ordered phase (Ising-like), which turns into a phase with quasi-long-range order at T I , and finally disorders at T2 • For T > Teu the identity of the original symmetry of the problem is lost, and all systems behave strictly like the planar rotor model (p = 00), with a BKT transition at TBKTc:::O.89 [8].
The transition temperatures are determined from our MC simulations as follows (for details see Ref. [13]) . For the high-temperature transition T2 we use
449
87 6
P = 32
Teu,"
1
4
PHASE T2:.t4.
-.,'
,
0.8
QUASI l IQUID
,,, , , ,,,• , I , P
0.6 0.4 0.2 0
,
DI SORD ~RED
1.2
r-,
,
5
0
~SE
ORDERED PHASE
0.02
-2 0.04
P
0.06
Figure 1: Phase diagram of t he p-st ate clock model. The Ising mod el, p = 2, exhibits a single second- order phase transition , as does the p = 4 case which is also in the Isin g universality class. For p ~ 6 a quasi-liquid phase appears, and the transitions at T 1 and '12 are second-order . The line T e« sepa ra tes the phase diagram into a region where th e th ermodynami c observabl es do dep end on p, below '1 ~u ; and a region where t heir values are p-independ ent , above T e« (collapse of observabl es, exte nded univ ersality). T hus for p ~ 8, we obser ve 'Is« < '12 = TBKT .
Binder's fourth order cumulants [14] in magnetization, UL == 1- (m4 ) /3 (m2 ) 2 , and energy VL == 1- (e4)/ 3(e2 ) 2 . The fixed point for UL is used to det ermine th e critical temperature T2 , whereas th e latent heat is proportional to limL->oo[(2/3 ) - minT VL] (in all cases min-- VL ---+ 2/3 , signaling a second-order phase tr ansition). Th e transition between th e ordered and the quasi-liquid phases, T 1 , is analyzed via the temperature derivatives of t he magnetization, 8(lm l)/ 8T, and 8UL /8T, both of which diverge in the th ermod ynamic limit [13] . Finite size scaling (FSS) applied to the location of the minima of these quantities yields T 1 = limL->oo T 1(L). We find T 1 = 4Ti 2/ ('1'2p2), with '1'2 ~ 1.67 ± 0.02, whereas for the Villain model, in the limit of an infinite h p and large p, Jose et al. [12] found T }JKKNl ~ 4Ti 2/(1.7p 2). Th e ordered phase vanishes rapidly as p ---+ 00 , see Fig. 1.
450
3
.
rr
128
"""2 1>=3
2
."
1 . . . .:: ..: / ... , .
X 10-3
",
...
4
0.5 r --
- - --
--,
.' ,'I
: li P;
1\
':i
\
,
0L..-:.....:.-.....:::._~_-1
o
0.5
T
1.5
0.5 T
1.5
T
Figure 2: Heat capacity (left panel), magnetization (center panel) and difference of internal energy per spin relative to XY model (right panel). The data corresponds to a system size L = 72 (N = 5, 184 spins). Note that all curves approach each other above some Teu (for p ;::=: 5). Figure 2 shows thermodynamic properties of t he clock model as obt ained from our Me simulations: t he the heat capacity per spin at zero external field 2) 2 2 CF == ((H - (H)2)/(L T ), and t he magneti zation per spin defined as (m) == (M)/ L 2 = ((I L~ l cosBil, I L~ l sin Bil)) / L 2 . The Ising-like behavior for p= 4, and t he t hree-p hase behavior for p 2:: 6 are evident . Figure 2 proves t he collapse of the t hermodyna mic observables: CF and (1m!) are manifestly p-independent for p>4 and T >Teu, wit h
(1.3) where TBKT T=Teu·
~
0.89; and t he intern al energy difference sharply drops to zero at
The collapse of thermodynamic vari ables and the specific form of Teu(p) can be und erstood as follows. (i) The large-p, sma ll-(Bi-Bj ) expansion of Eq. (1.1) yields a characteristic te mpe rature (21r/p)2 above which t he t hermodynamic observab les become p-independent, as T/(21r/p)2 » 1 [13]. This implies an asymptotic collapse of t hermo dynamic observables. (ii) Elitz ur et al. [9] noted t hat t he discreteness of t he angles Bi in Eq. (1.1) becomes irrelevant for t he critical properties of t he system, for sufficient ly large p. This implies a collapse of t hermo dynamic variables at critical points T 2 . A similar irrelevan ce of t he discrete ness of angles, imposed by one h p ---+ 00, had been previously observed in t he generalized Villain model, Eq . (1.2) [12] . There it was shown t hat necessary for the discret eness to become irrelevant is T > 41r2/ (p2Tk ), where Tk ~ 1.35 is the BKT point of model (1.2). This suggest that a necessary condition for t he collapse of t hermodyna mic observab les in t he p-state clock model is T > 41r 2 /(p 2TBKT) . The fit of our num erical data for Teu(p) [13], yielding Eq. (1.3), shows that t he cond ition is satisfied as an equality an d is bot h necessary and sufficient. The implicati ons of t he collapse above Teu are cruc ial for the underst andi ng of critical properties at t he transition point T 2 • We observe t hat T 2 (p 2:: 8) > T eu, which implies t hat t he t ra nsition T 2 for p 2:: 8 mu st be BKT. Previous work focused only on t he plausibility of such behavior. For t he purp oses of t his discussion, we take the BKT behavior as the following set of properties of the "-J
451 planar rotor model [8]: (i) a universal discontinuous jump to zero of the helicity modulus ~YITBKT =2Ts KT / rr, (ii) the exponential divergence of the correlation length as ~ rv exp[c/IT-TBKTll/2], (iii) the temperature-dependent power law decay of two point correlation functions with a critical exponent at T2 given by 1](TBK T ) = 1/4, and (iv) an exponent related to the decay of the magnetization given by ~=3'Tr2 /128 [15]. In fact , our numerical simulations show these features being satisfied, giving a rotund confirmation of the nature of the high temperature phase transition for P? 8. It is also evident from numerical data [13] that the high temperature phase
transition T 2(p < 8)is not completely BKT-like. These facts highlight the importance of where the extended universality (T>Teu ) occurs (see Fig. 1).
3
Summary, Implications, and Open Questions
In our study we obtained valuable evidence from the macroscopic properties of the Zp model, a model that, although completely discrete, shows regimes with continuous-like thermodynamic behavior. We have presented the phase diagram for the Zp model analyzing its critical properties. The 3-phase regime was observed for p > 4 in agreement with earlier predictions. Of particular interest is the surprising extended universal behavior above some temperature Teu , where the identity of the Zp model is completely lost as all observables become indistinguishable from those of the XY model. In fact, the presence of an "exact" BKT transition at the point T2 for p ? 8 is now firmly established as a consequence of the existence of this temperature line Teu , which divides the phase diagram in two regions: with and without a collapse of the thermodynamic observables. This extended universal behavior is not present at T2 (p < 8), since T 2 (p < 8) > Teu . These conclusions were confirmed by studying the critical properties at T2 (indeed, for p < 8 critical exponents and the helicity do not behave as expected from the BKT RG equations) . Our studies raise important questions: If observables below T 1 show ferromagnetic ordering with a significant p-dependence, and for T > Teu all information about p is lost, what is the nature of the region T 1 < T < Teu? What are the collective excitations that make the system thermodynamically indistinguishable above Teu ? Why is the extended universal behavior approached so rapidly (for p > 4), and what is qualitatively different for smaller p since no temperature exists that makes this degeneracy be achieved? A very broad variety of systems from confined turbulent flows to statistical ecology models show collapsing probability distribution functions in finite-sized systems, suggesting that scaling is independent of various systems attributes such as symmetry, state (equilibrium/not equilibrium) , etc [16] . We present a stronger result in the sense that all observables become identical for T > Teu : the critical properties' collapse is a consequence of the extended universality. The existence of this collapse of the thermodynamic observables implies that,
452
experimentally, any observable (0) of the system measured at temperatures above Teu will fail to show any signature of the underlying discreteness, i.e. (0) p = (0) 00 ' The corollary is that in the presence of this extended universality, lower-temperature measurements are necessary if a complete characterization of the symmetry of a system is desired , as may be expected in a wide range of experimental situations where the XY-like behavior is observed [17]. Experiments in a wide variety of physical systems-from ultra-thin magnetic films, to linear polymers adsorbed on a substrate-may show signatures of these effects. It may imply, e.g., that the critical properties at the melting transition of certain adsorbed polymer films may be unaffected by the symmetry of the substrate. Further details about the results presented in this letter and additional properties, plus some ideas on how to address the questions posed above will be published elsewhere [13]. We would like to thank H. Fertig, G. Vignale, and H. Taub for useful discussions. Acknowledgment is made to the University of Missouri Research Board and Council, and to the Donors of the Petroleum Research Fund, administered by the American Chemical Society, for support of this research.
don). [2] Kadanoff , L.P. et al., 1967 Static Phenomena Near Critical Points: Theory and Experiment, Rev. Mod. Phys. 39, 395; Migdal, A.A., Phase transitions in gauge and spin lattice systems, 1976, Sov. Phys. JETP 42, 743. [3] Kadanoff, L.P., Green, M.S. (ed.), 1970, Proceedings of 1970 Varenna Sum-
mer School on Critical Phenomena, Academic Press (New York); Griffiths, R.B ., 1970, Dependence of Critical Indices on a Parameter, Phys. Rev. Lett. 24, 1479. [4] See, e.g., Wigner, E., 1964, Symmetry and conservation laws, Physics Today March, p. 34. [5] Potts, R., 1952, Proc. Camb . Phil. Soc. 48, 106. [6] Onsager, L., 1944, Crystal Statistics. 1. A Two-Dimensional Model with an Order-Disorder Transition, Phys. Rev. B 65, 117. [7] Mermin, N.D., & Wagner, H., 1966 , Absence of ferromagnetism or antifer-
romagnetism in one- or two-dimensional isotropic heisenberg models, Phys. Rev. Lett. 17, 1133; Hohenberg, P.C., 1967, Existence of long-range order in one and two dimensions, Phys . Rev. 158, 383. [8] Berezinsky, V.L., 1970, Destruction of long-range order in one-dimensional and two-dimensional systems having a continuous symmetry group 1. Classical systems, Sov. Phys. JETP, 32, 493; Kosterlitz, J.M ., & Thouless, D.J.,
453 1973, Ordering, metastability and phase transitions in two-dimensional systems, J . Phys. C 6, 1181; Kosterlitz, J.M ., 1974, The critical properties of the two-dimensional xy model, J . Phys. C 7, 1046; Nelson, D.R, & Kosterlitz, J .M., 1977, Universal jump in the superfluid density of two-dimensional superfluids, Phys . Rev. Lett. 39, 1201.
[9] Elitzur, S., Pearson , RB., & Shigemitsu, J., 1979, Phase structure of discrete Abelian spin and gauge systems, Phys. Rev. D 19, 3698. [10] Savit, R , 1980, Duality in field theory and statistical systems, Rev. Mod. Phys . 52, 453. [11] Villain, J ., 1975, Theory of one- and two-dimensional magnets with an easy magnetisation plane. II. the planar, classical, two-dimensional magnet, J . Physique 36, 581. [12] Jose, J.V., Kadanoff, L.P., Kirkpatrick, S., & Nelson, D.R, 1977, Renormalization, vortices, and symmetry-breaking perturbations in the twodimensional planar model, Phys . Rev. B 16, 1217. [13] Lapilli, C.M., Pfeifer, P., & Wexler, C., in preparation. [14] Binder , K., & Heermann, D.W., 2002, Monte Carlo Simulation in Statistical Physics, Springer (Berlin), 4t h ed. [15] Bramwell, S.T., & Holdsworth, P.C.W ., 1993, J . Phys.: Condens. Matter 5, L53. [16] See, e.g., Aji, V., & Goldenfeld, N., 2001, Fluctuations in Finite Critical and Turbulent Systems, Phys. Rev. Lett. 86 , 1007, and references therein. [17] E.g., FaBbender, S. et a1. , 2002, Evidence for Kosterlitz-Thouless-type otientational ordering of CF3Br monolayers physisorbed on graphite, Phys. Rev. B 65, 165411; observed a BKT-like transition for CF 3Br adsorbed on graphite.
Chapter 31
Quantum Nash Equilibria and Quantum Computing Philip Vos Fellman Southern New Hampshire University Jonathan Vos Post Computer Futures
In 2004, At the Fifth International Conference on Complex Systems, we drew attention to some remarkable findings by researchers at the Santa Fe Institute (Sato, Farmer and Akiyama, 2001) about hitherto unsuspected complexity in the Nash Equilibrium . As we progressed from these findings about heteroclinic Hamiltonians and chaotic transients hidden within the learning patterns of the simple rock-paper-scissors game to some related findings on the theory of quantum computing , one of the arguments we put forward was just as in the late 1990's a number of new Nash equilibria were discovered in simple bi-matrix games (Shubik and Quint, 1996; Von Stengel , 1997,2000; and McLennan and Park, 1999) we would begin to see new Nash equilibria discovered as the result of quantum computation . While actual quantum computers remain rather primitive (Toibman, 2004), and the theory of quantum computation seems to be advancing perhaps a bit more slowly than originally expected , there have, nonetheless, been a number of advances in computation and some more radical advances in an allied field , quantum game theory (Huberman and Hogg, 2004) which are quite significant. In the course of this paper we will review a few of these discoveries and illustrate some of the characteristics of these new "Quantum Nash Equilibria" . The full text of this research can be found at http://necsi .org/events/iccs6/viewpaper.php?id=234
455
1.0 Meyer's Quantum Strategies: Picard Was Right In 1999, David Meyer, already well known for his work on quantum computation and modeling (Meyer, 1997) published an article in Physical Review Letters entitled "Quantum Strategies" which has since become something of a classic in the field (Meyer, 1999). In this paper, in a well known, if fictional setting, Meyer analyzed the results of a peculiar game of coin toss played between Captain Jean Luc Picard of the Starship Enterprise and his nemesis "Q". In this game, which Meyer explains as a two-person , zero-sum (noncooperative) strategic game, the payoffs to Picard and "Q" (P and Q hereafter) are represented by the following matrix showing the possible outcomes after an initial state (heads or tails) and two flips of the coin (Meyer, 1999):
NNNF FNFF
N
F
1-11
1
1 -11 I
-1 -1
... The rows and columns are labeled by P's and Q's pure strategies respectively ; F denotes a flipover and N denotes no flipover, and the numbers in the matrix are P' s payoffs, 1 indicating a win and -1 indicating a loss (p. 1052) Meyer notes that this game has no deterministic solution and that there is no deterministic Nash equilibrium. However, he also notes (following von Neumann) that since this is a two-person , zero sum game with a finite number of strategies there must be a probabilistic Nash equilibrium which consists of Picard randomly flipping the penny over half of the time and Q randomly alternating between his four possible strategies. The game unfolds in a series of ten moves all of which are won by Q. Picard suspects Q of cheating. Meyer's analysis proceeds to examine whether this is or is not the case. There follows then, an analysis, using standard Dirac notation of the quantum vector space and the series of unitary transformations on the density space which have the effect of taking Picard's moves (now defined not as a stochastic matrix on a probabilistic state, but rather as a convex linear combination of unitary, deterministic transformations on density matrices by conjugation) and transforming them by conjugation (Q's moves) .This puts the penny into a simultaneous eigenvalue I eigenstate of both F and N (invariant under any mixed strategy), or in other words, it causes the penny to "finish" heads up no matter what ("All of the pairs ([PF + (I p)N], U (1I--12,1/V2), U (1 /--12,11--12)]) are (mixed, quantum) equilibria for PQ penny flipover, with value -I to P; this is why he loses every game." (p. 1054). A more detailed treatment of this game can be found in InterJournal Complex Systems, 1846,2006.
456
2.0 Superpositioning The PQ coin flip does not, however, explore the properties of quantum entanglement. Landsberg (2004) credits the first complete quantum game to Eisert, Wilkens and Lewenstein (1999), whose game provides for a single, entangled state space for a pair of coins. Here, each player is given a separate quantum coin which can then either be flipped or not flipped. The coins start in
the maximum entangled state: H 0 H+ T 0 T
Which in a two point strategy space allows four possible states (Landsberg, 2004): 1
NN=H 0 H+T 0T NF=H0 T+T 0H FN=H0 T-T 0H FF=H0H -T0T The strategies for this game can then be described in terms of strategy spaces which can be mapped into a series of quatemions. The Nash equilibria which occur in both the finite and (typically) non-finite strategy sets of these games can then be mapped into Hilbert spaces where, indeed new Nash equilibria do emerge as shown in the diagram from Cheon and Tsutsui below (Cheon and Tsutsui, 2006). As in the case of the quantum Nash equilibria for the PQ game, unique quantum Nash equilibria are a result of the probability densities arising from the action of selfadjoint quantum operators on the vector matrices which represent the strategic decision spaces of their respective games.
Above: Solvable Quantum Nash Equilibria on Hilbert Spaces (Cheon and Tsutsui). Note the positioning of new Nash equilibria on a projective plane from the classical solution. A more detailed description of this solution can be found at http://arxiv.org/PS cache/guant-ph/pdf/0503/0503233.pdf
3.0 Realizable Quantum Nash Equilibria Perhaps the most interesting area for the study of quantum Nash equilibria is coordination games. Drawing on the quantum properties of entangled systems quantum coordination games generate a number of novel Nash equilibria. Iqbal and Weigert (2004) have produced a detailed study of the properties of quantum correlation games, mapping both invertible and discontinuous g-functions and Noninvertible and discontinuous g-functions (as well as simpler mappings) arising purely from the quantum coordination game and not reproducible from the classical games.
1 We have slightly altered Landsberg and Eisert, Wilkens and Lewenstein's notation to reflect that of Meyer's original game for purposes of clarity.
457 Of possibly more practical interest is Huberman and Hogg's study (2004) of coordination games which employs a variant of non-locality familiar from the EPR paradox and Bell's theorem (also treated in detail by Iqbal and Weigert) to allow players to coordinate their behavior across classical barriers of time and space (see schematic below). Once again, the mathematics of entangled coordination are similar to those of the PQ quantum coin toss game and use the same kind of matrix which is fully elaborated in the expression of quantum equilibria in Hilbert space.
I create
el\h.n~fid photons
I
Isend to deCISIOn ntaJ:ml
Decisions are coordinated even if made at different limes (Above) Huberman and Hogg' s entanglement mechanism. In a manner similar to the experimental devices used to test Bell' s theorem, two entangled quanta are sent to different players (who may receive and measure them at different times) who then use their measurements to coordinate game behavior. (Huberman and Hogg, 2004).
4.0 Quantum Entanglement and Coordination Games In a more readily understandable practical sense, the coordination allowed by quantum entanglement creates the possibility of significantly better payoffs than classical equilibria. A quantum coordinated version of rock-paper-scissors, for example, where two players coordinate against a third produces a payoff asymptotic to 1/3 rather than 1/9. Moreover, this effect is not achievable through any classical mechanism since such a mechanism would involve the kind of prearrangement which would then be detectable through heuristics such as pattern matching or event history (Egnor, 2001). This kind of quantum Nash equilibrium is both realizable through existing computational mechanisms and offers significant promise for applications to cryptography as well as to strategy.
458
4.1 The Minority Game and Quantum Decoherence Two other areas which we have previously discussed are the Minority Game, developed at the Santa Fe Institute by Challet and Zhang and the problem of decoherence in quantum computing. J. Doyne Fanner (Fanner, 1999) uses the Minority Game to explain learning trajectories in complex, non-equilibrium strategy space s as well as to lay the foundation for the examination of complexity in learning the Nash equilibrium in the rock-paper-scissors game (Sato, Akiyama and Fanner, 2001). Adrian Flitney (Flitney and Abbot , 2005 ; Flitney and Hollenberg, 2005), who has done extensive work in quantum game theory combines both of these areas in a recent paper examining the effects of quantum decoherence on superior new quantum Nash equilibria.
4.2 Flitney and Abott's Quantum Minority Game Flitney and Abbot then proceed through a brief literature review, explaining the standard protocol for quantizing games, by noting that "If an agent has a choice between two strategies, the selection can be encoded in the classical case by a bit." And that "to translate this into the quantum realm the bit is altered to a qubit , with the computational basis states 0) and 1) representing the original classical strategies." (p. 3) They then proceed to layout the quantum minority game , essentially following the methodology used by Eisert , Wilkens and Lewenstein for the quantum prisoner's dilemma, specifying that (p.3) : The initial game state consists of one qubit for each player, prepared in an entangled GHZ state by an entangling operator j acting on 00 ... 0). Pure quantum strategies are local unitary operators acting on a player 's qubit. After all players have executed their moves the game state undergoes a positive operator valued measurement and the payoffs are determined from the classical payoff matrix . In the Eisert protocol this is achieved by applying jt to the game state and then making a measurement in the computational basis state. That is, the state prior to the measurement in the N-player case can be computed by:
I¢o) =
Il}l) l
=
100 ...0) ) 1'/10)
0 i' 12 0
1¢2) =
(Jff l
I¢f ) =
j t lt/J2),
.. · ® Al/'{ )I¢d
Where 1J1 0) is the initial state of the N qubits , and Mk, k = I .. .,N is a unitary operator repre senting the move of player k. The clas sical and pure strategies are represented by the identity and the flip operator. The entangling operator j continues with any direct product of classical moves, so the classical game is simply reproduced if all players select a classical move.
459
4.3 Decoherence Flitney and Hollenberg explain the choice of density matrix notation for decoherence, and the phenomena which they are modeling explaining dephasing in terms of exponential decay over time of the off-diagonal elements of the density matrix and dissipation by way of amplitude damping. Decoherence is then represented and generalized as follows : 1(4N) and lJ = 0 may serve as a focal NE that arises from selecting point for the players and be selected in preference to the other equilibria.
However, if the players select 8N E corresponding to different values of n the result may not be a NE. For example, in the four player MG, if the players select f/A; f/s; nc, and n» respectively, the resulting payoff depends on (nA + ns + nc + nD)' If the value is zero, all players receive the quantum NE payoff of v.., if it is one or three , the expected payoff is reduced to the classical NE value of 1/8 , while if it is two, the expected payoff vanishes. As a result , if all the players choose a random value of 11 the expected payoff is the same as that for the classical game (18 ) where all the players selected 0 or 1 with equal probability. Analogous results hold for the quantum MG with larger numbers of players . When N is odd the situation is changed. The Pareto optimal situation would be for (N -1 ) = 2 players to select one alternative and the remainder to select the other. In this way the number of players that receive a reward is maximized. In the entangled quantum game there is no way to achieve this with a symmetric strategy profile. Indeed, all quantum strategies reduce to classical ones and the players can achieve no improvement in their expected payoffs. The NE payoff for the N even quantum game is precisely that of the N 1 player classical game where each player selects 0 or I with equal probability. The effect of the entanglement and the appropriate choice of strategy is to eliminate some of the least desired final states, those with equal numbers of zeros and ones . The difference in behaviour between odd and even N arises since, although in both cases the players can arrange for the final state to consist of a superposition with only even (or only odd) numbers of zeros, only in the case when N is even is this an advantage to the players .
5.0 The Many Games Interpretation of Quantum Worlds So, after a rather roundabout journey from the bridge of the enterprise, we now have a number of quantum game s with quantum Nash equilibria which are both uniquely distinguishable from the classical games and classical equilibria (Iqbal and Weigert, Cheon and Tsutsui, Flitney and Abbott, Flitney and Hollenberg, Landsberg)
460 but we also have an interesting question with respect to quantum computing , which is what happens under conditions of decoherence . Not unexpectedly, the general result of decoherence is to reduce the quantum Nash equilibrium to the classical Nash equilibrium, however, this does not happen in a uniform fashion. As Flitney and Hollenberg explain : The addition of decoherence by dephasing (or measurement) to the four player quantum MG results in a gradual diminution of the NE payoff, ultimately to the classical value of 1/8 when the decoherence probability p is maximized, as indicated in figure 6. However, the strategy given by Eq. (12) remains a NE for all p < 1. This is in contrast with the results of Johnson for the three player "EI Faro1 bar problem" and" Ozdemir et al. for various two player games in the Eisert scheme, who showed that the quantum optimization did not survive above a certain noise threshold in the quantum games they considered . Bit, phase, and bit-phase flip errors result in a more rapid relaxation of the expected payoff to the classical value, as does depolarization , with similar behaviour for these error types for p < 0.5
6.0 Conclusion Quantum computing does, indeed, give rise to new Nash equilibria, which belong to several different classes. Classical or apparently classical games assume new dimensions, generating a new strategy continuum, and new optima within and tangential to the strategy spaces as a function of quantum mechanics. A number of quantum games can also be mathematically distinguished from their classical counterparts and have Nash Equilibria different than those arising in the classical games. The introduction of decoherence, both as a theoretical measure, and perhaps, more importantly, as a performance measure of quantum information processing systems illustrates the ways in which quantum Nash equilibria are subject to conditions of "noise" and system performance limitations. The decay of higher optimality quantum Nash equilibria to classical equilibria is itself a complex and nonlinear process following different dynamics for different species of errors. Finally, non-locality of the EPR type, and bearing an as yet incompletely understood relationship to Bell 's Theorem offers a way in which quantum communication can be introduced into a variety of game theoretic settings, including both strategy and cryptography, in ways which profoundly modify attainable Nash equilibria. While the field has been slow to develop and most of the foundational research has come from a relatively small number of advances, the insights offered by these advances are profound and suggest that quantum computing will radically impact the fields of decision-making and communications in the near future.
461
Bibliography [I] Benjamin, S.c., and Hayden, P.M. (2001) "Multiplayer Quantum Games", Physical Review A, 64 03030I(R) [2] Cheon, T. and Tsutsui, I (2006) "Classical and Quantum Contents of Solvable Game Theory on Hilbert Space", Physics Letters, A346 http://arxiv.org/PS cache/quant-ph/pdf/0503/0503233.pdf [3] Eisert, J ., Wilkens , M., and Lewenstein, M. (1999) "Quantum Games and Quantum Strateg ies", Physical Review Letters 83, 1999,3077. [4] Flitney , A . and Hollenberg, LrC. (2005) Muitiplayer quantum Minority Game with Decoherence, quant-ph/051 0 I08 http://aslitney.customer.netspace.net.au/minority game qic .pdf [5] Flitney, A. and Abbot D., (2005) "Quantum Games with Decoherence", Journal of Physics A 38 (2005) 449-59 [61 Huberman, B. and Hogg, T. (2004) " Quantum Solution of Coordination Problems", Quantum Information Processing, Vol. 2, No.6, December, 2004 [7] Iqbal, A . and Weigert, S. (2004) "Quantum Correlation Games", Journal of Physics A, 37/5873 May, 2004 [8] Landsberg, S.E. (2004) "Quantum Game Theory", Notices of the American Mathematical Society, Vol. 51 , No.4, April, 2004 , pp.394-399. [91 Landsberg, S.E. (2006) "Nash Equilibria in Quantum Games", Rochester Center for Economic Research , Working Paper No. 524, February, 2006 [IOJMclennan, A and Park, I (1999) "Generic 4x4 Two Person Games Have at Most 15 Nash Equilibria" , Games and Economic Behavior, 26-1 , (January , 1999) , 111-130. [Il] Meyer , David A. (1997) "Quantum strategies", Physical Review Letters 82 (1999) 1052-1055. [I2J Quint, T. and Shubik, M. (1997) "A Bound on the Number of Nash Equilibria in a Coordination Game", Cowles Foundation , Yale University [13]Toibman, H. (2005) "A Painless Survey of Quantum Computation" http://www .sci.brooklyn .cuny .ed u/~ me ti s/pape rs/h toi bman I .pdf [I4]Von Stengel, Bernhard (2000) "Improved equilibrium computation for extensive two-person games", First World Congress of the Game Theory Society , Bilbao, Spain, 2000 II5]Yuzuru, S., Akiyama, E., and Farmer, J . D., (2001) "Chaos in Learning a Simple Two Person Game", Santa Fe Institute Working Papers, 01-09-049 .
Part III: Applications
Chapter 1
Teaching emergence and evolution simultaneously through simulated breeding of artificial swarm behaviors Hiroki Sayama Department of Bioengineering Binghamton University, St at e University of New York P.O. Box 6000, Binghamton, NY 13902-6000 [email protected]
We developed a simple interact ive simulation tool th at applies th e simulated breeding method to evolve populations of Reynolds' Boids system. A user manually evolves swarm behaviors of artificial agents by repeatedly selecting his/her preferr ed behavior as a parent of th e next generation. We used thi s tool as part of teaching materials of th e course "Mathematical Modeling and Simulation" offered to engineering-major j unior student s in t he Department of Human Communication at the University of Elect roCommunic ations, J apan, during t he Spring semester 2005. Students actively engaged in th e simulated breeding pro cesses in th e classes and voluntarily evolved a rich variety of swarm behaviors th at were not initially ant icipated.
1
Introduction
Emergen ce and evolution a re the two most important concept s that account for how complex systems may be come organized . They a re intertwined d eeply to each other at a wide range of scales in real comp lex systems. Typical examp les of such connec t ions include evolut ion of swarm behaviors of insects and animals [1] a nd formation of large-scale patterns in sp ati all y extended evolut iona ry sys-
464 tems [2] . Th ese two concepts are often t reated, however , as somewhat dist ant subjects in typical educational settings of complex systems related programs, being taught using different examples, models and/or tools. Sometimes t hey are even considered as antagonistic concepts, as seen in the "nat ural selection vs. self-organizat ion" controversy in evolut ionary biology. There is a need in complex systems educat ion for a tool wit h which students can acquire a more int uitive and integrated understanding of these concepts and the ir linkages. To meet with t he above need, we developed a simple interactive simulatio n tool BoidsSB , which applies t he simulate d breeding meth od [3] to evolve populations of Reynolds' Boids syste m [4]. In BoidsSB , each set of parameter settings that describe local interaction rules among individu al agents in a swarm is considered as a higher-level individual. A user manually evolves swarm behaviors of art ificial agents by repeat edly selecting his/her preferred behavior as a parent of t he next generation. While some earlier work implemented Boids with interactivity [5] , our syste m is unique in that it dynamically evolves parameter set tings that determine the characte ristics of swarm behaviors. In this article we introduce the model and briefly report our preliminary findings abo ut a wide variety of dynamics of the model and its potential educational effects indicated by the responses of part icipat ing st udents who played with it.
2
Model
Our model BoidsSB simulates a swarm behavior of artificial agents in a continuous two-dimensional square space using local interaction rules similar to those of Reynolds' Boids system [4] . Each individual agent in BoidsSB , represented by a direct ionless circle, perceives relative positions and velocities of ot her agents wit hin its local perception range and changes its velocity in discrete t ime steps according to the following rules: • If t here is no local agents within its perception range, steer randomly (St r aying).
• Otherwise: - Steer to move toward the average position of local agents (Cohesion) . Steer towards the average velocity of local agents (A lign m e nt) . Steer to avoid collision wit h local agents (Se pa r at ion) . - Steer rand omly with a given probability (W him) . • Approximate its speed to its normal speed (P ace keeping). Th e size of space is assumed to be 600 x 600. Parameters used in simulat ion are listed in Table 1. This system can produce a natural-looking swarm behavior if those parameters are app ropriately set (Fig. 1).
465
Table 1: Parameters used in each simulation run. Narne Min Max Meaning N 1 500 Number of agents R a 300 Rad ius of perception range Vn a 20 Norma l speed Vm a 40 Maximum speed Cl a 1 St rengt h of t he cohesive force Cz a 1 St rength of th e aligning force C3 a 100 St rengt h of th e separating force C4 a 1 St rengt h of t he pace keeping force Cs a 0.5 P robability of th e random steering
....
Figure 1: Example of swarm behavior that looks like fish schooling. Parameter values used are (N, R, Vn , Vm , CI , C2 , C3 , C4 , C5 ) = (200,30, 3,5, 0.06, 0.5, 10, 0.45, 0.05).
BoidsSB was developed using J ava 2 SDK St and ard Edition 1.4.2. It runs as a stand-alone applicat ion on any computer platform equipped with J ava 2 Run time Environ ment . It s source code is freely available from the ICCS webs ite", In BoidsSB , the simulated breeding method [3] is used to better engage st udents in the simulated phenomena, where st udents will interact wit h th e syste m and act ively particip ate in t he evolutionary process by subject ively selecting their preferred swarm behaviors. Each set of parameter values is considered as a higher-level individu al subject to selection and mut ation. The simulated evoluti onary process is an interactive search within a nine-dimensional parameter space. Figure 2 shows a screen shot of BoidsSB. The numb er of candidate populati ons used for simulat ed breeding is fixed to six due to th e limit ations in computational capacity for simulation and display space for visualization. A st udent can left-click on any of t he frames to select his/her preferred swarm behavior. Th e parameter settings used in a frame can be out put by right-clicking on it , which is helpful for st udents to consider how the observed swarm behavior depends on the parameter values. Once one of the six populati ons is selected, a new generation of six offspring is produ ced by randomly mut atin g (i.e., adding rand om noise to) each parameter value in t he selected parent , among which one inherits exactly the same parameter settings from the parent wit h no mut ation. I
ht t p:/ / necsLorg/ community/ wiki/index.php/ ICCS06/191
466
~
•
..
•
..
. ..
.'. .'
. .0-. ·..· r~
.. '
"..
;
01
..
, :1... ~, :--~."
Figure 2: Screen shot of BoidsSB. Swarm behaviors of six different populations are simultaneously simulated and demonstrated in six frames. A user can click on one of the frames to select a population that will be a parent for the next six populations. To enha nce t he speed of exploration, t he mutat ion rate is set rather high to 50%, i.e., noises are added to about half t he par ameters in every reproduction. This selection and repro duction cycle cont inues indefinitely until the st udent manually quits t he application.
3
Implementation in classes
We used BoidsSB as par t of teac hing materials of the course "Mathematical Modeling and Simulation" offered to engineering-ma jor junior st udents in t he Depar tment of Hum an Communicati on at t he University of Electro-Communications, J apan , during the Spring semeste r 2005. This course aimed to int rodu ce various exa mples of self-orga nizat ion to st udents and help them underst and these phenomena experientially and construct ively by act ively playing with simulat ions and modifying simulator codes. It also aimed to help st udents acquire fund ament al skills of object-oriented programming in J ava. Class meet ings were held in a comp uter lab once a week for four teen weeks. Each class was for 90 minutes, starting with a brief explanation abo ut t he to pical models and simulator codes followed by supervised lab work. T he act ual class schedu le is shown in Table 2. BoidsSB was used as one of t he materials for t he Week 10 "Evolut ion and Adaptation" , where it was discussed how dyn amic emergent patterns (swarm behaviors) could evolve using explicit fit ness criteria, either har d-coded or inte ractively selected. By this time students were already acquainted with the concept of swarm behaviors of Boids and other aggregate systems.
Droplet formation and phase transition in its dynamics Majority rule, Turing pattern formation, dynamic clusters in host-parasite systems, spiral formation Random network, random and preferential network growth Tree growth, simple self-replication, cell division Random walk, diffusion-limited aggregation, garbage collection by ants Swarm behaviors of insects, Boids, traffic jams Evolution of swarm behaviors by genetic algorithm and simulated breeding Spontaneous evolution in closed systems (Boids ecosystem, Evoloop)
Cellular automata Cellular automata
Network growth models Network growth models Individual-based models in discrete space Individual-based models in continuous space Evolutionary models with explicit fitness
Spatial Pattern Formation (2)
Networks (1)
Networks (2)
Co llective behaviors (1)
Co llective behaviors (2)
Evolution and Adaptation (1)
Evolution and Adaptation (2)
Team Projects Team Projects Final Presentations
5
6
7
8
9
10
11
12 13 14
Evo lutionary models with implicit fitness
Examples
Models
Subject Course introduction Intro to Java Programming Int ro to Java Programming Spatial Pattern Formation (1)
Week 1 2 3 4
Table 2: Class schedule of the course "Mathematical Modeling and Simulation" that was offered to juniors in the Department of Human Communication at VEC during Spring 2005 . The proposed BoidsSB was used as a teaching material for the Week 10: Evolution and Adaptation .
~
0:>
468
4
Results
In the Week 10 class, students first learned the concept of genetic algorithms and worked on the exercise to write a code of an explicit fitness function to automatically evolve vortex behavior of Boids. A majority of the students were unsuccessful in this attempt, especially because of the technical difficulty in evaluating the "vortex-ness" of populations by a measurable quantity. Then the concept of simulated breeding was introduced, and students used BoidsSB to achieve the same goal. They were quite surprised when a nicely swirling behavior of swarms came up just after several clicks of artificial selection. This experience helped them intuitively understand the strength of evolutionary principles, especially in the context of the emergence of nontrivial dynamic swarm behaviors. They also appreciated the diversity of different swarm behaviors that appeared during the processes of evolution. Throughout the class time, students actively engaged in the simulated breeding processes and voluntarily evolved a rich variety of swarm behaviors that were not initially anticipated. Figure 3 showcases several examples found through the exploration by the students. Some of them were totally unexpected results even to the instructor (the author) who created the system . No systematic evaluation has been conducted to measure educational impact of this model yet. Here we limit ourselves to probe its potential by looking at the comments given from the participating students. Below are some exemplar statements taken from their homework reports (translated from Japanese to English):
• "It was a fairly pleasing simulation. It was as if I was watching the behavior of water drops in a weightless space." • "The material of this week was very interesting. I have been vaguely thinking of doing some simulations like that, and now I feel I have found what I really wanted to do." • "Before playing with BoidsSB I was not successful in evolving Boids using GA, so I was fascinated by the process of the appearance of (vortex) behavior through evolution." • "In BoidsSB a human user serves as a fitness function, so its possibility should be virtually infinite. Results should therefore be infinite too. However, I found that the actual results of simulated breeding had some trend to fall into several typical classes of behaviors. Can we reduce this trend by either improving the selection criteria or diversifying parameters?" • "During the selection process I saw several typical behaviors appearing many times, but what I felt more exciting varied from time to time. I sometimes selected by comparing the six candidates. At some other times I had a concrete image of specific behavior in mind and kept selecting those that were similar to that. These are all subjective to each user and
F igure 3: Examples of swarm behaviors evolved by participating students using BoidsSB. Parameter settings are shown beneath each example in the (N, R, Vn , Vm , c i , C2, C3 , C4 , cs) format . Some of these may even appear to exploit weakness in the simulation algori thm.
470 therefore hard to predict. I thou ght that it would produce more interesting results than purely computational algorithms." In addition to the above, it was also not able tha t quite a few st udents developed det ailed discussions on how each parameter contributed to t he emergence of the observed swarm behaviors (omitted due to t he lack of space) . T hese responses may indicate that BoidsSB had an impact for st udents that led them to both intui tive experience and analytic reflection of emergence and evolut ion of swarm behaviors. A more systemati c and objective evaluation would be necessary to ascertain t he effectiveness of t he model, which is beyond the scope of our attempt at this time.
5
Conclusion
We proposed BoidsSB, a simple interactive simulation tool that applies the simulat ed breeding method to the evolution of swarm behaviors of artificial agents. We used this tool as a teaching mat erial in the mathematical modeling and simulat ion classes to provide students with opportunities to intuitively and const ructively understand emergence and evolution in an int egrated manner. In t he responses from th e participating st udents were several noticeable comments on their evoked interest in t he emergence of global collective dynamics out of local interaction rules, and on their admirat ion for the power of evolution that gives rise to nontrivial outcomes through repetition of simple processes. Th e fact that these positive responses were obtained from a single simulat ion tool may indicate t he effectiveness of our approach.
Bibliography [1] Camazine, S. et al. (2001) Self-Organization in Biological Systems. Princeton Universit y Press. [2] Bar-Yam , Y. et al. NECSI Evolution and Ecology Research Proj ect. http: / /www .necsLorg/research/evoeco/. [3] Unemi, T. (2003) Simulat ed breeding - a framework of breeding artifacts on the computer. Kybernetes 32:203-220. [4] Reynolds, C. W. (1987) Flocks, herds, and schools: A distributed behavioral model. Comp uter Graphics 21(4):25-34. [5] Unemi, T. & Bisig, D. (2004) Playing music by conduct ing BOlD agents - A style of interaction in the life with A-Life. In Artificial Life IX: Proceedings of the Ninth Int ernational Conference on the Sim ulation and Synthesis of Living Systems, MIT Pr ess, pp.546-550.
Chapter 2
An Exploration into the Uses of Agent-Based Modeling to Improve Quality of Healthcare Ashok Kay Kanagarajah, Peter Lindsay, Anne Miller, David Parker ARC Center for Complex Systems University of Queensland, St Lucia, Queensland, Australia Abstract HeaIthcare is a complex adaptive system. This paper discusses, healthcare in the context of complex systems architecture and an agent based modeling framework. The paper demonstrates complications of healthcare system improvement and it's impact on patient safety, economics and workloads. Further an application of safety dynamics model proposed by Cook and Rasmussen" is explored using a hypothetical simulation of an emergency department. By means of simulation, this paper demonstrates the nonlinear behaviors of a health service unit and its complexities; and how the safety dynamic model may be used to evaluate various aspects of heaIthcare. Further work is required to apply this concept in a 'real life environment' and its consequence to societal , organizational and operational levels of healthcare.
1.
Introduction
Patient safety, acceptable workloads and economic imperatives are interdependent dynamic forces that often come into conflict in heaIthcare environments. For example, reducing the total number of available doctors could lead to increased work load for doctors causing fatigue. This could lead to oversight of tasks and eventually affect patient safety. Balancing the competing dynamic forces is increasingly a major preoccupation of government regulators and senior hospital managers. Cook and Rasmussen" have developed a 'safe operating envelope model' as a useful tool in analyzing the balance between patient safety, economics and workload on a healthcare system (Figure I.) In this model patient safety is a dynamic property of the system. We have chosen Cook and Rasmussen's model as it can be applied at any scale of a healthcare system to explain the consequences of different configurations of workload and economic forces and their implications for heaIthcare and patient safety .
472
The safe operating envelope model also allows the use of agent based modeling (ABM) simulations to explore complex adaptive behaviors in environments such as healthcare . This paper uses, Emergency Department (ED) as a representative scenario to computationally demonstrate that the effect on safety depends on short time scale fluctuations in workload and loose coupling of 'resource buffers'. Further demonstrates the effect of efficiency Pressure lor improvements within ED ~~'::ed~sa::::!r.t.._.-+_~ driven by administrative and technical changes and work saturation, such as a bed-gridlock in an ED. The 'bed gridlock' or 'bed Boundary 01 acceptable block' in an ED, is a wnr'kIM rl situation within hospitals where patients stay in the Fi ure I: Rasmussen' s modified safet d namic model waiting room for a treatment due no beds being available within the ED.
2.
Healthcare Delivery System as a Complex Adaptive System
Healthcare delivery systems have been defined as Complex Adaptive Systems (CAS)I.2.1O.11 . Healthcare is an open system that demonstrates non-linear dynamics. Due to the socio-technical nature of healthcare, its boundaries are difficult to determine and the decisions are ultimately made by 'observers' . The observers of health systems show adaptive behavior; however, there are not always immediately apparent prescribed control mechanisms to support the adaptive nature. A healthcare system can also be considered as a highly connected network of formal and informal nodes that adapt by learning through various 'experiments in progress '. The interdependent and networked nature of healthcare means that the activities undertaken in one node, have the potential to affect behavior in other nodes and across the network overall. Given the complex nature of a healthcare system , it may be difficult for managers to accurately predict the effects of actions , such as performance improvement and their system-wide consequences. For example, an Emergency Department (ED), is a unit node within a healthcare network. It is relatively self-contained, non-trivial and of sufficient complexity as a setting for developing a dynamic model of a healthcare as a complex adaptive system. Like all complex systems, the ED can appear simplistic at an appropriate scale, both within a time period and as a complete system. Patients come into the ED and undergo a diagnostic process and a range of possible treatment modalities before they are discharged back into the community or admitted and transferred to another specialist ward. The primary objective for the ED is to meet the needs of patients, as and when they arrive, and categorizing patient arrivals based on severity (triage). A major challenge for ED managers is to ensure that resources are kept available for unexpected increases in patient demand that may result from a range of environmental and socially mediated factors - , for instance, a flu epidemic , or a
473
major accident resulting in multiple casualties. In these situations ED managers need crisis management plans or are able to adapt by calling for assistance with the healthcare network.
3.
Choice of Modeling Approach
There are numerous research and commercial articles within the literature of modeling and simulation of healthcare systems . June? and FoneSprovide a good overview of healthcare simulations . Koelling8 classified the available simulations under several categories. While the boundaries within the above categories are blurred, there is a need to make assumptions that will make the modeling realistic. If these assumptions do not hold true, then the models that are generated may provide unrealistic answers. The implications of this situation are: 1) the simulation work carried out is very specific to particular situations 2) extrapolation beyond the modeling assumptions leads to unrealistic results 3) the time, money and resources invested are only useful to its specific purpose . The purpose of our overall research is to develop a modeling framework for simulation applications that allows clinicians, managers and policy-makers to test the effects of potential actions given the dynamic relationships defined within the safe operating envelop model. Such an application should be based on contemporary theory about complex adaptive systems and use adaptable simulation tools such as ABM simulation s.
4.
Model Development- Agent Based Model of an Emergency Department and application of safety dynamic model
For the purpose of simplicity of modeling, ED is considered in isolation . Agents within ED simulation are the resources available within ED and consist of patients, doctors, nurses, technicians and treatment rooms and managers . An overall healthcare delivery system would contain many more agents than used in this simulation , but the dynamic interactions would be similar . We illustrate this with the following two examples: I) The patients arrive at ED. This is a simplified stage of an agent, however, within the larger scope of a whole of healthcare model the patient arrival will be a function of what happens within the society and will depend on the number of options that are available for those patients to receive treatment. 2) Transfer out to other wards will depend on the availability of appropriate beds in other areas. We demonstrate this by introducing a stochastic function to generate the bed availability for transfer. Within the ABM at each time intervals agents execute behaviors. The goal of the agents in the model is to manage patient outcomes while minimizing preventable adverse patient events such as delays that increase the risk of secondary complications with subsequent increases in length of hospital stay. It is assumed that any patient who stays longer than 4 hours in ED before being diagnosed and treated will be counted as a potentially adverse event. This situation could occur for several reasons , such as lack of available treatment rooms or lack of availability of any of the single or combination of agents such as doctors, nurses and technician s.
474 Agents have elementary rules, for example ; • Patients are attended to, based on the criticality of their condition. • Agents are self directed. i.e. doctors and nurses work as required, although the time they spend with the patient s may vary depending on demand pressures such as the numbers of patients and severity of patient illness. • Agents reflect adaptive behavior, based on the stage of other agents. For example, doctors may work faster or work over lunch periods in order to reduce excessive queues in waiting room. The opposite may also be true, however.
Workprocess Arriving patients are categori zed into different severity levels by the Triage Nurse, with Level 3 patients being more critical than Levelland 2. Level 3 patient s are taken to an ED major room immediately upon arrival. Once in the room , they undergo diagnosis and treatment. Finally, they complete the registration process before being either released or admitted into the hospital for further treatment. Levelland Level 2 patient s first sign-in with a registration clerk and their condition is further assessed by a Triage Nurse, before being taken to an ED room. Depending on their criticality this location may be to major room or minor room. Once in the room, Level land 2 patients complete their registrati on, before receiving treatment. Finally, they are either released or admitted in to the hospital for further treatment. The treatment proce ss consists of a secondary assessment performed by a nurse and a physician and the approp riate tests are performed by specialized technician s. Further treatment may be performed by a nurse or physician. For Levelland 2 patients, the registrati on proce ss is performed by a clerk with activities such as data collection, payment related information and entering the basic detail of patients into a medical chart for future reference. The simulation model under development is built on a Microsaint Sharp® platform. Microsaint Sharp® allows the development of agent-ba sed simulation models where multiple entities can be made to stochastically respond to conditions in their local environments, mimicking complex large-scale system behavior. Within the ED Simulation, we have nominated critical agents such as: patients, doctors, nurses, clerks and technicians, and treatment rooms for acute patients (majo r rooms) and others (minor rooms). Although these resources are not compreh ensive, they are aware of (and interact with) their local environment through elementary internal rules for decisionmaking, movement, and action, and allow study of the complex behavior at a conceivable scale.
Hypothesis testing Cooks and Rasmussen ' s model explores how operating pressure gradient s push the system's measures away from the boundarie s of economi c failure and work overload, towards the unacceptable performance (accident) boundary . Stable low-risk systems operate far from this boundary, and stable high risk systems operate nearer the acceptable margin. But the operating point moves in small increment s and remains largely inside the margina l boundary. Unstab le systems (otherwise known as chaotic systems) have large rapid shifts in the operating point s which often move outside the boundari es. We have defined three boundaries to mimic Rasmussen' s limits to test our model: 1. The acceptable performance boundary is defined as the time patient spent within the ED, measured in minutes . Acceptable performance targets for accident and
475
emergency departmentlaid down by the UK governmentis 240 minutes(4 hours). In addition, we also measured the total number of patients withinthe system. This gives an indication of 'bed gridlock' or lack of available treatmentrooms to meet the demand. 2. The economic failure boundary is crossed when the hospital uses more than the budgetedfunds. Here we use on-call doctors to fill in when there is a surge in patient demand. The numbersof maximum allowableon-calldoctors are considered2 for unanticipated situations (this is the marginalboundary). Anything above two on-call doctors per hour is consideredeconomicfailure. In order to keep the 'experiments' manageable we have concentrated on only a single type resource, whilst acknowledging other roles are critical too. 3. The work load boundary is defined by utilization of major room, minor room and nurses. Utilization is defined as the effective hours the agents spend with the patientsover the total scheduledwork hours. In addition we consider the time that a doctor spends in initial consultation with the patient. The nominated time for initial consultation is 10 minuteswhen the patient queue is at normal level (less than 5 in queue). However, a doctor may work faster meaningdoctor spend less time with the patient to clear the queue through the ED. The effective intended mean-time of consultation that doctors spend with the patient has minimum levels. This is an adaptive behaviorof doctor as an individual agent. Base case simulation and output measuresofboundaries. Resources
No of Budgeted
Aaents (Base Case) Doctors 3 Nurses 5 Major Treatment Room 2 Minor Treatment Room 8 Rcaistration Clerk 2 Laboratory Clerk 2 Discharge Clerk 2 Tri age Nurse 2 Phlebotomist 2 ECG Technician 2 Laboratory Technician 2 Figure 2 - Stable conditions and it operating regions of the resources and it is observed distributions. Patient Arrival R at e
~
~
i
~
12 10 8 8 4
2
0
•
At
~ ~
NJ~~/ r~~~~ I
•
17
25
41 33 Ti m e So rl•• (Hour)
-+- P atien l Rate
4'
57
Shi" Awra'l'l
85
Figure 3: Average number of ED patients arriving in hourlv intervals.
The numbers of critical agents used within base case are shown in Figure 2. The patient arrival pattern for the simulation was generated randomly and the behavior of base-case arrival and the demandpattern is shown in Figure3 The simulation results from the outcomespresentedhere are 3 days of results. We simulated the run for 4 days and excluded day one results for initialization transients. Next we present the results of the measures and the boundaries. Figure 4 shows the measuresof simulation results and the boundary conditions. Here adequate buffers of resource agents allow the system to cope with fluctuations seen within the demand pattern shown in Figure3.
476
The agents, as in real life, change behaviors and rules over time, as they gain experie nce through encounters with other agents . As agents interact their rules evolve . Within this ABM we have impleme nted a simple rule by which doctors cha nge their intended consultation time with the patients based on the ED waiting room queue. This is illustrated in the plot of initial consultation in Figure 4. This shows an increase in waiting room queue due to a short term increase in the number of patients during the hours of 57 to 60.
...
fJk)t .(A
Tlrre palk!nt spem wtthln Ihe EO
...
2>0
• . ID
-- .
j ':,. ~. . .,. " I
.
0
_ A ~W. I TIln.
-
Plllft)tnanc.
Bounct..,
.
"
..
l
'0 . IHour)
"
.... ..
g
I
• "
..
33
..
Time ser". (How)
••
.
57
Initial Consul1atlon Tlrre
,
l2
I
~·uUJ:u
'0 0
.
P... . O
'00
"
0
I -- .- ()Ic... 00C~0l'........ I E............- .
IS
.4QenISUtIIlsaUon
Plo" C
0
H·•
~
=::"-l Note K\olal ~
0
S
,.,. 1
n- S."
Numb r of On Can Doclora per Hour
PloI . B
--..... • •
1 ~ ~ 1n
" _2S ... t...ao.-,/.. W
.
!2
57
6S
- - e -l
-
n
.• 0 0
0
,
-• "
I
.
"
nr...s.n..
..
(hour )
.
S7
IS
Figure 4: Base-case where it shows a stable 'low risk' system which is operating under a stable condition and would be able to cope with some unanticipated demand increase or resource constrain ts.
5.
Scenario analysis - system dynamics and consequence
This section provides brief results and its implicatio ns. Patient arrival rate depends on various factors, such as weather, age of the populatio n, town center activities (e.g.; traffic accidents), bed occupancy rates in the neighboring hospitals etc. In a scenario where, there is a short-term increase in patients between 32 - 36 hours as shown in the arrival rate plot in Figure 5. If other agents remained similar to the base-case scenario, the sudden increase in workload pushes the operating region of the safe operating envelope beyond the boundaries of performance and economics. Within this simulation, we constrain all agents to a fixed capacity except for the number of doctors . The simulation increases doctors' capacity to up to nine when the waiting queue increased above 9 by calling in addit ional doctors to assist in clearing the backlog . Ability to increase the number of doctor agents and doctors who individually work faster was not enough to absorb the effect of the increase in work load depicted by the demand spike. Thus in this scenario, the operating point is pushed more into unacceptable performa nce boundary with increase in total time patients spent waiting in the ED (over 240 minutes) . This in real life situation has the potentia l to cause major safety concerns as shown in Figure 4, in the base-case scenario there is no violation of boundary conditions due to ample excess buffer capacities which made the interactions and the dependencies between the agents ' couple loosely. In order to test the overall impact of coupling between buffers, we reduced the buffer of the minor room capacity. This situation could be analogous to 'bed block' or 'gridlock 'situations within hospitals where patients still wait in major or minor rooms despite having completed their ED treatment
477
Plol SA
Shortlermlluctuatlon Inpallenlar,lvals to ED
Plot 58
TI"" patients spent within the ED
SI
17
25
33
41
ole
57
65
nm.sea-.
Figure5 Plots showing the short term fluctuations (increase) in demand pattern relative to base- case. In this scenario a particular spike is notable during 32-36 hrs. Figure 6 illustrates the impact of reducing the minor room capacity from 8 to 3 patients. This will obviously increase the overall utilization of the minor room as shown in plot 6C. As the number of rooms' decreases, the rate decreases at which the patients are cleared , increasing the waiting queue whilst also increasing the time that patients spend within the ED. Since the patient queue increases the adaptive behavior of the doctoragent within the simulation, it increases total number of doctors by drawing on availab le on-call doctors . Increase in on-call doctors has an impact on stabilizing the system until minor room capacity reduces to 5. Below the minor room capacity of 5, increasing the number of doctors seem to have no impact on the time a patient spends within the ED and is not able to cope with the inflow of patients as shown by the plot of patients accumulating within the ED. This situation is very similar to work saturation within the hospital system. This leads to ED 'bed block ', 'bed crunch' or 'bedlock'. In these situation beds are being occupied by patients who should be discharged or transferred elsewhere , but due to other circumstances are unable to do so.
6.
Conclusions and Future work
The basis for this study is to develop a tool that will be flexible and be able to evolve with the experience of healthcare systems. We discussed complex adaptive behavior of a healthcare system and demonstrated non-linear and adaptive behavior of the healthcare system 's operating parameters and its interdepend encies using an elementary ABM. We argue that ABM has the potential to be a useful tool to study the quality of care and to address the lack of integrative nature of simulation techniques. The safety dynamics model provided by Cook and Rasmussen is useful for qualitatively visualizing the complex behavior of healthcare . ABM provides an analytical tool to study the scale and granularity of system behavior, and is able to adapt and learn as the knowledge base and experience evolves. This research is in-progress and we will present more details and field study results in our future publications and presentations. Clearly , agent-based simulation is useful in analyzing complex adaptive behavior of a healthcare system. With the use of simulation we outlined how Cook and Rasmussen's safety operating envelop can assist in understanding the behavior of a complex system such as ED. Moreover, it highlights implication to economic, acceptable workload and system performance. Unanticipated situations can be dealt with better by modeling and analyzing many other potential scenarios , including those situations involving resource allocations. This ABM simulation suggests a very feasible modeling framework that can be used to represent the overall healthcare system and its internal network complexities in a realistic and problem solving manner .
Figure 6: The plot of measures for changing the Minor room capacity from 8 to 3.
7.
Reference:
I) Anderson, RA & Jr, RRM 2000, 'Managing health care organizations: Where professionalism meets complexity science', Health Care Management Review, vol. 25, no. I , p. 83. 2) Bar-Yam , Y, Ramalingam, C, Burlingame, L & Ogata , C 2004., Making things work: solving complex problems in a complex world, NECSI, Knowledge Press, Cambridge, MA : 3) Brailsford, SC, Churilov, L & Liew, SK 2003, 'Treating Ailing Emergency Departments with Simulation: An Integrated Perspectives', paper presented to 2003 WESTERN MULTICONFERENCE, ORLANDO, FLORIDA, USA. 4) Cook, R & Rasmussen, J 2005, "'Going solid": a model of system dynamics and consequences for patient safety', Quality & Safety in Health Care, vol. 14, no. 2, p. 130. 5) Fone, D, Hollinghurst, S, Temple, M, Round, A, Lester, N, Weightman, A, Roberts, K, Coyle, E, Bevan, G & Palmer , S 2003, 'Systematic review of the use and value of computer simulation modelling in population health and health care delivery',Journal ofPublic Health Medicine , vol. 25, no. 4, p. 325. 6) Gharajedaghi, J 1999, Systems Thinking : Managing Chaos and Complexity: a platform for designing business architecture, Butterworth-Heinemann. 7) Jun, JB, Jacobson, SH & Swisher, JR 1999, 'Application of discrete-event simulation in health care clinics: A survey', The Journal of the Operational Research Society, vol. 50, no. 2, p. 109. 8) Koelling, P & Schwandt, MJ 2005, 'HEALTH SYSTEMS: A DYNAMIC SYSTEM-BENEFITS FROM SYSTEM DYNAMICS', paper presented to Proceedings of the 2005 Winter Simulation Conference, Orlando, Florida ,USA. 9) Morecroft, J & Stewart, R 2005, 'Explaining Puzzling Dynamics: Comparing the Use of System Dynamics and Discrete-Event Simulation', paper presented to The 23rd International Conference of the System Dynamics Society, Boston USA. 10) Smith, M & Feied, C 1999, The Emergency Department as a Complex System, http://www.necsLorg, http://www .necsi.org :16080/research/management/health/index.html. II) Tan, J, Wen, HJ & Awad , N 2005, 'Health care and services delivery systems as complex adaptive systems' , Commun. ACM, vol. 48, no. 5, pp. 36-44.
Chapter 3
Self-Organized Inference of Spatial Structure in Randomly Deployed Sensor Networks Neena A. George, Ali A. Minai ECECS Department, University of Cincinnati, Cincinnati,OH [email protected], aminai @ececs.uc.edu Simona Doboli Department of Computer Science Hofstra University Hempstead, NY [email protected]
Randomly deployed wireless sensor networks are becoming increasingly viable for many applications. Such networks can comprise anywhere from a few hundred to thousands of sensor nodes, and these sizes are likely to grow with advancing technology, making scalability a primary concern . Each node in these sensor networks is a small unit with limited resource s and localized sensing and communication. Thus, all global tasks must be accomplished through self-organized distributed algorithms, which also leads to improved scalabil ity, robustness and flexibility . In this paper, we examine the use of distributed algorithms to infer the spatial structure of an extended environment monitored by a self-organizing sensor network. Based on its sensing, the network segments the environment into regions with distinct characteristics, thereby inferring a cognit ive map of the environment. This, in tum , can be used to answer global queries about the environment efficiently and accurately . The main challenge to the network arises from the necessarily irregular spatial sampling and the need for totally
480
distributed computation. We consider distributed machine learning techniques for segmentation and study the variation of segmentation quality with reconstruction at different node densities and in environments of varying complexity.
1 Introduction Wireless sensor Networks (WSN 's) are interconnected groups of spatially distributed sensor nodes used to gather and process large amounts of data. Each sensor node is a miniature electronic device, characterized by limited power, sensing , communication and computation capabi lities [Pottie 2000].Advances in hardware and communication technologies have led to the deployment of these nodes in large numbers for applications like habitat monitoring, border surveillance, contaminant tracking, and health monitoring [Chintalapudi 2003][Nowak 2003], giving rise to concerns such as scalability of proces sing algorithms and robustness of the network [Estrin 1999]. Also, traditional methods of processing where all data is transmitted to a central base station and processed there using powerful centralized algorithms is non-robust and inefficient with high communi cation overhead and poor response time. This has led to the development of decentralized algorithms, and such sensor network s can be regarded as self-organizing comple x systems in which localized interactions between neighborin g nodes helps to achieve a global objective. Most environments monitored by sensor network s have spatial structure (e.g., tracks, or regions with different terrain) that significantly affects the observations made by the network . If the network can infer this spatial structure from initial observations, it can predict events in the environment much more intelligently than a naive, uniformly organized network . For example , network performance in tracking applications is much better if the network can infer the layout of track s (if any) or distingui sh between hazardous and navigable region s. Since every node surveys only a small part of the environment, inference of the environment structure by the network require s intelligent collaboration between the nodes. Segmentat ion can be defined as the identification of regions or clusters which have similar properties. This similarity could be similarity of attribute s such as moisture, texture, compo sition , etc. or similarity in event s occurring in these regions, e.g., region s with high/low traffic or regions with higher/lower event density. The goal of segmentation in a self-organizing network is to use the information obtained from local sampling to identify these regions , thereby inferring the spatial semantic structure of the environment. Segmentation of the sensed environment can be seen as a powerful generic task that can serve several purpo ses including, but not limited to, the following: 1) Providing a cogniti ve map to interpret observed events; 2) Allowing better prediction of events (e.g., the direction of a moving entity); 3) Improving the quality of sensing by customizing sensing parameters to each region; 4) Improving network efficiency by allowing better sensor scheduling; and 5) Facilit ating task allocat ion among nodes, leading to better resource utilization . Segmentation is widely studied in the fields of computer VISIOn and image proces sing, and some of the methodologies used there have been applied in the
481
context of sensor networks by considering sensor nodes as equivalent to pixels. However, significant differences exist between the two [Devaguptapu 2003] . In image processing, pixels are regularly spaced and each pixel has eight neighboring nodes, but nodes in sensor networks are randomly deployed .The number of a node's neighbors is determined by how many nodes fall within its transmission and hence is variable in a random network. Also, due to random deployment, node density may not be completely uniform, leaving significant areas of the environment poorly monitored . This irregular sampling of the environment by the sensor nodes coupled with the lower density of deployment, makes it difficult to apply image processing techniques with equivalent accuracy in sensor networks . In this paper we look at the problem of segmentation in environments of different complexities, propose to use machine learning techniques for the intelligent approximation of areas not covered by sensor nodes, and study the quality of segmentation under different node densities.
2 Related Work Most of the work done in the area of spatial analysis of data in sensor networks has been on edge/boundary detection and has relied on methods used in image processing . In [Devaguptapu 2003] and [Chintalapudi 2003], the authors tackle the problem of distributed edge detection by applying filter-based schemes. A linear classifier-based approach and a statistical thresholding scheme is also used in [Chintalapudi 2003], and the three different methods compared, of which the linear classifier approach is found to perform the best. Nowak and Mitra examine a hierarchical method for boundary detection in sensor networks [Nowak 2003]. Edge reconstruction using platelets is studied [Willet 2003]. While boundary detection is of importance for determining the extent of phenomena, a method is also needed to identify semantically significant regions in the environment. Though some work has been done which deals with estimation of homogenous fields using interpolation methods [Nowak 2003], scenarios involving abrupt spatial changes are likelier to occur in a variety of applications and distributed, energy efficient methods are needed to address the problem of estimating these inhomogeneous fields [Nowak 2003]. As segments are comprised of regions with similar attributes, there is a natural analogy between finding segments and finding clusters . We use a method of region growing proposed by Panangadan and Sukhatme for region tracking [Panangadan 2005] .The basic idea behind this algorithm, as used in image processing, is that if two adjacent pixels are similar in intensity, they are merged into a single region.While they use it purely in the context of tracking a specific region and in a single attribute case, we investigate the use of the algorithm in finding all regions in environments of different complexities , with multi-attribute specifications . We examine the use of non-hierarchical, distributed methods to estimate inhomogeneous fields and use it in conjunction with the region growing algorithm to get a comprehensive global estimate of the segments. We also use two machine learning algorithms, namely the Inverse distance weighted (mW) algorithm and the K-nearest neighbors (KNN) algorithm for data imputation, making intelligent guesses about un-
482 sensed regions. We also evaluate the performance of these methods with different node densities. For simulation purposes, we use cellular environments, but our methods can be used in a continuous framework too.
3 System Design Environments of different types and complexities are used to test the performance of the algorithms. Each environment is modeled as an n x n cellular region, which is divided into segments of different types. Regions or segments can be defined as connected subsets of the environment which have similar values for all or some attributes. Every cell location (x, y), is characterized by N; attributes , a/x, y) , where i= 1, 2, ....Na . There are N, segments in the environment, drawn from a fixed number of segment types. Attribute aj is distributed over segment Sk, in a range [ajt in , ajtOX ] , with mean mjk and variance (J'jk. Each segment type has a characteristic mean and variance for each attribute, and the means for different segment types are wellseparated on at least one attribute .
3.1 Environment Models In this paper, we use the following specific types of environments: 1. Track Environment: The environment has two different piecewise linear tracks of different widths and is characterized by two attributes. The tracks are similar to each other on one attribute, but differ on both attributes from the background segment.
2. Patch Environments: Two different types of patch-like segments are distributed over the background, and are characterized by two attributes. The segments differ from each other on one attribute , and from the rest of the environment on both attribute types. Note that the patch segments imply a nonconvex background segment by default. 3. Spiral Environments: These environments consist of two intertwined spiral segments of different types of cells. The segments differ from each other on both the attributes. This is a very complex segmentation problem, and is used primarily as a benchmark because its complexity can be controlled systematically by varying the width of each spiral. It has been widely used in the classification literature .
3.2 Segmentation Algorithm A total of N; sensor nodes are randomly distributed over the environment. Each node is equipped with some sensing, communication, and computation capabilities. It also has some limited memory associated with it. Each node has a unique randomly generated identifier and is capable of sensing all attributes of interest. Every node is
483 assumed to sense the attributes within the cell in which it falls. Nodes also have a transmission radius, t; within which they can broadcast and receive messages. Nodes that fall within this radius around a specific node are called its I-hop neighbors , or just neighbors . Every sensor uses information from its neighbors to infer whether it is part of a larger region. Segmentation is based on the detection of similarity in sensed attributes for neighboring nodes. Using H(R) to denote homogeneity for region R, the basic property defining a region, R;, is [Petrou 1999] H (R;) =TRUE, i.e. every region is homogenous within itself, and H (R; U R) = FALSE, 't;/ i "* j, where R, and Rj are adjacent regions. Homogeneity can be found by comparing all attributes or a subset of attributes. Each node, V; , senses the attributes within its sensing range, and stores the data in a measurement vector /1;. It also transmits to its neighbors a message packet with its identifier and measurement vector. Communication is assumed to be omni-directional and instantaneous over a l-hop neighborhood. Every node checks the similarity of the values that it senses with those received from each of its neighbors . Similarity is said to obtain if the two nodes sense values which fall within a certain percentage of each other. If a match is found, the node marks this in a match table, and the process continues over several sensing cycles. Once the number of times the node finds a match with a neighbor (as a fraction of the number of trials) exceeds a pre-specified threshold, it deems its neighbor as similar to itself and hence lying in the same region. The requirement of multiple matches even in static or slowly-varying environments ensures robust segmentation in the presence of measurement noise. The node then sets its identifier to the minimum identifier among all its similar neighbors . Every node continues to compare itself to its neighboring nodes until no further similarities are found over a sufficiently long interval. At this point, all the nodes in a region with similar readings have converged to the same identifier value. Even at high densities, sensor nodes do not sample every part of the environment, and complete segmentation requires assignment of segment identifiers to un-sensed cells through inference. For this, we use two very similar algorithms, the KNN algorithm and IDW algorithm, both implemented by the nodes in a distributed manner. Each node has a cell map of a pre-specified size m <m, which represents the node's subjective estimate of a limited region beyond its area of sensing. In the version of the KNN algorithm implemented, every node finds the k-nearest neighbor nodes for each cell within its cell map and decides the label or identifier of the cell based on the plurality of these. The algorithm is tested using different sizes for the cell map and by varying k. By increasing the size of the cell map, a more comprehensive approximation of the environment can be made at even lower densities. A slightly modified version of the KNN algorithm is used in the inverse distance weighting method in which each of the k neighbors is weighted inversely with its distance to the cell in deciding the label. The weight, Wj of a node label is calculated as Wj = d - (k+ I) , where d is the distance of the node to the unsensed cell and k is the number of neighbors . The label with the greatest cumulative label is assigned to the cell.
484
4 Results The quality of segmentation is studied for each of the environment types. We define segmentation metrics based on the number of correctly classified cells found after the running the algorithm .
J. Track Environment: In this environment configuration, the number of track cells that are classified as background segment cells and the total number of correctly classified track segment cells (including type 1 and type 2), are found for 3 runs of the algorithm at a specified density .The average accuracy of classification is then calculated and the procedure repeated for different node densities. A comparative plot of the KNN and IDW algorithms at different node densities is plotted in Figure 3 for the track environments. Accuracy is defined as the percentage of track 1 and 2 cells classified correctly. Figures 5 - 7 depict an example of the reconstruction.
Aoooo . . ... ""..... . . - -
0'"
Figure. 3: Accuracy plot for Tracks
--. 0I
O.
Figure. 4: Accuracy plot for patches
2. Patch Environment: The number of incorrectly and correctly classified convex segments are found, averaged over 3 different runs and plotted for different node densities. Figure 4 shows the error plot for one of the patch segments . Similar results are obtained for the second type of patch segment. Accuracy is defined as the percentage of patch cells classified correctly .
3. Spiral Environment type: Here the number of spiral cells of each type that are classified correctly and misclassified are calculated, averaged over 3 runs and plotted at different node densities (Figures 8- 11). For each environment (3-cell width, 6-cell width, 9 cell width), the percentage accuracy of each spiral type is plotted, where accuracy is defined as the percentage of spiral cells classified correctly. The plots for the environment type with spirals of widths 3 cells (Figure 8 and Figure 9) and 9 cells (Figure 10 and Figure 11 ) are shown .
5 Analysis of Results Both the KNN algorithm and Inverse Distance Weighted algorithm show comparable results, with high accuracy even at lower node densities. As expected , the percentage accuracy of classification improves at higher node densities for all environment
485 types , with minor vanations, which can be attributed to the vanation in nodedistribution over the segments. Some configurations might show better accuracy levels at lower density values which are very close, but significant increase in accuracy is seen at higher densitie s.
Figure. 5: Actual Track Environment
Figure. 6: Tracks before reconstruction
Figure. 7.:Tracks after reconstruction at 0.5 node densit
,/ 90
0 .4
_-..0,oe
0.5
0.7
0.8
Figu re . 8: Type I cells, 3-cell wide spiral
Figure. 9: Type 2 cells, 3-cell wide spiral tft('tll ......
The lOW algorithm is found to perform much better than the KNN algorithm at low node densities (typically < 0.5), though at higher densitie s the two algorithms
486 have very similar performance but with greater variation. This might be because at lower densities, nodes that sense differently are more likely to lie further from the segment edges than at higher densities. At higher densities, though the overall accuracy would increase, the performance of the two algorithms would undoubtedly depend on the node distribution due to random deployment. Clearly, both the KNN and IDW algorithms can be used with sufficient accuracy for environment inference in unsensed regions, over a wide range of complexities, with the IDW algorithm showing better performance at lower densities of deployment.
6 Conclusion We have shown that distributed classification of a monitored environment into comprehensive segments can be achieved. The next logical step would be to enable the network to answer global queries about the environment. This work is in progress.
Bibliography [1] Chintalapudi, K.K, & Govindan, R., 2003, Localized Edge Detection in Sensor Fields, IEEE Workshop on Sensor Networks Protocols and Applications [2] Devaguptapu, D., & Krishnamachari, B., Apri12003, Applications of localized image processing techniques in wireless sensor networks, SPIE's 17th Annual International Symposium on Aerospace/Defense Sensing, Simulation, and Controls, (Aerosense '03), Orlando, Florida. [3] Estrin , D., Govindan, R., Heidemann, J., & Kumar, S.,1999 , Next Century Challenges: Scalable Coordination in Sensor Networks, Proc. MOBICOM, Seattle, 263-270. [4] Nowak, R. & Mitra, U., 2003, Boundary Estimation in sensor networks: Theory and Methods, Proc. IPSN'03 , 80-95 . [5] Nowak, R., Mitra, U., & Willet, R., 2004, Estimating Inhomogeneous Fields Using Wireless Sensor Networks, IEEE Journal on Selected Areas in Communications, Vol. 22 , No .6 (Aug.), pp . 999-1006. [6] Panangadan, A., & Sukhatme, G.S., 2005, Data Segmentation for Region Detection in a Sensor Network, CRES Technical Report, 05-005. [7] Petrou, M. & Bosdogianni, P., 1999, Image Processing, the Fundamentals, John Wiley and Sons. [8] Pottie, G., Kaiser, W., Clare, L., & Marcy, H., 2000 , "Wireless Integrated Network Sensors". Communications of the ACM, 43 : 51 - 58. [9] Willet, R., & Nowak, R., 2003, Platelets:A multiscale approach to recovering edges and surfaces in photon-limited imaging, IEEE Trans. Med. Imaging, 22:332-350.
Chapter 4
Obtaining Robust Wireless Sensor Networks through Self-Organization of Heterogeneous Connectivity Abhinay Venuturumilli University of Cincinnati [email protected] Ali Minai University of Cincinnati [email protected]
1. Introduction A Wireless Sensor Network (WSN) is a set of sensor nodes that can communicate wirelessly with each other across an extended environment. Sensor networks are being used for various military, environmental, human-centric and robotic applications [Arampatzis 2005] . Most of the research on WSNs is focused on networks with identical nodes that have the same transmission range. This creates a homogeneous network whose connectivity can be modeled as an undirected graph. Homogenous networks are simple to analyze, but are well-known to be suboptimal with regard to efficiency , longevity and robustness [Yarvis 2005J. The random deployment of homogeneous nodes results in an uneven connectivity with critical nodes, making the network non-robust to node failure. A simple solution to overcome this problem would be to increase the transmission range of all nodes. but . this creates undue congestion in other parts of the network. In a heterogeneous network, in contrast, nodes can individually select their transmission range and tune their
488
connectivity locally without creating congestion. This effectively reduces the number of hops between nodes without increasing bandwidth needs and energy . Though the resulting networks are more efficient and robust than homogeneous ones, they are difficult to analyze (see Duarte-Melo et al. [Duarte-Melo 2002] for some analysis) . Motivated by the considerations discussed above, several algorithms have been proposed algorithms for obtaining efficient heterogeneous networks [Ramanathan 2000; Borbash 2002; Liu 2003; Li 2004]. However, these networks are not generally robust to random node failure. Some of the algorithms listed above have been compared and analy zed by Srivastava et al. [Srivastava 20031. Recently, our laboratory has developed an algorithm for obtaining heterogeneous networks based on reverse engineering [Ranganathan 2006]. In this paper , we propose a distributed algorithm to design a robust and energy efficient network. The nodes choose different transmission ranges such that the congestion and mean path length between the nodes are minimized. The simulation results reaffirm that the networks obtained by our heuristic outdo homogeneous networks on the basis of all performance measures, while still being robust to random node failure.
2. Network Model The sensor network is modeled as a graph whose vertices (nodes) represent the sensors and edges indicate direct communication between the nodes. The nodes are deployed in a uniform random distribution with density A in a 2-D unit square area , so that the edges between nodes are undirected in homogeneous networks and directed in heterogeneous ones. We assume that the environment is obstruction-free and that each sensor is aware of its geometric coordinates. The network model used in this paper assumes that every sensor node can choose from two or three transmission ranges . The low power transmission radius is termed the whisperer radius (rw), the high power transmission radius the shouter radius (rs) , while the medium power transmission radius - for the three level model - is called the sp eaker radius (rt ) . The whisperer radius is chosen as r.; = a rp where a is a constant factor < 1 and rp is the percolation radius for the network [Stauffer 1994]. The shouter radius is chosen as r, = f3 rp where f3 is a constant > 1, and the speaker radius as r, = r rp where a < r < f3. The nodes that are present within whisperer range of a node are listed in its adjacency list Aw• The nodes between whisperer and speaker radii are stored in the inner-ring adjacency list Ai, and the nodes between speaker and shouter radii are stored in the outer-ring adjacency list A o• Thus, the set of nodes present between whisperer and shouter radius of the given node comprise the set Ar = Ao U Ai, where Ar is termed the node's ring adjacency list. The current adjacency list A c for a node depends on whether the node is a whisperer, shouter or speaker.
489
3. Basic Radius Adaptation Algorithm The fundamental principle underlying the radius adaptation algorithm is for each node to choose the smallest possible radius while maintaining the connectivity achievable by choosing the maximal radius . The basic 2-level case is as follows:
3.1 Initial Setup Once the nodes are deployed , each sensor node follows a sequen ce of instructions to gather information about its neighboring nodes. First , every node boosts its transmission range to rs and broadcasts its coordinate values along with its randoml y generated unique identification number, and then cuts-back to r w- Upon receiving coordinate value information from each of its neighbors in shouter range, the node calculates the Euclidean distance from each of these nodes to itself. If the distance to a node is less than r w' it adds that node to its whisperer adjacency list A w and, if the distance is more than rw , it appends that node to its ring adjacency list A, . Apart from A w and A" the node stores another list An which keeps track of its current adjacent nodes depending on the transmission range it is currently using. Since all nodes initially are at whisperer range, Ac equals A w • The initiali zation process concludes with the transmission of the current adjacency list Ac by each node to all its whispering neighbors Aw • By the end of the setup process , each node knows which nodes it can transmit to at whisperer and shouter ranges and also the current adjacent nodes of its whispering neighbors.
3.2 Basic Radius Update Procedure Following the completion of the setup process, each node k traverses through its list of ring adjacent nodes and checks if they are reachable through one of its whispering neighbors. If all the ring nodes are reachable through its whispering neighbors, becoming a shouter would not result in any additional connectivity for this node, and it remains a whisperer. Otherwise, it becomes a shouter, updates its current adjacency list An and then sends the updated Ac to its whispering neighbors. Thus, if one of these neighbors later runs the radius update algorithm, it will use the correct adjacency list for node k. If it has no ring-adjacent nodes that cannot be reached through k or through one of its other adjacent nodes, it can remain a whisperer. Once this algorithm is run by each node independently and asynchronously, each node is guaranteed to be connected to all its shouter range neighbors either directly or through a whisperer range neighbor.
3.3 Enhanced Radius Update Procedure In the above procedure, each node checks only the adjacency lists of its immediate neighbors and if it finds any ring-adjacent node to be unreachable , it increases its transmission range to rs without any further search. However, this can lead to an unnecessary increase of transmi ssion radius since a node can be connected to its ring adjacent nodes through an alternative path of more than 2 hops . Therefore, an
490
enhanced version of the algorithm implements a broader search by each node to find paths to all unreachable ring nodes through its reachable nodes. Each node k is aware that all other nodes also have at least transmission range r w' so if it finds two nodes in its ring adjacency list which are within Euclidean distance rw of each other, it knows that if it can reach at least one of the two nodes, it can also reach the other one through the first. At first, node k implements the basic update algorithm to determine which ring nodes it can reach through its adjacent nodes. It then checks if the remaining nodes in the ring are reachable through one of the reachable ring nodes by calculating the distance between ring nodes. If any new node is added to the list of reachable nodes after the verification, the remaining unreachable nodes' distance to this new reachable node is determined. This process continues until there is either no change in the reachable node list or all the ring nodes are reachable. If all the ring nodes are reachable, then the node decides to be a whisperer , but if it has one or more unreachable nodes in the ring, it decides to boost its transmission range to become a shouter.
3.4 Three Level Algorithm In the algorithm described above, a node decides either to keep a whisperer transmission range or to change to a shouter transmission range depending on ring connectivity. An impro ved solution aims to achieve better congestion control by extending the transmission range to three levels viz ., whisperer, speaker and shouter. In this algorithm, each node initially runs the original 2-level algorithm for the speaker and shouter transmission ranges . The node, thus decides either to remain at speaker range or to change to shouter range. If the node changes its transmission range to shouter, the deci sion is final, but if the node remains at speaker range, it goes through another decision cycle to decide if it can reduce its transmission range further to whisperer range. In the second cycle , it again runs the 2-level algorithm , but this time for the whi sperer and speaker radii. If the node can ensure that it can reach its inner-ring adjacent nodes by transmitting at whisperer range , it reduces its transmission range to r.; Thus, after the algorithm is run, there are three different transmission ranges in the network. The networks obtained by the 2-level and 3-level algorithms are compared against each other and with equivalent homogeneous networks. The results are discussed in the next section.
4. Results and Discussion In the simulations, node s are deployed in a unit square area with a uniform random distribution. For the homogeneous case , all nodes are assigned percolation radius as the transmission radius. For heterogeneous networks, the coefficients a and ~ are taken as 0.8 and 1.25 respectively, and y is assigned a value of 1. For heterogeneous networks, the nodes are allowed to self-organize with the radius update algorithm. Once the algorithm is run by the nodes in the network, the self-organized heterogeneous networks (both 2-level and 3-level) and homogeneous networks are compared to check the following performance measures: 1) Maximum size of
491 strongly connected component (SCC); 2) Congestion (mean in-degree) of the network; 3) Average inverse shortest path length (AISPL) among all node pairs; and 4) Mean transmission radius. Also, these homogeneous and heterogeneous networks are subjected to different levels of random node failure, and the effect on the maximum connected component size is compared to evaluate robustness. The simulations were performed on networks ranging from 200 to 1000 nodes. For each of these network sizes, the performance measures were averaged over 100 different network configurations. The comparison of strongly connected component sizes for intact networks is shown in Figure I, with the values normalized by the network size. Clearly, the size of the strongly connected component is consistently larger in heterogeneous networks when compared to homogeneous ones. By construction of the algorithm, the size of the SCC should be the same as what it would be if all the nodes in the network were shouters. Thus, it is observed that both 2- and 3-level algorithms have the same connected component size. The difference between the two algorithms can be seen in the other performance measures discussed next. In Figure 2, the mean transmission radius for the different cases is compared for networks of different sizes. It can be seen that, the mean radius is consistently lowest 1.02
. .. .. .. .. . -
....
'O ' .
_
.'.
0.12 0.1
•
Ill:
c
j
0.08 0.06 0.04 0.02
--+-Ham ___H8U2
Heull
HIItwDrtl 81m NdwDrkSlze
Figure 1. Component Size with Error Bars
Figure 2. Mean Transmission Radius
for the 3-level model and highest for the homogeneous case. Since the speakers in 3level case would actually be shouters in the 2-level algorithm, the former leads to a lower mean radius than the latter. However, it is interesting to see that the mean transmission radii of both the heterogeneous algorithms are less than the percolation radius. At the same time, they also maintain very high connectivity compared to a homogeneous network, demonstraing the advantage of optimized heterogeneity.
492 The average congestion for networks of various sizes is shown in Figure 3. It is clear that, like mean radius, congestion is lowest for the 3-level model, while the homogeneous case has the highest congestion. The congestion in a homogeneous network is high because homogeneity forces nodes in dense areas to have needlessly large radii and, therefore, high degree. The reason for higher congestion in the 2-level model compared to the 3-level model is that the nodes which choose to become speakers in the latter case have to become shouters in the 2-level model. One concern in optimizing transmission radii to minimize congestion without sacrificing connectivity is that it might lead to longer paths between nodes. To check this, we evaluated the networks for the average inverse short path length between all node pairs. We used inverse rather than the direct length to conveniently account for disconnected node pairs [Beygelzimer 2005]. Interestingly, though the heterogeneous networks have low mean radius and have less congestion compared to the homogeneous networks, they still have high AISPL. From Figure 4, it can be seen that, the AISPL value is lowest (i.e., worst) for homogeneous networks while is highest for the 2-level model. The reason for the high AISPL in the 2-level model is the greater fraction of shouters, some of which become speakers in the 3-level case. The percentage of shouters in the networks is shown in Figure 5. It can be observed that, the fraction of shouters in the 2- and 3-level models is approximately 25% and 10%, respectively , and is independent of the network size; while the speaker fraction in the 3-level model is approximately 15%. Thus, the sum of the shouter and speaker fractions in the 3-level model is the same as the shouter fraction in the 2-level model.
Another important concern in highly optimized heterogeneous systems is robustness to random node failure [Dekker 2004 , Crucitti 2004 , Paul 2004], since optimization tends to "squeeze out" all the redundancy in the system. To evaluate robustness, we subjected both homogeneous and heterogeneous networks to random node failure. Node removal was sampled over 10 independent instances for each network layout. Robustness, measured by the size of the largest strongly connected component in the damaged network, was evaluated for 200, 300 and 400 node networks, though results are shown for 300 nodes only. From Figure 6, it can be observed that heterogeneous networks are more robust than homogenous ones even for 35% node failure. It can also be observed that robustness for 3-level networks drops at a faster rate than for the 2-level case, which again reflects the presence of a larger shouter population in the latter. In conclusion, it has been shown that heuristically obtained heterogeneous networks surpass homogenous networks on several performance metrics. Also , it has been shown through simulations that the heterogeneous networks are more robust than the homogenous networks in the presence of random node failures. Thus, the proposed heuristics represent a very useful approach that enhances network efficiency while maintaining, and in fact greatly improving other critical performance metrics.
References Arampatzis , T. , Lygeros, J ., & Manesis, S., 2005, A survey of applications of wireless sensors and wireless sensor networks, Proceedings of the 13th Mediterranean Conference on Control and Automation (Limassol Cyprus)
494 Paul, G., Tanizawa, T. , Havlin, S., & Stanley, H.E., 2004, Optimization of robustness of complex networks, The European Physical Journal B. 38, pp. 187-191 Li, N. & Hou, J.C.., 2004, Topology control in heterogeneous wireless networks: problems and solutions, Proceedings of IEEE INFOCOM 2004 (Hong Kong). Yarvis, M., Kushalnagar, N., Singh, H., Rangarajan, A., Liu, Y., & Singh, S., 2005, Exploiting heterogeneity in sensor networks, INFOCOM, Proceedings of the 24th Annual Joint Conference of the IEEE Computer and Communications Societies (Miami) . Duarte-Melo, E., & Liu, M., 2002, Analysis of energy consumption and lifetime of heterogeneous wireless sensor networks Proceedings of IEEE Globecom '02, (Taiwan) . Stauffer, D., & Aharony, A., 1994, Introduction to Percolation Theory. London: Taylor & Francis Ramanathan, R., & Hain R, 2000 , Topology control in multihop wireless networks using transmit power adjustment, Proceedings of IEEE INFOCOM 2000 (Israel), pp 404-413 Ranganathan , P., Ranganathan, A, Berman, K. & Minai, A.A., 2006, Discovering adaptive heuristics for ad-hoc sensor networks by mining evolved optimal configurations, Proceedings of the 2006 World Congress on Computational Intelligence (Vancouver , Canada) Borbash, S.A., & Jennings, E., 2002, Distributed topology control algorithm for multihop wireless networks, Proceedings of the 2002 World Congress on Computational Intelligence (Honolulu, HI). Srivastava, G., Boustead , P., & Chicharo, J.F., 2003, A comparison of topology control algorithms for ad-hoc networks, Proceedings of the 2003 Australian Telecommunications, Networks and Applications Conference, (Melbourne) . Llu, J., & Li, B., 2003, Distributed topology control in wireless sensor networks with asymmetric links, Proceedings of IEEE Globecom '03 , Wireless Communications Symposium (San Francisco), pp 1257 - 1262. Beygelzimer, A, Grinstein , G.E., Linsker, R, Rish, I, 2005, Improving network robustness by edge modification, Physica A 357, pp 593 - 612 Dekker, A.H. & Colbert, B.D., 2004, Network robustness and graph topology, Proceedings of the 27th Australasian Computer Science Conference, (New Zealand) , Estivlll-Castro, V., ed., Conferences in Research and Practice in Information Technology, 26, pp 359-368. Crucitti, P., Latora, V., Marchiori, M., & Rapisarda, A., 2004, Error and attack tolerance ofcomplex networks, Physica A 340, pp 388 - 394
Chapter 5 Self-organizing Text in an Amorphous Environment Orrett Gayle The University of the West Indies , Mona [email protected] Daniel Coore The University of the West Indies, Mona [email protected]
In an amorphous computing environment, myriad irregularly located computing elements asynchronously execute a common program and communicate locally to produce some pre-specified emergent behaviour. We have implemented a mechanism for robustly generating patterns of self-organising text in an amorphous computing environment . Our method uses the Growing Point Language (GPL), presented in [3], and builds on ideas put forward there for constructing text-like patterns. One shortcoming of that technique was that patterns for producing a text character were sensitive to signals that were localised to the origin of the text. As a result, the methods for generating these patterns did not scale well enough to be able to produce arbitrarily many of them. We have found a way to allow the conditions that govern the formation of a character to propagate arbitrarily far from the starting point, thereby allowing us, in principle, to produce arbitrarily long concatenations of text characters. We used a pen metaphor, so that the description of our self-organising text is similar to text being drawn on paper, namely, by a series of interconnected strokes. Using this pen metaphor, characters may be combined via the GPL network abstraction in a perfectly natural way to produce words. We also implemented a general synchronizing mechanism which ensures that the information used to produce one character does not interfere negatively with the production of the subsequent character. This mechanism permits long strings to be drawn reliably in a manner that scales well with the number of text characters to be drawn. Our simulations show that we can produce strings that are longer than we were previously capable of producing, and that short strings are produced more reliably.
496
1
Introduction
In an amorphous comput ing environment, myriads of simple comp ut ing elements interact locally, under th e cont rol of a common program to produce some pre-specified coherent behavior . The goal of amorphous comput ing is to find programmin g paradigms t hat allow us to engineer th e emergence of coherent behaviors so th at th ey can be used to produce self-organization on a massive scale, hith erto observed only in nature jl]. Indu cing a self-organised pattern on an amorphous compute r is particularly difficult because amorphous computing elements a priori have no information abo ut t heir position , t heir connect ivity, nor any notions of orientation such as left , right , up or down. Several approa ches to controlling t he complexity of an Amorph ous Compute r have been developed [3, 7, 6], and more recently [4, 2], each with a specific goal of producing a particular type of emergent behaviour. The Growing Point Language (GPL) , developed by Coore [3], is especially suited for producing self-organizing patterns t hat are prim arily topological rath er th an geomet ric. As an applicat ion of GPL , Coore [3] showed how simple text characters could be made to self-organise. T his was done merely as a demonstr ation of t he possibilities with GPL. Thi s implementation had a few problems, t he most important of which was t hat it had difficulty prod ucing st rings longer t han three characters.
-
Figure 1: A self-organizing "SYSTEM". Each dot represents a processor. Its colour represents the state t hat it has assumed after running a common program. The initial conditions included assigning the state of the bottom line, and special status to t hree other processors located near t he bott om left corner of the text. We present here, an extension of t his method t hat scales to arbit rarily long strings (provided there are resources available within t he environment). Figure 1 illustrat es th at our enhanced version is capable of producing much longer strings. One important by-product of t his work is a synchronizing mechanism to coord inate repeat ed, inter-dependent , distributed "spat ial processes".
2
Background
Th e Growing Point Language (GP L) is a programming language for specifying interconnect topologies in a coordinate-free way. The behaviour of a GPL pro-
497 gram is determined by: its instructions, the collection of processing elements that execute it (called the domain) and a set of initial conditions. The principa l concept in GPL is th e growing point, which describes a path in the GPL domain . At any given time, an instance of a growing point resides at a single location in the domain, called its active site . A growing point has a tropism, which is expressed as an affinity for either increasing , decreasing, or constant pheromone concentrations in the vicinity of the growing point . As a growing point 's active site moves from one location to a neighbouring one (according to its tropism) , we say that the growing point propagates. An active site may secrete a pheromone, which initiates a diffusion process (as described in [5]) that is centred at the active site. When the active site ceases execution at a location, and it does not propagate to a neighbouring location, then we say the growing point has terminated (at that locat ion). The trajectory of the growing point is the sequence of locations that its active site visits . The active site may also deposit some material at its current location. The material may be detected by other active sites, and may be used to influence the ir decisions. For example, we can determine where the trajectories of two growing points intersect by having each one sense the material deposited by the other. In pattern formation problems, we use materials to colour-code the trajectories, and compare the collection of all trajectories with a material against the desired pattern. Logical groupings of growing points in a GPL program are called networks. A growing point can be viewed as a relation between t he location of its invocation and t he location (s) of its termination. The network abstraction is an extension of this idea: it defines a relation between its inputs, the locations from where growing points may be invoked, and its outputs, a subset of termination locat ions of those growing points. Compound networks can be defined by allowing the outputs of one network to act as the inputs of another; such a connection is called a cascade of networks .
3
Implementation
The program directives for generating text are arranged in three layers of abstractions. At the lowest level are the most primitive growing points required for drawing line segments and rays in various directions. Recall t hat there is no a priori knowledge of vertical and horizontal directions; they are relative to a reference line which is established from t he initial conditions of the environment . We use a metaphor of a pen making strokes to define growing points t hat produce line segments relative to the reference line. The second layer of abstraction is the collection of character-shape descriptions. These are implemented as GPL networks , which are composed from the pen stroke growing points of the lower layer. The third (and highest) layer is the word pattern itself: it is implemented as a GPL network that is formed by cascading let ter networks. When t he programmer wishes to construct a self-organising word, she defines a cascade of let ter networks to spell the word she wants , configures a domain
498 to indicate where the bottom left corner of the network and the reference line should be, and executes the program on the domain. We now describe how each abstraction layer is implemented, starting from the bottom layer.
3.1
Determining up, down, left and right
Two pheromones, base-line-long and dir-pheromone, are responsible for establishing the vertical and horizontal directions, respectively. We construct a reference line that secretes base-line-long. The growing points for drawing vertical lines have tropisms that are sensitive to base-line-long: away from the line is up and towards the line is down . The initial secretion of dir-pheromone is produced from a point specified in the initial conditions, which should be to the left of the reference line. So, a constant concentration of base-line-long pheromone indicates a direction parallel to the reference line; the direction of increasing concentrations of dir-pheromone is left and the direction of decreasing concentrations is right . We defined a library of growing points based on these derived directions to move the pen : some that leave a mark (e.g. up-pen, right-pen) and some that do not (e.g. lift-pen: right) . For example , the essential part of the up-pen definition is: (define-growing-point(up-pen length) (material ink) (tropism (ortho- base-line-long» away from reference line ; other characteristics omitted (actions ; maintain conditions for nearby pen grOWing points (when «< length 1) (terminate» (default (propagate (- length 1»»»
3.2
Character Networks
We defined each character as a GPL network of two inputs and two outputs. In order to allow arbitrary length strings of characters, character definitions must include growing points that not only define their shape, but also extend the reference line". One input point marks the start of the character shape , the other marks the start of the reference line. The corresponding outputs are produced when the growing points initiated from those inputs terminate. These outputs are used as the input terminals for a subsequent character when character networks are combined to form a network for the whole string. For example , here is the definition of the character 'L': ISO, in fact , when a character shape is drawn, it simultaneously draws the portion of the reference line that will support the next character.
499 (define-network (L (txt-in line-in) (txt-out line-out)) (at txt-in (start-gp up-pen CHAR-HEIGHT) upright of L (--> (start-gp right-pen (/ CHAR-WIDTH 2) ) base of L (--> (start-gp lift-pen :right (/ CHAR-WIDTH 2)) ; invisible (->output txt-out)))) use endpoint as output point (at line-in (--> (start-gp base-line CHAR-WIDTH) propagate base line (->output line-out)))) The at command allows us to initiate growing points from a network 's input. The --> symbol is an abbreviation for the connect command, which allows the
termination point of the first growing point started in the expression to serve as the initial point of the growing point(s) th at follows. The ->output command causes a point to behave as the output of a network . The expressions CHAR-HEIGHT and CHAR-WIDTH are constants that determine the dimensions of a character in terms of neighbourhood hops. Figure 2 illustrates the connection between the implementation and the emergent pattern caused by the L network .
r~pen
(j)
righl-een Ixt- in line-in •
==
lifl-p~n :right
- - '-e txt-out
(j) base-line
.
line-out
Figure 2: Trajectories of growing points used by the L network are shown, along with the direction of growth , and a sequence number indicating when each growing point starts.
3.3
Word Organization
A network for a string is defined as a cascade of the individual networks for the characters in the string. The network definition below shows how the word "LIFE" as a self-organising piece of text was constructed using our GPL libraries. Observe how intuitive the definition of the text appears at this highest level of abstraction. (define-network (LIFE (txt-in line-in) (txt-out line-out)) (==> (txt-in line-in) L txt-director I txt-director F txt-director E (txt-out line-out)))
Figure 3 illustr ates the execution of this program through a sequence of snapshots of the domain . The colour of a processor is an indication of the material that has been deposited (by growing points) at that processor's location. The symbol ==> is an abbreviation for cascade . The name of the network is LIFE, it has two inputs (txt-in and line-in) and two outputs (txt-out and line-out) . In order to start the program, we will need to supply two points
500
-
-
-
l,\
l,\f
l,lf~
1
l,l
-l, .
L
-
l. \ . azlL¥
l,lft: .....
Figure 3: The evolutionof a self-organizing text pattern . Colours represent materials that have been deposited at each point (here, there are only three materials: none, ink, line)
of the domain that will behave as these two input locations. The expression starting with ==> says to use the two inputs to the LIFE network as inputs to the L network (defined previously) , then use its outputs as inputs to the t xt -di r e ct or network, whose outputs are then used as inputs to the I network and so on, until the outputs of the E network are supplied as the outputs of the overall LIFE network . The purpose of the txt-director network is to restore the conditions necessary to draw the next character; it will be explained in more detail shortly. Note that the cascade combinator allows us to build even bigger networks by combining word networks in the same way that letters are combined to make words.
3.4
P ropagati ng direction information
Given a process of constructing individual characters, the problem of making the process scale to producing arbitrarily long strings is reducible to replenishing the source of dir-pheromone after each character is drawn . In other words, we desire a kind of inductive property on our text strings: if the current character has sufficient information to determine left / right and up/down, then the subsequent character will also. Then, if we ensure that our initial conditions provide enough information for the first character to be formed properly, then the remainde r of our text will be also. The earliest attempt at generating self-organis ing text with GPL did not attempt to establish this inductive property: it used a one-time initial secretion of dir-pheromone over a range far enough to cover the distance of all of the characters to be drawn . The initial attempts at achieving this inductive property,
501
described in [3], embedded replenishing secretions of dir-pheromone within the definitions of the character networks, usually defined to occur from somewhere near the centre of the character. This had two problems: the definition of a character was now more complex, and the replenishing secretion interfered with the construction of the current character. In our current approach, we defined txt-director network, to be cascaded with character networks, whose sole function is to secrete dir-pheromone far enough to provide guidance to one character. We then cascade an instance of the txt-director network in between successive character networks. This takes care of the replenishment of dir-pheromone, but not of its interference with the current character.
(
/\\
.-.
\~,,/ /
I
.~--.
• Figure 4: The Synchronizing Mechanism. Top row: the synchronizing growing point within the 'L' network. Bottom row: the role of the txt-director network (shown with a dotted line) between consecutive characters. The circles show the extents of homing pheromone (smaller) and dir-pheromone (larger).
Our solution to this second problem was to use a pair of growing points to implement a synchronization mechanism that ensured that a character's network would be completely drawn before the subsequent network started. Figure 4 illustrates the mechanism in action . Specifically, the point to become the output of the character network secretes a homing pheromone. The final segment of the character's pattern is then sequenced (using connect) with a growing point that seeks out the homing pheromone. Upon finding the source of the homing pheromone, the growing point terminates and yields the output point of the network . Since this network is cascaded onto a txt-director network, the overall effect is that the secretion of dir-pheromone for the subsequent character does not take place until the current character has been completely drawn .
4
Discussion
We have presented an improvement to a GPL implementation of self-organising text. Our implementation is modular, in that both the production of characters
502 as well as t he maintenance of t he necessary signals have been capt ured by GPL networks. T his gives us th e power to combine them freely and robustly, in much t he same way that a digit al circuit designer may work with logic gates over transistors. The pot ential for interference between networks will always be present because th ere are a finite numb er of pheromones and an unb ounded numb er of uses of th em in generating a pat tern t hat has unbounded extent. In thi s implementation, we resolved this inte rference by synchronizat ion, but there are alternative approac hes, which we intend to explore. Qualitati vely, we have shown t hat non-trivial pat terns can be engineered to emerge in a complex system from relatively simple interactions between t he system's elements. We believe t hat t he techniques we have used can be genera lised to solve ot her types of pattern formation problems. At t he very least , it is clear t hat with th e appropria te abstractions, we can appl y traditional engineering techniques to cont rolling complex syst ems.
Bibliography [1] ABELSON, Harold, Don ALLEN, Daniel COORE, Chris HANSON, George HOMSY, J r. T HOMAS F . K NIGHT, Radhika NAGPAL, Erik R AUCH, Gerald J ay SUSSMAN, and Ron WEISS, "Amorphous comput ing", CACM 43 ,5 (2000), 74- 82. [2] BEAL, J acob , "P rogra mming an amorphous computational medium ." , Unconventional Programming Paradigms (J .-P. BANATRE, P . FRADET, J .-L . GIAVITTO , AND O . MICHEL eds.), (2004), 121-136. [3] COORE, Daniel, Botanical Compu ting: A Developmental Approach to Generating Int erconnect Topologies on an Amorphous Comp uter, PhD t hesis MIT (January 1999). [4] COORE, Daniel, "Abstractions for directing self-organ ising pat terns." , Unconvent ional Programming Paradigms (J .- P . BANATRE, P . FRADET, J .-L . GIAVITTO, AND O . MICHEL eds.), (2004), 110-120. [5] COORE, Daniel, and Radhika NAGPAL, "Implement ing reaction-diffusion on an amorphous computer.", 1998 MIT Student Workshop on HighPerformance Computing in Science and Engineering , (1998), 6-1 - 6-2. [6] KONDACS, Attila, "Biologically-inspired self-assembly of two-dimensional shap es using global-to-local compilat ion", Int ernational Joint Con ference on Ar tificial Intelligence (2003). [7] NAGPAL, Radhi ka, Programm able Self-Assem bly: Const ructing Global Shape using Biologically-inspired Local Interactions and Origami Mathematics, PhD th esis MIT (J une 2001).
Chapter 6
SELF-LEARNING INTELLIGENT AGENTS FOR DYNAMIC TRAFFIC ROUTING ON TRANSPORTATION NETWORKS Adel Sadek Dept. of Civil and Environmental Engineering and Dept of Computer Science University of Vermont [email protected] Nagi Basha Dept. of Computer Science University of Vermont [email protected]
Abstract Intelligent Transportation Systems (ITS) are designed to take advantage of recent advances in communications, electronics, and Information Technology in improving the efficiency and safety of transportation systems. Among the several ITS applications is the notion of Dynamic Traffic Routing (DTR), which involves generating "optimal" routing recommendations to drivers with the aim of maximizing network utilizing. In this paper, we demonstrate the feasibility of using a self-learning intelligent agent to solve the DTR problem to achieve traffic user equilibrium in a transportation network. The core idea is to deploy an agent to a simulation model of a highway. The agent then
504
learns by itself by interacting with the simulation model. Once the agent reaches a satisfactory level of performance, it can then be deployed to the real-world, where it would continue to learn how to refine its control policies over time. To test this concept in this paper, the Cell Transmission Model (CTM) developed by Carlos Daganzo of the University of California at Berkeley is used to simulate a simple highway with two main alternative routes. With the model developed, a Reinforcement Learning Agent (RLA) is developed to learn how to best dynamically route traffic, so as to maximize the utilization of existing capacity. Preliminary results obtained from our experiments are promising. RL, being an adaptive online learning technique, appears to have a great potential for controlling a stochastic dynamic systems such as a transportation system. Furthermore , the approach is highly scalable and applicable to a variety of networks and roadways .
1.0 Introduction In recent years, there has been a concerted effort aimed at taking advantage of the advances in communications, electronics, and Information Technology in order to improve the efficiency and safety of transportation systems . Within the transportation community, this effort is generally referred to as the Intelligent Transportation Systems (ITS) program. Among the primary ITS applications is the notion of Dynamic Traffic Routing (DTR), which involves routing traffic in real-time so as to maximize the utilization of existing capacity. The solution to the DTR problem involves determining the time-varying traffic splits at the different diversion points of the transportation network. These splits could then be communicated to drivers via Dynamic Message Signs or In-vehicle display devices. Existing approaches to solving the DTR problem have their limitations . This paper proposes a solution for highway dynamic traffic routing based on a self-learning intelligent agent. The core idea is to deploy an agent to a simulation model of a highway. The agent will then learn by itself through interacting with the simulation model. Once the agent reaches a satisfactory level of performance, it could then be deployed to the real world, where it would continue to learn how to refine its control policies over time. The advantages of such approach are quite obvious given the fact that real-world transportation systems are stochastic and ever-changing, and hence are in need of on-line, adaptive agents for their management and control.
1.1. Reinforcement Learning Among the different paradigms of soft computing and intelligent agents, Reinforcement Learning (RL) appears to be particularly suited to address a number of the challenges of the on-line DTR problem . RL involves learning what to do and how to map situations to actions to maximize a numerical reward signal (Kaelbling, 1996; Kretchmar, 2000; Abdulhai and Kattan, 2003; Russell and Norvig, 2003). A Reinforcement Learner Agent (RLA) must discover on its own which actions to take to get the most reward. The RLA will learn this by trial and error. The agent will learn from its mistakes and come up with a policy based on its experience to maximize the attained reward (Sutton and Barto, 2000). The field of applying RL to transportation management and control applications is still in its infancy. A very small number of studies could be identified
505
from the literature. A Q-Iearning algorithm (which is a specific implementat ion of reinforcement learning) is introduced in Abdulhai et. al. (2003) to study the effect of deploying a learning agent using Q-Iearning to control an isolated traffic signal in realtime on a two dimensional road network. Abdulhai and Pringle (2003) extended this work to study the application of Q-Iearning in a multi-agent context to manage a linear system of traffic signals. The advantage of having a multi-agent control system is to achieve robustness by distributing the control rather than centralize it even in the event of communication problems . Finally, Choy et al. (2003) develop an intelligent agent architecture for coordinated signal control and use RL to cope with the changing dynamics of the complex traffic processes within the network.
2.0 Purpose and Scope The main purpose of this study is to show the feasibility of using RL for solving the problem of providing online Dynamic Route Guidance for motorists, through providing a set of experiments that show how an RL-based agent can provide reasonable guidance for a simple network that has two main routes. Figure I shows the simple network used in this study. It should be noted that this network is largely similar to the test network used by Wang et al. (2003) in evaluating predictive feedback control strategies for freeway network traffic.
Dl
01
Figure 1: Network Topology The network has three origins: 0 I, 02, and 03 and three destinations D I, D2, and D3. Each origin generates a steady flow of traffic. Traffic disappears when it reaches any of the three destinations. The length in miles of each link is indicated on the graph. For example, LO has a length of 2 miles. The capacity of all links is 4000 veh/h except for LO that has a capacity of 8000 veh/h. All links have two lanes except LO that has 4 lanes. As can be seen from Figure I, there are two alternate routes connecting origin 0 I to destination D I. The primary route (route A) has a total length of 6.50 miles, whereas the secondary route (route B) is 8.50 miles long, is therefore longer than route A. The intelligent RL agent is deployed at the J I junction. The goal of the agent is to determine an appropriate diversion rate at JI so as to achieve traffic user equilibrium between the two routes connecting zones 0 I and DI (i.e. so that travel times along routes A and B
506
are as close to each other as possible), taking into consideration the current state of the system .
3.0 Methodology 3.1 Cell Transmission Model In this study, we selected the Cell Transmis sion Model (CTM) to build the simulation model , with which the RL agent would interact to learn for itself the best routing strategie s. The CTM was developed by Daganzo to provide a simple representation of traffic flow capable of capturing transient phenomena such as the propagation and dissipation of queues (Daganzo , 1994; 1995). The model is macroscopic in nature, and works by divid ing each link of the roadway network into smaller, discrete, homogeneous cells. Each cell is appropriately sized to permit a simulated vehicle to transverse the cell in a single time step at free flow traffic conditions. The state of the system at time t is given by the number of vehicles contained in each cell, ni(t). Daganzo showed that by using an appropri ate model for the relation ship between flow and density, the cell-tran smission model can be used to approximate the kinemat ic wave model of Lighthill and Whitham (1955). For this study, a C++ implementation of the CTM was developed and used to simulate the test network . 3.2. The Intelligent Agent As previously mentioned , RL was the paradigm chosen to develop the intelligent, learning agent that will be used for dynamic traffic routing . Specifically, the learning algorithm implemented in the agent is based on the SARSA algorithm, which is an onpolicy Temporal Difference (TD) learning implementation of reinforcement learning (Sutton and Barto, 2000). SARSA is a temporal difference algorithm because - like Monte Carlo ideas - it can learn directly from the experience without requiring a model of the dynamics of the environmen t. Like Dynamic Programming ideas (Bertsekas, 2000), SARSA updates the desirability of its estimate s of state-action pairs, based on earlier estimates; i.e. SARSA does bootstrapping. For a complex, unpred ictable, and stochastic system like a transportation system, SARSA seemed to be very suitable to adapt with the nature of an ever-changing system. The implementation of the SARSA algorithm is quite simple. Each state action pair (s,a) is assigned an estimate of the desirability of being in state s and doing action a. The desirability of each state action pair can be represented by a function Q(s,a). The idea of SARSA is to keep updating the estimates of Q(s,a) based on earlier estimates of Q(s,a) for all possible states and all possible actions that can be taken in every single state. Equation I shows how the Q(s,a) values are updated . [Equation I] where a is the step-size parameter or the learning rate. According to Equation [I] , at time t, the system was in state s., and the agent decided to take action at. This resulted in moving the system to state S,+I and obtaining a reward of rl+l. Equation [I] is thus used to find new estimates for the Q-value s for a new iteration as a function of the
507
values from the previous iteration . The algorithm typically would go through several iterations until it converges to the optimal values of the Q-estimates.
3.3. Experiment Setup As was previously mentioned, the objective of the experiment presented in this paper is to have an agent that is capable of recognizing the state of the system and deciding upon a diversion rate at junction J 1• If this diversion rate is followed, the system should eventually move to a state of equilibrium where taking any of the two routes will result in the same travel time . For example, if route A (L I) is totally blocked because of an accident, the agent should guide all motorists to take route B (L2).
3.3.1 State, Action, Reward Definition State representation is quite important for the agent to learn properly . Ideally, using the CTM, each state is represented by the number of vehicle in each cell. Computationally, it is impossible to use this representation for the states for a RL agent, since this would make the problem state space very large. In this experiment, the state of the system is represented by the difference in the instantaneous time between taking the short route through LI (route A) and the longer route through L2 (route B). Based on some empirical experiments, the state space was discretized to a finite number of states . Table I shows how the state was discretized based upon the difference in instantaneous travel time between the longer route, route B, and the shorter route, route A. Time difference (dif) in minutes 0< dif < 2 2
State code 0 1 2 3 4 5 6 7 8 9
As can be seen, State 0, for example, refers to the case when both routes are running at free flow speed. For this case, the difference in travel time between routes B and A is in the range of +2 minutes, since route B is 2.0 miles longer than route A. On the other hand, state -9 refers to the case when route A (the shorter route) is extremely congested (totally closed), while the longer route is doing fine. In our experiments, the instantaneous time is determined from speed sensor readings along the two routes . For actions, ideally the diversion rate is a real number between 0 and 100%; which means an infinite set of actions . In this experiment, the set of actions were reduced to only six actions . Table 2 indicates the six different actions used in this experiment. Action code 0 1
Action meaning Divert 100% of the flow to Ll & 0% to L2 Divert 80% of the flow to Ll & 20% to L2
508
2 3 4 5
Divert 60% of the flow to LI & 40% to L2 Divert40% of the flow to LI & 60% to L2 Divert 20% of the flow to LI & 80% to L2 Divert 0% of the flow to LI & 100% to L2
Table 2: Action Set
For the SARSA algorithm, all the values of Q(s,a) are initialized to zero. As the agent experiments with different actions for the sates encountered, the environment responds with a reward, which in our case is equal to the negative of the difference in instantaneous travel time between the two routes. In other words, the goal of the agent is to take the proper action to ensure that the instantaneous difference in time between the two routes does not exceed 2 minutes; i.e. reaching the state of equilibr ium. In the experiment, the authors simulated running the system for around 90 hours of operation . Every five hours and half, an accident is introd uced on link L1. The accident lasts for half an hour causing the reduction of Ll capacity to half its original capacity assuming that one of the two lanes on LI is blocked because of the accident. The purpose of introducing the accident repetitively is to make sure that the agent encounters many different states and lasts in each state long enough to learn the best action for it.
4.0 Results and Discussions The results of the experiment show that the agent managed to learn the right actions for all the states it has encountered. For example, for the state of complete congestion on link LI (state -9), the agent learned that the proper action is to divert 100% of the traffic to the longer route (action 5). On the other hand, for the free flow state (state 0), the agent learned that the best action is to divert all the traffic to the shorter path (action 0). The agent also learned the correct actions for the intermed iate states. For example, when the travel time on the longer route was 10 minutes less than the shorter route, the agent learned to divert only 20% of the traffic to the shorter route and 80% to the longer one. Once the situation improves and the difference is only 7 minutes instead of 10, the agent adjusts the diversion rate to 40 to the shorter route and 60% to the longer one. Figure 2 shows the convergence to the right action for state - 9. The time is represented in seconds* I O. Notice that the system got into state - 9 after almost six hours of the simulation ; 30 minutes after the accident. The agent chose action 0 (diverting all the traffic to the shorter route) the first time the agent encountered state 9. This decision is actually the worst decision that can be made in this situation. But as time passes, the agent decided to switch to action 1 which is still not the proper action to take. By the end of the simulation model , the agent converged to the right action which is diverting all traffic to the long route that exhibits shorter travel time.
509
Conversion for State -9
6..-- - - - -
..
-......
5 +------~_
~_+_ _
-. _
- -___;
....... ~.........t_- - _l = 3 +------.... +H...........~--_l o ·
u
lH----.. . . . . . 1
I--Actionl
0 .j-4~-~----,-- --.---------l
o
1000:>
20000
30000
zoooo
Time III Secon ds"10
Figure 2: Conversion for State -9
5. Conclusions The results obtained from this preliminary work are very promising. For our future research, we are planning to use a neural network to augment the state representation of the system. Using a neural network will allow us to deal with a bigger set of states as well as achiev ing a smother and continuous representation of the state space. Similarly for actions, a neural net would be much more appropriate than the set of 6 different actions we used in this experiment. In the future we are also planning to experiment with a more complex transportation network, and to utilize a community of cooperating intelligent agents for traffic management and control. .
6. Acknowledgement This research is being funded by grant number CMS-0133386 from the National Science Foundation (NSF). The authors would like to thank NSF for their support.
References Abdulhai, B. and Kattan L.(2003) "Reinforcement learning: Introduction to theory and potential for transport applications" Canadian Journal of Civil Engineering, Volume 30, Number 6 (I I), 981-991. Abdulhai, B., Pringle, P., and KarakouIas, GJ. (2003) "Reinforcement Learning for True Adaptive Traffic Signal Control. ASCE Journal of Transportation Engineering, Vol. 129(3), pp. 278-284. Abdulhai, B. and Pringle, P. (2003). "Autonomous Multiagent Reinforcement Learning - 5 GC Urban Traffic Control." A paper presented at the 2003 Annual Transportation Research Board Meeting, Washington, D.C. Bertsekas, Dimitri P. (2000). Dynamic Programming and Optimal Control. Athena Scientific, Massachusetts USA, 502 pp. Choy, M. C., Cheu, R.L., Srinivasan, D., and Logi, F. (2003). Real-Time Coordinated Signal Control Using Agents with Online Reinforcement Learning. A paper presented at the 2003 Annual Transportation Research Board Meeting, Washington, D.C.
510
Daganzo, C.F. (1994). "The Cell Transmission Model: A Dynamic Representation of Highway Traffic Consistent with the Hydrodynamic ." Transportation Research, Vol 28B, No.4, 269287. Daganzo, C.F. (1995). "The Cell Transmission Model, Part II: Network Traffic ." Transportation Research, Vol 29B, No.2, 79-93. Kaelbling, Leslie P., Littman, Michael L., and Moore, Andrew W. (1996) "Reinforcement Learning: A Survey." Journal of Artificial Intelligence Research, (4) 237-285. Kretchmar, R. Matthew (2000). A Synthesis Of Reinforcement Learning And Robust Control Theory . Ph.D. Diss., computer Science, Colorado Sate University, Fort Collins, Colorado, 139 pp. LighthilI, M.J. and Whitham, J.B. (1955) "On Kinematic Waves. I: Flow movement in long rivers; II. A theory of traffic flow on long crowded roads." Proc. Royal Society A, 229, pp. 281-345 Russell, S. and Norving, P.(2003) Artificial Intelligence A Modem Approach . Second Edition, Prentice Hall. Sutton, Richard S. and Barto, Andrew G. (2000). Reinforcement Learning. MIT Press, Massachusetts USA, 322 pp. Wang, Y, Papageorgiou, M. and A. Messmer. (2003) "A Predictive Feedback Routing Control Strategy for Freeway Network Traffic." Transportation Research Record 1312, TRB, National Research Council, Washington, D.C, pp. 21-43.
Chapter 7 Distributed Resource Exploitation for Autonomous Mobile Sensor Agents in Dynamic Environments Sarjoun Doumit and Ali Minai Complex Adaptive Systems Laboratory (C.A.S.L.) University of Cincinnati, Ohio, U.S.A. sdoumit@ececs. uc.edu, [email protected]. ed u
This paper studies the distributed resource exploitation problem (DREP) where many resources are distributed across an unknown environment, and several agents move around in it with the goal to exploit/visit the resources . A resource may be anything that can be harvested/sensed/acted upon by an agent when the agent visits that resource 's physical location. A sensory agent (SA) is a mobile and autonomous sensory entity that has the capability of sensing a resource's attribute and therefore determining the exploitatory gain factor or profitability when this resource is visited. This type of problem can be seen as a combination of two well-known problems: the Dynamic Traveling Salesman Problem (DTSP) [8] and the Vehicle Routing Problem (VRP) [1] . But the DREP differs significantly from these two. In the DTSP we have a single agent that needs to visit many fixed cities that have costs associated to their pairwise links, so it is an optimization of paths on a static graph with time-varying costs. In VRP on the other hand, we have a number of vehicles with uniform capacity, a common depot, and several stationary customers scattered around an environment, so the goal is to find the set of routes with overall minimum route cost to service all the customers. In our problem, we have multiple SAs deployed in an unknown environment with multiple dynamic resources each with a dynamically varying value . The goal of the SAs is to adapt their paths collaboratively to the dynamics of the resources in order
512
to maximize the general profitability of the system. Applications of this model range from exploratory missions such as those of rovers on planets [7, 4, 5] to surveillance, monitoring, resupply and survival in hazardous environments.
1
Introduction
T he special case of the distributed resource exploitation problem (DREP) we consider can be seen as a scenario where a group of agents, deployed in an (unexplored) area to fulfill a certain tas k (including exploring t he area), par tition the discovered ta rgets or "resources" into different classes in order to come up with t he best path plannin g scheme for t heir application. In the existing related literatur e, this challenge falls into the category of multiobjective optimization, i.e. the many parameters the agents have to consider in order to design the optimal paths that will allow them to exploit th e largest amoun t of resources while considering th e const raints of every probl em typ e. The constr aints we consider include tim e, energy of th e agent, communicat ion and cost of mobility , but ot hers could be considered. In brief, the tas k or problem becomes finding the -+ vector of decision variables X var t hat sat isfies all the system const raints (agents, -+ resources and environment) and st ill optimizes a vector funct ion V F which -+ represents the object ive function F . T he functio ns t hat are members of V Fare usually conflicting due to opposing goals, such as minimize energy expendit ures (in order to increase longevity of th e agent ) and keep visiting resources (which costs energy and decreases longevity). An important tradeoff is exploration vs. exploitation : should th e agent go to explore unexplor ed parts of the terrain in the (possibly vain) hope of discovering new resources or visit already discovered resources? In the next section we discuss the system description and related const raints in more detail.
2
System Description
The syste m consists of the SAs and an "unexplored" terrain which includes the hotSpots. The SA is a relatively small sensing unit equipped with a batter y for energy and has mobility and wireless communication capabilities. Th e SA's mobility resembles the locomotion of a fluid particle in a field of potentials (where the potentials are a result of hotSpots and unexplored terrain areas). The agent 's "motor system" generates a constant moving force th at is guided by a "directional steering" capable of heading in any direction (0° -+ 360°) in 3D space. The agents incur a mobility and communicat ion energy cost . Since t he terrain we're addressing is a 3D non-homogeneous environment, th e mobility cost differs when traversing different types of terrain, or going on an incline or a combination of th ese elements . Each SA records the energy cost of going between specific points on the terrain, so it can create its own virtu al asymmetric weighted-edge graph which is it 's view of the terrain. Th e SA earns different
513 "rewards" for every time it "discovers new areas" , "discovers a new hotSpot" or "visits a previously discovered hotSpot" . Every agents is equipped with an algorithmic strategy module that allows to plan its next step and even much further in the future based on the constant information it collects itself from the terrain (or from other SAs). The decision mechanism of the SA, which we will not describe in detail , is based on a subsumption architecture. Every agent has a 2-stage path-planning module with two functions: • Long-term planning: draws strategies and list of future actions to be taken by the agent and the possibility/probability of certain events that could occur. Decides the value of the short-term planning's time-steps (or Horizon value) and informs the short -term planning of the current destination point to go to . • Short-term planning: uses its specific Horizon time value to calculate the specific steps needed to go to a specified target and regularly updates the bearings and angle of the agent to the compensate for any path irregularities that might arise due to obstacles and different terrain textures. Note that information updates from neighboring agents cannot prompt it to alter its course until it has finished.
The potential fields represent the discovered hotSpots' status and the unexplored areas of the field or the areas that are explored but weren't visited in some time . The hotSpots are dynamic in nature and decay away after some time, each at a different rate.
2.1
Types of Solution Approaches
We classify the approach for distributed management and control of the agents into three classes: aware agents, semi-aware agents and unaware agents. Each of these classes defines how informed a single agent is about the rest of the agents in the network. We describe these classes below, and also show the results of the same experiments run with these three classes of agents . • Aware agents In the aware case, every agent knows the state of every other agent on the field. Every single agent can learn and take advantage of all the decisions, discoveries and feedback by all other agents on the field. This would represent the perfect scenario as long as the cost of information (whether acquired by wireless or some other medium) is negligible. The decisions taken by these agents reflect the behavior of the entire system and followthe embedded algorithm as a monolithic system. It is, therefore , very predictable. • Semi-aware agents In the semi-aware case, the actual distance separating the nodes and limitations on information transfer are taken into consideration. Therefore , only agents within a physical proximity of each other can communicate and only part of their knowledge is transmitted and shared .
514
Th e cost of tra nsmission is factored in when determining th e efficiency of the total syste m. Due to the limited information sharing, the collect ive behavior of the agents is not very predictable and could result in surprising solut ions and many local semi-opti mal solutions. • Unaware agents The last class of agents represents true classic complex systems, similar to very primitive cellular organisms, where every agent interacts with its immediate environment and chooses its act ions independently of ot her agents in the network. T his class embodies true emergence where the system is completely unpr edictable and no optim al solut ion is guaranteed. All th e known hotSpots exert specific "gravitational" and "anti- gravitat ional" forces on the agents in their vicinity. When an agent discovers a hotSpot, it assigns it a specific "profitability" value p(t ) which corresponds to th e intensity with which its sensed phenomenon was recorded. A unique timer funct ion f( t) is also associated to every discovered hotSpot to determine the value of t he "attr activeness/repulsion" of t he hotSpot which is proportional to the tim e value and oth er variables. Thus, an SA th at has recentl y been "attracted" to an unvisited hotSpot would immediately afterwards be "repulsed" by t hat hotSpot and, based upon t he different rewards it has accumulated and on its specific exploratio/ exploit ation probabili ty ratio, it would either gravitate towards an unexplored section in the terrain , or towards anothe r hotSpot. When exchange of information occurs amongst the SAs, the network 's values for every hotSpot's attractiveness, etc. becomes the deciding factor for assigning SAs to targets. T he proximity of t he SA to ot her hotSpot 's and unexplored par ts of the terrain, in addition to the cost of t he virt ual edges, play the role in an "optimized preferent ial matching" algorithm where the agent with the lowest rewards for hotSpot visitations (and the previously mentioned factors as well) gets priority for visiting the most "needy" hotSpot (the one with least visits).
3
Formulation
The challenge discussed in this paper is to allow multiple fully aut onomous agents to move around in an unknown terrain in order to explore and opt imize their objective function F . T he overall objective is to coordina te pat hs amongst the agents in order to re-visit th e discovered hotSpots capt uring t he nonlinear constraints of: 1) t he intensity for every hotSpot with respect to other hotSpots; 2) th e energy remaining in t he agent fs); and 3) the exploration-exploitation variable factor that determin es the size of t he planning cycle's step size which, in turn, defines the explorato ry pattern sizes.
3.1
Mathematical formulation of the objective function
We now describe the different parts that are used in delineatin g the mathematical formul ation of t he object ive function.
515
System Requirements and Constraints • Agent's sensing radius is restricted to 2 space units around it. • Different terrain textures and elevations incur different mobility costs. • hotSpots should be visited proportionally to their average intens ity value. • SAs must balance between their exploration and exploitation ratios.
• Qt
summation of the all the system requirements and constraints.
=
Decision Variables •
N hotSpot
• NTerrain
• NSA
= Number of discovered unit
= Number
•
K hotSpot
•
L SA
• R SA
= Number of discovered hotspots. areas on the terrain.
of active (alive) agents.
= Attractivity ratio
of discovered hotspots.
= Exploration/Exploitation ratio of the = Rewards
agents.
value of the agents .
• c is the weight of a given virt ual edge. • p becomes 1 if we want to use the edge and 0 if we do not .
Objective Function
Max.
L
L
L (Khotspot+LsA+RsA)xQt+Min. L (p x c) x Qt . (1.1)
(Nho,Spot) (NT.", ;,) (NSA)
note that if all the terrain becomes explored the objective function becomes
Min . ~)p x c) x Qt.
3.2
(1.2)
State Vector Representation
The agents are described by a state space model (SSM) [6, 3, 2] which is a discrete time-inva riant system . The SSM best models our agents simply because the evolution (values assumed)of the agent's n real state variables denoted by vector y depend on the evolution of the terrain's varia bles and other agents 'states variables x , denoted by vector In system theory , reference is usually made to a black-box representation: 1t ---. Agent ---. y . The agent's state space model is described by the equation system:
u.
x(t + 1) = Ax(t) y(t)
+ Bu(t) ,
= Cx(t) + Dn(t)
(1.3) (1.4)
516 Equation (1.3) represent the agent 's state vector and how they change according to t he input where x (t ) is the state vector and u(t) the act ion vector. Equation (1.4) defines how t he outputs (or observations) determines the out put as a function of the state of the system where y(t ) represents the out put vector. This system can be seen as a linear mapping of the space of m-dimensional sequences into a space of n-dimensional sequences.
4
Simulation Results and Conclusion
We conducted many simulat ion runs of our syste m on a (100 x 100 x 5) unit 3D terrain. We teste d many types of terrain elevat ions and textures and varied the number of hotSpots and SAs. We use 3 types of SAs t hat we gave identical physical values (energy, sensing radius, ...) . The first ty pe of SAs where ones that used a completely rand om (and selfish) algorithm, the second was semiaware (with limited communicat ion radius) and t he t hird was the fully aware kind. Several conclusions can be drawn from t he results we collected. Firs t on the coverage (exploration) issue, t he third type out performed all th e ot her types, which was expected, and t he results can be seen in figure 1. T he random SAs outperformed the ot her 2 types at the beginnin g, but soon flattened out. On the ot her hand we see a repetitive "pattern" in the semi-aware case, while we only see a similar pattern , half-way th rough simulat ions for the fully aware case. Finally we see that t he fully aware agents were able to discover about 50percent of the terrain. In t he next figure 2, we study one facet of exploitation which
E1pb r.Jton : lerroun
40 -j-
- I --f--
- I---
Co~folge Prrcenu~
-I-
*
JO t--I----+/"+--+--:I--+-~.__j
l egend
~
8
,.
20
1000
2000
3000
.lOO()
5000
6000
zooo
-
R.andom Agenls
-
SemlA.wolJe Agents AIl'J.u e Agents
8lOO
Time
Figure 1: Terra in coverage
is the total number of visits all hotSpots received. The results show that the semi-aware SAa outperformed the random and aware SAs. But th is does not
517
mean that the constraints of t he system on the hotSpots and on the SAs were optimally met, as will be revealed in the following 2 figuress 3 and 4. Although it does show a great improvement over the random case, the localized agents have a tende ncy of "over-doing" th e exploitation as exploitation of the "same" hotSpot s occurs frequently due to the restrictions of communication. In figure 3 we see a big discrepancy between the most visited hotSpot and th e least visited one for the semi-aware agents , something that is less acute for the fully aware SAs. This fact shows that th e exploration/exploitation ratio was not as balanced in the semi-aware case as it was in the fully aware case. This is shown once again in figure 4 where only the semi-aware and aware results from figure 3 are ordered and analyzed.
...
Figure 2: Total number of hotSpot visitation for 5 agents and 20 hotSpots
.Of····· ·l .~=.....- l I
J
.
.. '
.I
. . ...
Figu r e 3: Individual hotSpots visitation number for 5 agents and 20 hotSpots
Bibliography [1] BIANCHI, Leonora, "Notes on dynamic vehicle routing - the state of art" , Tech. Rep. no. 1, IDSIA Switzerland , (2000). [2] DURBIN, J ., and S.J. KOOPMAN, Time Series Analysis by State Space Meth ods, No. 24, Oxford University Pr ess (2001). [3] (ESO), European South ern Observatory , "Mathematical background of the linear state space model", Astronomy & Astrophysics, An International Weekly Journal, 12 (1997). [4] JPL , "J upiter's europa harbors possible warm ice or liquid wat er" , Tech. Rep. no. 1, JPL, (2001).
[5J JPL , "Mars exploration rover mission", Tech. Rep . no. 1, JPL, (2003). [6] MOULINES, Eric, "Linear state space models", Tech. Rep. no. 1, Ecole Nation ale Superieure des Telecommunications, Paris, (2002).
I
518
Expbr.,o n!ExpbUlo n An.lysls for d ~fcrc nl Age nl.,,'Pcs SOO
400
..
'"
9 ~
- -
300
'0
.R §
200
z
,,
100
0
~
/~11 ~..~.. ...-
~~ 0
I
2
V
~
~t:2 ,.r" ..:.:: ,, .,'
......
it"'"' :--
~
l;.::
"
Legen d _ _ Se.,1Aw. ,. Agen's -e- AW.1n! Age nts - Sem!A.w..e Avg
.
- - Aw areAvg
//
3
4
5
6
7
8
9
10 II
12
Hotspots
Figure 4: Analysis wit h average and linear regression for the semi-aware and aware SAs results depicted in graph 3 [7J NASA , "Listening for an ocean on europa", (2001) .
Tech. Rep. no. 1, NASA,
[8] SIMONETTI, Neil, "Applicat ions of a dynamic programming approach to the t raveling sa lesman problem" , Tech. Rep. no. 1, Carnegie Mellon, (1998) .
Chapter 8 Interconnecting Robotic Subsystems in a Network Javier A. Alcazar and Ephrahim Garcia Laboratory for Intelligent Machine Systems Mechanical and Aerospace Engineering Cornell University [email protected] We analyze how the interconnection of subsystems affects the network dynamics of the overall system's connective stability. Closed form equations for the magnitude of the interaction between subsystems are obtained to aid in the design of nearest neighbor interconnected subsystems and fully interconnected subsystems.
1. Introduction Multi robot algorithms have been of considerable importance in the control and robotics communities due to their applications in search and rescue missions, air traffic control, automatic highways and military operations such as reconnaissance , surveillance and target acquisition to mention a few. The analysis of connective stability for different subsystem interconnections has not been examined, and so, this paper is intended to fill that gap. In previous related multi robot work, formations have been accomplished in several different ways. Behavior-based approaches have been used for robot formations [Balch et al. 1998], [Arkin 1992] and [Kube et al. 1993], but do not include a formal development of the system controls from a stability point of view. In other studies algorithms for formations have been proposed such as in [Fredslund et al. 2001] and [Fredslund et al. 2002], where each robot keeps a single friend at a desired angle and the goal becomes to center the friend in the robot sensor's field of view. More recent work has been taking a system control perspective. Feddema [Feddema et al. 2002] uses decentralized control theory to control motion of multiple robotic vehicles in formation along a line. Desai [Desai et aI. 1998], [Desai et al. 2001] modeled a formation of nonholonomic mobile robots and developed a framework for transitioning from one formation to another . Chen and Luh [Chen et al. 1994] used distributed control in large groups of robots to cooperatively move in various geometric formations.
520 In this paper, the preliminaries for decentralized control of large scale systems [Siljak 1990, 1991] are introduced in part 2; the connective stability analysis of multiple subsystems using Lyapunov vector functions are addressed in part 3. Part 4 defines three types of network interconnections. Part 5 describes a possible linear state space representation for holonomic robot dynamics, which then proceeds to a complete definition of the parameters for two subsystems of interest: a diagonal subsystem and the robot subsystem, given in part 6. Part 8 shows the computed results for connective stability, along with some closed form equations that are useful when designing nearest neighbor and fully interconnected subsystems.
2. Preliminaries Let us consider a dynamic system S described by the differential equation, S : ~=I(t,x,u) (1)
y=h(t, x) where x(t) E ~W is the state of S at time t
yet) E 9F
E
'J, ii(f) E 9F are the inputs and
are the outputs. Assume the function
I :3 x m" x 9F
~
m"
(which
describes the dynamics of S) to be defined and continuous on the domain 3 x m" x mP • Assume the function h: 3 x m" ~ mq (which describes the observations of S) to be defined and continuous on the domain ~ x 91", so that solutions of (1) exists for all initial conditions. The system S described by (1) can be decomposed into N interconnected subsystems S, described by the equations,
S:
~ = l i(t,XpUJ + ],(t,x,u) y = hi(t,X i) + hi(t,x)
In this new description of the system S , the functions I i: 3 x m"' x mP,
(2)
~
m"'
represent the dynamics of each isolated subsystem Sj' and ]; : 3xm" xmp ~ m"' describes the dynamic interaction of S, with the rest of the system S . The function h, : 3 x m"' ~ mq, represents the observations at S, from its local state variables, and
the function
h. :3 x m" ~ mq, I
represents the observations at S.I from the rest of the
system S. Feedback may be added to the system S as follows, uj=kj(t , y)+'((t,Y), where k, : ~ x 91
Q ,
iE{l,...,N},
(3)
---t 91"' represents the control law applied at S, from its local state
variables, and the function '( : 3 x mq ~ m"' represents the observations at Sj from the rest of the system S . For linear time invariant lumped systems, (1) can be described by a set of equations of the form, S: ~ = Ax + Bu
Y = CX where A, Band C are, respectively, n x n , n x p , and q x n , constant matrices.
(4)
521 The system S in (4) has P inputs,
q
outputs and
n
state variables . In a similar way
as in (2), the system S described by (4) can be decomposed into N interconnected subsystems S; described by the equations,
i
s.
j
= Aixi +
Yi = c.s, +
N
N
j" 1 N
j"1
L eijAijx j + BJii + L b;jB yii j,
i E {I,..., N},
L CijCijXj'
0)
o
B, and C, are, respectively, n, x n., ni x Pi' and qi x n., constant matrices. Each subsystem S, in (5) has Pi inputs, qi outputs and ni state variables
where Aj
,
N
such that P
= L Pi'
q=
i=1
N
L qi
and n =
N
L », . The matrices
Aij' Bij and Cij
i=1
i= 1
represent the interaction among the subsystems . The matrices Aj
,
B, and C, represent
the "self-interaction" within the same subsystem S, . The elements "iii} are defined as _ {I, eij = 0, The matrix E
= (eij)
1990]. The elements b;j
=
{I~,
o, c,
s, Sj
s,
(6)
can act on cannot act on
S;.
is called the fundamental interconnection matrix [Siljak
by and
Gi} are defined respectively as,
can act on Xi' and cannot act on Xi'
i, =
{Id, xx j j
can be observed by Yi' (7) cannot be observed by Yi'
A decentralized control can be implemented on the partitioned dynamic system using local control based on local observations. Formally for each subsystem S, a linear local control of the form,
(8) is applied based on local observations, i.e. each element of the matrix Blj = 0 and Cij = 0 Vi,j E {1, ...,N}.
F; is the reference signal or command applied to subsystem
Sj ' Substitution of the decentralized control given by (8) in (5) would reconstruct the partitioned closed loop dynamics as, N
S : i; = ( AcL)jXj + L eijAijxj + (BCL );r; , iE{I,...,N},
.v; = (CCL) ;Xj , where (AcL)j
= Aj -
j"1
B;K[, (BCL); = B, and (CCL );
(9)
= C; are closed loop matrices for S, .
3. Connective Stability Analysis of connective stability is based upon the concept of vector Lyapunov functions . It should be mentioned that there exists no general systematic procedure for choosing vector Lyapunov functions . We outline a method for stability analysis, which first assumes that the complete system S has been decomposed into N interconnected
522 subsystems Si' To have stability of each subsystem Si' the scalar function v. (x) = I
I
(x T Hx I
I
'V'2 is proposed as a candidate for the Lyapunov function for each
I)
S., and I
require that for any choice of the positive definite matrices Gi , there exist positive matrices Hi as solutions of the Lyapunov matrix equations T Ai Hi + HiA; = -G;. Now the vector Lyapunov function is defined as v= (vI' v2 ' ... , vN ) ,
definite
which must satisfy the inequality
t s Wv
to guarantee connective stability of the
overall system S . The elements of the "aggregation matrix" W = CWij) are defined by,
wij
where Am C·) and
Am(G;) - 1Y, (ATA \ 2 1 (H) - e;;/!,M ii ii),
=
{
/!'M
- 1Y,
-eij/!'M
i
(ATA \ ij ij),
.
.
.
.
1= } Ii: }
(10)
AM C·) are the minimum and maximum eigenvalues of the
corresponding matrices [Siljak 1991]. The matrix W must have all leading principal minors positive to guarantee stability of the overall system S . This method guarantees that the system S will be stable even if an interconnection becomes decoupled, i.e. e.lj = 0, or if the interconnection parameters are perturbed, i.e. 0 < elj < 1 [Siljak 1990].
4. Network Interconnection types The following interconnections of independent subsystems are considered (Fig . 1), 1) Nearest neighbor interconnection (NN): each subsystem interacts only with its nearest neighbors. 2) Nearest neighbor interconnection with a central unit (NC): each subsystem interacts with its nearest neighbors and with a central unit (subsystem 1). 3) Full interconnection (FI): every subsystem interacts with every other subsystem.
Figure. 1. Interconnections of N=4 subsystems (robots): (a) nearest neighbor interconnection, (b) nearest neighbor interconnectionwith a central unit (I), and (c) full interconnection. The dynamics of the overall system S for the case of nearest neighbor interconnections with N subsystems can be described by (11) ,
i , =(A, -B,K;~, +B,aCzxz -C,x.)+B/ ;
i z =(A, -BzK;~2 +B,aCtxt-Czx,)+BzaC,x, -Czx')+B,rz
i 3 =(A, -B,KJ~3 +B3Q...C;xZ-C3X3)+B,Q...C.x. -C3x,)+B,r,
interaction gain matrix between subsystems. Note that the interaction
between subsystems is given by:
AI)
=-B)QC), An =B1QC2 ' etc.
The fundamental interconnection matrix that describes the interactions of (11) is given by the N x N constant matrix,
E=
I
o
0
I
1
0
0
I
I
o o o.
0
0
I
o
0
(12)
In a similar way, the fundamental interconnection matrices for the nearest neighbor interconnection, with subsystem 1 as the central unit, and the full interconnection interaction, can be computed respectively as the following N x N constant matrices,
E=
1 I I I
I I
I 0
I 0
1 I
I
I
0 ,
I 0
0
0
and
I I
I I 1 I
I I
I I
I
I
I
I
I
I
I
I
I
E=
(13)
5. Robot dynamics In a group of mobile robots a natural way to select the subsystems is selecting each robot as a subsystem. Considering four wheel omnidirectional drive robots, the equations of motion for each robot, each a holonomic subsystem, could be based on simple Newtonian point physics [Cliff 1996], and a model that captures the torque produced by a direct current (DC) motor given by [Kalmar 2004],
r=aV-jJUJ,
(14)
where V is the voltage applied to the DC motor, w is the angular velocity of the motor shaft and (a, f.l) are constants properties of the DC motor given by a = k I / R a and
f.l = kf k[ / Ra , where k, is the motor torque constant, kf is the back-emf constant of the motor and R; is the armature resistance. For a four wheeled omnidirectional drive robot (Fig. 2), the equation of motion that describes its translation x r is given by, di x;
dx; (15) =aRVR+aLVL, dt dt where m is the mass of the robot, beff is the effective viscous damping coefficient, VR
m-2- +be!! -
is the effective voltage being applied into the motors to make the robot rotate counter clockwise, and in the same way VL is the effective voltage applied to make the robot rotate clockwise.
a R and a L are the effective counter clockwise and clockwise motor
constants respectively. It can be shown that be!!
=b + (f.lR + f.lL)/ r;"
where r{J) is the
radius of the wheel, and b is the friction coefficient of the vehicle in the media . The equation of motion that describes its changes in orientation 8 is given by,
524
Figure. 2. Four wheeled omnidirectional drive robot. Subsystem dynamics for each robot.
d 2e
de
L
L
(16) dt dt r; r; where J is the moment of inertia, bfkff is the effective angular viscous damping, and J2 +bfkjf-=-aRVR--aLVL'
L is the distance from the wheel to the center of the robot (Fig. 2). It can be shown that bfkjf = bo + L2 (JiR- JiJ Ir;" where bo is the angular friction coefficient. Defining the state vector as
x=(Xl' x 2 ) = (XT' B), (15) and (16) are a two input two output system that
can be written in state space as,
SR:
[~I] =r- °
b:
X2
1[X,]+r-taR ~aL 1[VR] , and a - -a V
bO - O~(f f
X2
frO}
R
frO}
SR:
L
L
[;:J=[~ ~I;:J
(17)
6. Subsystems The following two subsystems are of particular interest,
Any diagonal subsystem S D It is assumed that every single subsystem S; is represented by a
n x n state space
with n inputs and n outputs. Stability of each subsystem is accomplished by using pole placement such that A; - B;K;T = -fJ 1; where fJ > 0 and I; is the identity n x n matrix. Assume each subsystem to be fully controllable and observable by setting B.I =/.I and C.I =/.I for all i {I,..., N} where /.I is the identity n x n matrix. The
=
interaction gain matrix between subsystems is given by is the identity
Q = fJrI;
where r > 0 and
I,
ni x ni matrix . For the nearest neighbor interconnection, the dynamics of
the overall system S for N subsystems described by (11) would be given as, S D ( NN ) :
+ + + x) =-/l I +2y x) +fly x +x. +1') =-/l(1 +2Y)x N_1+fly (XN_ +xN)+ rN_1 N =-/l(1 +y)xN +fJYX N_ +' N°
i , = -/l(1 y )x, /lyx2 r; i 2 =-/l(I +2 y)x2 +/ly(x, + x) +r2 0
(
)
(
)
2
N- I
2
1
(18)
525 Robotic subsystem SR Each subsystem S; has the state space representation given by (17). The dynamics of the overall system S , for the nearest neighbor interconnection with N subsystems, is described by (11). The following parameters have been used in this robotic model:
b::: I(~)'
b(J ::: I(
m/ s
Kf :::O.OSS(
Nm), rad / s
Ra
::: 0.2(n ), K/:::2.5283XIO -3
(NAm), m :::l (Kg ),
vs), r; =0.032S(m), L=O.OSS(m), J=I(~)'
rad
gain matrix Q:::
[ro
rad l s'
0] , full state feedback K
-r
=[I
0
interconnection
«. = a R ,
0], and 1
f.JL
= f.JR '
7. Results Computation of connective stability is done for the robotic subsystem and the diagona l subsystem using fJ::: 1. The "aggregation matrix" define by (10) is used to set an upper bound on the magnitude of interactions between subsystems . Table (I) and (II) shows the computed maximum magnitude, r , which guarantees connective stability for
r2 as the upper bound on the magnitude of the interactions between 2 subsystems (e.g. r2 = 1/2 in Table (I)),
selected subsystems interconnections described in part 4. Define
and define q N as the upper bound on the magnitude of the interactions for N subsystems (e.g.
r < q6 for 6 subsystems) . As the number of subsystems are increased,
q N approaches zero for the NC and FI interconnections. Only the nearest neighbor interconnection reaches a nonzero limit. The limit can be shown to be, lim Y2
NC r < 1/2 r <1 /4 r < 0.1909 r <0 .1632 r <0 .1437 r< 0.1280
r <1/4 r<1/ 6 r < 1/ 8 r < I /10
r < 1112 r< 0.1150 r< 1/ 14 r < 0.1041 r<1/1 6 r< 0.0949 r < I/18
NCFK NCFK 11 2 lim r~O r
EJ
r « 1/ 2
1/ 2 N- I
Y<-
r~O
Number of Subsystems N 2 3 4 5 6 7 8 9 10
Connective Stability
NN
NC
EJ
r < 16525 r<11017
r < 16525 r < 16525 r < 8262 r<8262
r < 9681 r < 9135
r < 6312 r < 5396 r < 4750 r< 4241
r < 5508 r < 4131
r < 8855 r< 8693 r< 8589 r < 8519 r < 8470
r < 3802 r< 3442 r < 3139
NCFK NCFK 16525 r <- r~O N-t OC; 2 NCFK = No closed form known. N
lim
r < 3305 r < 2754 r < 2360 r < 2065 r < 1836 16525 r <- N- I
r~O
526 Equation (19) can be used when designing nearest neighbor interconnected subsystems to guarantee connective stability . Another equation can be found for fully interconnected subsystems, by induction, qN
= :~I
(20)
gives the upper bound for the interactions when fully connecting N subsystems .
8. Conclusion Some design equations were presented that could assist in the design of nearest neighbor interconnected subsystems or fully interconnected subsystems to guarantee connective stability. The more interconnections in the network, the less connectively stable the overall system becomes.
References Alcazar, 1 A. and Garcia, E., 2006, Parametric Analysis for Modeling and Simulation of Stochastic Behavior in the Predator-Prey Pursuit Domain, Transactions of the Society for Modeling and Simulation International, vol. 82, No. 12,827-840. Arkin, R. C., 1992, Cooperation without communication: Multiagent schema based robot navigation.L Robot. Syst., vol. 9, no. 3,351-364. Balch, T. and Arkin, R. C., 1998, Behavior-based formation control for multirobot teams, IEEE Trans . Robot. Automat., vol. 14,926-939. Chen, Q. and Luh, 1 Y. S., 1994, Coordination and control ofa group ofsmall mobile robots, in Proc. 1994 IEEE lnt. Conf. Robot. Automat., San Diego, CA,2315-2320. Chen, C. T., 1999, Linear System Theory and Design. New York : Oxford university press. Cliff, D. and Miller, G. F., 1996, Co-evolution ofpursuit and evasion II: simulation methods and results, Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior. Desai.J. P., Ostrowski, 1 and Kumar, V., 1998 Controllingformations ofmultiple mobile robots, in Proc. Conf. Robotics and Automation, Leuven, Belgium, 2864-2869. Desai, 1 P., Kumar, V. and Ostrowski, J. P., 2001, Modeling and control of formations of nonholonomic mobile robots, IEEE Trans. Robot. Automat., vol. 17,905-908. Feddema, J. T., Lewis, C. and Schoenwald, D. A., 2002, Decentralized control of cooperative robotic vehicles: theory and application, IEEE Trans . Robot. Automat., vol. 18,852-864. Fredslund, 1 and Mataric, M. J., 2001, Robot formations using only local sensing and control, in
Proc. IEEE Int. Symposium on Computational Intelligence for Robotics and Automation (CIRA-OI), Banff, Canada. Fredslund, 1 and Mataric, M. 1, 2002, A general algorithm for robot formations using local sensing and minimal communications, IEEE Trans. Robot. Automat., vol. 18, no. 5,837-846. Kalmar-Nagy, T., D'Andrea, R. and Ganguly, P., 2004, Near-optimal dynamic trajectory generation and control of an omnidirectional vehicle, Robotics and Autonomous Systems, vol. 46, 47-64 . Kube, R. C. and Zhang, H., 1993, Collective robotics: From social insects to robots, Adaptive Behavior, vol. 2, no . 2, 189-218. Siljak, D. D., 1990, Large Scale Dynamic Systems. New York: Academic. Siljak, D. D., 1991, Decentralized Control ofComplex Systems. New York : Academic.
Chapter 9
Estimating Complex System Robustness from Dual System Architectures Chad Foster Massachusetts Institute of Technology, Cambridge, MA [email protected]
This pap er invest igates differences in robustness and reliabil ity between hardware only and hardwar e-softwar e product s present ed in th e pat ent lit erature. Simplified models explore the reliability and robustness between these products . In genera l, more complex hardwar e-softwar e systems provided higher robustness but lower reliability. Additional product complexity may create a tradeoff between incr eased robustness and decreased reliabili ty. Systems that require both high complexity and reliability have to overcome this hurdle, and while simple answers are not pr esent ed here, som e industry specific case studies are investigated . Exa mples of t his trad eoff phenomena are presented , as are directions for future research.
1
Introduction
Pat ent examples from a variety of industries are used to explore th e issues of robustness and reliability. A short discussion is included about other aspects of the differences; reliability growt h, the cost of changes, and development tim e frames. The following discussion presents eight patents and the reasoning behind the mechanical or software aspects of each solution . Using information from liter ature[9] the generic reliability was compared for each solution. A simplified
528
Figure 1: 6,991,280
Figure 2: 6,926,346
Figure 3 : 4,086,022
robustness comparison was also made between the patent pairs . For the purposes of this paper robustness is defined as the ability to deal with variance in the system. Reliability is defined as the time between failures for the system, and assumed constant. All components in the system are considered to be of average reliability. A general discussion of the results follows with some conclusions and directions for future work.
2
Patents
Patent number 6,991,280 shown in Figure 1 describes "Airflow Control Devices Based on Active Materials. " This device is based on an active material such as a piezoelectric material or a shape memory alloy. Included in this valve is a sensor and a controller to provide closed loop position control. Specific robustness and reliability claims include a reduction in maintenance and a reduction in the number of failure modes. The comparison patent is 6,926,346 shown in Figure 2 describes an Adjustable Vehicular Airflow Control Device. This device is more conventional, using a belt or gear drive, to control a deflector. The implementation presented uses feedforward control that utilizes the vehicle speed sensor or a manual selection . The major innovation is adding a controllable airflow device where it has historically been fixed. Patent 4,086,022 shown in Figure 3 describes an Improved Compressor Casing. This improvement is described in other papers [3] and includes the addition of a number of slots in the casing to delay the onset of stall or surge. This solution uses a mechanical change to increase the operating range of the system A comparison patent 6,354,602 uses measurement and active operating limit
529
Figure 4: 6,889 ,8m
Figure 5: 6,371,857
Figure 6: 2,859,641
Figure 7: 6,637,572
line management for surge avoidance. This is done completely in software using sensor measurements and transfer functions to actively control the air and fuel flows . The major improvement , in addition to surge, is the ability to control for compressor fouling. The next example adds electronics to a current system . Patent 6,889,803 shown in Figure 4 creates a "torsional active vibration control system. " This device creates an adjustable torsional damper with at least one actuator that can adjust the absorption characteristics of the system. A comparative example without electronics is given in patent 6,371,857 shown in Figure 5 it describes a torsional vibration damper with increased stability. The base structure of the two inventions is similar. The mechanical design adds tuned dampers to improve the response frequency bandwidth. An example from the automotive literature exists in a locking differential. One traditional locking differential is called the Torsen differential and is explained in patent 2,859,641 shown in Figure 6. This differential is completely mechanical with a number of gears that 'lock' when the wheels begin to slip. Although this patent is 48 years old it is still used extensively in vehicles manufactured today (Audi, GM). A comparative example is a relatively new patent that uses electromagnetic clutches and a number of sensors to simulate a locking differential although the actual differential is standard . This patent, 6,637,572, is shown in Figure 7. There are numerous other pat ent examples that demonstrate this architecture comparison . Presenting the products and comparisons are frequently obscured between mechanical and electrical hardware and the proprietary software. The patent literature is often incomplete and many companies choose to maintain ownership through trade secrets rather than patents. Many of these
530
examples are incomplete and so were excluded from this st udy.
3
Reliability
Patent s were drawn from literature that were well known in th e marketp lace and offered ty pical comparisons. It is the aut hors belief that t hese are typical represent ations of dual patent architectures and not unique cases. Th e reliability for each of th ese systems was calculated by using th e failure dat a in published literature[9] . Two designs were designated as base designs and only th e change in reliability was calculat ed. The expected reliability of each design did not differ by more th an 75 failur es per million operating hours. An attempt was not made to add modification to the life equations, because similar in-use condit ions were assumed. Th e tabulat ed comparison between six patents of two differing designs is as follows: Devi ce Airfl ow Vibration Damper Surge Avo idance Transmissio n
P a t en t Life P atent Life Patent Life P at ent Life
It is noted that in every comparison the reliability of the syste m decreases with the addit ion of electronics and software. Th e mechanical components were replaced by a large number of electrical systems thus lowering the total reliability. Althou gh this holds as a general t rend here th ere are specific examples of overly complex mechanical syste ms or where th e software system has improved reliability by removing failur e prone components . In addition to this high-level calculat ion of reliability it should also be noted that the reliability growt h differs between th ese two systems (and is not studied here). It can be feasible to improve the reliability of the software in-situ , through software upgrades or other bug-fix releases. Mechanical systems are more difficult to upgrad e and are very expensive to replace in th e field. The most frequent upgrade to non-safety relat ed items is to fix-as-failed, thu s only having a slow effect on improving reliability . This practice replaces the poor previous design with a better design only when the product is brought in for a repair (reactive maintenance). Th e models of improving reliability in th e field show a lower dollar requirement for the software syste m to repair after release. Taking the rule of thumb that the cost of changing mechanical systems increase by an order of magnitude for every development st age, and t aking some field data on software development , the lesser cost of software is shown in Figure 8[6]. Th e comparisons of reliability used here do not include the act ual software, only the hardware components support ing that software. Th e reasoning behind t his is t hat t here is no accepted meth od for measuring software reliabillty jl l ].
531
Figure 8: Cost of Making changes in Mechanical and Software Systems
The best estimate used by researchers is based on the number of lines of code[6]. The addition of software only reduces the overall system reliability, and can reduce it quite severely and in unexpected ways. Software can be improved with additional redundancy, both in hardware and software, reliable software practices, simpler software blocks and adding more reliability testing. And although counter-intuitive, the most reliable systems avoid off-the-shelf software components. These custom systems are comprised of small, well-tested, redundant sub-systems of code and hardware. This creates a system where there are no unknown segments of code nor hidden interfaces[7]. Reused mechanical components have the opposite effect and are usually preferred due to their demonstrated robustness, and lack of hidden features. There are specific reliability cases that could reveal both the different extremes - reliable software and unreliable hardware or unreliable software and reliable hardware. The general case shown here is that adding sufficiently more components, even if they are highly reliable, reduces the overall predicted reliability of the system .
3.1
Failure Severity
Failure severity is defined as the number of failures that would completely prevent the system from completing its function. Ideally this is zero and any failure would still allow the system to perform its function (even with limited performance). This is interpreted in the software domain as fault tolerance. The primary catastrophic mechanical failure is fatigue . These failures are normally abrupt without obvious warning. The primary prevention technique is expensive on-site monitoring, or adequate margin in the mechanical design. This failure can be correlated to the loading, endurance limit and yield stress . There exists a well-known, straightforward, second order relationship[lO]. In software there are often numerous catastrophic failure modes with nonlinear relationships. The complexity of software systems stems from interdependencies, assumptions, and recycled code that are difficult to test. Replacing mechanical systems with software systems requires changing the development
532 pro cess to account for thi s complexity and test ing difficulty.
4
Robustness
Th e robustn ess definition used in this paper is the ability for a syste m to react to noisy input parameters with little performance degradation. Simple robust ness comparisons are used that compare the different embodiments . For each of t he previous patents, a relative ± comparison is made about the robustness. Th e Airflow devices functional requirements concerning robust ness and th e expectation to th ose requirements are laid out in the following chart . Note th e comparisons were only made '+' and '-' because of t he high-level nature of this comparison; furth er comparisons would require th e approximate distributions of t hese inputs and mathemati cal or simulation models of the out puts . Re quirement Varian ce In Air Unstable C urrents Tem p er a t u re C hange
Mec h (6 ,69 1,280) 0 0 -
Soft (6 ,926,346) 0
+ +
The diffic ulty of cont rolling t he act ive mat erials severely limits th eir use. Temperature, current , temperature and other affects need to be actively contro lled and so decreases the base robustn ess of this solution. Th e two methods to avoid surge were to first change the casing and t he ot her is to use sensors and a contro l syste m. Here is t he comparison of their noise factors. Requirement Pressure Ratio Soot Loadin g Temperature
Mech(4,086 ,022)
+
Soft(6,354 ,602) 0
+ +
0 0
The added software gives the syste m tr emendous flexibility to operate closer to t he surge limit and thus improve operating efficiency. The addit ion of mechanical slots adds manufacturing complexity and provides some losses during normal operation. The torsional vibr ation damper added an electronic cont roller with the intent of increasing th e damped frequency range. Requirem ent Freque ncy Va rian ce Wear
Mech(6,371,857)
o o
Comparing th e locking transmissions also shows greater robust ness in th e software based solut ion. Th e software syste m is able to respond to a greater range of wheel speed differences, and react faster. The software also allows t he flexibility for the syste m to respond when the wheels 'may' slip and not to wait for t hem to be slipping. Requirement Friction Loss Driving Sp eed
533
The robustness of these functional requirements were improved when performed with the software based system over the mechanical based system . In general the addition of the software addressed a failure and gave a more consistent output for the range of inputs.
5
Complexity
The systems so far have been assumed to followa linear law of requisite variety [1]. The amount of control that is generated by the system is directly dependent on the number of independent variables. The question arises about the coordination between the variables. It would be expected that the percentage of problems due to the software would increase by a power as the amount of software in vehicles increases. This is not observed in the recall data available through the National Highway Transportation Safety Administration (NHTSA). Motor vehicles seem to have the same percentage and severity of recalls with software as with hardware. Although there are numerous examples of poor software, and newsworthy failures (Ariane 5 missile, AT&T switching system , Therac-25, Osprey Helicopter) [5, 6] the added complexity does not create disproportional numbers of failures. This does not indicate that the systems are not more complex, they are, there is just no evidence of a power law relationship. The decrease in reliability for the software products can be seen from consumer reports data[2]. One of the big software development differences is the cost to find and fix any errors, it is reported that 80% of the software development costs are in finding and fixing defects[6] . During the development of the Space Shuttle the independent validation and verification group was was almost as large as the development group. The Space Shuttle software is considered the state of the art for reliable code at 0.1 faults per 1000 executable lines of code(KXLOC) . (Windows 2000 is rated at 2-4 per KXLOC) [6] . There are also variations in software quality not seen in the more standardized, and simpler, mechanical design. For example, in an experiment[8] 27 versions of the same algorithm were developed in Pascal the most reliable did not fail 1 in a million trials and the least reliable failed 10,000 in a million trials. There are unique requirements in both mechanical and software systems. Making the decision between the two systems requires complex tradeoffs in development resources, testing, and final end use reliability.
6
Conclusion
The move to more complex systems and specifically software systems is not without tradeoffs. The addition of components of greater complexity may reduce reliability at the benefit of increased robustness. It is challenging to make the tradeoff between reliability and robustness but one useful tool is in the transition to more complex systems. This relationship
534 does not appear non-lin ear in the literature nor in indu strial practice, as the complexity increases so does robustness at the expense of reliability. There is still potential for further cascading, catast rophic failur es with this increased complexity. This phenom ena was not seen in the aircraft indu stry [4] and is not expected in the automot ive industry. Future work is focused on aiding the designer to make t he decision between using software based or mechanical based solut ions. This considers items such as bud get , schedule, testing, resour ce loadin g, and product repar ability. The goal is to facilitate a more knowledgable choice between syst em options and managing complexity.
[3] DON CLAUSING, Daniel D. Frey, "Improving system reliability by failuremode avoidance including four concept design st rateg ies" , Systems Engineering 8, No.3 (2005), 245-261. [4] FREY, Daniel D, John SULLIVAN, Jos eph PALLADINO , and Malvern ATHERTON, "Part count and design of robust systems", Systems Engineering (INCa SE) (2006). [5] G AGE, Deborah, and J ohn MCCORMICK, "We did nothing wrong: Why software qu ality matters" , Baseline (2004). [6] HATTON, Les, (1997).
"Software failures, follies, and fallacies" , IEEE Review
[7] KELLER, Ted , and Norm an F. SCHNEIDEWIND, "Successful application of software reliability engineering for the nas a space shuttle" , Proceedings of th e 1997 8th International Symposium on Software Reliability Engineering, ISSRE, (1997). [8] LEVESON, J . C. Kni ght N. G. , "An experimental evalua tion of the assumption of independence in multi-version programming" , IEEE Transactions on Software Engineering 12 (1986), 96-109. [9] Moss , T R, The Reliability Data Handbook, ASME Press (2005). [10] SHIGLEY, Joseph Ed ward , and Ch arles R. MISCHKE, Mechanical Engineering Design 6th ed, McGr aw-Hill (2001). [11] SMITH, David J , Reliability, Maintainability and Risk - Practical Methods for Engineers, Elsevier (2005).
Chapter 10
Inquiry and Enterprise Transformation Dean J. Bonney Graduate Student Johns Hopkins University dean. [email protected]
1.1. Introduction What really constitutes an Enterprise from an Enterprise Systems Engineering perspective? My effort to create a workable definiti on follows: An Enterprise is a complex system of community, individual , and semiotic systems defined by the way its authentic relationships are constructed and dependent on the ways the partie s to these relationships bind to one anothe r. The imprint of an Enterpr ise is projected through its spatial identity, an identity that builds and maintain s its currency throu gh the positive images and authentic relation ships it embra ces. As we inquire about the state of the Enterpri se, how will we collect the data that will enable us to influenc e the desired transformation? There exists a hierarch y of inquir y that can move an Enterpri se System Engin eer' s line of inquiry from focusing on the best solution to solving the problem (e.g. transformation) to focusing on visualizing a future that will enable the Enterpri se to shift from where it is to where it desires it to be. To be proficient in this line of inquiry, one must becom e familiar with the architecture of transformational question s, the process for creating those question s that will acquire the data necessary for a transformational shift, and when and how to execute the inquiry.
536 This paper will introduce the Enterprise Systems Engineer to Enterprise properties that should be understood before a transformation initiative can begin, and will provide an inquiry process for beginning the transformation.
1.2. The Art and Science of Inquiry and Transformation How can we transform an Enterprise without first knowing of, then understanding its spatial identity? Enterprises come with no automatic markers for territorial or systemic integrity. An Enterprise's spatial identity is defined by the authentic relationships and positive images that community, individual and semiotic systems share with each other. Systems thinking, both technical and social, are required to understand the dynamics of the Enterprise as systems, relationships, and images bind and unbind within the Enterprise domain . The domain of an Enterprise is bound by its spatial identity, and resistance from other Enterprises influences this identity. Using technology as an enabler can both transform and reform the relationships and images that compose an Enterprise's spatial identity .
1.2.1. Transformational properties of an Enterprise The following paragraphs describe the Enterprise properties that playa role in the transformation process.
1.2.1.1. Enterprise Systems The author has selected community, individual, and semiotic to categorize the systems that, together, form an Enterprise. The spatial identities of these systems categories should have a high degree of affinity and form the Enterprise spatial identity. Community systems are composed of community or subject matter experts and peer groups that organize, manage, and disseminate community data in support of both community and Enterprise positive images and their authentic relationships. Information exchange packages are distributed between other community and individual systems within the Enterprise. Examples of community systems can be found in the business operations of an Enterprise . Individual systems consist of a single individual or group of individuals dedicated to promoting an authentic relationship or positive image within the Enterprise. An example of an individual system would be a set of processes whose outcomes consist of widgets that can be identified as an Enterprise product by external consumers. A semiotic system is one that is symbolic to the Enterprise. There is only one semiotic system in an Enterprise and it embraces all positive images and authentic relationships that the Enterprise accepts. An Enterprise cannot exist without the presence of a semiotic system. An example of a semiotic system would be the Cross in the Christian religion. Any use of the Cross by Christians transmits a message to other members of the Christian Enterprise as well as to external consumers of that message.
537
1.2.1.2. Spatial Identity Space and identity are products of positive images and authentic relationships. The spatial identity of an Enterprise is bounded by the differences of its identity in relation to other Enterprises. For the Enterprise, that boundary exists at the point where other Enterprises resist the influence of its positive images and authentic relationships.
1.2.1.3. Authentic Relationships An authentic relationship exists for an Enterprise when it can be established that its systems are exploring and inventing together based on the relationship. Authentic relationships arise from sharing and acknowledgement, from regular expressions of commitments and values, and from trust and honesty [Harris 1999j.
1.2.1.4. Positive Images Positive images exist as a result of the semiotic system that forms the core of an Enterprise. Positive images are not necessarily viewed as positive by other Enterprises. Examples of this include the 20 th century struggle between communism and democracy . Each of these Enterprises embraced positive images that were anathema to each other. Most Enterprise positive images are expressions of its semiotic system. For example, the images of redemption and afterlife universally held by Christians are expressions of the semiotic system represented by the Cross .
1.2.2. Inquiry: Architecture and Process Inquiry is an acquired skill that assists the Enterprise Systems Engineer in identifying an Enterprise's spatial identity and the properties of that identity (systems , images , and relationships). Once this data set is acknowledged and accepted through conscious awareness by the Enterprise, it becomes possible to transform the Enterprise. It is the author's belief that it is the responsibility of the Enterprise Systems Engineer to facilitate, record, and influence activities that lead to conscious awareness.
1.2.2.1. Architecture The architecture of inquiry is based on its construction, scope, and assumptions. The linguistic construction of a question can make a critical difference in the success of leading an Enterprise to conscious awareness of its spatial identity. Think of the construction as a continuum that moves from less powerful to more powerful questions. At the less powerful end, questions elicit yes/no responses . The more powerful questions begin with how, what, and why. It is questions that begin with these words that stimulate more reflective thinking and a deeper level of conversation. A deeper level of conversation is required to unlock the unconscious understanding of an Enterprise's images , relationships, and semiotic purpose [Vogt 2003] . It is important not only to be aware of how words influence the effectiveness of inquiry, but also to match the scope of a question to its necessity.
538 'Take a look at the following three questions: • How can we best manage our work group ? • How can we best manage our company? • How can we best manage our supply chain ? In this example , the questions progressively broaden the domain of inquiry as they consider larger and larger aspects of the system ; that is, they expand in scope [Vogt 2003].' The Author will continue with Vogt 's example of assumptions. ' Because of the nature oflanguage, almost all of the questions we pose have assumptions built into them, either explicit or implicit. These assumptions mayor may not be shared by the group involved in the exploration; for instance the question, "How should we create a bilingual education system in California?" assumes that those involved in the exploration have agreed that being bilingual is an important capacity for the state's students. However, some powerful questions challenge everyone's existing assumptions. For example, ask yourself what assumptions the following question might challenge: "How might we eliminate the border between the u.s. and Mexico?" To formulate powerful questions , it's important to become aware of assumptions and use them appropriately [Vogt 2003].' In the citations above, Vogt demonstrates the importance in being deliberate and premeditated in approaching and developing a line of inquiry that will unlock an Enterprise's unconscious awareness .
1.2.2.2. Process Vogt provides the following 'game plan process' for unlocking unconscious awareness. •
•
Assess the current situation - Conduct a situation analysis that includes some or all of the following : o Assessment or gap analysis of current and desired outcomes-based results o Meetings with key stakeholders to unlock and discover images and relationships o Mapping of resistance points that could influence the future spatial identity of the Enterprise Discover the "big questions" - Seek the core questions, usually 3 to 5, that, if answered, would make the most difference to the future of the Enterprise. This can be accompli shed by clustering related questions and considering the relationships between them. 'Clarify the "big questions" that the clusters reveal and frame these as clear and concise queries , not as problems . Something fundamental changes when people begin to ask questions together - they go beyond the normal stale debate about problems that passes for strategy in many Enterprises [Vogt 2003] .
539
•
•
Create images of possibility - Creating vivid images of possibility begins the process of transforming to new positive images. These images can be visualized through the use of the 'art of the long view' process where different stories of the future are created by the 3 to 5 questions posed. Evolve workable strategies based on the new images to form relationships that will transform and reform the Enterprise.
By moving from a problem /solution paradigm toward a process focused on essential inquiry , the Enterprise Systems Engineer will be able to slowly transform the Enterprise from a future that reacts to stimuli to a future that achieves the possible.
1.3. Conclusion There is still much to be explored in the emerging field of Enterprise Systems Engineering. This paper provides a theoretical approach to Enterprise transformation that is based mostly on observation and experience. The process for using inquiry as a catalyst for transformation has been used successfully at the organizational level. It is the author's hope that the definition and properties of an Enterprise described in this paper provoke debate about the essence of an Enterprise.
References Harris , D. L., 1999, Transforming a Social Movement, The School of Cooperative Individualism , (Chicago) . Vogt, E.E., Brown, J, & Isaacs, D., 2003, The Art of Powerful Questions, Pegasus Communications, Inc., (Waltham) .
The author's affiliation with The MITRE Corporation is provided for identification purposes only, and is not intended to conveyor imply MITRE's concurrence with, or support for, the positions, opinions or viewpoints expressed by the author. MITRE Case No. 08-0362 is approved for public release; distribution unlimited.
Chapter 11 Capabilities-Based Engineering Analysis (CBEA) Mike Webb The MITRE Corporation [email protected] Abstract Th is paper describes capabilities-based engineering analy sis (CBEA) as a new analytical approach to support enterprise systems engineering (ESE) . CBEA provides a framework for capabilities-based planning, programming, and acquisition analysis in a systemic approach to the purposeful evolution of a complex enterprise. This paper outlines the basic approach , guiding principles, and challenges of CBEA for researchers and practitioners of ESE .
2. Capabilities View of the Enterprise Economic theory has increasingly focused on the concept of capabilities as the defining characteristic of an enterprise, and this view has propagated through the fields of business and management. Indeed , in its firm-oriented incarnation , evolutionary economics [Nelson & Winter 1982] is ofte n referred to as the capabilities view [Foss 19931. In this view. an enterprise is conceptualized and distinguished in terms of its unique capabilities. and these capabi lities are considered to be the most potent source of competitive advantage , growth , and succes s of an enterprise [Collis 1994 ].
2.1. The Complex Context Central to the capability view is a primal foc us on dep loying enterprise resources to achieve end-user effec ts or outcomes [Haeckel 1999] . However , the foc us on purposeful effec ts is typically compl icated by the complexity of the enterprise and the enviro nment in which it operates . Figure I illustrates a high-le vel view of an enterprise and the issues attendant to its complex context. Fig ure 1 - Com pl ex Cont ext of the Enterprise Org anizatio ns
~_ .. ~.:.x... o 6 0-'" .:..0 a p
'. - 0.'::
-. ~ ' c omPIfCllI;,d 0!!lanlzatlonal Relationships
. ,« :
,,~ ••••, deploy t \
\
\
-',
-.......
(via OfKle,..tive preten.",,'
Resources .rc.
I
UncertaInty "
-'-.-.
\
:~~
RapId Evolution \ \ \, Adapt
,
'
..
Compc>tlng ObjeetJves
"
AntiCipate"
q:¥::'~~;p \
"'" " "
-,
\
\com~!~rlnteraetJons
to.~c~~~:o.... p
'. \
"
Emergent D.mands ' " ,
End·U ser Effects \ "- -:J.~:_-'''~- ......:: -:.-\ Mu1f}ple, Diverse
'..
-. _. , "".-... ....... .............
Cons~ntCh!nge
"
.... ... . tJo:.....- .:""
..
•••
\
Stakeholders
~
..
...... in Different Environments
Figure. 1. The Capabilities-Based View of the Enterprise The complex enviro nmenta l context shapes the efficacy of the ente rprise. For example, in evolut ionary eco nomic theory [Nelson and Winter 19821. orga nizational capabilities and decision rules evolve ove r time through deliberate problem-solving and random events . An economic analog ue of natural selec tion operates as the market determ ines which firms are and are not viable in the real-world difficulties of complexity, uncertainty , and bounded rationality . Interestingly , the authors conclude that attempts at long-range optimization and contro l of technological adva nces will lead not to efficiency but to inefficiency . This capabilities perspective is not limited to commercial enterprises; the United States Departm ent of Defense (DoD) has adopted such a view to operate in today' s
542
complex, uncertain world . The 2001 Quadrennial Defense Review [DoD 2001] promulgated a capabilities-based approach to planning that was reaffirmed and detailed in the recent Joint Defense Capabilities Study [Aldridge 2004) . Capabilities-based planning has been defined as " ... planning , under uncertainty , to provide capabilities suitable for a wide range of modem-day challenges and circumstances, while working within an economic framework" [Davis 2002J.
2.2. The Focus of Enterprise Modernization Researchers emphasize the need for an enterprise to invest in capabilities rather than functions or business units [Stalk, Evans & Schulman 1992]. Chandler [1990J views the integration , coordination, and maintenance of organizational capabilities to be as difficult as their creation, as changing technologies and markets constantly pressure capabil ities toward obsolescence . The survival and growth of the enterprise depends on a continuing modernization of organizational capabilities . As the pace of change continue s to increase and the nature of that change becomes increasingly discontinuous, the concept of dynamic capabilities becomes a key concern of the enterprise. Dynamic capabilities are typically viewed as the ability to integrate, build , and reconfigure competencies to address rapidly-changing environments [Adner & Helfat 2003; Eisenhardt 2000; Teece et. al. 1997; Winter 2003] . CBEA is intended to serve as an adaptable approach to achieving dynamic capabilities .
3. CBEA -- Embodying Complex System Strategies The complex systems paradigm has strongly influenced modem enterprise management theory, and numerous researchers have studied the implications of complexity for enterprise strategies [e.g. Beinhocker 1999; Brown & Eisenhardt 1998; Cohen & Axelrod 1999; Connor 1998; Haeckel 1999; Kelly & Allison 1999; Macintosh & MacLean 1999; Sanders 1998]. In the context of capabilities analysis , complexity results from the interconnectedness of capabilities , the social relationships within the enterprise [Barney 1991] and from co-specialized assets , that is, assets which must be used in conjunction with one another [Teece 19861. To address these issues , CBEA is founded on a few key strategies from complex systems theory .
3.1. Modularity Inspired by "The Architecture of Complexity" [Simon 1962], many researchers advocate the adoption of modularity as a design principle to manage complexity [Baldwin & Clark 2000; Sanchez & Mahoney 1996; Schilling 2000]. The adoption of modular design principles for all aspects of the enterprise -- organization , resources, processes, and effects - entails the creation of semi-autonomous modules with stable and visible rules for communication and interaction . As long as the integrity of intermodule interaction is preserved , module designers are free to engage in local (e.g. within a module) adaptation or innovation . Partitioning the enterprise into capability categories, with attendant capability portfolios focused on critical end-user outcomes , is a core strategy of CBEA to manage complexity . However , as noted above, many interactions and complications are likely to remain to some degree across the enterprise. For example, while two systems may be weakly coupled structurally , they may be highly coupled functionally . Schaefer
543
conclude s that , "it would seem unlikely that a finn could ever hope to uncover an optimal modular design partition .. ." [Schaefer 1999, 325] .
3.2. Exploratory Modeling & Analysis Enterprise plannin g problem s are typically characterized by enormous uncertainties that should be central consideration s in the design and evaluation of alternative courses of action. A key approach to addressing pervasive uncertainty is exploratory modeling & analysis [Bankes 1993 ; Davis and Hillestad 2001 ; Lempert, Schlesinger & Bankes 1996] . The obj ective of exploratory analysis is to understand the implications of highly uncertain problem and solution spaces to inform strategy and design choices. In particular, exploratory analysis is intended to identify strategies that are flexible , ad aptive, and robust. CBEA uses exploratory modeling and analysis to examine enterprise capability issues in the broadest possible context of scena rios , conditions, and assumptions, readily complem enting the technique of planning with multiple scenarios [Beinhocker 1999 ; Courtney 1997; Epstein 1998; Schoemaker 1997]. Relevant models and analysis methodologies should be able to reflect hierarchical decomposition through multiple levels of resolution and from alternative perspectives representing different aspects of an enter pr ise. Ideally, the analytical agenda should be able to examine the relat ive fitness of enterprises respons e to a great variety of possible futures .
3.3. Adaptive Evolutionary Planning The complex dynamics of enterprise relationships suggest that enterp rise evolution depends critically on factors other than global intention and design. Co mplex systems have the internal capacity to change in unpredictable ways that cannot be described by optimization plann ing approaches. Researchers point to the limits of predictab ility and conclude that reliance on a single evolutionary strategy is inappropriate [Beinh ocker 1999; Pascale 1999]. Instead , the goal should be to develop a collection of strategies to facilitate ready adaptation to future changes. As an extension of exploratory modeling and analysis , CBEA considers a broad set of risk factors (e.g. cost , schedule, performance , technologies) to envision alternative evolutionary paths. Thi s implies planning for multiple options and adapting strategies as scenarios unfold. Preparing for multiple conting encies is a key element of an adaptive planning proce ss. Haeckel [1999 ) emphasizes that creating an enterprise culture of adapting and respond ing is paramount to survival and success .
4. CBEA - Shaping the Evolution of Capabilities CBEA is founded on the premise that an enterprise and its capabilities must be viewed as complex adaptive systems. This perspective fosters a change in focus from specific programs or functions to a capability-centric design that facilitates horizontal integration of a distributed , mission-oriented enterprise. CBEA supports this perspective by analyzi ng enterprise capabilities, linking capabilities to portfolios of processes and resources , and identifying reconfigurations for effec tive adaptation. The analytical approach is based on iterative analysis and design foc using on structure, behavior, and effects. Structure includes all syste m inputs and their
544
relationships (e.g. policy, organization, training, materiel, leadership, personnel, and facilities) [Gharajedaghi 1999]. Behavior entails the action and performance required to produce outcomes , and effects are the resulting operational outcomes . These three dimensions, their interdependencies , and their interactions with the environment provide the foundation for a holistic enterprise analysis methodology . 4.1. Analytical Principles Based on the foregoing context and motivations , CBEA is founded on a set of key set of analytical principles : I)
Focus on outcomes (desired operational effects) of the enterprise end-user , vice inputs such as particular programs, platforms , or functions.
2)
Frame a portfolio perspective as a means of partitioning the problem and solution spaces in terms of capabilities (capability partition s or modules) .
3) Approach issues holistically , examining all aspects of structure , behavior , and effects ; consider a full range of alternative solutions to provide a capability. 4)
Examine the complex networks of interdependencies. Such interdependencies exist across all the fundamental dimensions of analysis (structure, behavior , and effects) , in multiple aspects, at different levels of hierarchical description .
5)
Explicitly bound profound uncertainties attendant to complex adaptive system problems ; all assessments must be accompanied by rigorous risk analysis .
6)
Pursue an adaptive evolutionary approach to planning to position the enterprise to effectively respond to changes as they occur .
7)
Assess and balance the evolution of capabilities within resource constraints for a wide range of diverse and stressing operational circumstances.
4.2. Analytical Activities The scope and complexity of the capabilities-based enterprise perspective requires a disciplined and robust analytical construct capable of assessing and managing risk across very diverse and uncertain problem sets, with consideration and competition of a broad range of solution approaches constrained by cost and resource constraints. The need for engineering analysis pervades this perspective, and CBEA addresses this need. Figure 2 identifies the basic modules and general flow of CBEA. The modularized schema is not a standardized process, but a collection of interrelated analytical activities that can be assembled as needed to support capabilitie s-based planning . All the modules interrelate , but there are three major phases: I) purposeful formulation; 2) exploratory analysis, and 3) evolutionary planning .
Figure. 2. Capabilities-Based Engineering AnalysisActivities Purposeful formulation establishes the framework for analysis in a participative process that engages stakeholders as purposeful systems [Ackoff & Emery 1972) . Based on a thorough review of stakeholder needs and objectives , the analytical process specifies the relevant outcome spaces, that is, the operational goals , contexts, and conditions for which solutions must be designed. A significant part of the CBEA approach is to stimulate analysts to consider a wide set of possible scenarios and conditions against which solution options are evaluated by outcome-based metrics . Based upon the outcome space descriptions, the baseline capability portfolio is bounded and described . A capability portfolio includes all the structural elements that must cooperate to provide the desired operational outcomes in a capability area. The exploratory analysis activities of CBEA assess the performance and cost of portfolio options over the broad range of formulated contexts, while generating new concepts for possible improvement. The focus of performance evaluation is an assessment of capability risks and opportunities, emphasizing the identification of critical capability drivers, capability gaps, and possible options for significant improvement. The focus on capabilities, vice existing solutions , facilitates proposals for new ways and means to accomplish missions, potentially fostering new transformational capabilities . CBEA emphasizes competition among creative alternative solution concepts, with cost considerations serving as critical constraints in defining feasible solutions (e.g. cost as an independent variable) . CBEA evolutionary planning activities focus on developing flexible, robust, and adaptive approaches to the development and fielding of solutions within the context of capability portfolios and the broader enterprise . Central to this effort is the examination and integration of alternative evolution strategies aimed at synchronizing component structures and behaviors under different possible contingencies ; different time-phased
546
cost and performance profiles are developed for different evolution paths. Beyond a single capabilit y portfolio, plans must be assessed for their impacts (e.g. technical , functional , and resource) on other capability portfolio s and the broader enterpris e. Such planning assessments are typically integrated in a capab ility roadmap that summarizes the results of the analysis and decisions.
5. Summary & Conclusions CBEA is a key element of enterprise systems engineering; it is the analytical framework that supports capabilities-based planning, programming , and acquisition in a systemic approach to the purposeful evolution of the enterpri se. The purpose of CBEA is to help enterprise decision-makers adj udicate risks through their policy and resource decisions in a highly uncertain , dynamic , and complex environment. In this context , a focus on capabilities provides the guiding principles and raison d' etre of the enterprise . This work outlines the key principles and processes of CBEA that can be adapted to provide the needed engineering analysis for a broad set of enterpri se management issues. CBEA principles and processes, derived from basic strategies to deal with comple xity , are relatively new and will continue to evolve. The need to focus on enduser capabilities in a complex, dynamic and uncertain world will continu e to motivate the development of CBEA.
References Ackoff', R. L. & Emery, F. E.. 1972, On Purposeful Systems, Aldine-Atherton (Chicago) . Adner R. & Helfat , C.E., 2003 , "Corporate effect s and dynamic managerial capab ilities" , Strategic Management Journal , 24,1011 -25. Aldridge , P., 2003 , Joint Def ense Capabilities Study Final Report, Joint Defense Capabilities Study Team , January 2004. Baldw in, C. Y. & Clark , K . B., 2000 , Design Rules: The Power of Modularity , MIT Press (Ca mbridge , MA). Bankes, S. C., 1993, "Exploratory modeling for policy analysis", Operations Research 41(3), 435 -449. Barney, J. B., 1991, "Firm Resources and Sustained Competitive Advantage," Journal of Management 17 (March) : 99- 120. Beinhocker, E. D., 1999, "Robust adapti ve strategie s ," Sloan Management Review, Spring, 95106. Brown , S. L. & Eisenhardt, K., 1998, Competing on the Edge: Strategy as Structured Chaos, Harvard Business School Press (Bost on) . Chandler, A. D. Jr., 1990, Scale and Scope: The Dynamics of Industrial Capitalism , Harvard University Press (Cambridge, MA). Cohen , M. and Axelrod , R., 1999, Harnessing Complexity: Organizational Implications of a Scientific Frontier, The Free Press (New York ). Collis , D. J., 1994, "Research Note: How Valuable Are Organi zation al Capabiliti es?" Strategic Management Journal, 15 (Winter), 143-152 . Connor, D. R., 1998, Leading at the Edge of Chaos: How to Create the Nimble Organization, John Wiley & Sons (New York ). Courtney, H., 1997, "Strategy under uncertainty" , Harvard Business Review, 75(6), 67-81. Davis , A, 2002 , Analytic Architecture fo r Capabilities-Based Planning, Mission-System Analysis, and Transformation , RAND (Santa Monic a, CAl . Davis, P. K. & Hillestad, R., 2001, Exploratory Analysis fo r Strategy Problems with Massive Uncertainty. RAND (Sant a Monica, CAl .
547
Department of Defense (DoD), 2001, Quadrennial Defense Review Report. Washington , D.C.: United States Department of Defense, September 30 , 2001 Eisenhardt, K. & Martin, J., 2000, "Dynamic capabilities : what are they?", Strategic Management Journal, 21,1105-1121. Epstein, J., 1998, "Scenario planning: an introduction", Futurist, 32(6) , 50-52 . Foss, N. J., 1993, "T heories of the firm: contractual and competence perspectives ", Journal of Evolutionary Economics, 3, pp. 127-44. Gharajedaghi , J ., 1999, Systems Thinking , Butterworth-Heinemann (Boston) . Haeckel, S., 1999, AdaptiveEnterprise, Harvard Business School Press (Boston, MA). Kelley, S. & Allison, M. A., 1999, The Complexity Advantage: How the Science of Complexity Can Help Your Business Achieve PeakPerformance, McGraw-Hili (New York). Lempert, R., Schlesinger, M. E., & Bankes, S. C, 1996, "When we don't know the costs or the benefits: adaptive strategies for abating climate change:' Climatic Change, 33(2), 235274. Macintosh , R. & MacLean D., 1999, "Conditioned emergence : a dissipative structures approach to transformation", Strategic Management Journal, 20 (4),297-316. Nelson, R. R. & Winter , S. G., 1982, An Evolutionary Theory of Economic Change, Belknap Press (Cambridge , MA). Pascale , R. T., 1999, "Surfing the edge of chaos" , Sloan Management Review 40(3), 83-95 . Sanchez, R. & Mahoney , J. T., 1996, "Modularity, flexibility, and knowledge management in product and organization design" , Strategic Management Journal, 17, Winter, 63-76. Sanders , T. 1.,1998, Strategic Thinking and the New Science: Planning in the Midst of Chaos. Complexity, and Changes, The Free Press (New York). Schaefer , S., 1999, "Product design partitions with complementary components ", Journal of Economic Behavior & Organization, 38(3) , 311-330. Schilling, M., 2000, "Toward a general modular systems theory and its application to interfirm product modularity", AcademyofManagement Review, 25(2), 312-334 . Schoemaker, P., 1997, "Disciplined imagination: from scenarios to strategic options" , International Studiesof Management and Organization, 27(2), 43-70 . Simon H. A., 1962, "The architecture of complexity", Proceedings of the American Philosophical Society. 106,467-482, reprinted in: Simon H. A., 1981, The Sciences of the Artificial (2nd ed.), MIT Press (Cambridge , MA). Stalk , G. Jr. , Evans, P. & Schulman , L. E., 1992, "Competing on capabilities : the new rules of corporate strategy" , HarvardBusiness Review, 70 (2), March-April, 57-70. Teece, D. J., 1986, "Firm boundaries , technological innovation and strategic management ", in The Economics of Strategic Planning , edited by L. G. Thomas. Lexington Books (Lexington , MA),187-199. Teece , D. J ., Pisano , G. & Shuen, A.. 1997, "Dynamic capabilitie s and strategic management", Strategic Management Journal, 18, August, 509-533 . Winter, S., 2003, "Understanding dynamic capabilities", Strategic Management Journal, 24 (10), 991-995 .
Chapter 12
Stakeholder Analysis To Shape the Enterprise Keith McCaughin and Joseph DeRosa
The MITRE Corporation www.mitre.org
1.1. Introduction An enterprise is a complex adaptive social system that should maximize stakeholder, not shareholder, value - value to employees, customers, shareholders and others . We expand upon Russell Ackoff's direction to distribute value among stakeholders, to propose a schema of rules that guide the interactions among autonomous agents in the transactional environment of an enterprise. We define an enterprise as an organization and its transactional environment interacting with and adapting to each other. Enterprise behavior can only be understood in the context of this transactional environment where everything depends on everything else and interactions cannot be controlled, but can be influenced if they are guided by an understanding of the internal rules of the autonomous agents. The schema has four complementary rules (control , autonomy, return and value) derived from the work of Russell Ackoff and Michael Porter. The basic rules are applied in combination to eight stakeholder types derived from Richard Hopeman and Raymond McLeod (Leaders , Competitors, Customers, Public , Workers, Collaborators, Suppliers and Regulators). An enterprise can use this schema and rules in a process of stakeholder analysis to develop and continually refine strategies to encourage behaviors that benefit the enterprise and discourage behaviors that harm the enterprise. These strategies are implemented in a relationship management program in support of enterprise strategic management to consciously and explicitly shape the environment to reduce risks and increase opportunities for success.
1.2. Stakeholder Analysis To Shape the Enterprise
549
1.2.1. Benefits of Stakeholder Analysis
The stakeholder analysis process creates the basis for the essential understanding necessary to influencethe transactional environment of an enterprise: • Allows an enterprise to shape its environment through interactive planning. "Interactive planning is directed at gaining control of the future. It is based on the belief that an organization's future depends at least as much on what it does between now and then as on what is done to it." [Ackoff, 19991 • Enables conscious and rational relationship management of key stakeholders because the enterprise can understand why different stakeholder types behave differently from one anotherand why they behave the way they do. • Enables tailoring strategies for key stakeholders to take greater advantage of opportunities and avoid or mitigate unwanted risks when they becomeapparent. • Enables enterprise change management to be aligned with changes in the environment. • Enables continual improvement and adaptation to change. 1.2.2. Definition of Enterprise
•
An enterprise is a complex adaptive social system [Pisek et ai, 1997J comprised of an organization [Ackoff, 1999] of people, processes and technology interacting with its transactional environment [Gharajedaghi, 1999) and adapting to rules of a schema [Holland, 19951.
..- .
'.
",
Tr n etien
En . onm nt
Figure 1. The Organization in the transactional environment
550 1.2.2.1. Complex Adaptive Social System As a complex adaptive social system, an enterprise includes people with freedom to act in not always predictable ways. Their actions are interconnected in such a way that they change the context for all. Adaptation occurs as they adjust to change. An organization is a purposeful system that contains at least two purposeful elements that have a common purpose relative to which the system has a functional division of labor; its functionally distinct subsets can respond to each other's behavior through observation or communication; and at least one subset has a system-control function [Ackoff,1999]. The system-control function is of particular interest in stakeholder analysis. "This subset (or subsystem) compares achieved outcomes with desired outcomes and makes adjustments in the behavior of the systems which are directed toward reducing the observed deficiencies. It also determines what the desired outcomes are. The control function is normally exercised by an executive body which operates on a feed-back principle. 'Control' requires elucidation. An element or a system controls another element or system (or itself) if its behavior is either necessary or sufficient for subsequent behavior of the other element or systems (or itself), and the subsequent behavior is necessary or sufficient for the attainment of one or more of it goals ." [Ackoff, 1999]
1.2.2.2. Transactional Environment The transactional environment is made up of all parties to the transfer of information, goods, services or funds within the enterprise. _...- ... .. .....-
Figure 2 Organization and transactional environment with eight stakeholder types The transactional environment has eight stakeholder types , either organizations or individuals who affect and are influenced by the enterprise. Raymond McLeod [McLeod, 1994] built on work of John Hopeman [Hopeman, 1969] to develop the
following stakeholder types : Customers, Suppliers , Stockholders or Owners , Labor Unions, Government, Financial Community, Local Community, Competitors.
551
1.2.2.3. Adapting to Rules of a Schema
/
Figure 3. Schema of four rules for stakeholders We compared the revised list of stakeholder types to the value chain of Michael Porter [Porter, 1985] and inferred that there arc basic rules governing the behavior of these eight types of stakeholders. Porter's ideas of margin and value fit neatly on the horizontal axis of the schema but he did not explicitly define the vertical integration dimension . So we compared the vertical dimension to Ackoff's definition of an organization and inferred that leaders must be most strongly motivated by the systemcontrol function ; a desire to control the enterprise's destiny and create its future as well as generate wealth . We inferred then that workers must desire to remain autonomous or uncontrolled to the extent possible while gaining wages for work. We • • • •
determined that the schema for stakeholders has four complementary rules: Control - Above center Autonomy - below center Return - left of center Value - right of center
This schema can be used to establish a set of rules for each stakeholder type. Before we develop the rules for each stakeholder type further , we want to reinforce the need to understand the transactional environment in order to shape the enterprise, by discussing two key principles of enterprises as systems. The enterprise is an open system with an organization at the center interacting with its transactional environment made up of all parties to the transfer of information, goods, services or funds within the enterprise.
552 1.2.3. Key Enterprise Principle: Openness
Figure 4. Circles of enterprise control, influence and appreciation The first principle we would like to emphasize is openness. As an open system , an enterprise can only be understood in the context of its environment [Gharajedaghi, 1999]. Everything depends on everything else. There are people in the organization who can be controlled and all people and organizations in the transactional environment that cannot be controlled. The enterprise can only be understood in the context of an organization and its transactional environment. Leadership must influence and appreciate the environment while guiding the organization by its internal "code of conduct" (system-control function).
1.2.4. Key Enterprise Principle: Purposefulness The second principle is purposefulness. The organization of the enterprise is a purposeful system. The stakeholders are purposeful systems. An enterprise needs to understand why stakeholders do what they do before it can influence them . Understanding is different from information and knowledge [Gharajedaghi, 1999]. Understanding answers why stakeholders do what they do. Knowledge answers how stakeholders behave . Information answers what can be observed about that behavior. The schema for stakeholder analysis helps understand stakeholder types and their rules of behavior.
1.2.5. Schema for Stakeholder Analysis
553
Figure 5. Eight stakeholder types arrayed in the environment The schema has four complementary rules : • Control - The system-control function identified by Ackoffthat is necessary and sufficient for subsequent behavior of the enterprise or its members . • Autonomy - Freedom to act in ways that are not always predictable. • Return - Return on investment or margin, the remainder of selling price less production costs . • Value - Fitness for the intended purpose that is greater than the price of acquisition . From this schema we infer rules for each of the eight stakeholder types :
•
Leaders (McLeod's Stockholders or Owners) seek maximum control to realize their vision for the enterprise whether fame, fortune, service, etc. If enlightened, they are concerned with balancing all other forces for the good of the enterprise and its stakeholders.
•
Competitors (McLeod's Competitors) seek maximum control of and value from the enterprise by appropriating the organizations market, customers, employees and proprietary secrets for their own gain.
•
Customers (McLeod's Customers) seek maximum value from purchasing products or services with the greatest usefulness at the lowest prices .
•
Public (McLeod's Local Community) seek maximum value and autonomy from the enterprise to share in a better quality oflife with little or no cost.
•
Workers (McLeod's Labor Unions) seek maximum autonomy from the control of the enterprise to pursue their own goals as well as the goals of the enterprise.
•
Collaborators (Mcleod's Financial Community) seek maximum return in common interests while remaining autonomous from the enterprise to form alliances with other enterprises.
•
Suppliers (McLeod's Suppliers) seek maximum return for their products or services.
554 •
Regulators (Mcleod's Government) seek maximum control to gain compliance with laws or policies while seeking maximum return through taxes and fines. This applies to government organizations but also to non-government organizations with policy and standards authority such as industry and trade associations.
These rules demonstrate that each stakeholder type operates with a different rule . Left unchecked, the stresses the eight stakeholder types exert on an enterprise can pull it apart. If an enterprise serves the interests of only one stakeholder, no matter who, it can be disastrous. For example, if customers' expectations are not managed, they will soon expect more than can be delivered. The key to success in shaping the enterprise is to balance the interests of all the stakeholders through win/win strategies that are complementary.
References
555
Ackoff, Russell, 1999, The Best of Ackoff, Toward a Systemof Concepts, John Wiley and Sons Axelrod, Robert and Cohen, Michael D., 2000, Harnessing Complexity, Basic Books, New York Gharajedaghi, Jamshid, 1999,SystemsThinking: Managing Chaos and Complexity, Butterworth Heinemann, Boston Holland, John, 1995, Hidden Order: How Adaptation Builds Complexity, Reading, MA, Perseus Books Hopeman, Richard1., 1969, Hopeman, Systems Analysis and Operations Management, Charles E. Merrill, Columbus, OH, 79-81 McLeod, Raymond, Jr., 1994, Systems Analysis and Design: An Organizational Approach, The Dryden Press PIsek, Paul; Curt Lindberg; BrendaZimmerman, 1997, Some Engineering Principles for Managing Complex Adaptive Systems, (Version: November 25, 1997) http://www.plexusinstitute.com/edgeware/archive/ed ge-place/think/main_filing1.html Porter, Michael E., 1985, Competitive Advantage, The Free Press, New York, NY, 1985
Chapter 13
Systems Thinking for the Enterprise: A Thought Piece George Rebovich, Jr. The MITRE Corporation [email protected] This paper suggests a way of managing the acquisition of capabilities for large-scale govemment enterprises that is different from traditional "specify and build" approaches commonly employed by U.S. govemment agencies in acquiring individual systems or systems of systems (SoS). Enterprise capabilities evolve through the emergence and convergence of information and other technologies and their integration into social, institutional and operational organizations and processes. Enterprise capabilities evolve whether or not the enterprise has processes in place to actively manage them. Thus the critical role of enterprise system engineering (ESE) processes should be to shape, enhance and accelerate the "natural" evolution of enterprise capabilities. ESE processes do not replace or add a layer to traditional system engineering (TSE) processes used in developing individual systems or SoS. ESE processes should complement TSE processes by shaping outcome spaces and stimulating interactions among enterprise participants through marketlike mechanisms to reward those that create innovation which moves and accelerates the evolution of the enterprise.
1 Introduction This paper is about a way of managing the acquisition of capabilities for large-scale enterprises that is different from the traditional "specify and build approach" commonly employed by govemment agencies in acquiring individual systems or systems of systems. How the United States (U.S.) Federal Reserve Board (the Fed) helps manage perhaps the most complex enterprise in the world today - the U.S. economy - motivates the direction of this paper. How complex is the U.S. economy? As measured by the gross domestic product, the 2005 U.S. economy was estimated at $12.4 Trillion, involved nearly 10,000 publicly traded companies, and millions of consumers. All of these companies and consumers are operating in their own self-interests. By U.S. law, the Fed is charged with maintaining a balance between growth and inflation in the U.S. economy. Remarkably, the Fed has basically four tools available to it to maintain this balance. It can: sell or purchase government securities, change the reserve requirements for banks, change the discount rate at which banks borrow money from the Fed, and change the short-term Fed funds rate at which banks borrow money from each other. Separately and in combination, these mechanisms serve to increase or decrease the
557 supply of money in the economy. Of course, great economic analysis skill is needed in deciding how many securities to sell or buy and when, and whether and how much to change reserve requirements, discount and Fed funds rates, and when. But, generally, the U.S. economy responds in a way the Federal Reserve intended. Think about that. The Fed harnesses the complexity of the myriad interconnected organizations and individuals in the U.S. economy through a handful of interventions to achieve its purpose. Companies and consumers innovate to make and change decisions in response to the Fed's interventions in a way that serves their own self interests and - at the same time - the interests of the Fed. What a powerful model for engineering government enterprise capabilities .
2 Enterprise and Enterprise Capabilities By "enterprise" we mean an association of interdependent organizations and people, supported by resources, which interact with each other and their environment to accomplish their own goals and objectives and those of the association [NESI 2004]. Resources include manpower, intellectual property, organizational frameworks and processes, technology, funding, and the like. Interactions include coordination of all types, sharing information, allocating funding and the like. The goals and objectives of the various organizations and individuals in the enterprise will sometimes be in conflict. In the business literature an enterprise frequently refers to an organization, such as a firm or government agency; in the computer industry it refers to any large organization that uses computers (e.g., as in Enterprise Resource Planning systems). The definition of enterprise in this paper is intended to be quite broad and emphasize the interdependency of individual systems and systems of systems, and the emergence of new behaviors that arise from the interaction of the elements of the enterprise and its environment. The definition includes firms, government agencies, large information-enabled organizations and any network of entities coming together to collectively accomplish explicit or implicit goals . This includes the integration of previously separate units [MITRE 2007]. Examples of enterprises include: A chain hotel in which independent hotel properties operate as agents of the hotel enterprise in providing lodging and related services while the company provides business service infrastructure (e.g., reservation system), branding and the like. A military command and control enterprise of organizations and individuals that develop, field and operate command and control systems, including the acquisition community and operational organizations [3] and individuals that employ the systems. Historically, many in the SE community have focused primarily on hierarchical relationships within an enterprise and tended to isolate systems from the environment in which they are contained, often by assuming the environment is fixed or static. The systems engineering literature is replete with phrases like "system capabilities" and "SoS capabilities," so the question arises, "what is an enterprise capability and how does it differ?" An enterprise capability involves contributions from multiple elements, agents or systems of the enterprise. It is generally not knowable in advance of its appearance. Technologies and their associated standards may still be emerging and it may not be clear yet which will achieve market dominance . There may be no identifiable antecedent capability embedded in the cultural fabric of the enterprise and thus there is a need to develop and integrate the capability into the social, institutional and operational concepts, systems and processes of the enterprise. The personal computer emerged as a replacement for the combination of a typewriter and a hand-held calculator, both of which were firmly embedded in our social, institutional and operational concepts and work processes. The personal computer is not an enterprise capability by this definition. But the internet is an enterprise capability. Its current form could not possibly have been known in the 1980s. Its technology has been emerging and continues to do so. More fundamentally, there was no identifiable antecedent capability embedded in the cultural fabric of our society before the internet's emergence , nor were
558 there associated problems and solutions to issues like identity theft, computer viruses, hacking, and other information security concerns.
3 Evolution of Enterprise Capabilities Enterprise capabilities evolve through emergence, convergence, and efficiency phases as suggested by the stylized s-curve in figure I. This is similar in its essentials to Rogers' diffusion of innovation curve [Rogers 2003] which later influenced Moore's technology adoption curve [Moore 2002]. Emergence is characterized by a proliferation of potential solution approaches (technical, institutional, and social). Many of these potential solutions will represent evolutionary dead-ends and be eliminated (convergence) through market-like forces. This is followed by a final period (efficiency) in which the technology is integrated and operationalized to such a degree that it becomes invisible to the humans, institutions and social systems that use them. • Sm all number of mature soluti ons:
Integra t.d 00 co mpl.t.l y Ihallhty becom e "'I nvisible"
Effic iency
~
:c gooc: RI
uCIl.=! III 0
• No clear slngl.
oolutlo n: prollloraUon of solution approac hes
.- >
...li.w CIl
"E
w
• Multiple, adequate O<)lullono wllh little performa nce
Emergence
diff.rentiatlon among them
Time Figure]: Phases of Enterprise Capability Evolution and their Characteristics
Enterprise capabilities evolve through emergence, convergence and efficiency phases whether or not an enterprise (or society) has intervention processes in place to actively manage them. Interventions, whether purposeful or accidental, can alter the shape of the evolutionary curve, the speed at which evolution progresses and the level of evolutionary development achieved. This is notionally depicted in figure 2. For illustration purposes assume that curve A depicts how a capability would evolve in an enterprise without explicit, purposeful interventions.
_-- - -c ~
:0
[g
---A
f_ _
_----~:+------B
lJ;
.. .:1 0 .-III >
e,w
.l!!
c w
Time
A:. Evolution without purposefullnlerven tl ons (notional)
B: Foreshortened exploration ; quicker con....ergence; less optimal effic iency phase c : Ex:,~~ ::1 ..d ex etc raucn: h:;mge( convergen ce: m o re. 'HK(.'f;~'.>Ll ; " Hi<::·j
Figure 2: Shaping, Enhancing and Accelerating Enterprise Capability Evolution
559 In curve B, enterprise engineering processes shorten the exploration phase (perhaps by early down-selecting to a small number of acceptable enterprise-wide standards for a particular technology). This provides the benefit of converging more quickly to an efficiency phase but at the cost of a less optimal efficiency phase (perhaps because superior alternatives were never explored due to the foreshortened emergence phase). Conventional examples of this type of premature convergence include the competition between VHS and BetaMax systems of video recording, and QWERTY and Dvorak keyboard arrangements . Curve C depicts a situation in which exploration of an enterprise capability is extended, perhaps to consider additional emerging technology alternatives. This has the effect of deferring exploitation of a preferred approach beyond either of the other two curves, but in the end it results in the most successful efficiency phase. Figure 2 is not meant to suggest that foreshortened exploration necessarily leads to a less optimal efficiency phase or that extended exploration guarantees a more successful one. There are no hard and fast rules. Too much exploration can leave an organization permanently disorganized so that new ideas have their underpinnings swept away in subsequent change before it is known whether they will work. Aggressive exploitation risks losing variety too quickly which can happen when fast reproduction of an initial success cuts off future exploration and possible improvement. These are not just two ways that a good concept can go wrong. The two possibilities form a fundamental trade-space. Investments in options and possibilities associated with exploration usually come at the expense of obtaining returns on what has already been learned [Rebovich 2005]. The critical role of enterprise engineering processes is to shape, enhance and accelerate the "natural" evolution of enterprise capabilities. In the emergence phase, interventions should favor and stimulate variety and exploration of technologies, standards, strategies and solution approaches and their integration and operationalization in and across enterprise organizations, systems and operations. In shaping convergence, the goal of interventions is to narrow the solution approaches and start to balance exploitation of more robust solutions with exploration of promising, emerging alternatives. In the efficiency phase, interventions favor exploitation of that which is known to work through proliferation of a common solution approach across the enterprise. This is notionally depicted in figure 3.
lntel'VtlnUonl '. vo r & . Umula.e var'ety & exploration 01 technolog .... IlIndarda & Implementation atf1llegles
III
W
Interven tions reward &
lncenttvtze common s olution.
Emergence
(e .g .. via coili bo ratkm)
Time
Figure 3: Role of Purposeful Interventions in Shaping Enterprise Capability Evolution
4 Traditional Approach to Developing System Capabilities Traditional systems engineering is a sequential, iterative development process used to produce systems and sub-systems, many of which are of unprecedented technical complication and sophistication. The INCaSE (ANSI/EIA 632) Systems Engineering process is a widely recognized representation of traditional systems engineering [INCaSE 2004].
560 An implicit assumption of classical systems engineering is that all relevant factors are largely under the control of or can be well understood and accounted for by the engineering organization, the system engineer, or the program manager and this is normally reflected in the classical systems engineering mindset, culture, and processes. Within most government agencies, systems are developed by an acquisition community through funded programs using traditional system engineering methods and processes. The programs create a plan to develop a system and execute to the plan. The classical process works well when the system requirements are relatively well known, technologies are mature, the capabilities to be developed are those of a system, per se, and there is a single individual with management and funding authority over the program. It is estimated that the United States Department of Defense manages hundreds of systems of record being developed or modernized through funded programs of record using traditional systems engineering methods and processes. There are numerous variations on this classical systems engineering approach, including build-a-little, test-a-little incremental or spiral developments, to mitigate uncertainties in long-range requirements, technology maturity, or funding of the system. The prevailing business model in most government development or modernization acquisition programs is to contract for the promise of the future delivery of a system that meets specified performance requirements, contract cost, and delivery schedule. These program parameters are set at contract award and they form the basis for success or failure of the program and the individuals working on the program. This model of success shapes the organization's engineering processes, management approaches and the motivations of program staff to place emphasis on tracking the progress of program parameters, uncovering deviations, and taking corrective action to get back on course to deliver according to the contract. This is normally accomplished through milestone reviews and other events that illuminate progress in the framework of the system performance requirements, cost, and delivery schedule.
5 Traditional Approach to Developing SoS Capabilities The traditional approach to developing multi-system capabilities is through an executive oversight agency that aligns and synchronizes the development of individual systems to develop a capability that is greater than the sum of the individual systems. This approach works well for SoSs comprised of individual systems that are being developed together as a persistent, coherent, unified whole, particularly when: the identity and reason-for-being of the individual elements of these SoSs are primarily tied to the overarching mission of the SoS, the operational and technical requirements are relatively well known, the implementation technologies are mature, and there is a single program executive with comprehensive management and funding authority over the constituent systems. Examples of these types of SoS include the Atlas Intercontinental Ballistic Missile system, an air defense system, and the United States National Air and Space Administration's original Apollo Moon Landing capability. The traditional approach to SoS development or modernization works for this category of SoS because it satisfies all the essential conditions and attributes of engineering at the system level, only it is larger. The community culture, organizational norms, rules of success, engineering and business processes, and best practices in this category of SoS are either essentially the same as at the system level or they scale appropriately.
6 Enterprise Engineering The traditional approaches to systems engineering described above break down for engineering enterprise capabilities. Enterprise capabilities evolve through largely unpredictable technical and cultural dimensions. Enterprise capabilities are implemented by the collective effort of self-serving organizations whose primary interests, motivations, and rewards come from successfully fielding system capabilities. The identities of the individual elements of the enterprise do not strongly derive from the resulting enterprise
561 capability. Enterprise engineering is an emerging mode of systems engineering that is concerned with managing and shaping forces of uncertainty to achieve results through interventions instead of controls. It is directed towards enabling and achieving enterprise-level and cross-enterprise capability outcomes by building effective, efficient networks of individual systems to meet the objectives of the enterprise. Enterprise engineering manages the inherent uncertainly and interdependence in an enterprise by coordinating, harmonizing and integrating the engineering efforts of organizations and individuals through processes informed or inspired by evolution (both natural and technology) [Axelrod 2000] and economic markets [Schelling 1978]. Enterprise engineering is a multidisciplinary approach that encompasses, balances and synthesizes technical and non-technical (political, economic, organizational, operational, social and cultural) aspects of an enterprise capability . Enterprise engineering is based on the premise that an enterprise is a collection of agents (individual and organizational) that want to succeed and will adapt to do so [Allison 1999]. The implication of this statement is that enterprise engineering processes are focused on shaping the outcome space and incentives within which individuals and organizations develop systems so that an agent innovating and operating to succeed in its local mission will - automatically and at the same time - innovate and operate in the interest of the enterprise. Enterprise engineering processes are focused more on shaping the environment, incentives and rules of success in which classical engineering takes place [Rebovich 2006].
7 A Framework for Developing Systems and Evolving Enterprise Capabilities Figure 4 depicts an approach in which enterprise engineering processes shape the evolution of enterprise capabilities through emergence, convergence, and efficiency phases via evolutionary or market-like mechanisms at the same time that individual system capabilities are being developed via the traditional system engineering approach of building to a plan. The basic notion is to stimulate innovation by and interactions among government programs of record to move the enterprise towards an enterprise capability at the same time the programs are developing their systems of record . Specific interventions depend on the phase or state the enterprise capability is in. IM
-.- _ _
• • • . ---- I ·. ~· I TTT" .... e - ......... c..--..
Prescri ptive.
compflance-baaed opp,ooch to build system
copoblllU..
<.-
lil7" J r=:::=l ~
I
j
,
I
.;..
, .:...
I II
<.-
I
:
I I
l
'--
'-.-.
I I' I
:
~
I I ~
.....,
Shape & Influence
Ipproach to evolve enterpri.. c poblll ll ..
,
~1 1F '-
Figure 4: A Framework for Evolving Enterprise Capabilities
This approach is similar in its essentials to the Federal Reserve intervening in the United States economy in which the collective response of organizations and individuals operating in their own self interests to those interventions serve their needs and those of the Federal Reserve at the same time.
562 Some real and potential examples of stimulating innovation within a government system acquisition setting are as follows. In the Defense Advanced Research Projects Agency Grand Challenge, for a $2M prize, the U.S. government got multiple millions of dollars of innovation in critically important autonomous robotic vehicle technology by sponsoring a "race." This accelerated the application of that technology to military applications up the evolutionary curve. Another potential way of stimulating innovation is that the government could consider changing acquisition program review criteria. One example might be that for a program to pass a milestone review, it must show adequate technical progress at the milestone plus demonstrate that it collaborated with another program that produced variety in an enterprise capability during the emergence phase. Another example might be to streamline program reviews for programs that can demonstrate the creation of variety in an enterprise capability during the emergence phase. Yet another market-inspired approach would be for the government to consider establishing a metadata market incentive fund to stimulate interactions among government system programs of record surrounding the value they create or take from the enterprise. Picture a value exchange matrix for each system acquisition program in which rows are "value taken from the enterprise" by a program of record and columns are "value provided to the enterprise" by a program. The idea would be to reward programs with richly populated matrices and give little or no reward to programs with sparsely populated matrices.
8 Monitoring the Evolution of an Enterprise Capability An enterprise capability is a characteristic of the enterprise in its operation. The implication is that enterprise performance should be strongly tied to the behavior of operational units employing enterprise systems and capabilities in actual operations. Measures intended to monitor the evolution of enterprise capabilities should focus on capturing changes in the way operational units interact. The evolution and utilization of enterprise capabilities have strong elements of social system structure and dynamics. The implication is that the definition of enterprise measures should include sociologists as well as operational and technical experts. Formal verification of the piece-parts of an enterprise capability will still need to be done as part of system sell-offs, but they should not be viewed as the primary indicators of an enterprise capability. Even as simple a system as a wristwatch is primarily evaluated holistically (e.g., does it tell time correctly?) and not as the pair-wise interactions of its myriad mechanical and electrical parts. Table 2 suggests some examples of measures for monitoring the evolution of military interoperability at the enterprise level. Table 1 Monitoring Enterprise Evolution
Emerzence Increase total no. of interface control documents among programs of record. Increased volume of voice, email, chat & instant messaging among operational units. Communication emerging among previously noninteracting units.
Cenverzence Decrease in number of interface control documents.
Efficiency Predominant use of single standard among operational units.
Increased use of common standards among programs of record.
Predominantly continuous interactions among operational units.
Less episodic, more continuous interactions among operational units.
563
9 Summary An enterprise is a complex social system. Enterprise capabilities are not knowable in advance of their appearance. They evolve over time. There have important technology and cultural aspects. Enterprise capabilities come about through evolutionary processes. There are important technology and cultural dimensions to this evolution. The critical role of enterprise engineering is to guide and manage processes that are largely evolutionary. Enterprise engineering is largely about intervening in "technology and social system" evolutionary processes to shape and accelerate outcomes. Enterprise engineering processes complement more traditional system engineering processes by incentivizing programs developing system capabilities to behave in ways that serve both program needs and enterprise needs. Assessments of enterprise capability performance and evolution should be strongly tied to the behavior of the enterprise in live operations.
Bibliography [1] 17 December 2004, "Enterprise in the Net Centric Implementation Framework," v 1.0.0, NES\. [2] Evolving Systems Engineering at MITRE. The MITRE Corporation, Bedford, MA. August 2007. [3] This example is intended to include government organizations, non-profits and commercial companies. [4] Rogers, Everett M. Diffusion of Innovation, 5th Edition, New York, NY, Free Press. 2003. [5] Moore, Geoffrey A., Crossing the Chasm, Harper Collins, New York, NY. 2002. [6] Rebovich, G., Jr. November 2005. Enterprise Systems Engineering Theory and Practice, volume 2. Systems Thinkingfor the Enterprise: New and Emerging Perspectives. MP05B043, vol 2, The MITRE Corporation, Bedford, MA. [7] "Systems Engineering Handbook," INCaSE-TP-2003-016-02, version 2a, INCaSE, June 2004. [8] Axelrod, R. and M. D. Cohen, 2000. Harnessing Complexity: Implications ofa Scientific Frontier, Basic Books.
Organizational
[9] Schelling, T. C., Micromotives and Macrobehavior, W. W. Norton. 1978. [10] Allison, G. and P. Zelikow, 1999. Essence ofDecision: Explaining the Cuban Missile Crisis, 2nd edition, Addison-Wesley. [11] Rebovich, G., Jr. "Systems Thinking for the Enterprise: New and Emerging Perspectives," Proceedings of 2006 IEEE International Conference on Systems of Systems, April 2006.
Chapter 14
Representing the Complexity of Engineering Systems: A Multidisciplinary Perceptual Approach Matt Motyka, Jonathan R.A. Maier and Georges M. Fadel Clemson Research in Engineering Design and Optimization Laboratory Department of Mechanical Engineering, Clemson University fgeorge @clemson.edu
1. Introduction The natural evolution of design leads to the creation of devices that are increasingly complex. From the invention of the wheel to the use of the wheel in the landing gear of a spacecraft, new technology often builds upon or further develops existing products. As devices become more complex, so does the process of designing them. In order to meet consumer demands, design requires the integration of technology from multiple disciplines. This can lead to the formation of intricate interdependencies between components and systems within the device. These interdependencies may include material , geometric , dynamic , pneumatic , vibration, acoustical , thermal , electrical , and chemical con siderations . Because of their complex nature , the interdependencies within a device are not always apparent , and may easil y be overlooked by inexperienced designers. Overlooking critical design aspects often leads to the need for re-design, which can be costly and time consuming. The goal, then, is the effective management of this complexity in such a way that designers and engineers are continuously aware of the crucial interdependencies that relate to their design objectives. Information must be up
565 to date, readily available , and its relevance made apparent. To fulfill these requirements, data representation techniques must be efficient and comprehensive . The current situation indicates that data representation is not keeping pace with the demands of design, which creates a hindrance to the design process. Juergen Mihm provides one of the most comprehensive studies of complexity in design to date (Mihm, 2003) . His efforts focused on mathematically modeling the structure of complexity in design, and then simulating generic design scenarios to gain insight into the fundamental aspects of complexity . Mihm views the modem design process as necessarily iterative-continuous refinement driven by external feedback . His model emphasizes two primary sources of complexity: size of the design project and design team cooperation , and how these factors affect convergence to a final design solution . As a result of his study, Mihm proposes six suggestions for effectively managing complexity: limiting system size, modularization, cutting interdependencies, immediately broadcasting design updates, release of preliminary information, and globally optimizing overarching chunks. One focus of further research, then, is to improve communication to broadcast design updates and release information. How do we make data more communicable and presentable? Design teams must have access to relevant and current data . Furthermore, they must be able to efficiently under stand the data generated by others and successfully interpret the implications it may have on their own design task . The key, then, is efficiency . One major aspect of this effic iency is data representation . This paper presents our preliminary efforts to investigate possible data representation techniques used to manage complexity in design . In Section 2, we review existing techniques for representing device behavior. In Section 3 we review perceptual considerations for representation schemes in general , and present our new representation scheme for device interactions . The scheme is illustrated in the case study in Section 4. Summary remarks are offered in Section 5 .
2. Current State of Complexity Representation 2.1. Functional Modeling Traditionally , each engineering discipline has its own unique scheme for representing the data that is most relevant to its particular field . Because the design of a device may require input from several of these disciplines , several graphically incompatible models are often generated. Because of this disciplinary approach to design decomposition , it may be difficult for the design team to see how the information from the electrical circuit diagram affects the thermal data on the temperature-entropy diagram because the two representation schemes are so visually disparate . This synergy of information, while not clearly represented , may have a great impact upon the design of the device . Functional model ing is a popular method for representing interactions within mechanical devices. The method advocated by Pahl and Beitz (1996) offers several advantages for the representation of a technical system . The scheme is based upon a functional decomposition of the device, and illustrates the flow of energy , mater ial, and information (EMI) . A major advantage of the Pahl & Beitz representation scheme is that the EMI concept allows information from several disciplines to be represented in the same scheme, giving a truly functional representation of the device. In other words ,
566
it allows for transition between a disciplinary design approach and a functional design approach. For example, the energy input may be from electrical current or it may come from thermal source. Otto and Wood (2001) discuss further developments of functional modeling. By incorporating the modularization of functions and heuristic branching, Otto and Wood enable the scheme not only to provide functional decomposition of the system, but also represent some elements of aspect decomposition . Aspect decomposition focuses on dividing the design project into well established design categories. For instance, automobile design is often divided into a power train team, a ride quality team, an interior team, etc . By modularizing, or grouping, several related functions in the scheme together, their relationship to physical components begins to emerge. Functional models can unify information from several disciplines, but in doing so these models promote generalization over detail . Thus they provide a comprehensive overview of the conceptual mapping in the design, but include little information for the technical aspects of the design . Another element of design not incorporated in theses approaches is geometric information. Functional models thus provide the designer with tools for representing multi-disciplinary , generalized, functional information for a device ; however, the designer is still in need of a means to represent multi-disciplinary data on a detailed and relational level.
2.2. Bond Graphs Bond graphs also allow for the synthesis of data from different disciplines, but do so with a more technical approach . Instead of energy, material, and information, the currency in a bond graph consist of either effort or flow. Effort and flow are defined in terms of almost any discipline , such as the flow of electrical current or the flow of entropy in thermodynamics. Rowand effort are brought together at junctions, where either the flow sums to zero (p-junction) or the effort sums to zero (s-junction) . These flows interact with physical elements in the model, designated as having one or two ports (Kamopp & Rosenberg, 1968). More complex elements (such as a gyrator) will have two ports, and more simple elements (such as spring) only have one port, with each port offering a connection point for an effort and flow . This allows the bond graph to offer a unified illustration of the system under study. The reduction of all physical systems to terms of effort, flow , junctions, and elements has great capacity for synergizing data . The bond graph also allows the illustration of causality within the system, which provides an additional representational dimension to the scheme (Lacroix, et. ai, 2001). However, once constructed, the bond graph is fairly rigid and not easily adaptable if the system is dynamic . Because of its universal structure, the integration of mechanical elements is not always straightforward (Lacroix, et. ai, 2001). While comprehensive in a mathematical sense, the bond graph does not incorporate ways to convey the geometric information , qualitative details, and phenomena related interactions (extraneous to the intended functions) that are needed in the design process .
567
3. Perceptual Considerations In order to ensure simplicity and clarity in data representation , one must first understand what qualifies as simple and clear to the observer . Green's study of perception with regard to data representation illuminates several issues that deserve consideration (Green, 1991). With the use of the three geometric dimensions, size, shape, orientation , color, and texture, it appears that the potential exists to represent at least eight different dimensions for each data point. Then, if one considers the use of animation, data layering , and auditory mapping, the dimension representation potential grows considerably . How much information can the human mind sort, and when does a representation scheme become confounded by its own complexity ? Much modern research has been done on this topic, and there appear to be three basic conclusions: I) different representation techniques lend themselves to representing specific data types, 2) there is a limit to the number (and types of) data dimensions that humans can perceive efficiently, and 3) that humans, with time and training , possess the ability to interpret data beyond the efficiency limit (Green, 1991).
3.1. Perceptual Variables With regard to the first conclusion , representational dimensions are often separated into two primary categories, planar and retinal variables. Planar variables represent the two spatial dimensions, and retinal variables include all other variables, such as color, size, shape, depth, etc . These variables are then further categorized as associative, selective, ordered, or quantitative . Associative variables can be ignored when looking at other variables . For example, it is easy to ignore shape when you are looking at color, so shape is an associative variable. Selective variables allow you to single out that variable while ignoring others. It is easy to notice the color and ignore the shape , so color is selective. To put it succinctly, associative variables are easily ignored and selective variables are easily noticed . Ordered variables allow for representation of increasing or decreasing values. Brightness , for example, could be used as an ordered variable , whereas color could not. The fourth category , quantitative , refers to variables that can convey proportions and ratios . If one line is twice as long as the other (size), the viewer easily interprets the ratio of the values (Green, 1991). The main point of contention presented by Green is that shape can be select ive, or easily distinguished from other variables. Traditional theory points to the fact that circles , triangles , squares , etc. tend to blend together when combined in a visual field . Green proposes that these shapes are simply too similar, and that the use more disparate shapes such as "X" and "0" create a distinct selective effect (Green,1991).
3.2. Preattentive limits Now that the variables and their representative properties have been more or less defined, it is important to look at how they may be combined. The general agreement is that efficient visual data representation is limited to the two planar variables and one retinal variable. This statement has been largely supported by psycho-physiological research, which concludes that the three variable structure is necessary for preattentive , or effortless, perception . Any further addition to this structure requires the user to actively analyze the visual image to interpret the data . Green has suggested that one of
568
his additional retinal variables (velocity, direction, frequency, phase, and disparity) can be added to represent a fourth dimension , without disturbing preattentive perception. The two final important points with regard to perception are I) that training and experience can increase human capacity for data interpretation , and 2) that interpretation does not necessarily become more difficult as the number of variables increases. It has been shown that, with practice, differentiation between shapes becomes easier (because shape as a selective variable) . One study also indicated that interpretation of five data variables proved to be easier than four data variables (Green, 1991).
3.3 New representation scheme The task, then, is to take a complex design scenario and develop a useful and communicative scheme for representing the relevant data through the application of perceptual principles. In this representation scheme, relationships between components are depicted as the interaction of functions/ phenomena. Each type of function/ phenomena was assigned an identifying color so that they are distinguishable from one another. Table I summarizes all of the relationships mentioned to this point and their assigned colors. Table 1. Function! Phenomenon Types and Their Representative Colors
acoustical YELLOW
material
GRAY
The perceptual elements utilized in the representation scheme consist of two dimensions of the planar variables and the retinal variables of size, color , and shape. The presence of three retinal variables exceeds the limitations for preattentive recognition in the scheme. Although the limitations are exceeded, it is by a small margin, which should permit the user to understand the scheme with little prior instruction or study. The function/ phenomenon circles require a two dimensional location with regard to being inside or outside the component circle and to their polarization. The function / phenomenon circles implement the properties of color and size. The arrows implement the properties of size and shape (line style as shape) . Color was properly selected becau se it allows function to be either selecti ve (singled out) or associative (ignored) while observing the diagram . Size is justifiably used to represent ordered data . Shape (in the form of line style) is used to distinguish between external and internal interactions , and conforms to Green's requirements for being
569 associative and selective . The representation scheme is illustrated in the following case study which represents the interactions between components of a mechanical device .
4. Motorcycle front disk brake case study With many aspects of the proposed representation scheme defined, it can now be applied to the front disc brake system. First, Figure I shows each component of the front disc brake and its main function. The main functions are polarized to show key influencers. Also shown in Figure I are the components that make physical contact. Secondary functions and related phenomena have also been added to the diagram , with different size circles to indicate the relative levels of significance. Figure I shows these additions. With the details of the components specified, interrelationships can be shown . In Figure 2, arrows link the functions and phenomena in order to identify key relationships . The line weight indicates the strength of the relationship and the arrow direction parallels the direction of influence. The dashed lines depict interactions with external components and the solid lines depict internal relationships .
•
•
---...,~~
I
lI~h
e-·
---"~~ mod
GRA
y
Figure 1. Components with secondary functions and related phenomena
570
•
.1oG!'
----~~
e-· •
I
----t~~ mOO<"" <
•• GRA y
Figure 2. Components with interrelationships The representation scheme is in complete form in Figure 2. Notice that it does not display any specific geometric information . Because of the existing capabilities of CAD software and in order to reduce visual complexity to the viewer, the representation scheme will simply link the component circles to the appropriate CAD files for further geometric information. Also, with each element added to the scheme, more effort is required to discern information quickly . For this reason, the scheme features various levels, with each level illustrating one type of relationship. For example, clicking on any of the thermal elements in the diagram would link to a separate page that gives further thermal details. The possibility exists to further simplify the presentation of information by offering a level that shows only one function/ phenomenon . While limited in depth of information, this basic depiction gives a very clear, uncluttered overview of the thermal impact on different components.
5. Summary remarks The proposed representation scheme incorporates several elements of complexity in the front disc brake system. Each component is represented, along with its primary and secondary functions and related phenomena. These functions / phenomena are qualitatively given one of three levels of relative importance to the system. Eight different types of relationships are clearly identified and illustrated, and categorized as external or internal, also with an indication of the relative interaction strength and the direction of influence. Components that physically touch are also identified. With all of the potential representative capacity of the scheme, it is also important to validate its
571 usability . A key benefit is the incorporation of unintended interactions, through the representation of related phenomena. Functional flows are illustrated and given causality, along with a qualitative measure of importance . This is all accomplished while respect the perceptual considerations that allow the scheme to retain clarity and thus be of greater use to the designer.
References Pahl, G. & W . Beitz, 1996, Engineering Design : A Systematic Approach , 2nd edition , SpringerVerlag (New York) . Bimber, 0 ., 2004 , HoloGraphies: Combining Holograms with Interactive Computer Graphics, Bauhaus Univers ity (Weimar , Germany) . Cleveland , W ., 1993, Yisualizing Data, AT&T Bell Laboratorie s, (Murray Hill , NJ). Ebert, D., Favre J. M., & Peikert, R., 2001, Data Yisualitution 2001, Springer-Verlag Telos . Edmundson, D., Johns , B., & Scharff, R., 1998, Motorcycles: Fundamentals , Service, and Repair , 3rd edition , Goodheart-Wilcox. Maier , J . R. A. & Fadel, G . M. (2006) . "Understanding the Complexity of Design" , in Complex Engineering Systems, edited by Y. Bar-Yam , A. Minai & D. Braha. Springer -Verlag . Green, M., 1991, "The visual psychophysics of data visualization" Proceedings of Data Visualizat ion Conference 1991. A draft article entitled "Toward a Perceptual Science of Multidimensional Data Visualization: Bertin and Beyond" was posted on Green's website , but is no longer available . Hagen , H., Mueller , H., & Nielson , G. M., 1993, Focus on Scientific visualization, SpringerVertrag (Berlin , Germany) . Hege , H. C., & Polthier , K., 1997, visualization and Mathematics: Experiments, Simulations, and Environments , Springer (Berlin , Germany) . Karnopp , D., & Rosenberg , R., 1968, Analysis and Simulation of Multiport Systems ; the Bond Graph Approach to Physical System Dynamics , M.LT. Press, (Cambridge , MA) . Lacroix , Z., Shah, J ., Summers, J., Vargas-Hernandez, N., & Zhao , Z ., 2001 , "Comparative Study of Representation Structures for Modeling Function and Behavior of Mechanical Devices," DEfC200I/CIE-21243 , Pittsburgh, PA. Mihm , J. , 2003 , Complexity in New Product Development : Mastering the dynamics of Engineering Projects, Deutchcr Universitaets-Verlag , (Wiesbaden, Germany) . Otto , K., & Wood, K., 2001 , Product Design: Techniques in Reverse Engineering and New Product Development , Prentice Hall (Upper Saddle River , NJ).
Chapter 15
Policy Scale-free Organizational Network: Artifact or Phenomenon? Dighton Fiddner Political Science Indiana University of Pennsylvania fiddner@iup .edu 1.1. Introduction The research examined the Federal government's information infrastructure system's (lIS's) national security policy over the decade of the 1990s to provide continuous empirical evidence of the Federal government's understanding of the lIS's vulnerabilities, risks to national security, and the dearth of policy actions to address those vulnerabilities. The research also discovered that when depicted as a network, policy's organizational structure resembled a scale-free network. But, is it truly a scale-free network or only an artifact?
1.2 lIS Security Policy Vulnerabilities of the lIS as a risk to u .S national security have been included in every annual national security strategy since 1992. President Clinton identified vulnerabilities of the lIS that posed significant risks to the national security of the nation in the National Security Strategy in 1995. The 2000 national security strategy , A National Security Strategy For a New Century, included protection of U.S. critical infrastructures, to include the information infrastructure, as in our vital interest and, therefore , important to the survival , safety, and vitality of our nation . The strategy goes on to state "we will do what we must to defend these interests , including, when necessary and appropriate, using our military might unilaterally and decisively"[U.S. White House, 1999].
573 Given the salience of lIS security to U.S. economic well-being and nation al secur ity, one would expect to find a well-reasoned comprehensive security policy to protect the system. The National Research Council (National Academy of Sciences, 1989, 1990], the National Communic ations System, and the President ' s Nation al Security Telecommunications Advisory Committee all alerted the nation to the vulnerabil ities of the system as early as 1989 [U.S. Department of Defense , 1997]. Even the National Securit y Council (NSC) acknowledged in the 1990 National Security Directi ve (NSD) 42 that "telecommunications and information processing systems ... shall be secured by such mean s as are necessary to prevent compromise, denial, or exploitation." The NSD even specified what the U.S. response to such vulnerabilities should be: "A comprehensive coordinated approach must be taken to protect the government' s national security telecommunications and information systems (national security systems) against current and projected threat s. This approach must include mechan isms for formulating policy , overseeing systems security resources programs, and coordin ating and executing techn ical activities" [National Security Directive (NSD) 42, National Policy fo r Security of National Security Telecommunications and Info rmation Systems]. Unfortunatel y, at the end of 2000 the United States still had no comprehensive coordinated national liS security polic y.
1.3 lIS Security Policy Organizational Structure Figure I depicts the different federal departments, agencies, and advisory panels that have a statutorily- or administratively-mandated specified responsibility for one or more of the five information assurance objectives I as a network instead of the traditional horizontal and vertical organizational chart. The research and development organizations are included because their research and development agendas and priorities are essential to any liS security policy since many of the system's inherent vulnerabilities are technical in nature.i What is striking at first glance about the organizational diagram is the sheer number of organizations involved. A total of 31 organization s ( 13 executive departments and independent agencies and 18 additional orgaiizations) have some statutorily or administratively mandated responsibility for all or part ofIlS security policy development, obviously too many players to execute successfully any action With such an organizational structure, it would be difficult to develop a coherent coordinated policy even if other complexities could be resolved or somehow excluded from the policymaking process. By the early 1990s, the entire lIS security organizat ional
I The definit ion of Information Assurance (IA) and its obj ectives can be found in National Information Systems Security (INFOSEC) Glossary, NSTISSI No. 4009 . 2 Both the national security strategy for 2000, A National Security Strategy fo r A New Century, and Defending America's Cyberspace: The National Plan fo r Information Systems Protection recognize the importance of research and development funding for enhanced security of the information infrastructure system.
Figure 1. lIS Security Policy Organizational Network.3 structure was already too confusing for any remedy except complete overhaul. The Joint Security Commission reported in 1994 that there is a "profusion of policy formulation authoritie s, all of whom are addressing essentially the same issues" at the national level [United States Joint Security Commission, 1999J "This 'everyone is in charge' arrangement means that no one has responsibility for meeting the vital needs for INFOSEC (information security) for national security " [United States Joint Security Commission, 1999]. Without a designated policy decision maker, some policy makers and organizations • were content to do little in a classic example of the "free-rider" phenomenon, • stagnated the process because of bureaucratic competition between contending policy organizations, and • failed to establish burden-sharing responsibilities. In 1999, the reconvened Joint Security Commission found that lIS national security policy was still "in need of a clear enunciation of principles, goals, and definitions of authorities and responsibilities." As a result, "information system security policy has remained fragmented at the managerial level, with responsibilities poorly defined and spread over multiple bodies. " The Commission also found that there was no clearly defined and broadly accepted institutionalized mechanism to issue national-level policy, 3
See Appendix I: Glossary for explanation of the acron yms used in the diagram.
575 4
even when endorsed and approved by the National Security Advisor [United State s Joint Security Commission, 1999]. The liS security policy structure' s o verwhelming confusion was exacerbated even more by the PDD 63 (Clinton Administration's Policy on Critical Infrastru cture Protection) mandates [Zuckerman, 2000] .
1.4 ISS Security Policy Organizational Structure Much of the disappointing results of liS security policym aking can be explained by standard bureaucratic rationale and counterintuitive reasoning of organ izations eschewing tasks deemed not within in the core "ess ence" of the organizat ion. ' Might there also be some comp elling underlying explanation from the po licymaking structure that can better explain the polic y organiz ations' failure to develop a comprehensive national security policy for the inform ation infrastructure system? Interestingly, as can be seen from the liS Security Policy Network diagram (Figure I), the policymaking network resembles a scale-free network, the very type of network the liS's structure exhibits . If it is indeed a scale-free network, could the properties of such a network determine the policymaking behavnr? To truly be a scalefree network, the structure depicted will have to exhib it the properties of such a network: End Users Highly-connected Node s Preferential Linking: As new node s enter the network, they are more likely to link to highly-linked node s than low-linked node s. Logarithmic Distribution: The distribution of links in the network is determined by logarithm ic power laws. Phase transition: All of the nodes undergo a phase trans ition and start acting as a single entity when a critical threshold (the tipping point) is crossed One can easily identify end-users (Info Assurance TF, ISPAC s, PCCP, etc.) and highly connected nodes (National Security Advi sor, CIAO , DoC, NSTISS , etc.) characteristic of a scale-free network. If the phy sical laws of a scale-fre e network are present , one would expect to see a degradation of function (poli cymak ing) as a result of the vital node not performing its role optimally. Such an effect appears to have happened with the dearth of comprehensive information infrastncture national security polic y from 1990 to 2000. Given the scale-free network ' s natural intrinsic vulnerability of critical node function al degradation reduc ing the overall performance of a system, one could reasonabl y conclude that inadequate performance (functional degr adation) by the Nat Sec Adv critical node was the primary cause of the lack of a comprehensive lIS national security policy . The National Security Advisor's failure not to force, through his mandated and perceived authority, subordinate policy decision makers (DoD, DoC, USSPB, NCO(SIP&C-T, etc.) to collaborate effect ively (a degradation of that highly connected The National Security Adv isor is designated adm inistrati vely as the authority that reports to the Pre sident on information infrastructure system security pol icy by Executive Order (EO) 12472. Assignment of National Security and Emergency Preparedness Telecommunications Funct ions. 5 Halp erin defines the "essence" of an organization as the dominant group's (generally career officials) view of the organization 's m ission and capabilities (Halperin with Clapp and Kantor, 3 and Halperin, 78). 4
576 node's function) hampered the policymaking process. The CIAO 's and NCO(SIP&C-T)'s inability to adequately integrate the other critical infrastructures into the information infrastructure national security policymaking environment in a timely fashion further supports the concept of a vital node's importance and the reduction of system function correlated to degradation of a vital node. In addition to end users and highly connected nodes, scale-free networks also use logarithmic distribution, faster growth of highly-linked nodes, and phased transitions of networks [Barabasi, 2002]. The highly-linked nodes in this environment did seem to grow faster so that condition is met. But, the condition of logarithm distribution does not seem to obtain. The degree distributions are: BOTH Solid & Dashed Degree Count I 23 2 15
3 4 5 6 7
3 3 3 3 I
BOTH Solid & Dashed Degree Count 8 I
9
0
10 II
0 0
12 13 14
0 0 I
I. Degree explains 88.3% of the variance in Count 2. The significance F is very low, 6. IE-07, so the good fit is significant, e.g. p<0.005 3. BUT the slope of the log-log data is the power =-1.25 ... for scale-free networks, the power is between -2 and -3, so something else is at work here. JUST SOLID (the DASHED network is not a single component) Degree Count Degree Count I 28 8 0
2
II
9
0
3
5
10
0
4
I
II
0
5
2
0
6 7
I 0
12 13
I
I. Degree explains 86.3% of the variance in Count 2. The significance F is very low, 4.5E-06, so the good fit is significant, e.g., p<0.005 3. BUT the slope of the log-log data is the power = -1.32 ... for scale-free networks, the power is between -2 and -3, so something else is at work here also . It is not conclusive, but most scale-free networks also have high clustering coefficients and the clustering in these networks is extremely low (arithmetic mean ofO.02305 - it ranges from 0 to I - with a standard deviation of 0.04209 for BOTH and mean=0.0019352, stdev=0.00877 for SOLID). There's virtually no clustering. The problematic node is the National Security Advisor. In both cases, it has a residual (difference between the measured value and the value predicted by the model) on the order of 10 times the residuals of the other Degree data points - both networks look
577 more like an exponential network with an outlier than a scale-free network. This is supported by the lack of any nodes with degrees 9-13 (both) and 7-12 (solid) . I suspect that this is because the high degree of this node represents a different kind of relat ionship than those between the other nodes. 6 The network does not seem to undergo a phase transition, at least none that is observable. The policy system never produces "national" information assurance security policy or a plan that is effective; "Securing America' s Cyberspace" is based on voluntary cooperation by the propriet ary information corporations. So, to answer the question posed in the title of this article, it does not seem as if information infrastructure security policy 's organizational structure is a scale-free network and therefore is only a product of artificial character due to extraneous (my) agency: an artifact. A more elegant explanation than the standard bureaucratic organizational rules of why the lIS security policy environment did not perform its mandated task cannot be found entirely from the properties of its organizational structure. I say entirely because, of course, the critical nodes inherent in the scale-free network are also evident in the traditional vertical-horizontal organizational chart and, obviously, made a difference in the performance of the policy environment. Even if the organizational structure is not a scale-free network, it is certainly complex and may resemble something we international relations analysts call a "regime" - a complex of stated and understood principles, norms, rules, processes, and organizaton s that together help govern behavior [Goldstein, 2005]. As can be implied from the definition, such an environment is in and of itself extremely complex with few lines of demarcated authority and bodes even worse for centralized, comprehen sive policy.
1.5 Conclusions Even if the policy organizational structure is not a sca le-free network , the sad fact is that much of this policym aking confusion could have been avoided. Morton Halperin offers an explanation of what might have been. He postulates that "despite the different interests of the partic ipants and the different faces of an issue which they see, offici als will frequentl y agree about what should be done." Such agreement most likely takes place when there is strong Presidential leadership [Halperin, 1971]. Unfortunately, throug hout his term President Clinton was much more interested in domestic policy than national security matters and did not provide the strong leadership necessary to clarify the policy environment to resolve the organiz ational competition . The current Bush administration seems to have cooled to the idea of preparing a national comprehensive lIS policy . I suspect the admini stration has decided that other critical infrastructure risks considered more likely to be exploited by terrorists using weapon s of mass destruction with more spectacular effects is a higher priority. That determination is understandable given the events of Septemb er 11, 2001. However, the vulnerabilitie s of the lIS probabl y pose a greater national security risk to the nation as a whole give n the significance of the effects from exploitation of its
I wish to thank publicly Steve Abrams, doctoral candidate in Information and Comput er Sciences at University of California, Irvine , for his assistance in help ing me better understand the physical propert ies of networks and for determining the degree s and power of the lIS Security Policy Organization Network depicted.
6
578 vulnerabilities than the relatively small-scale terrorist use of weapons of mass destruct ion. The seemingly overriding difference between the two risks would seem to be the public panic caused by use of weapons of mass destruction from both casualties and the mere threat of and possible presence of menacin g nuclear, biological or chemical agents in the nation itself. Use of weapons of mass destruct ion within the territory of the United States is a direct attack on the core of the nation itself and the effects are much more visible than an attack on someth ing as invisible and abstract as the lIS, but not more damag ing to the nation 's national security. As the events of September II showed, sharing information between such a large number of organizations either efficiently or timely is nearly impossible. One could also argue that the administration has continued the policy pattem established during the 1990s; increasing the complexity of the policy environment. A new executive position, President's Special Advisor for Cyber security been created, and an entirely new Executive Branch department (Department of Homeland Security) has been added to the picture without any statutory or administrative relief of the already existing policy structures or processes. The organizational landscape has only become even more muddled.
References Barabasi, A.-L., 2002, Linked: The New Science ofNetworks, Perseu s (Cambridge). th Goldstein, Joshua S., 2005, International Relations, 6/E edition, Longman (New York). Halperin, Morton , Why Bureaucrats Play Games, in Foreign Policy , volume 2 (Spring 1971),74. Morton H. Halperin with the assistance of Priscilla Clapp and Arnold Kantor "The "X" Factor in Foreign Policy: Highlight s of Bureaucratic Politics And Foreign Policy," Brookings Research Report 140, Washington, D.C.: The Brookings Institute, 1975,
3.
National Academy of Sciences, 1989, Growing Vulnerability of the Public Switched Network National Research Council , National Academy Press (Washington, D.C.). National Academy of Sciences, 1990, Computers at Risk: Safe Computing in the Information Age, System Study Committee, National Research Council, National Academy Press (Washington, D.C.). National Academy of Sciences, 1996, Cryptography 's Role in Securing The Information Society (CRISIS) , Committee to Study National Cryptography, Computer Science and Telecommunications Board, Commission on Physical Sciences , Mathematics, and Applications, National Research Council, Academy Press (Washington, D.C.), 620. United States Congress. National Security Act of 1947 (PL 235 - 61 Stat. 496; 50 U.S.C. 402), as amended, 80th Congress, t" sess., July 26, 1947. United States Department of Defense, January 8, I 997,_Report of the DSB Task Force on Information Warfare (Def ense) , Section 2.4, "Threat" (Washington, D.C.). United States Joint Security Commi ssion, August 24, 1999, Report of the Joint Security Commission II, "Organizing INFOSEC in the Government" (Washington, D.C.). United States National Security Agency, National Information Systems Security C1NFOSECl Glossarv. NSTISSI No. 4009, Ft. Meade, MD: NSTISSC Secretariat (142), September 2000. United States White House, December 1999, A National Strategy For a New Century (Washington, D.C.).
579 United States White House, May 1998, Clinton Administration 's Policy on Critical Infrastru cture Protection: Presidential Decision Directive 63. White Paper (Washington, D.C), http://www.info-sec.com/ciao. United States White House, 2000, Def ending America's Cyberspace: The National Plan fo r Information Systems Protection. Version 1.0: An Invitation to a Dialogue
(Washington, D.c.). Webster 's Ninth New Collegiate Dictionary, 1987, Mirriam-Webster, Inc. (Springfield,
MA). Zuckerman, MJ., March 9, 2000, Asleep at the Switch? How the Government Failed to Stop the World' s Worst Internet Attack, in USA Today.
Appendix 1: Glossary CIAO - Critical Infrastructure Assurance Office CIOC - Chief InformationOfficers Council CSSPAB - Computer System Security and Privacy Advisory Board DOC - Departmentof Commerce FCC - Federal Communications Commission HCS WG - High Confidence Systems Working Group I1TF - Information Infrastructure Task Force INTER-AGENCY WG ON CIP R&D - Interagency Working Group on Critical Infrastructure Protection Researchand Development ISAC - Information Sharing and Analysis Center ISPAC - Information SecurityPolicy AdvisoryCouncil LSN WGINGI - Large Scale Networking WorkingGroup/Next Generation Internet NCSIP&C-T - National Coordinator for Security, Information Protection and CounterTerrorism NEC - National Economic Council NIAC - National Information AssuranceCouncil NIPC - National Infrastructure Protection Center NIST - National Institute of Standards and Technology NRIC - Network Reliabilityand Interoperability Council NCS - National Communications System NSC - National Security Council NSTAC - National Security Telecommunications Advisory Committee NSTC - National Science and Technology Council NSTISSC - National Security Telecommunications and Information Systems Security Committee NTIA - National Telecommunications and Information Administration OMB - Office of Management and Budget OPM - Officeof Personnel Management OSTP - Officeof Scienceand Technology Policy's PACHPCCITNGI - President's Advisory Committee on High Performance Computing and Communications, Information Technology, and the Next Generation Internet PCCIP - President's Commission on Critical Infrastructure Protection PCAST - President's Committee of Advisors on Science and Technology Policy PIT AC - President' s Information Technology Advisory Committee USAC (NIl) - United States Advisory Council on the NIl USSPB - United States Security Policy Board
Chapter 16
APPLICATION OF COMPLEX SYSTEMS RESEARCH TO EFFORTS OF INTERNATIONAL DEVELOPMENT Hans-Peter Brunner) Senior Economist Asian Development Bank 6 ADB Avenue,0980 Manila,Philippines [email protected] (T) ++63-2-6324159
Fundamental research on complex systems has shown relevance to efforts of international development. This paper canvasses some practitioner friendly approaches to international development. Development is about interventions in a highly complex system, the society. Complex systems research tells us that development interventions should not be overly planned, rather the fundamental uncertainty of a changing social system requires a diversity of interventions, and rapid learning from development success and failure. Developing economies are functioning at a low level of effectiveness and resource use. Complex systems are change resistant, and intervention requires understanding the autocatalytic nature of a process of change. International development is about the stimulation of a society's innate autocatalytic / self-organizing processes through interventions that stimulate enough to overcome change resistance, but which do not overwhelm the system. Since the size of financial interventions may in some cases be a substantial fraction of the existing economic activity, disruption is a likely outcome. Crucially, one must avoid having the socio-economic activity organized around the intervention itself, since then an undesirable dependency of the economy on the intervention arises. Stimulation of the innate modes of activity results in the development of socio-economic organization around energy, material and financial flows. The primary generator of effectiveness is an appropriate network structure of interactions and relationships. This paper summarizes traditional development efforts and their outcomes as well as a plausible description of the process of complex systems motivated interventions. Examples are given of recent approaches which aim to appropriately stimulate international development.
1 The assessments and opinions expressed are solely those of the author and should not be attributed to any of the organizations with which he is affiliated. The author thanks the participants of the International Conference on Complex Systems for their valuable inputs.
581 1. Society and its economy as a complex system As Matutinovic [2005, 873] states "An economy is a complex system consisting of a myriad of agents that may be placed in three broad categories: firms, households, and government. Agents' interactions come under the broad umbrella of cooperation and competition while their production and consumption activities constitute the functional fabric of the economic system. Economic activities often span several hierarchical levels of functional interdependence." Complex systems are highly networked systems . They achieve their stability with the existing interdependencies of economic agents. Large scale disruptions through intervening external agencies can destabilize a complex system, and push it into an undesirable steady state or equilibrium . This also means that existing interactions are fully aligned with the way the system works, and external change agents will encounter resistance from the system to introduce change unless the intervention is designed sufficiently large and in a way that it is compatible with the existing system [Bar-Yam 2004]. Hence international development is about the stimulation of a society's and economy 's innate autocatalytic/self-organizing processes through interventions at the meso-level of hierarchy (mostly intermediary institutional level, and not at the individual agent or at the macro level). The interventions have to be sizeable enough to overcome change resistance, but which do not overwhelm the system. The paper explains the autocatalytic nature of a change process, and it describes the network structure of complex societies. With illustrations of traditional development efforts and with examples which take a complex systems view of society, the paper makes a plausible case for appropriate, practitioner friendly development approaches. The choice of analytical framework and of concepts not only helps in selecting the best use of funds by international agencies, but facilitates international development intervention more likely to lead to desirable outcomes .
2. The autocatalytic, self-organizing nature of a process of change 'Autocatalysis' refers to "any cyclical concatenation of processes wherein each member has the propensity to accelerate the activity of the succeeding link" [Ulanowicz 1999,41-55]. Autocatalysis in an economic system presumes a variety of economic actors (= vertices or nodes of a network) interacting in a network of economic links. The network structure of interaction will be detailed in a following section. Economists have used positive and negative feedback loops to model the autocatalytic nature of economic change. Brunner [J 994] has formalized mechanisms underlying the creation of populations of economic actors such as firms leading to macroeconomic change. Productivity change is driven by fluctuating population size in an institutional setting for economic rules. The population is subjected to variations in frequencies of events and performance characteristics. The frequency of firm attributes in a population is the result of a stochastic process of innovation and imitation. Productivity increase can be detected and felt at the higher, macro level, but the change is the outcome of individual behavior that fluctuates and that is susceptible to three distinct forces, competition and cooperation in networks, technological diversity, and 'appropriability' conditions (the legal and institutional frame for instance intellectual property rights). In positive and negative feedback loop models, autocatalysis streamlines energy, material, and financial flows towards more efficient members of the population. Heterogeneous agents continually adjust to the macro situation they create collectively . The whole population is ratcheted up
582 on the macroeconomic level until the constituent members of a population touch a physical or spatial constraint and a negative feedback loop constrains growth [Matutinovic 2005, 872].
3. Traditional development efforts and their outcome Traditional equilibrium development theory is based on full information, complete networks and systems such that disturbances at any node spread evenly in the system. Moreover when explanation happens at the macro level, agents are assumed to be of an average quality and consistent with aggregate behavior [Arthur 2006], and change impacts on all agents evenly. However analysis at the aggregate and average levels is inappropriate to capture the important underlying complex processes of change in individual agent behaviour and in economic interactions and connections that constitute the heart of economic development. As Potts [2001] shows in detail, the traditional microeconomic development framework is based on 'field-theory',' whereas a complex systems approach does not define all connections between nodes or points in a field as complete. In this non-traditional way the explanatory focus becomes the change in the structure of an incomplete system to a less complete or more complete (complex) system as a result of a development intervention. What is to be explained in socio-economic systems is the structural change in terms of degree of heterogeneity of agent populations in space, the modularity and hierarchy of a system, and other aspects of composite structural existence. According to Potts [2001, 14, 19], the economic concept of a field is the extreme and unrealistic case of the geometry of economic space. If a set of interactions cannot be collapsed to a field of actions as in traditional development theory, then the geometry of development space must be mapped by a set of specific network connections among agents. Traditional economic development theory posits an equilibrium setting, with an economy generally being in equilibrium or in a steady state, then being disturbed by random shocks returns to different equilibrium or steady state. Change is not the result of the complex system's innate, self-organizing properties of populations of heterogeneous agents. Very central to the approach is the market clearing mechanism of supply or demand imbalances over the price or quantity adjustment mechanism. Events are explained at a macro level usually contrapositioning supply and demand, savings and investment and so on, without recognizing the underlying complex microeconomic autocatalytic processes inducing supply and demand, savings and investment imbalances. As a result the traditional approach relies conveniently and exclusively on tractable mathematical, numerical methods of representation, for instance on the use of difference or differential equations. The savings and investment imbalances in traditional theory arise from price changes as a result of factor substitution when supply and demand of factors change as a result exogenous forces such as a larger educated population or more efficient techniques that allow for more use of capital intensive production. A production function is used to reflect in equilibrium models the changes in factor proportion in terms of movement around a production function with given technology, and to reflect a production function shift toward a 'known' technology frontier as a result of the exogenous adoption of efficient techniques. In complex, evolutionary systems and development theory, investment is related to technology progress, and this relationship is expressed in a technology ---------2
A field is a space in which all points are connected to all other points in the space, thus conjuring full information among agents .
583 progress function [Metcalfe et al. 2006], which replaces the need for a production function. A technology progress function links the productivity variable to growth. Very widespread is the application of the two-gap differential equation model of economic growth where growth is constrained by either domestic savings or foreign exchange earnings from exports. A development intervention shocks a constrained-growth equilibrium by releasing a growth constraint, thereby moving the macro-system onto a higher development equilibrium. It is a mechanistic approach based on field-theory, where all technology is know at least in principle, and where all connections exist but are subject to growth constraints. Thus the conventional model approach leads development policy to mechanistically direct resources to increase investment, or to change relative factor prices, or to increase factor endowments, or to release constraints. Conventional recommendations for trade policy for instance to remove a foreign exchange gap are based on models with constant or diminishing returns in technological capabilities [Arthur 1990, 98]. Such models settle for single equilibrium points (in contrast to autocatalytic or positive feedback approaches which can lead to multiple equilibria, some of which can be undesirable). Traded goods settle at a fixed world price ("Law of one Price") as conditions of zero transaction and information costs prevail. The traditional Ricardian model of trade excludes all costs of transacting. No institutions, no transport is necessary. Information and knowledge about products, markets, and so on, is transmitted instantly and costless. Because of these field-theory assumptions, the traditional Ricardian model of trade allows simply for the numerical analysis of the effect of changing relative factor prices on the structures of production. The venerated Heckscher-Ohlin approach is the neoclassical prototype equilibrium model. In all such trade models the location of production is determined by relative (comparative) advantages based on factor endowments on a macro level. When however resources are created by different, heterogeneous market participants in different locations, and by a dynamic process of competition and cooperation leading to innovation and imitation, then it is far from obvious that patterns of trade are determined by changing relative prices, or factor endowments, or by releasing constraints and barriers to trade. Induced micro-dynamics in an adequate institutional setting, where policy interventions do not overwhelm the system but stimulate innate autocatalytic processes of productivity improvements lead to new trade patterns. When trade patterns shift, we encounter the complexity of an out-of-equilibrium economy, and international development design needs to be based on a model different from the mechanistic and purely numerical one. Only a complex systems approach to international development will help us calibrate interventions to a complex system such that an intervention stimulates enough for significant and positive change, without creating systems dependency on it. International development thus leads to more desirable outcomes with the complex systems approach.
4. The 'small-world network' structure of economy interactions and relationships One example of autocatalytic, self-organizing structure formation in an economy is the formation of cities and economic hubs of specialized industries and services in a network of supply and value chains. According to Barabasi [2002] the economic hubs play a special role in the stability of an evolving economic network. Structure formation and international development result from a change
584 of the economic interactions and of the functional relationships in trade connections of economic actors. Such trade connections can be represented as a directed network ('small-world network') , where incoming links k a refer to supply flows and outgoing links kb to selling flows, so that the degree of an actor or vertex in network terms is the total number of its connections, k c k a c k b (see Matutinovic [2005, 873-77] for a detailed description of an economy as 'smallworld network'). A small-world network is characterized by short average paths connecting any two agents in the economy, by a small number of hubs populated with few firms that have a large total number of connections in the network. Firms are connected to their suppliers, buyers, and other firms in cooperation; households are connected to consumer product markets and service categories; government provides services both to firms and households and is linked to a large total number supplies incoming from firms. In small-world network, economic flows among firms are highly skewed, where a few relationships make for most of the interactions. Networks are weighted and directed such that some links are very important for an economy, because they have high carrying capacity and are crucial links of many lesser vertices to central actors in the economic system. This is the well-know 80/20 pattern, where 80 percent of business is conducted with 20 percent of suppliers and customers. Product markets and service categories are defined by a dense topology of connections which represent supply- and value chains.
5. Examples of recent approaches Recent approaches to international development take a complex systems and network view. Microeconomic behavior of economic agents, when constrained by institutions and expectations interact with each other to produce macroeconomic outcomes. Firms are economic actors and they make constrained rational decisions about production, investment, and marketing strategy among others. Firms operate in an environment of procedural and substantial uncertainty and their behavior is dominated by trial and error, learning is a continuing process [Matutinovic 2005, Nelson and Winter 1982, Brunner 1994]. New goals, skills, and technology result in new behavior. Novelty arises spontaneously in response to problems -- novelty diffuses among a population of firms and through networks of interaction. New solutions lead to new problems. Fluctuations and novelty at the macroeconomic level arise from a nonlinear dynamic process among market actors. Brunner and Allen [2005] in their book present development policy experiments with the help of complex system, evolutionary trade models. Such models combine the mathematic, numeric approach where differential equations determine macroeconomic outcomes, with a logic (time-indexed) sequence model which defines the trade network interaction of heterogeneous economic agents at the micro level. Development intervention is enacted at the meso(institutional) level of a hierarchically structured economic system. In those experiments trade is foremost influenced by the policy induced change in capability of economic agents to engage in trade. Economic agents use their trading power to buy further technological capability. A technological progress function is used. Trade is productivity driven and the evolutionary trade models in this book link productivity change to structural differences occurring in terms of export product variety and quality. Structural changes in trade are linked to
585 increases in employment, incomes, and in growth rates. In the models, export success feeds back into a positive loop, or (non-linear) autocatalytic process of increased productivity leading to increasing economies of scale and agglomeration effects. Economic agents are approximated as actors in networked geographic space. Each geographic cell in a cellular automaton (CA) establishes trade with the neighboring cells, mainly based on the productive capability of an exportoriented firm within the cell. Put into an agent space of the CA, the economic model combines a numerical mathematical model with a logic time-indexed sequence of agent states. Each geographic space thus produces output and consumes at the same time [no explicit demand function though - this could be added in a more complete model]. Producers earn rent for their efforts. Consumers earn wages. Production and consumption can temporarily diverge in a particular location, thus leading to trade with neighbors. Firms compete with other firms in the same sector, and get selected for their success. Firms cooperate in networks of suppliers of inputs, knowledge providers, consultants, marketing, industry and service cooperatives and associations. Trade networks emerge. The whole combination of factors in a variety of trade services is characterized by a combined "transaction technology", which is incorporated in export unit values of South Asian exports to OECD countries (import data of the OECD countries). Emerging trade networks encapsulate knowledge. In evolutionary trade theory variation and selection among agents re-coordinates knowledge. Development intervention is directed at structure change. The next non-equilibrium model in chapter 4 [Brunner and Allen 2005], extends the previous model with a population and labor force as well as natural resource sub-model. From the view of spatial structure, the demographic, economic, and environment variables in the modeling framework are disaggregated into those of different regions and sectors, similar to the CA structure in the previous model. The extended model introduces three dynamic drivers that are represented through the use of three 'attractivities'. These are the demand attractiveness, the migration attractiveness and investment attractiveness. These attraetivities affect the decision making of the actors (individuals, firms, investors etc.) within the model and capture their responses to differences in price, reward and productivity. As a result, initially small differences can be amplified through autocatalytic processes and eventually lead to a spatial re-structuring of the system, and a change in the trading pattern across zones. Economic incentives emanating form the development interventions and investment attractivities create the opportunity of employment, increase vacancies and raise labor productivity, which causes the increase of jobs and wages [again to note the use of a technology progress function here]. Improved or decreased income earning opportunities will alter relative economic conditions among the regions and sectors, changing the pattern of attraction and flow of the workforce between regions. This will in tum alter patterns of demand for services and the relative investment behavior of sectors. On the one hand, as a result, there will be further incentives favoring further investment in the region, which will in tum lead to further migration from other regions. On the other hand, the further growth of population could lead to the degradation of the environment, particularly of water supply, waste disposal, congestion and increased utility costs in this region. The potential degradation could lead a decrease in comparative advantages with regard to further increases in economic activity and population. Instead of a model of inter-regional trade
586 that moves towards a single, predictable outcome of equity or structured activity, in our model there exist a combination of positive and negative feedback loops that can lead to different spatial structures of economic activity and population depending on the order and rapidity responses of the different agents, and their abilities to innovate and to trade successfully. The balance of advantages can potentially evolve in different directions, and change the spatial pattern of economic activity and population migration. Let us present a practical application in order to illustrate how the new ideas can be used to illuminate the probable consequences of possible infrastructural projects, with a view to ascertaining how worthy they are of support, and what the impacts on poverty and society are likely to be.
6. A Spatial Example of West Bengal The example is taken from a study commissioned by the ADB concerning some possible infrastructural projects in West Bengal. A dynamic, spatial model was developed using 33 different spatial zones of West Bengal that were chosen by the main consultants on the project) Halcrow-Fox for their transportation study of the region. In addition, each of these zones was linked to activities in Orissa, the Indian South and Central States, Bihar (Ranchi) and Bihar (Patna), the Indian North and North Western States, Assam, Sikkim, the Indian Northeastern States, Bangladesh North, Bangladesh Central and Bangladesh South, Nepal and Bhutan. There were 46 in all. The transportation study provided information concerning the flows of goods and commodities between these zones, along the transport infrastructure of West Bengal. This information concerned the volume and value, by 24 different sectors and the origins and destinations. Four different types of flow can be distinguished. All details of the example are given in chapter 5.2 to 5. 6 of Brunner and Allen [2005]. One of the notable characteristics of the complex system approach is that the economic models are built upon an interregional matrix. The economic variables, such as demand and supply, import and export, productivity accumulation and consumption, costs, wages and profits, link directly with the interregional matrix or input-output system. The flow equations are then calculated in the simulation models, thus enabling adequately calibrated development intervention and effective use of economic resources. Since the economic sectors of different regions are differentiated there is a constant tendency to instability mediated by market mechanisms, mainly prices. So, this interregional matrix which is suggested here establishes the relationship between the whole economy, the regional/sector economy and local decision making process that is the relationship between 'macro' economic behaviour and 'micro' economic structure.
7. Summary This paper has canvassed dome practitioner-friendly approaches to international development. Development is about interventions in a highly complex system society. Complex systems are change resistant and intervention requires understanding the autocatalytic self-organizing processes that shape an economy 3 Halcrow Fox, Vineyard House, 44 Brook Green London W6 7BY. "North -South Corridor Development Project in West Bengal".
587 across micro, meso, and macro scales. In positive and negative feedback loop models, autocatalysis streamlines energy, material, and financial flows towards more efficient economic actors and agents. Heterogeneous agents are located and linked in space and geography. What makes for structural change, and thus development, is the reconfiguration and filling-in of economic network connections among agents. In the complex systems way the explanatory focus becomes the change in the structure of an incomplete system. In traditional equilibrium theory a priori complete networks ('field-theory') spread change equally across the system composed of homogeneous 'average' actors. In a complex systems approach, microeconomic, autocatalytic processes lead to macroeconomic, aggregate behaviour of the system, and the evolving macro scale of the system constrains in tum the behaviour of heterogeneous actors. Traditional equilibrium theory does not link micro processes to the macro scale of an economy. In evolutionary complex systems, scales are linked, because both numeric models and logic algorithms are combined, not so in traditional theory. Moreover, the complex systems models canvassed in the paper do not employ as part of the numeric model traditional production functions, instead technology progress functions are incorporated into the models.
Bibliography [I] Arthur, Brian, 2006. "Out-of-Equilibrium Economics and Agent-Based Modeling." In Handbook of computational Economics, Vol.2: Agent-Based Computational Economics, K. Judd and L. Tesfatsion (ed.), Elsevier/NorthHolland. [2] Arthur, Brian, 1990. "Positive Feedbacks in the Economy." Scientific American, February: 92-99. [3] Barabasi, Albert, 2002. Linked: The New Science of Networks. Cambridge, MA: Perseus Publishing. [4] Bar-Yam, Yaneer, 2004. Making Things Work. Cambridge, MA: NECSI Knowledge Press. [5] Brunner, Hans-Peter and Peter Allen, 2005. Productivity, Competitiveness and Incomes in Asia: An Evolutionary Theory of International Trade. Cheltenham, UK and Northampton, MA: Edward Elgar Publishing. [6] Brunner, Hans-Peter, 1994. "Technological diversity, random selection in a population of firms, and technological institutions of government." In Evolutionary Economics and Chaos Theory, L. Leydesdorff and P. van den Besselaar (eds.), London: Pinter. [7] Matutinovic, Igor, 2005. "The Microeconomic Foundations of Business Cycles: From Institutions to Autocatalytic Networks." 1. ofEconomic Issues, vol. 39,4: 867-98. [8] Metcalfe, Stan, John Foster and Ronnie Ramlogan, 2006. "Adaptive economic growth." Cambridge J. ofEconomics, vol. 30, l: 7-32. [9] Nelson, Richard and Sidney Winter, 1982. An Evolutionary Theory of Economic Change. Cambridge, MA: Harvard Univ. Press. [10] Potts, Jason, 200 I. The New Microeconomics. Cheltenham, UK and Northampton, MA: Edward Elgar Publishing. [11] Ulanowicz, Robert, 1999. "Life after Newton: An Ecological Metaphysic." BioSystems, vol. 50: 127-42.
Chapter 17
About the bears and the bees: Adaptive responses to asymmetric warfare Alex Ryan DSTO , Aust ralia [email protected]
Conventional military forces are organised to generate large scale effects against similarly st ructured adversaries. Asymmetric warfare is a 'game' between a conventional military force and a weaker adversary t hat is unable to match t he scale of effects of the conventional force. In asymmetric warfare, an insurgents' st rategy can be understood using a multi-scale perspective: by generati ng and exploiting fine scale complexity, insurgents prevent the conventional force from act ing at the scale t hey are designed for. This paper presents a complex systems approach to the problem of asymmetric warfare, which shows how futur e force st ructures can be designed to adapt to environmental complexity at multiple scales and achieve full spect rum dominance.
1
Introduction
The cold war was a st ory ab out two giant bears, Uncl e Sam and Mother Russia, armed to the teeth and locked in the dynamic st ability of mutually assured destruction. The nuclear arms race saw both bears grow stronger and more dang erous. The only non -cat astrophic resolution could be voluntary withdrawal by one side, wh ich occurred when Mother Russia backed down , exhauste d and weak from keeping pace with Uncle Sam. Now the undisputed power in t he woods, Uncl e Sam felt lost at first withou t t he focus t hat a life-threatening adv ers ary demands. Sear chin g for animals that might one day become powerful bears , Uncl e Sam encounte red many mystical anima ls: dragons , elepha nt gods and a reptilian Godzilla, alt hough none seemed interested in confront ing t he
589 dominant bear on his terms. So Uncle Sam went into hibernation, until one September several sharp stings rudely awaked him. Enraged, the bear found the small bodies of the bees that had released their venomous barbed stingers in his flesh1 . Uncle Sam tracked the honey trail to a hive in the mountains, which he smashed with his giant paws. Unsatisfied, the bear continued over the mountains and into the desert where the bear knew of another beehive, rumoured to have even nastier stings. This hive too was quickly smashed , but Uncle Sam was in for a surprise. For this species of desert bee was more aggressive and easily agitated. Every time they stung the intruder, an alarm pheromone triggered more bees to attack. The swarming enemy was small, mobile and dissolved into the desert under direct attack, which rendered the bear's keen eyesight and large muscles ineffective and impotent. Crushing individual bees could not destroy the swarm, and even when the bear had limited success, a Goliath triumph over David could do little to win the hearts and minds of the neighbouring animals. Truth be told, the locals rather thought the bear was stirring up a hornet's nest . This story illustrates just a few of the issues associated with the transition from the cold war era to the so-called Global War on Terror . It is a problem that is relevant to every nation state in every region. Terrorism is asymmetric, enduring, ancient , unencapsulated and continually co-evolving: it is the kind of problem that has most stubbornly resisted conventional scientific reduction , the kind of problem that has helped motivate the rise of systems thinking. Section 2 provides insights from complex systems theory, explaining in terms of organisation and scale why the bear is ineffective against the bees, and introduces adaptation as a possible response. The conjunction of systems engineering and complex systems (beginning to be recognised as a separate field as complex systems engineering) is developed in Section 3 to provide a new approach to developing capabilities for complex environments. Section 4 explains the attrition warfare paradigm that conventional forces are organised to fight, which is contrasted with asymmetric warfare in Section 5. Adaptive responses to asymmetric warfare are discussed in Section 6.
2
Complex systems
Two complex systems ideas are useful for understanding asymmetric warfare: multiscale variety and adaptation. The law of multiscale variety [4, 5] is an extension of Ashby's law of requisite variety [1] . Assume that a system has N parts that must be coordinated to respond to external contexts , and the scale of the response is given by the number of parts that participate in the coordinated response. Secondly, assume that under (complete) coordination, the variety of the coordinated parts equals the variety of a single part. Then , coordination increases the scale of response, but decreases its variety: there is a tradeoff 1 I thank Martin Burke for suggesting this metaphor for asymmetric warfare . Swarms of bees are also discussed as a metaphor for effects-based operations in [9] .
590
between large scale behaviour and fine scale complexlty'' . The generalised law of multiscale variety st at es t hat at every scale the variety necessary to meet t he tas ks, at that scale, must be larger for the system than the tas k requirements
[5].
Adaptation is a generic model for learning successful interaction strategies between a system and a complex and potentially non-station ary environment . Th e environment is treated as a black box, and stimulus response interactions provide feedback th at modifies an interna l model or represent ation of t he environment , which affects t he probability of the system t aking futur e act ions. Th e three essential functions for an adaptive mechanism are generat ing variety, observing feedback from interactions with t he environment, and selection to reinforce some interactions and inhibit oth ers. Wit hout variation, the system cannot change its behaviour. Without feedback, there is no way for changes in the system to be coupled to th e structure of the environment. Without preferential selection for some interactions, changes in behaviour will not be st atistic ally different to a random walk.
3
Complex Systems Engineering
Complex Systems Engineering when ta ken literally is a contrad iction in terms. If conventional approaches to modelling do not capt ure t he dynamics of a complex system, we cannot expect to predict the effects of design choices on behaviour , an essential preconditi on for engineering. Therefore, either the systems we can engineer will never act ually be behaviourally complex (and certainly no more complex than th e designer), or else we need to develop a new understandin g of what it means to design and engineer a system. And since there appears to be no bound to th e complexity of problems we can imagine engineering solutions to , the latter course of act ion seems inevitabl e. In order to increase t he complexity of engineering designs, we must acknowledge real limits on certainty and predictability of system behaviour , and t hat complete test and evaluat ion is not feasible [3, Th m A]. Rather than planning a design cent rally, t he design occurs dist ributed across a tea m with managerial independence, relying more on market forces and end user feedback than the ability of th e lead systems engineer to understand and predict the behaviour of the whole system. Each designer would be responsible for developing a functional building block for the system, capable of operating independently. The way in which different functions interface would not be pre-specified. Interfaces would be negotiated locally as the building blocks are developed, creat ing large numbers of possible interaction networks. Whereas traditional syste ms engineering takes a glueware approach to integration that tends to minimise the number of possible interaction patterns [8], harnessing self-organising mechanisms to produce global st ruc2In t his pap er complexity is defined as t he (log of t he) num ber of possible configura t ions, and is th erefore a measure of variety .
591 ture provides an important source of variety. Even when the components of a self-organising system are fixed, the multiple possible organisations of components provides the flexibility to respond to unexpected environments and exploit unplanned functionality. Traditional systems engineering assumes fixed epochs, where one system is replaced by a new system that is essentially designed from scratch. However, complex problems will not have fixed solutions, so the design, develop, operate, and dispose life cycle is artificial. Rather than one-off replacement costs, complex systems engineering projects could be given a continual flow of funding that incrementally changes the system. The system is always operational, and the distinction between legacy and current systems is dissolved. Because the interaction patterns are not fixed, as new components are added the system will self-organise to include them. Some of the new components will directly compete with legacy components for links. As the new components demonstrate better performance, their interactions will be reinforced, while old components that fall into disuse can be removed. The coexistence of generations in the one system provides redundancy [2], which alleviates the need for exhaustive test and evaluation and allows greater risk-taking and innovation when designing new components.
4
Attrition warfare
Conventional military forces are organised to generate large scale effects against similarly structured adversaries . Industrial age mass production allowed nation states to re-equip and reorganise rapidly following defeats in battle that in the Napoleonic era would have been decisive [9J. Therefore, industrial age armies had to be prepared to fight wars of attrition, where the primary aim is the physical destruction of the adversary's armed forces and its support base. Because of this, the attrition 'game ' is dominated by the physical resources or mass each player can bring to bear. Each side competes to produce effects on a larger scale in order to overwhelm their opponent's defences. The Lanchester differential equations are the classical model for attrition warfare because they capture the importance of mass in achieving the physical destruction of an opponent. The Lanchester equations [7J relate the mass and effective firing rate of one side to the rate of attrition of the opponent. The solution is known as Lanchester's square law, which shows that increases in initial mass will produce quadratic improvements in battlefield superiority compared to improvements in weapon efficiency. So how is a conventional force organised? We will consider two organisational drivers for conventional military structures. Firstly, there has been a trend throughout the history of warfare towards greater lethality at all levels of warfare in order to mass a large scale effect in the shortest possible time frame. As is expected from the multiscale law of requisite variety, this emphasises greater coordination of the activities of units at all levels. Synchronisation enables the exploitation of synergies and compensation for weaknesses of differ-
592 ent units within combined arms teams , and at a higher level of organisation by joint warfighting. Centralised control, detailed plann ing and collective training exercises are the dominant mechanisms for producing synchronised effects. A continual focus on synchronisation to produce large scale effects has encouraged specialisation . Specialisation can increase efficiencyby avoiding redundancy, but it also locks in dependencies between units , reducing the ability of units to operate in isolation at finer scales. In order to focus effects, conventional forces are organised to be spatially oriented towards the forward edge of the battle area , with most protection for the front-line manoeuvr e units , less protection for stand-off indirect fire units and little protection for the rear logistics supply chain. Secondly, with increasing lethality comes increasing responsibility and the potential for the misuse of force to adversely affect the strategic goals of conflict. This has reinforced the use of centralised control and codified procedures to limit flexibility and prevent strategically detrimental application of force. Highly centralised hierarchical control reliably amplifies the scale at which the commander can control battlespace effects, at the expense of the fine scale variety (complexity) that the conventional force can cope with .
5
Asymmetric warfare
In contrast to attrition warfare, asymmetric warfare is a 'game ' between a conventional military force and a weaker adversary that is unable to match the scale of effects of the conventional force. To compensate, the weaker adversary must believe their will to accept the costs of conflict is greater than their opponent. What began as Operation Iraqi Freedom, and is now referred to by the Pentagon as the Long War, is a paradigmatic example . The US-led Multinational force is opposed by an Iraqi insurgency composed of approximately 14 guerilla organisations, each with distinct aims and relatively independent operations. Although the insurgents have limited material means, their religious and ideological motivations provide strong will. A history of aversion to casualties in democracies during conflicts, especially when the conflict is periph eral to the nation's core interests, provides a basis for the insurgents ' belief that the Multinational force's strategic will can be weakened if domestic support for a continued presence in Iraq is sufficiently eroded . At the tactical level, suicide bombings represent an asymmetry in will, where suicide bombers have sufficient will to utilise methods outside those available to the Multinational force. For insurgents to exploit their asymmetries, they must also negate the asymmetries that favour the convent ional force. In particular, they must avoid direct, large scale confrontation against the better equipped, trained and synchronised conventional force. This can be understood using a multi-scale perspective: by generating and exploit ing fine scale complexity, insurgents prevent the conventional force from act ing at the scale they are organised for: large scale but limited complexity environments. By dispersing into largely independent cells, insurgents can limit the amount
593
of damage any single attack from t he conventional force can inflict. T his significantly reduces the threat of retaliation from acting as a deterrent, since t he insurgents have negligible physical resources exposed to retaliatory attack [9]. Insurgents t hat do not wear uniforms and blend into a civilian population cannot be readily identified or targeted until t hey attack, in a sit uation of their choice. There is no longer a forward edge of t he battle line, meaning softer support units are vulnerab le. The number of possible locations, times and direction of attack increases significantly compared to attrition warfare, increasing fine scale complexity. The heightened pote ntial for collateral damage from mixing with civilian populations dramatically increases the tas k complexity for a conventional force t hat must minimise the deaths of innocent civilians for any hope of strategic victory. By moving into complex terrain, such as urban or high density vegetation, the Multinational force's most expensive sensors (such as satellites and radar) designed to detect large scale movements are ineffective due to t he fine scale of physical activity, which provides a very low signal to noise ratio. In this sit uation, insurgents contro l the tempo and intensity of the conflict, which enables them to exploit niches in the fuzzy spaces near artificia l boundaries, such as the traditional conceptual dichotomies of war/ peace, combatant/non-combatant, state/non-state actors, and tactical/strategic operations, adding further to complexity. The ability to mass synchronised battlespace effects is of little use in such a complex situation.
6
Adaptive responses to asymmetric warfare
Clearly the conventional force requires a different organisation to respond to asymmetric powers, while still maintaining the ability to generate large scale effects when required. The force must be able to generate sufficient variety in effects at every scale, from peace-keeping and peace enforcement crises to high intensity warfighting, to achieve full spectrum dominance. Because the biggest capa bility gap is currently in asymmetric warfare, we will focus on this context. Currently within the Australian Army, units form Platoons or Troops , each with a specialist function, such as to detect , respond or sustain. The P latoons in a Battlegroup can then be combined to form Company level combat teams, and integration at lower levels is uncommon. In contrast, the Australian Complex Warfighting concept [6] outlines a new type of organisation to cope with increased complexity. Smaller, austere, semi-autonomous teams with modular organisat ion are envisaged. These teams use swarming tactics and devolved situation awareness to operate as self-reliant teams that aggregate to achieve larger scale effects through local coordination rather than central control. Teams would not be as specialised as current units , since each team must be largely self-reliant for logist ics, sensing, decision-making and respondi ng to t hreats. The difference between a logist ics and reconnaissance team for complex warfighting would be a matter of emphasis , since both would have a base capability for mobility, survivability, detection and response. Complex warfighting tas ks such
594 as asymmetric warfare require a shift in conventional force structure towards special forces structures. Each of these changes are consistent with complex systems insights. However, the new organisation presents additional challenges. Whereas a centralised system promotes standardisation of equipment and process, semi-autonomous teams will deliberately promote variety. As well as reflecting differences in local context, the variety will exist because some teams will discover successful strategies that are unknown to other teams. Therefore, it is necessary to promote the spread of successful strategies between autonomous teams to improve overall force effectiveness. However, if successful variations are adopted too readily, the reduction in variation between teams will diminish the force's ability to adapt. There exists a tradeoff between being adapted (specialised) to the current environment, and adaptability for future contexts. Another challenge associated with semi-autonomous teams is ensuring that the goals of the autonomous teams are aligned with force level goals. Abu Ghraib is an example where the pursuit of local goals (extracting intelligence) produced extremely damaging strategic effects. The impact of socio-cultural issues on strategic success is clear in this example. The interplay between increasingly ubiquitous mass media and populations that value surprise in news (which Shannon's theory accounts for) act to reinforce the perception of anomalies, and can also reinforce the public response. For armed forces, the more independent teams become, the more likely at least one team will develop a culture that reinforces strategically counterproductive behaviour. The role of higher level headquarters that manage semi-autonomous teams will be to clearly communicate and then police the boundaries of acceptable behaviour, within which autonomous teams have freedom to innovate. This kind of force structure places very different demands on the capability development process. Whereas a conventional force demands standardisation so that effects can be synchronised and the output of units is predictable, complex warfighting teams will be heterogeneous and have varied demands for materiel. An industrial age approach to capability development tailored to large scale production of standardised materiel is unsuited to meeting the fine scale but complex demands of the new force structure. In contrast, the complex systems engineering model introduced in Section 3 enables a much more responsive development process. Individual autonomous teams could test new ways to sense and respond to asymmetric threats, enabling them to adapt at the secondary level. From the capability development perspective, semi-autonomous teams provide an ideal entry point and realistic test bed for new systems, and allow the coexistence of legacy and experimental systems discussed in Section 3. If the system provides a significant benefit, demand for it will quickly spread through the network of teams . If a new system has a negative effect on one team's performance, at least it will have a negligible effect on overall force performance due to the autonomy of the teams . In order to be effective, the adaptive response must occur at all levels. At the strategic level, the use of complex warfighting teams to deny targeting success
595 is only one available strategy in a space that includes economic , political and information operations. In this space , the effects of taking different sets of actions is typically far less certain and first and second order adaptive cycles (adaptation applied to the process of adaptation) can playa crucial role in identifying and exploiting useful sets of actions.
7
Conclusion
Asymmetric warfare presents a challenge of increased fine scale complexity. Current force structures are monolithic bears designed for large scale effects, and must be reorganised by devolving autonomy and increasing independence to provide sufficient fine scale variety. Once variety is available, the best way to cope with complexity is using adaptation, which can improve system performance over time and track changes in the environment. Complex systems theory has the potential to improve the way military capability is acquired, organised and managed, enabling adaptive responses to asymmetric warfare .
Bibliography [1] ASHBY, W . R., An Introduction to Cybernetics, Chapman & Hall London, UK (1956).
[2J BAR-YAM, Y., "Enlightened Evolutionary Engineering/Implementation of Innovation in FORCEnet" , Tech. Rep. no., Office of the Chief of Naval Operations Strategic Studies Group, (2002).
[3J BAR-YAM, Y , "Unifying Principles in Complex Systems" , Converging Technology (NBIC) for Improving Human Performance, (M. C. Roco AND W. S. BAINBRIDGE eds.) . Kluwer Dordrecht, The Netherlands (2003). [4J BAR-YAM, Y, "Multiscale Complexity / Entropy", Advances in Complex Systems 7 (2004), 47-63. [5J BAR-YAM, Y, "Multiscale Variety in Complex Systems", Complexity 9,4 (2004), 37-45. [6J KILCULLEN, LTCOL D., Complex Warfighting , Commonwealth of Australia (2004). [7] LANCHESTER, F . W., Aircraft in Warfare: The dawn of the Fourth Arm, Constable and Co. London, UK (1916). [8J NORMAN , D.O., and M. L. KURAS, "Engineering Complex Systems", Technical report, MITRE, (2004). [9J SMITH, E. A., Effects Based Operations : Applying Network Centric Warfare In Peace, Crisis, And War, CCRP Publication Series (2002).
Chapter 18
Improving Decision Making in the Area of National and International Securitythe Future Map methodology Donald Heathfield
FutureMap [email protected] 1. Summary This article proposes an approach to improving decision making in the area of national and international security by building a comprehensive map of the organization's future environment. The Future Map provides a platform for leveraging numerous internal and external contributors, drives a "strategic conversation" across the organization, and links strategy, intelligence and learning. The Future Map methodology enables organizations to better address the challenges of decision making in the complex and continuously changing global political and security environments.
2. Introduction The capacity of organizations working in the national and international security area to make good decisions depends to a large degree on their ability to anticipate the behaviors of complex social and political systems. Our far from perfect understanding of these systems, the uncertainty and ambiguity associated with their future developments, and conflicting views and experiences of multiple stakeholders further complicate this work. To be successful in dealing with global security challenges, a proactive, anticipatory approach, or what Leon Fuerth, former National Security Adviser to Vice President Gore calls "forward engagement," is critical. Such engagement is difficult to imagine
597 without the organization's commitment to "future preparedness ," that involves a systematic exploration of its future environment and building of capacities to meet coming challenges . From a strategic perspective , it may be more important whether organizations have processes and tools for continuously improving the understanding of their environment, than how well they know it today. The lack of a systematic , organization-wide approach to "future preparedness" is one of the greatest hurdles that prevent organizations from adopting a more proactive strategic posture. How can we help organizations build a holistic , synthetic view of their future environment? How can we encourage decision makers to continuously question their assumptions and strategies in response to rapid changes and uncertainty typical for social and political systems? How can we mobilize and leverage the knowledge that already exists within an organization or is available to its global networks?
3. Challenges of decision making in a complex environment The challenges of decision making in the international security area arise from the complex nature of social and political systems as well as from the internal complexity of governments . The relations between the factors that drive changes in the security environment are often difficult to understand and quantify. The "human" factor introduces additional elements of surprise. While many decisions carry potentially high political and economic costs, they have to be made fast and on the basis of information that is rarely complete and often ambiguous. Decision makers need to reconcile multiple, frequently conflicting internal perspectives . They have to consider options and strategies derived from organizational processes that are based on different methodologies and information . Short-term operational requirements may not be entirely consistent with long-term strategic goals. Personal agendas, experiences and value judgments play an important role, while distance, cultural backgrounds and organizational procedures create potential for "blind spots." The nature of the international security environment define the need for the tools that have the potential to offer the most value for decision makers in managing uncertainties and risks [Van der Steen, 2005.] If we draw a chart that plots the degree of predictability the future environment against the rate of change, the following three situations appear (see Figure 1.) Sudden changes, the nature of which we do not understand may be treated as random, as they cannot be anticipated . In such cases decision makers should focus their attention on developing the capacity of their organizations to withstand the impact of the changes and to recover from their consequences . Building organizational resilience is the only strategy. Predictable and gradual changes present a relatively simp le challenge that organizations can handle using traditional strategic planning approaches . The
598 development of robust models of the environment will go a long way towards improving their preparedness.
.....
Low
Understanding and Predictability
.... Random ..... .... .... ..... .... .... Comolex -,
...... ..... ..
-,
-,
..
-,
....
<,...
..... ....... ..
SimDle
<,
High Gradual
Rate of Change
Rapid
Figure. 1. The space between the simple and random is by definition a domain of complexity [Wikipedia.] Dealing with relatively predictable but rapid changes requires strong implementation and coordination capabilities, while relatively slow but ambiguous developments will challenge the organization 's capacity to make sense and adapt. Whether the nature of change is well understood or not, the ability of organizations to shorten the time between the recognition of first warning signals and the implementation of decisions is critical. Success depends on how fast the whole organization can see the "big picture" and change its ways of working. Organizations need a rigorous "future preparedness" process that incorporates the following strategies: •
•
Develop, in absence of reliable models, the ability to learn about the behavior of the "target" systems by using a "trial and error" method: collectively map anticipated future environment, comparing as many perspectives as possible; question the underlining assumptions looking at the unfolding reality; adjust the picture of the expected future based on new insights. Improve the capacity to meet unanticipated challenges more effectively: use collective intelligence to make sense of new developments; mobilize people by involving them in action planning and tracking the progress towards the objectives.
Both strategies require a shared "working picture" of the future that can be built, tracked, and debated by many people across the organization .
599
4. Requirements for the Picture of the Future The shared picture of the future must fulfill the following fundamental requirements: 1. Bridge the methodological gap between future-related techniques and processes and established "core" organizational tools and processes, thus making future preparedness a daily activity linked to established reporting and measurement tools such as organizational scorecards. [Kaplan and Norton, 1996.] Organizations need to track not only the external environment, but also their internal preparedness. 2. Connect the four elements - future exploration, early warning, communication, and use of information about the future as parts of one comprehensive system, a unique depository of all future-related knowledge, as well as a shared space for exchanges, analyses, debates and learning. 3. Integrate both long- and short-term preparedness capacities - be reactive enough to offer real time intelligence along with strategic foresight, to avoid the competition for executive attention between their daily needs and the "big picture," recognizing that both are essential to strategic success. 4. Ensure a highly intuitive visual interface, in order to be able to communicate the situation and key messages quickly, and speak the language of decision makers in a way that conveys highly strategic analysis along with pure facts.
5. The Event Field How can we assemble such a picture from a multitude of facts and opinions? How can we present the future in a way that facilitates decision-making? How can we integrate different types of information coming from diverse sources into one picture? The picture of the future needs to be based on a common denominator that does not depend on a kind of methodology, source or nature of the information used. All future scenarios, objectives, decisions, and warning signals are the events that we anticipate to happen. Using an anticipated event as the "common denominator," we can recreate our future environment, both long-term and short-term. We believe that we need to put in front of decision makers the field ofanticipated events with which the organization will have to deal in the future. This Event Field is the product of our collective anticipation; it is the picture that appears on our collective future "radar screen." Our goal is to create the organization's Event Field and then continuously question and clarify this picture using all means available. Our purpose is to create a learning process for decision makers that is focusing them on future preparedness. By introducing the Event Field, we are shifting the focus of discussion, as Arie de Geus proposed, from "whether something will happen" to "what would we do, if it happened." [Arie de Geus, 1999.] The process of the construction and assessment of the organization's future Event Field invites the participants to "live in the future," going back and forth in time, "trying on" various
600 alternative futures. It forces decision makers to make an important step from "future awareness" to "future appropriation." This process is consistent with the observations about how people unconsciously prepare for the future made by the Swedish neurobiologist David Ingvar and described in his article "The Memory of the Future" published in 1985. According to Ingvar, a part of the human brain "is constantly occupied with making up action plans and programs for the future," making "alternative time paths into the future," and "storing these alternative time paths." This "memory of the future" helps us to establish a "correspondence between incoming information and one of the stored alternative time paths," perceiving its "meaning." It also allows us to filter out irrelevant information that has no meaning for any of the "options for the future which we worked out" [quoted from Arie de Geus, 1999.] By creating a similar process in an organization, we can help its leaders visualize strategic paths into the future, identify challenges and opportunities, develop and effectively execute strategies.
6. Construction of a Future Map A field of anticipated events can be built for any domain or entity, whether an enterprise or a country. Following the practice of social scientists, we define an anticipated event as Who will do What to Whom/with Whom When and Why. The Event Field stretches as far into the future as available information allows. In the Event Field, anticipated events are arranged on a timeline, in the columns that represent the key factors that are likely to impact the future of the organization (see Figure 2.)
Event Field Structure
Event
Event Event
More Factors
Event
Event Event Factor
Future Event
Event
Event Event
Event
More Factors
Event
Event Today Event
Event
Event
i
Event
Event
Event Event
Past
Factor
Event Factor
Figure. 2.
Event Factor
601 Factors that influence the furore may represent the external environment as well as the internal situation of the organization. The main factors can be divided into subfactors until the right level of detail is achieved. From a methodological point of view the development of the Event Field is consistent with the principles of morphological analysis, a "method for rigorously structuring and investigating the total set of relationships in inherently non-quantifiable socio-technical problem complexes," introduced in its modem form by Fritz Zwicky and further advanced by Tom Ritchey and others. [Ritchey, 2005.] The Event Field integrates into the two-dimensional "Time-Factors" framework information from all contributors inside and outside the organization: internal early warning networks, scenario development groups, external expert communities, newswire reports, RSS feeds, trend forecasts and automatic extraction tools. It creates a space into which the information from all sources can be continuously deposited. The Event Field methodology establishes the step-by-step process of constructing the Event Field in the domain and then continuously updating and assessing this Event Field to extract useful knowledge for decision-makers. It defines the framework for identifying and assessing risks, impacts and probabilities associated with anticipated events. It offers a structured approach to the collection of information about the driving forces in the domain, identifying uncertainties and developing multiple scenarios. To ensure consistency, the Event Field is built from two opposite directions at the same time: from the long-term future towards the present by tracking scenario milestones, and from the present towards the future by converting observed trends into anticipated events. As the forecasts made using different independent methodologies are translated into events, we can crosscheck and refine our assumptions and expectations. Once the future Event Field for the organization (or a domain) has been created, we can extract from it a number of Future Maps. These Future Maps are the "snapshots" of the Event Field that represent the future seen through the lens of different perspectives. Each Future Map can convey a particular scenario, a vision of a particular group of experts, an extract from a specific type of sources, or a picture with a specific time horizon.
7. Benefits of the Future Map Approach A Furore Map can be seen as an outcome - a picture of anticipated future, as well as a continuous process of creating, updating, and assessing this picture to extract value from our combined knowledge. A Future Map serves as a unique repository of all information about the future of the organization. It bridges the gap between shortterm trend extrapolations and long-term scenarios, helping decision-makers better understand the time horizon within which the majority of policy decisions need to produce results - from a few months to 3 - 5 years.
602 By offering the capacity to show a variety of visions of the future side-by-side, a Future Map provides a tool for evaluating scenarios, reconciling conflicting information, understanding alternative perspectives, identifying discontinuities and examining gaps in knowledge . By focusing on future milestones, it forces us to actively search for early indicators of emerging trends. By assessing the impacts of events across time and across domains, a Future Map encourages holistic thinking about the future, promotes more systematic and rigorous analysis of risks and challenges and fosters discussions among all contributors. A Future Map facilitates the assessment of probabilities and expected impacts of the events by providing "prediction market" technologies with structured sets of events to "bet" on. At the same time a Future Map creates a dynamic scorecard by putting the organizations' missions, strategic objectives and measuring milestones on the timeline and in the context of their security environment.
Future Map enhances corporate early warning systems by linking strategy, intelligence and learning, and connecting collection and assessment processes
Early Warning Systems
Future Map puts corporate objectives and strategies on the timeline and within the context of the changing external and internal pnvirnnmp.nt
The Balanced Scorecard
Decision (Prediction) Markets Future Map creates sets of events to which decision market technology can be systematically applied, and enhances its value to decision-makers
Organizational Communication and Learning
Future Map creates a common information repository and collaboration space, while giving decision makers instant view of anticipated events and challenaes
A Future Map is conceived as a platform for making "the strategic conversation" in an organization permanent by connecting all pertinent contributors around the shared vision offuture challenges. A Map's potential as a sharing tool, an integration tool, a learning tool, a strategy development tool, and a progress-tracking tool is greatest when it is used to leverage global collaboration. Using the Future Map software remote contributors can create and update event fields for multiple domains, track changes, collectively assess probabilities, map and
603 discuss potential implications, develop and extract scenarios, plan strategic options and build progress scorecards . (see Figure 3.)
Beijing Olympics
81
Dec -2007 Nov-2007
Figure. 3. By creating a foundation for systematic analysis of future security environment and facilitating inter-organizational collaboration a Future Map provides a tool for building a focused strategic preparedness process that links strategy, intelligence and learning, and for improving decision making in a complex security environment.
References Van der Steen, M., 2005, Integrating Future Studies in Public Policy Making, Foresight, Innovation and Strategy, World Future Society, 220. Wikipedia, 2006, Complexity, www.wikipedia.org. Kaplan, R. & Norton, D., 2000, Having Trouble with Your Strategy? Then Map It, HBR, September-October. De Geus, A , 1999, Strategy and Learning, Reflections, Volume 1, Number 1. Ritchey, T., 2005, Future Studies using Morphological Analysis, Adapted from the article for the UN University Millennium Project: Future Research Methodology Series, The Swedish Morphological Society.
Chapter 19
Comparison of chaotic biomagnetic field patterns recorded from the arrhythmic heart and stomach Andrei Irimia, Michael R . Gallucci, John P. Wikswo Jr. Living State Physics Laboratori es, Vanderbilt University andrei.irimia@vanderbilt .edu
We here invest igate th e t ime evolution of normal and arryt hmic cardi ac and gastr ic biomagnet ic signals using simultaneous magn et ocardio graph y (MCG) and magnetogastrography (MGG). Noninvasive MCG /MGG recordin gs were acquired from ten anest het ized domestic pigs in the cont rol (healthy) state. T hereafte r, gastric arrhythmia was induced via surgical stomach division, which disrupted th e natural periodicity of th e gastric musculature activation cycle. After recording biomagnetic data in t his st ate for one hour, cardiac arrythmia was induced in each anest hetized pig, which allowed us t o compare cardiac and gast ric arrhyt hmia within th e framework of an intra-subject an imal model. Signal ana lysis revealed that several features are shared by cardiac and gastri c arrhyt hmias, par ticularly with respect to th e chaos cont ent of t he magnetic signal from each orga n before and after th e onset of pathophysiology. Our findings indicate t ha t chaos phenomena in t he gut-which have been investigated only recently-may be similar to those in th e heart , which are better und erstood. 1
1 Funding was provided by NIH grants ROI OK 58697 and 58197 and by the New England Complex Systems Inst itute.
605
1
Introduction
In recent years, much progress has been made in the direction of t urning chaos theory into a reliable tool for the characterization of cardiac pathophysiology, partic ularly in t he context of fibrillation and heart failure [2, 5]. Investigations of magnetically-recorded chaotic patterns from the mammalian hear t and gut are import ant for th e elucidation of complex pathological states such as arrhyt hmia, ischemia and muscle injur y. Whereas cardiac physiology has been st udied exte nsively using nonlinear analysis meth ods, little has been done to investigate the nature of such conditions in the gast rointestinal (GI) syste m, where their clinical significance is nevertheless comparable. In t he normal heart, th e sinus node acts as a pacemaker and abnormal rhythmicity in the node can lead to cardiac arrythmia. In th e case of the stomach, gast ric electrical act ivity (GEA) possesses a pacemaker as well, which is located in th e gastic ant rum. There, th e interstitial cells of Cajal impose periodic waves of cell membr ane depolarization and repolariz ation that advance along the corpus of th e stomach at a rate of 3-6 cycles per minute (cpm) in porcines. Each of these waves consists of a potenti al upstroke followed by a plateau and then by a sustained depolarization phase. In this study, we compare t he time evolut ion of normal and arry thmic cardiac and gastric biomagnetic signals using simultaneous magnetocardiography (MCG) and magnetogastrography (MGG) and discuss th e similarities and differences between the two. Moreover, we quantify normality and pathology in t he two organs using various measures, both visual (Lorentz attractors, 2D ret urn maps) and quantit ative (capacity dimension, correlat ion integral , etc) . Finally, we report statistically significant differences in t hese quantitative measures between the normal and pathological states.
2
Experimental protocol
Our st udy made use of two noninvasive techniques called magnetocardiography (MCG) and magnetogastrography (MGG). Th e use of MGG is arguably more advantageo us th an t hat of electrogastrogra phy (EGG) because t he quality and st rengt h of recorded EGG signals are strongly dependent upon the permittivity of tissues, whereas MGG depends primarily on th eir permeability, which is approximately equal to th at of free space. EGG signals are thu s attenuated by th e layers of fat and skin locat ed between internal organs and the recording electrodes, while MGG does not suffer from thi s set back [6]. Because gast ric biomagnetic fields are very weak (0(10- 12 ) T ), signals were acquired using a Superconducting QUant um Int erference Device (SQUID) biomagnetometer. T he multichann el 637i SQUID biomagnetometer (Tristan Technologies Inc., San Diego, CA, USA) in the Vanderbilt University Gastrointestinal SQUID Technology (VU-GIST) Laboratory has dete ction coils located at the bot tom of a dewar filled with liquid helium. Th e coils are magnetically coupled to the SQUID coils, which convert magnetic flux incident on t he detection coils to voltage signals that are amp lified and then acquired at 3 kHz. T he detection
606
!·t;~y}.~4 .. o
I)
10
lS
I_I..
~
~
30
Figure
1 : Sample normal and pathological magnetic signals for the heart (AI-A4) and stomach (BI-B4) acquired from a porcine subject. The first column contains normal signals while the second one displays pathological signals. Al and BI show normal raw magnetic field (Hz) data, while A2 and B2 show FICA-processed , artifact-reduced signals. Similarly, A3 and B3 display pathological raw magnetic field (Hz) signals, while A4 and B4 show FICA-processed, artifact-reduced pathological data.
coils are arranged in gradiometer format as a horizontal grid . Each anesthetized animal subject was placed horizontally under the SQUID inside a magnetically shielded room. Our protocol was approved by the Vanderbilt University Institutional Animal Care and Use Committee (VU-IACUC) . The animal subject set consisted of 10 healthy domestic pigs (sus domesticus) of approximately 20 kg each . Initial anaesthesia consisted of intravenous injections of Telazol, Ketamine and Xylazine, each at a concentration of 100 mg/rnl. The dosage administered was 4.4 mg/kg Telazol, 2.2 mg/kg Ketamine and 2.2 mg/kg Xylazine. Each animal was intubated and maintained on isoflurane anaesthesia with a concentration of 1.5-2.5%. Because the extent of the SQUID input grid is comparable to the size of the animal's chest and abdomen, simultaneous MCG/MGG signals could be recorded. In each pig, the stomach was surgically divided (which led to gastric electrical source uncoupling) and post-division data were acquired. After one hour of post-surgery recording time, cardiac arrythmia was induced in pigs using an intravenous injection as data were being acquired. All animals were under monitored anaesthesia while this was done. The injected solution consisted of pentobarbital sodium with a concentration of 390 mg/rnl and a dosage of 86 mg/kg (1 cc/IO lb). This procedure allowed us to record not only the abnormal gastric signals of each pig resulting from stomach division, but also the arrhythmic cardiac signals induced by the injection. The signals from the
607
two organs were recorded simultaneously.
3
Analysis methods
Both visual (attractors, Poincare return maps, etc .) and numerical (capacity and correlatio n dimensions, etc.) tools were employed to quantify t he time evolut ion of chaos pat terns during arrhyt hmia. First , we made use of fast independent component analysis (fast ICA) to recover th e cardiac and gast ric sources of interest from the SQUID-recorded mixtures. In [4] we demonst rated the use of ICA for noninvasive MGG signal processing. Th e dimensionality of each dat a set was first reduced using principal component analysis (PC A), whereafter fast ICA was applied. PCA is a technique that describes th e variati on of a multivariat e dat a set in terms of a set of uncorrelat ed variables, each of which is a particular linear combinat ion of th e original variables [1]. The princip al components (PCs) of PCA are linear combinations of the underlying vari ables in t he data set th at maximize the variance of each PC subject to an orthonormality constraint. We used th e fast ICA algorithm of Hyviirinen and Oja [3]' which minimizes the mutu al information between the ra ndom vari ables t hat define the separated signals of interest . A det ailed description of our approach to implement FastICA is available in [4]. Lorentz attractors were used to visualize our MCG/MGG data. These threedimensional objects can be formally defined as the subspace of the total state space of a system to which the trajectory of the syste m converges after t he initi al transients have died out [7] . Differences in attractor characteristics between t he healthy and pathological states were quantified numerically using four measur es, namely the capacity dimension, informat ion dimension, correlat ion dimension and correlat ion integral. A 3D attractor can be divided using a partition of boxes of edge length Eo If N (E) is the minimum number of boxes required to cover the spatial exte nt of t he attractor, t he capacity dimension of the syste m can be defined as C = lim In N (E) . f - O In (l/E)
(1.1)
Th e second measure th at was used is th e Balatoni-Renyi information dimension 6, which is a generalization of th e capacity dimension concept that weighs each non-empty cube i by its probability P( .
1 N (f )
6 = lim -1 f- O
nE
L Pi In(1/pi).
(1.2)
i= l
Anoth er popular measur e is the correlation dimension 1J
InI
= lim -1- , f- O
nE
(1.3)
608
••
• I
.....; .•. .-
"~L,
..
: f
--,-
..J
••
• 1
" ~ ,l.
"". " J ~
-----:------'
~-----1
":~,
Figure 2: Samp le ret urn maps for SQU ID signals recorded from t he heart (A) a nd stomach (8) of a domestic p ig . The first column co ntains maps created from normal data while t he second one disp lay s ma ps from pathological d ata . Each signa l s(t ) was norma lized for sim p licity wit h resp ect to max {s (t )} be fore gen er ating th e map .
where I (E) is the correlation int egral 1
I(E) = Nlim N(N -+oo
N
-
) 1
N
L L 8(E -Iri - rjD,
(1.4)
i=l j=l
Above , 8(x) is the Heaviside function (-1 if x 2': 0,0 otherwise) and [r, - rjl is the distance between two points r , and rj .
4
Results and discussion
Sample raw and ICA-processed MCG and MGG signals acquired from a porcin e subject are shown in Figur e 1. Wh at can be conclud ed from th e analysis of t his figure is that the und erlying properties of the two biological sources ar e more readil y apparent from their respective ICs (A2, A4, B2, B4) than from t he raw data t races (AI , A3, Bl , B3). Although cardiac interference (AI) is a significant artifact in t he gast ric signals of (Bl ), its presence was reduced via FICA. In A3 and A4, arrhyt hmic heart signals are shown while B3 and
609
A'
A I
. ')
t. ~
; . e.
-a ,
II
.
:: t.
;
·t.
OJ
t. ~
~
0
...
Figure 3: Example of Lorenz a t t ra ctors for hea rt (A) a nd stomach (B) signa ls. Both nor mal (1) an d patho log ica l (2) d at a were used a nd normalization was a p plied as in th e prev ious figur e.
B4 display a tachygastri c signal induced by stomach division. Th e differences between the normal and path ological signals of the heart are readily visible from our figure. Whereas th e periodicity of t he PQRST complex in the first case is norm al ((AI) and (A2)) , pronounced bradycardi a is visible in (A3) and (A4). In the case of th e gastric signals, th e dominant GEA frequency is of approximately 3 cpm in (BI) and (B2), whereas (B3) and (B4) show a t achygastric rhythm of approximately 4.5 cpm. Sample Poincare return maps are displayed in Figure 3 and examples of Lorentz attractors created from our data are shown in Figure 4. The normal cardiac attractor (AI) has a characteristic shape due to th e high rhythmic pattern of the heart signal. This feature is disrupted in th e arr hyt hmic state, which is also reflected in th e at tractor (A2), which has a more irregular shape. Comparing normal (BI) and abnormal (B2) gast ric signals, one can see th at a larger amount of chaos is present in t he abnormal case (B2) compared to (BI). To ascertain th e reliability of our quantitative measur es, we analyzed th eir convergence as a function of 1:. Th e results of this analysis for one case are shown in Figure 5. There, the definition used for th e percentage error between two successive (previous vs. current) values of each measur e was (current - previous) x 100 / max {previous, current} . Values on the abscissa are shown in units of
610
5.r-- - --.-- - - -.-- - --.- - - -.-- - ---.,
;;
.2
J•
.!:
2.
;
)
O. !I:-_ _--""_~---'--- ~-_ _- ' -_ _--:J
o
10
!lU:lbe r
ot
15 20 vol....,.,trie boxea [tni.1l 1olUl
25
Figure 4: Convergence of the capacity dim en sion C , information d imension 8, correla ti on d imension v and correlation integral I as a fun ction of the number of volumetric boxe s used for th eir comp utat ion (see te xt for details) . The definition used for the percentage er ror betw een two successive (pr evious vs. cu rrent) values of each measure was (current - previous) x 100 I max {previous, c urrent }. Valu es on the ab sciss a a re shown in units of In (% er ror) + 1, such that no error (perfect agreem ent) corres po nds to the hori zontal line y = 1, which is also dr awn .
In (% error) + 1, such that perfect agreement corresponds to the horizontal line y = 1. As the figure shows, all four measures were found to converge, albeit at different rates. The convergence of v was found to be rapid, although oscillatory behavior was found to exist when more th an 5 x 106 boxes were used for its computation. On the other hand, both C and 8 were found to converge smoothly for this exampl e, although much slower than u, The correlation integral I , which is the most computationally-intensive measure, was found in general to have a predictable behavior. For most of our data samples, 8 was found to best reflect the differences between the healthy and pathological states. Because of this, we focus on the behavior of this parameter throughout the remainder of our discussion. In th e case of MGG recordings from normal subjects, because GEA par ameters such as frequency and amplitude are approximately const ant in tim e, 8 was found to be well behaved , with a normalized variance (0'(8) / (8)) of 0.12 (8) across subjects. In the case of pathology, arrhythmia was found to cause abrupt and frequent changes in GEA par ameters, which was reflected in the normalized variance of 8 having a value of 0.84 (8) across subjects. Thes e differences in 8 were found to be st atistically significant (p < 0.001). Normalized variances are reported here instead of absolute numbers because th e 0' statistic was computed across subjects, where large inter-subject differences in 8 were found although the tim e behavior of this parameter was found to be very similar in all cases.
611
5
Conclusion
In conclusion, we have compared cardiac and gastric biomagnetic signals as recorded by simultaneous magnetocardiography and magnetogastrography. It was found that statistically significant differences in the variances of the parameter 8 exist between the healthy and pathological states of the stomach and that these differences are also reflected by the visualization modalities that were presented. In the case of visual measures, distinguishable differences in attractor shape were found between the healthy and pathological states. It is our hope that further study of such differences may one day help us develop novel methods for the noninvasive characterization of gastric disease.
6
Acknowledgments
Research funding was provided by NIH grants ROl DK58697 and DK58197 (AI and MRG) and HL58241 (JPW) . We are grateful to the Graduate School of Arts and Sciences at Vanderbilt University for covering travel expenses for the first author. We also wish to thank Phil Williams and Amy Grant from the Surgical Research Division at the Vanderbilt University Medical Center for their assistance with our animal experiments.
Bibliography [1] BS Everitt and G Dunn (1992) Applied multivariate data analysis Oxford University Press New York, NY 45-64 [2] AL Goldberger, LAN Amaral, JM Hausdorff, PC Ivanov, CK Peng, HE Stanley (2002) Fractal dynamics in physiology: alterations with disease and aging Proc Natl Acad Sci USA vol 99 2466-2472 [3] A Hyvarinen and E Oja (1997) A fast fixed-point algorithm for independent component analysis Neural Comput vol 9 1483-1492 [4] A Irimia, LA Bradshaw (2005) Artifact reduction in magnetogastrography using fast independent component analysis Physiol M eas vol 26 1059-1073 [5] PC Ivanov, LAN Amaral , AL Goldberger, S Havlin, MG Rosenblum, ZR Struzik, HE Stanley (1999) Multifractality in human heartbeat dynamics Nature vol 399 461-465 [6] MP Mintchev, A Stickel, KL Bowes (1998) Dynamics of the level of randomness in gastric electrical activity Digest Dis Sci vol 43 953-956 [7] CJ Starn (2005) Nonlinear dynamical analysis of EEG and MEG: review of an emerging field Clin N europhysiol vol 116 2266-2301
Chapter 20
Complex Networks in Different Languages: A Study of an Emergent Multilingual Encyclopedia* F. Canan Pembe'r' and HalukBingol2 I
Dept. of Computer Engineering, istanbul Kultur University, 34156 Bakirkoy, istanbul, Turkey 2 Dept. of Computer Engineering, Bogazici University, 34342 Bebek, istanbul, Turkey
There is an increasing interest to the study of complex networks in an interdisciplinary way. Language, as a complex network, has been a part of this study due to its importance in human life. Moreover, the Internet has also been at the center of this study by making access to large amounts of information possible. With these ideas in mind, this work aims to evaluate conceptual networks in different languages with the data from a large and open source of information in the Internet, namely Wikipedia. As an evolving multilingual encyclopedia that can be edited by any Internet user, Wikipedia is a good example of an emergent complex system. In this paper, different from previous work on conceptual networks which usually concentrated on single languages, we concentrate on possible ways to compare the usages of different languages and possibly the underlying cultures. This also involves the analysis of
• This work is partially supported by Bogazici University Research Fund under the grant number 06AI05 .
613 local network properties around certain concepts in different languages. For an initial evaluation, the concept "family" is used to compare the English and German Wikipedias. Although, the work is currently at the beginning, the results are promising.
1 Introduction The study of complex networks has started to receive a great attention in recent years [Newman 2003]. Some of the types of networks mostly studied include the Internet, social networks and biological networks. Language has been another interesting area of this research. There is some work in the literature regarding the language as a complex network. In [Motter 2002], a linguistic network is formed by connecting words about similar concepts using a thesaurus data, and this network is shown to be both small-world and scale-free which are the two of the commonly used properties in describing complex networks. In another such work, data from two online dictionaries are used to form conceptual networks [Batagelj 2002] . In that work, the network is constructed by connecting the entries to other entries used in their definition. The previous work on this subject mostly concentrated on single languages. The aim of this work is to investigate such conceptual networks in different languages and use some network properties to compare the usage or importance of the concepts in different languages. One of the prerequisites of this work was to find suitable data to construct the networks. There are several dictionaries and thesauri available in different languages, some of which are available in electronic format, including English WordNet [Fellbaum 1998], Turkish WordNet [Bilgin 2004], dictionaries in several languages etc. The problem of using such data sources in a work of analyzing and comparing different languages is that, the format, content and completeness of each dictionary may be different, because these dictionaries may have been created by different organizations with different purposes in mind. The data to be used in such a work should be comparable both in coverage and format for the different languages involved. Therefore, Wikipedia project) is selected as the data source for the analysis. The rest of the paper is organized as follows. The reason to select Wikipedia as the data source and the properties of it are given in Section 2. This is followed by the methodology of the network construction in Section 3 and some of the initial results obtained for English and German Wikipedias in Section 4. Then, the conclusion is given in Section 5.
2 Wikipedia Wikipedia [Wikipedia 2006], started in 2001, is a multilingual web-based encyclopedia project and is considered as one of the greatest inventions in the World Wide Web after the invention of Google'. Some of the most important properties of this encyclopedia are both its free content and its openness. Any Internet user can ) http ://www.wikipedia.org http://www.google.com
2
614 create new entries or edit existing ones without even the necessity to register. This openness results in a dynamically evolving and developing source of data. There are almost no restrictions on the editing of the encyclopedia entries. Any user can even delete the passages written by others. In such a free environment, the control of the encyclopedia is maintained in a distributed fashion. This is achieved by making all the versions regarding the edits publicly available in addition to the current version of each entry. Users can correct wrongly written entries or recover intentionally deleted passages owing to the availability of versions. In this way, the encyclopedia evolves in a natural and fast way . Due to these properties, Wikipedia becomes a good example of a complex and emergent system and an interesting data source for research. Using Wikipedia as the data source in this project has several advantages. First, all the Wikipedias in different languages have the same format where each encyclopedia entry corresponds to a page and users can make links to other entries from a page. This makes the comparison of it for different languages possible. Second, the encyclopedia is being created from scratch and is open to everyone. As a result, it is a large and natural data source for obtaining results on different languages. Wikipedia is available in over 200 languages. Some of the Wikipedias are more active, including the English and German versions. In Table 1, some of the larger versions in different languages are given together with their current number of articles. In this work, English and German Wikipedias are used as data . Table. 1: Some of the largest versions ofWikipedia
Lan ua e English German French Polish Ja anese
Number of Articles 937,803 343,612 226,032 189,106 174,476
3 Construction of the Networks The methodology is to construct conceptual networks for Wikipedias in different languages and comparatively analyze some of the properties of these networks. The data of both English and German Wikipedias is obtained as two XML files. Each XML file contains all the entries together with their content in that particular language. These data are collected at different time points and made available for research purposes at the Wikipedia site . In this work, data dumps obtained in December 2005 are used for both English and German. Each entry page contains links to other entries which are embedded into the content. An example portion of an entry is given in Figure 1. The network can be formed in a natural way based on these data by connecting entry titles to the entry titles hyperlinked within the content of that entry ; e.g. the entry "complex network" is connected to the entry "degree distribution" in the example of Figure 1.
615
g~_J!1J?!~. . ~.~.tY>'ork
FromWIl
The twomost well~nown examples ofcomplex networks arethoseof scale-free network.andsmall·world networks. Botharespecific models of cornplex networks discOY8f8d in the late 1990s by physicists, andarecanonical case-studios in the field. However, as network sciencehas continued to grow in importance andpopularity, other models of complexnetworks h"", been dEM>loped. Indeed, thefieldcontinuesto dEM>lop at a brisk pace, andhas broughl togetherresearchersfroma vanety offields. Network science, andthe study anduse of complex networks in particular,'Shows somePfomiso ofhelping10ullfEM>l lhe . Wet",e ofthe genetic mgulatorynetwork, to explain howto buildrobustand scalable communicationnetworks' both wired andwireloss, 10aid in tha d8Yelopmont oftnDre efficientvaccination stmagies, and10produce a near endless stream of attraet;""pictures
Figure. 1: Example portion from a Wikipedia entry
The dataset contains large amounts of data. In the network built, entry titles correspond to vertices whereas the edges correspond to distinct links from an entry page to other entry pages. Some of the properties of the dataset are given in Table 2. Table. 2: Properties of the datasets used
English Size of contents (in Gigabytes) Number of vertices Number of edges A verage degree
German 3.71 1,587,127 16,355,115 10.3
1.45 518,070 7,682,508 14.8
4 Analysis of English and German Wikipedias In this work, some global properties of Wikipedia networks , such as the diameter , is not calculated due to the enormous size of the network and time restrictions . There is some research on Wikipedia showing that the degree distribution of the network shows scale-free properties [Voss 2005]. In another work, Jon Kleinberg's HITS and Larry Page and Sergey Brin 's PageRank algorithms are used to extract the most central vertices from the English Wikipedia network [Bellomi 2005]. However, the research on Wikipedia is currently at the very beginning and most of the work concentrates on a Wikipedia in one language. The aim of this work is not to concentrate on the network properties of Wikipedia in a single language. Instead, the aim is to comparatively analyze conceptual networks in different languages and this may also involve the analysis of local network properties around a certain concept in different languages. First, the degree of vertices in English and German Wikipedias are considered . The average degree of the vertices is given for both of the languages in Table 2. As seen, although the number of vertices; i.e. entries, in German Wikipedia is much less than that of English, the average degree of vertices is higher. When the degrees of
616 individual vertices are considered, it is seen that they reach 3000s in the English version and 2000s in the German one. However, such big numbers usually occur for entries which are lists in both of the languages, such as the entry with the title "List of airlines" and degree 1777. Next, the concept of "family" ("familie" in German) is investigated in the Wikipedia networks for both of the languages. For this purpose, the I-neighborhood and 2-neighborhood of the concept is considered. The number of distinct concepts in the neighborhoods of the concept and the clustering coefficient of "family" for both of the languages are given in Table 3. The clustering coefficient Cz'(v) is calculated as in (1) and (2) where IE(G1(v))1 is the number of edges among vertices in 1neighborhood of vertex v, IE(Gz(v))1 is the number of edges among vertices in 1- and 2-neighborhood of v and Ii. is the maximum degree of a vertex in the network [Batagelj 2002] : (1)
Cz '(v) = deg(v) Cz(v) !:i.
(2)
The initial results regarding the concept "family" are interesting. First, it was stated that the network size of English Wikipedia is much larger than that of German Wikipedia as given in Table 2. However, Table 3 shows that the number of concepts associated with the "family" concept is much larger in German version than the English one. Also, the "family" concept is more clustered in the German version. When considering the possible authors of the encyclopedia, the authors of the English version are expected to be more diverse. That is, this will include English-speaking Internet users from all around the world as well as those located in the U.S. or the U.K. In this manner, the number of authors in the German version of the encyclopedia is expected to be more limited when compared to the English version. Then, the question of whether the family concept is more important within the German culture. Although there may be several factors behind these initial results , extending this work to other concepts or network properties may be promising. Table. 3: Number of vertices in the 1- and 2-neighborhoods of "family" and clustering coefficient of "family"
German
English Number of vertices in l-neighborhood Number of vertices in 2-neighborhood Clustering coefficient (* 1000)
44 2530 0.03
100 3373 0.20
617
5 Conclusion In this paper, an emergent multilingual encyclopedia, Wikipedia, is investigated as a large and evolving complex system. Different than previous work on conceptual networks which usually concentrated on single languages, we concentrated on possible ways to compare the usages of different languages and possibly the underlying cultures. For an initial evaluation, the concept "family" is used to compare the English and German Wikipedias. Although , the work is currently at the very beginning, the results are promising. As future work, the investigated concepts may be extended possibly with the use of categories; for example all the concepts in the category of "education".
Bibliography [1] Batagelj V., Mrvar, A., & Zaversnik, M., 2002, Network analysis of dictionaries, Jezikovne tehnologije / Language Technologies , T. Erjavec, J. Gros eds., Ljubljana, 135. [2] Bellomi F., & Bonato, R., 2005, Network Analysis for Wikipedia , in Proceedings of Wikimania 2005, Frankfurt, Germany. [3] Bilgin, 0., Cetinoglu, 0., & Oflazer, K, 2004, Morphosemantic Relations In and Across Wordnets: A Preliminary Study Based on Turkish, in Proceedings of the Global WordNet Conference, Masaryk, Czech Republic. [4] Fellbaum, C. (Ed.), 1998, WordNet- An Electronic Lexical Database, the MIT Press. [5] Motter, A. E., de Moura, A. P. S., Lai, Y.-C., & Dasgupta, P., 2002, Topology of the conceptual network oflanguage, in Physical Review E, 65, 065102. [6] Newman, M. E. J., 2003, The Structure and Function of Complex Networks, in SIAM Review, 167,45. [7] Voss, J., 2005, Measuring Wikipedia, in Proceedings 10th International Conference of the International Society for Scientometrics and Informetrics, Stockholm.
Chapter 21
Possible Chaotic Structures in the Turkish Language with Time Series Analysis Gokhan Sahin Department of MIS, Yeditepe University, Kayisdagi Caddesi, 34755 Kadikoy, Istanbul, Turkey sahin @yeditepe.edu.tr Murat Erentiirk, Avadis Hacinliyan Department of Physics, Yeditepe University, Kayisdagi Caddesi , 34755 Kadikoy, Istanbul, Turkey [email protected]
The possibility of chaotic structures in Turkish and English text s, as well as the possibilit y of using th e pseudo-invariants in a reconstructed phase space as identifying characterist ics for langu ages is investigated. Texts of length up to 83000 in both languages have been analyz ed. Two alt ernatives for th e dependent variable in a tim e series analysis have been used. Word frequencies based on a corpus have been one alte rnat ive inspired by Zipf's law. The oth er alte rnat ive is based on assigning values to the letters in a word as inspired by a random walk. A posit ive maximal lyapunov exponent has been observed. Values of this exponent are different for th e two languages. This and differing detr ended fluctuation analy sis results for th e two langu ages for either parametrization imply that our analysis methods can point to differences in languages.
1
Introduction
The structure of natural languages has recently become a n important field of research following the observation that te xts written in natural languages obey laws tha t approximately ob ey rules of fractal geometries [1, 2J. This beh avior is
619 characteristic of many other systems in nature including many forms of music. On the other hand, time series analysis methods[3, 4] have become an import tool in analyzing fractal structures using a one dimensional signal in time . Unfortunately, there are several ways in which one can generate a one dimensional time signal based on a literary text . A natural language is a hierarchy of structures that involve both a sound and a meaning. The simplest structures consist of the letters of the alphabet and the syllables; they will have a contribution to the sound but not to the meaning. Structures higher in the hierarchy such as words, sentences and paragraphs will contribute to the meaning. In order to be an effective communication medium, a language must be patterned. It is expected that the patterns will involve self similarity [2, 5] and there should be a certain characteristic distance or distances between words that significantly contribute to the meaning,[6] . It would be tempting to see time series analysis can give us an estimate on both the possibility of self similarity and the existence of such a window, since chaotic behavior is related to the long time predictability of a system . The first issue involved in an attempt to analyze a natural language as a time series is constructing a time series from a text. A meaningful dependent variable is needed for the time series analysis. Two different dependent variables have been used in this work: i) The frequencies derived from a corpus and ii) a variable derived from values assigned to the letters constituting a given word. The resultant time series are analyzed via nonlinear time series analysis and detrended fluctuation analysis (DFA) [6] . DFA has become an important tool in analyzing long-term correlations in nonstationary time series [6, 7] such as organization of DNA nucleotides, heartbeat time series, long-range weather forecasting, economical time series and solid-state dynamics [8,9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19] . DFA is reported to have advantages over conventional methods (e.g.,spectral analysis and Hurst analysis) . It permits the detection of intrinsic self-similarity embedded in a seemingly non stationary time series, and also avoids the spurious detection of apparent self-similarity, which may be an artifact of extrinsic trends [20, 21, 22, 23, 24] .
2
DETRENDED FLUCTUATION ANALYSIS
In order to compute the scaling exponent a from nonstationary time series denoted by x(i) [i = 1, ...., N], the time series is integrated [6] first: N
y(k)
= Lx(i) -
x.
i=l
Here x is the the average value of the series x( i), and i ranges between 1 and N.
y(k) is next divided into n boxes of equal length . A line (Yn (k)) is fitted to each of the boxes by a least squares fit. As the next step the time series is detrended
620 by subtracting the local trend (Yn(k)) from the integrated time series. The root-mean square fluctuation of the detrended series, F(n) is computed as :
F(n)
=
1 N N L[y(k ) - Yn(k)J2 . k=l
F(n) is calculated for alln. The slope of the graph of log(F(n)) versus log(n) is the scaling exponent 0:. This slope 0: is related to the l / f spectral slope, m by the relation , m=20: -1.
3
ANALYSIS OF TEXTS
As an example of the proposed analysis method , two Turkish and two English texts (will be referred as Turkish text 1, Turkish text 2, English text 1, English text 2) were used. Turkish text 1 and English text 1 are independent of each other whereas Turkish text 2 and English text 2 are translations of each other. The time series is constructed from all the texts both using the frequencies from the corpuses and using derived variable from values assigned to the let ters constituting a given word. In the latter case DNA Random Walks [25J served as a main source of inspiration. The methodology is as follows: Each word is accepted as a single step in a random walk. The length of each step (word) is determined from the letters as N
s(n)
= L y(i ) i=l
Here N is the length of the word and y( i) is the ASCII value of the corresponding letter. After the time series is constructed, the scaling exponent and Lyapunov exponent are found.
3.1
Analysis using a Variable derived from let ters
The detrended fluctuatio n ana lysis of English text 2 and Turkish text 1 ( which are translations of each other) are presented below (Fig. 1). The breakdown in the slopes show changes of the correlation properties. The slope of Turkish text is approxiamately 10% higher. The difference of correlation relations in the two regions for either analysis is clearly observed. In order to further verify that the correlation properties are not of the texts but the languages the same analysis is applied to two turkish texts with different contexts. Below is the result of analysis where the correlation properties of the different turkish texts are quite near (Fig. 2). The same ana lysis applied to English texts is presented in the next figure (Fig. 3). Again the correlation properties (the slopes) are similar.
621 Turkish text 2 + slope=O.52 ••• ••• . slope=O.67 . English text 2 x slope=O.48 • .• •.• . slope=O.57 • .•.• •
38 3.6 3.4
3.2
2.8 2.6 2.4
2.2
1.5
2
2.5
3
3.5
Iog(n)
Figure 1: log(F(n)) vs log(n) for Turkish and English texts .
Figure 2: log(F(n)) vs log(n) for two Turkish texts .
As a last check, the correlation properties of a random text using Steganography ciphering [26] (in English) was checked, and it is found that for the randomized text the correlation properties do not resemble the correlation properties of the English text(Fig. 4).
622 3.9
r ---;:--:-:-:""'':-:--
-
---r-
-
-
---.-
-
- - , - --
-
-.--
---,
3.6 3.4
32
2.6 2.4
2.2
1.5
2.5
3
Figure 3: log(F(n)) vs log(n)two Turkish texts . 36
r---:::--....-::--
, --
-
, --
-
34
xxx?,
x"
.j"
32
)(~
r<
.xx"x
~ 26
~xW!'
../ ....
j
~
24
)/l' X
x X ?'''-
22
./)0(
......
......
••••+
.,
.
..
. .....
+
" ~ ...
)()Ii"
..xX"
lI:.
t+++_· +
:rt<)(l()(
28
x-
, - - -, -----"..-----, ".
....
. ++
•• • +
•• •+•• •
.. .'.'
)()(
..++
'
18
•
1 6'---
05
- ' - - -- ' - --
-
' - --
1.5
-
' - -_ _' - -_ _'--_---1 25 35
Figure 4: log(F(n)) vs log(n) for English randomized text and English text .
3.2
Analysis Using Corpus
Detrended fluctuation analysis is now applied on the same texts, using frequencies derived from a corpus. The results are therefore independent of the previous parametrization. Below are the results for different Turkish texts, and different English texts , all results parallel to th e previous section the results are not included here due to space constraints.
623 TEXT English Text English Text Turkish Text Turkish Text
1 2 1 1
Lyapunov Exponent 0.012 0.018 0.2 0.2
Table 1: Maximallyapunov exponents calculated for texts.
3.3
Lyapunov Exponents
The time series analysis is applied to the dependent variable derived by using the frequencies in corpus. The first results imply significantly different magnitud es for the maximal lyapunov exponents in Turkish (of order 0.20 ) and in English texts (of order 0.018). For a more reliable conclusion more sets of data and collection of frequencies from a very large set of data and using the collected frequencies for the segments of data, in order to eliminate the corpus dependence would be useful. Still, there is evidence both for a positive maximal lyapunov exponent and a possible difference between the two languages (see Table 1).
4
CONCLUSION
Zipf's laws relate the frequencies to the rank so that frequencies based on a corpus are natural candidates for a dependent variable; unfortunately this choice depends on the corpus. A random walk model is based on the progression from one word to the next, so that it is relative ly more localized. Detrended fluctuation analysis results reveal differences between the two languages. This is independent of the two parameterizations used in this work. Time series analysis using the corpus based parametrization yields a positive maximal liapunov exponent. The value of this exponent is different in the two languages. This indicates that word frequencies are an important tool for using a time series ana lysis on a natural language. Words constitute one of the smallest units in a natural language that carry meaning. Both the Turkish and English language shows evidence of chaotic behavior when word frequencies are used. It is natural to expect a window of lengt h up to eight words (the length of a sentence) where meanings of nearby words are expected to affect each other to the greatest extent. We think that this is why a time series ana lysis based on frequencies is meaningful but a time series analysis based on the random walk inspired model is relatively less feasible. It is known that sounds in all natural languages exhibit fracta l structure, but the evidence presented here along the lines that a time series can be derived from a written language would be of interest in many fields including cryptography. There are indications that both word frequencies based on a corpus and a parametrization based on assigning values to lette rs can serve as blueprints
624 for distinguis hing languages. However we have used two languages t hat have significantly different grammatical structures. Different verb positions in the two languages or the presence of articles before nouns in English, much wider usage of suffixes in deriving Turkish words are examples of this difference. Finally, corpus quality is a factor that can also affect these results.
Bibliography [1] Hrebicek, Ludek, Universittsverlag Dr. N. Brockmeyer, Bochum, pp .91-96, 1992. [2] Hrebicek, Ludek, Quanti tative Linguistics Vol. 566, Wissentschaftlicher Verlag Trier , 1995. [3] Kantz, H.and T .Schreiber, Nonlinear Time Series Anaylsis, PlaceNameCambridge PlaceTypeUnivers ity Press, placeCityCambridge, 1997. [4] Abarbane l, H.D.I., R.Brown, J. J. Sidorowich L.S. Tsimring , Th e analysis of observed chaotic data, Revs. Of Modern Phys., Vol. 65, No.4, pp. 1331-1392, 1993. [5] Hrebicek, Ludek, Quantit ative Linguistics Vol. 48, Universittsverlag Dr. N. Brockmeyer, Bochum, 1992. [6] C.-K. Peng, S.V. Buldyrev, S. Havlin, M. Simons, H.E. St anley, A.L. Gold-
berger, Phys. Rev. E 49, 1685,1994. [7] J.W. Kant elhard t , E. Koscielny-Bunde, H.H.A. Rego, placeS. Havlin, and A. Bunde Physica A 295, 441 ,2001. [8] S.V. Buldyrev, A.L. Goldberger, S. Havlin, R.N. CityplaceMantegna, StateM.E. Matsa , C.-K. Peng, M. Simons, H.E. Stanley, Phys. Rev. E 51 , 5084, 1995, [9] C.-K. Peng, S.V. Buldyrev, A.L. Goldberger, R.N. Mantegna, M. Simons, H.E. Stanley, Physica A 221 ,180, 1995.
[10] S.V. Buldyrev, N.V. Dokholyan , A.L. Goldberger, S. Havlin, C.-K. Peng, H.E. Stanley, G.M. Viswanathan, Physica A 249 , 430,1998. [11] C.-K. Peng, J. Mietus, J .M. Hausdorff, S. Havlin, H.E. Stanley, A.L. Goldberger , Phys. Rev. Lett. 70 ,1343, 1993. [12] C.-K. Peng , S. Havlin, H.E. Stanley, A.L. Goldberger, Chaos 5, 82, 1995.
625 [15] C.K Peng, J .M. Hausdorff, S. Havlin, J.E. Mietus , H.E. Stanley, A.L. Goldberger, Physica A 249 , 491, 1998. [16J YH. Liu, P. Cizeau, M. Meyer, C.-K Peng, H.E. Stanley, Physica A 245, 437, 1997. [17J P. Cizeau, YH. Liu, M. Meyer, C.-K Peng, H.E. Stanley, Physica A 245 , 441,1997. [18J M. Ausloos, N. Vandewalle, P. Boveroux, A. Minguet , K Ivanova, Physica A 274, 229, 1999. [19J M. Ausloos, K Ivanova, Physica A 286 , 353, 2000. [20] Buldyrev, S. V., A. L. Goldberger , S. Havlin, C. K Peng , H. E. Stanley and M. Simons, Biophys. J. , Vol.65, 2673,1993. [21J Ossadnik, S. M., S. V. Buldyrev , A. L. Goldberger, S. Havlin, R. N. Mantegna, C. K Peng, M .Simons, H.E Stanley, Biophys. J., Vol. 67,64, 1994. [22J Hausdorff, J . M., C. K Peng, Z. Ladin, J . Y Wei, A. L. Goldberger, J . Appl. Physiol., Vol. 78, 349, 1995. [23J Hausdorff, J . M., P. Purdon, C. K Peng, Z. Ladin, J . Y Wei, A. L. Goldberger, J . Appl. Physiol., Vol. 80, 1448, 1996. [24J Hu, K , P.Ch. Ivanov, Z. Chen, P. Carpena, H. E. Stanl ey, "Effect of nonstationarities on detrended fluctuation analysis", Phys . Rev. E., Vol. 64, 011114, 2001. [25] Alexandre Rosas, Edvaldo Nogueira, Jr., and Jose F. Fontanari, PHYSICAL REVIEW E 66, 061906 , 2002. [26J http:/ /www.fourmilab .chfj avascrypt /stego.html
626
Index of authors Flo, Rob - 307 Foster, Chad - 527 Gallucci, Michael R. - 604 Garcia, Ephrahim - 519 Gastner, Michael T. - 315 Gayle , Orrett - 495 George, Neena A. - 479 Giannoccaro, Haria - 130 Green, David G. - 58 Groothuis, Adam - 422 Gross, Laura K. - 339 Gulak, Yuriy - 50 Hacinliyan, Avadis - 618 Halpern, V. - 438 He, Jiang - 122 Heathfield, Donald - 596 Hoogendoorn, Mark - 235 Hovareshti, Pedram - 98 Irimia, Andrei - 604 Jones, Douglas E. - 243 Kanagarajah, Ashok Kay - 471 Kane, Keelan - 275 Kay, Nigel - 146 Kleinberg, Samantha - 186 Korotkikh, Victor - 19, 19 Kovacevic, Lazar - 194 Krokene, Paal - 251 LaFon, Christian - 366 Lacitignola, Deborah - 267 Lange, Holger - 251 Lapilli, C. M. - 446 Laverdure, Nate - 374 Leishman, Tania G. - 58 Lekkakos, Spyridon D. - 366 Lin, Aizhong - 347 Lindsay, Peter - 471 Lyell, Margaret - 307 Lyneis, James - 366 Maier, Jonathan R.A . - 564 Erdi, Peter - 90 Marczyk, Jacek - 27 Erentiirk, Murat - 618 Marks, Robert E. - 414 Fadel, Georges M. - 564 Matthaus, Franziska - 211 Fallah, M. Hosein - 106, 122 McCaughin, Keith - 548 Fellman, Philip Vos - 114, 138, 162, McDonald, Diane M. - 146 McGowan, Clement - 374 398, 422, 454 Mejia-Tellez, Mateo - 307 Fiddner, Dighton - 572
Aguilar, Luis Angel - 227 Albino, Vito - 130 Alcazar, Javier - 519 Alderson, David - 74 Angus, Simon - 291 Antoniotti, Marco - 186 Araujo, Tanya - 430 Baras, John S. - 98 Basha, Nagi - 503 Bingol, Haluk - 612 Bonney, Dean J. - 535 Boppana, Krishna - 366 Bosse, Tibor - 42, 259 Bounova, Gergana - 323 Brandt, Kevin - 154 Brantle, Thomas F. - 106 Brunner, Hans-Peter - 580 Byeon, Jeewoong - 178 Bykovsky, Val K. - 34 Carbonara, Nunzia - 130 Cecere, Fred - 374 Chistilin, Dmitry - 382 Chow, Sam - 366 Coore, Daniel - 495 Csardi, Gabor - 90 Dancik, Garrett M. - 243 Danforth, Christopher M. - 339 Darneille, Robert - 374 DeRosa, Joseph - 548 Deng, Hong-Zhong - 66 Deshpande, Balachandra - 27 Dias, Manuel - 430 Doboli, Simona - 479 Dorman, Karin S. - 243 Doumit, Sarjoun - 511 Doursat, Rene - 203 Ebenhoh, Oliver - 211
627
Melamede, Robert - 219 Mert z, Sharon A. - 422 Mesjasz, Czeslaw - 170 Mesterton-Gibbons , Mike - 283 Miller, Anne - 471 Minai, Ali A. - 479, 487, 511 Mishra , Bud - 178, 186 Motyka, Matt - 564 Mysore, Venkatesh - 178 Narzisi, Giuseppe - 178 Nasrallah, Walid - 390 0 kland, Bjern - 251 Ozeki, Takeshi - 82 Parker, David - 471 Pembe, F. Canan - 612 Pfeifer, P. - 446 Polani, Daniel - 3 Post , Jon athan Vos - 114, 398, 454 Ramakrishn an, Naren - 186 Rebovich Jr. , George - 556 Rinaldi, Matthew - 366 Riofrio, Walter - 227 Ryan, Alex - 588 Sabelli, Hector - 194 Sadedin, Suzanne - 58 Sadek, Adel - 503 Salam, Khan Md. Mahbubush - 299 Salazar, Carlos - 211 Sayama, Hiroki - 463 Schut , Martijn C. - 235 Sgorbati , Susan - 11 Sharp anskykh , Alexei - 42, 259
Sherratt, Tom N. - 283 Str andburg, Katherine - 90 Tadepalli, Satish - 186 Takahashi, Kazuyuki Ikko - 299 Tan , Yue-Jin - 66 Tebaldi, Claudio - 267 Thom as, Gerald H. - 275 Tobochnik, Jan - 90 Treur , Jan - 42, 235, 259 Venkat , Kumar - 406 Venuturumilli, Abhinay - 487 Wakeland , Wayne - 406 Wang, Zhiyong - 366 Webb, Mike - 540 Weber, Bruce - 11 de Week, Olivier - 323, 366 Wexler, C. - 446 Whitney, Daniel E. - 74, 331 Wheeler, Paul - 366 Wikso Jr., John P. - 604 Wiley, James B. - 347 Wilkinson, Ian F . - 347, 414 Wojcik, Leonard - 366 Wright , Roxana - 398 Wu, Jun - 66 Young, Louise - 414 Yu, Jun - 339 Zalanyi, Laszlo - 90 Zborovskiy, Marat - 366 Zhu, Da-Zhi - 66 Sahin, Cokhan - 618