Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Alfred Kobsa University of California, Irvine, CA, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen University of Dortmund, Germany Madhu Sudan Massachusetts Institute of Technology, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max-Planck Institute of Computer Science, Saarbruecken, Germany
5102
Marian Bubak Geert Dick van Albada Jack Dongarra Peter M.A. Sloot (Eds.)
Computational Science – ICCS 2008 8th International Conference Kraków, Poland, June 23-25, 2008 Proceedings, Part II
13
Volume Editors Marian Bubak AGH University of Science and Technology Institute of Computer Science and Academic Computer Center CYFRONET 30-950 Kraków, Poland E-mail:
[email protected] Geert Dick van Albada Peter M.A. Sloot University of Amsterdam Section Computational Science 1098 SJ Amsterdam, The Netherlands E-mail: {dick,sloot}@science.uva.nl Jack Dongarra University of Tennessee Computer Science Department Knoxville, TN 37996, USA E-mail:
[email protected]
Library of Congress Control Number: 2008928941 CR Subject Classification (1998): F, D, G, H, I, J, C.2-3 LNCS Sublibrary: SL 1 – Theoretical Computer Science and General Issues ISSN ISBN-10 ISBN-13
0302-9743 3-540-69386-6 Springer Berlin Heidelberg New York 978-3-540-69386-4 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. Springer is a part of Springer Science+Business Media springer.com © Springer-Verlag Berlin Heidelberg 2008 Printed in Germany Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper SPIN: 12322206 06/3180 543210
Advancing Science Through Computation
I knock at the stone’s front door. “It’s only me, let me come in. I’ve come out of pure curiosity. Only life can quench it. I mean to stroll through your palace, then go calling on a leaf, a drop of water. I don’t have much time. My mortality should touch you.” Wislawa Szymborska, Conversation with a Stone, in Nothing Twice, 1997 The International Conference on Computational Science (ICCS 2008) held in Krak´ ow, Poland, June 23–25, 2008, was the eighth in the series of highly successful conferences: ICCS 2007 in Beijing, China; ICCS 2006 in Reading, UK; ICCS 2005 in Atlanta; ICCS 2004 in Krakow, Poland; ICCS 2003 held simultaneously in Melbourne, Australia and St. Petersburg, Russia; ICCS 2002 in Amsterdam, The Netherlands; and ICCS 2001 in San Francisco, USA. The theme for ICCS 2008 was “Advancing Science Through Computation,” to mark several decades of progress in computational science theory and practice, leading to greatly improved applications in science. This conference was a unique event focusing on recent developments in novel methods and modeling of complex systems for diverse areas of science, scalable scientific algorithms, advanced software tools, computational grids, advanced numerical methods, and novel application areas where the above novel models, algorithms, and tools can be efficiently applied, such as physical systems, computational and systems biology, environment, finance, and others. ICCS 2008 was also meant as a forum for scientists working in mathematics and computer science as the basic computing disciplines and application areas, who are interested in advanced computational methods for physics, chemistry, life sciences, and engineering. The main objective of this conference was to discuss problems and solutions in all areas, to identify new issues, to shape future directions of research, and to help users apply various advanced computational techniques. During previous editions of ICCS, the goal was to build a computational science community; the main challenge in this edition was ensuring very high quality of scientific results presented at the meeting and published in the proceedings. Keynote lectures were delivered by: – Maria E. Orlowska: Intrinsic Limitations in Context Modeling – Jesus Villasante: EU Research in Software and Services: Activities and Priorities in FP7 – Stefan Bl¨ ugel: Computational Materials Science at the Cutting Edge
VI
Preface
– Martin Walker: New Paradigms for Computational Science – Yong Shi: Multiple Criteria Mathematical Programming and Data Mining – Hank Childs: Why Petascale Visualization and Analysis Will Change the Rules – Fabrizio Gagliardi: HPC Opportunities and Challenges in e-Science – Pawel Gepner: Intel’s Technology Vision and Products for HPC – Jarek Nieplocha: Integrated Data and Task Management for Scientific Applications – Neil F. Johnson: What Do Financial Markets, World of Warcraft, and the War in Iraq, all Have in Common? Computational Insights into Human Crowd Dynamics We would like to thank all keynote speakers for their interesting and inspiring talks and for submitting the abstracts and papers for these proceedings.
Fig. 1. Number of papers in the general track by topic
The main track of ICSS 2008 was divided into approximately 20 parallel sessions (see Fig. 1) addressing the following topics: 1. e-Science Applications and Systems 2. Scheduling and Load Balancing 3. Software Services and Tools
Preface
4. 5. 6. 7. 8. 9. 10.
VII
New Hardware and Its Applications Computer Networks Simulation of Complex Systems Image Processing and Visualization Optimization Techniques Numerical Linear Algebra Numerical Algorithms
# papers 25
23 19
20 17
14
15
14 10
10
8
10
10
10
8
7
9
8
5
Int Agents
Workflows
Environ
Soft. Eng
Develop
Bioinfo
Dyn. Data
Teaching
GeoComp
Networks
Finance
Chemistry
Multiphys
Graphics
0
Fig. 2. Number of papers in workshops
The conference included the following workshops (Fig. 2): 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
7th Workshop on Computer Graphics and Geometric Modeling 5th Workshop on Simulation of Multiphysics Multiscale Systems 3rd Workshop on Computational Chemistry and Its Applications Workshop on Computational Finance and Business Intelligence Workshop on Physical, Biological and Social Networks Workshop on GeoComputation 2nd Workshop on Teaching Computational Science Workshop on Dynamic Data-Driven Application Systems Workshop on Bioinformatics’ Challenges to Computer Science Workshop on Tools for Program Development and Analysis in Computational Science
VIII
11. 12. 13. 14.
Preface
Workshop Workshop Workshop Workshop
on on on on
Software Engineering for Large-Scale Computing Collaborative and Cooperative Environments Applications of Workflows in Computational Science Intelligent Agents and Evolvable Systems
# participants 100 77
30
27 19 15
12
11
10
8 6 5
5
4
5
2
2
AU
SG
MY
2
1
TW
IR
KR
IL
1
IN
1
BR
SZ
TR
SI
2
DZ
2
MU
2
1
SK
RS
RU
PL
RO
NL
NO
MK
IT
LT
GB
GR
ES
1
2
JP
2
CN
2
MX
2
1
FR
DK
CZ
1
DE
BE
BG
1
1
2
3
US
2
3
AR
2
3
CA
3
UA
3
AT
5
4
Fig. 3. Number of accepted papers by country
Selection of papers for the conference was possible thanks to the hard work of the Program Committee members and about 510 reviewers; each paper submitted to ICCS 2008 received at least 3 reviews. The distribution of papers accepted for the conference is presented in Fig. 3. ICCS 2008 participants represented all continents; their geographical distribution is presented in Fig. 4. The ICCS 2008 proceedings consist of three volumes; the first one, LNCS 5101, contains the contributions presented in the general track, while volumes 5102 and 5103 contain papers accepted for workshops. Volume LNCS 5102 is related to various computational research areas and contains papers from Workshops 1–7, while volume LNCS 5103, which contains papers from Workshops 8–14, is mostly related to computer science topics. We hope that the ICCS 2008 proceedings will serve as an important intellectual resource for computational and computer science researchers, pushing forward the boundaries of these two fields and enabling better collaboration and exchange of ideas. We would like to thank Springer for fruitful collaboration during the preparation of the proceedings. At the conference, the best papers from the general track and workshops were nominated and presented on the ICCS 2008 website; awards were funded by Elsevier and Springer. A number of papers will also be published as special issues of selected journals.
Preface
# participants 100
IX
94
34 28
21 17
16
14
10
9
8 5
5
5
3
3 2
1
1
5
2
2
1
3
3 2
2
3 2
1
1
3 2
2
1
1
AT BE CZ DE DK ES FR GB GR IT LT MK NL NO PL RO RS SI SK SZ TR CA US AR BR MX CN JP IL IN IR KR MY SG TW MU AU
1
5 4
4
Fig. 4. Number of participants by country
We owe thanks to all workshop organizers and members of the Program Committee for their diligent work, which ensured the very high quality of ICCS 2008. We would like to express our gratitude to the Kazimierz Wiatr, Director of ACC CYFRONET AGH, and to Krzysztof Zieli´ nski, Director of the Institute of Computer Science AGH, for their personal involvement. We are indebted to all the members of the Local Organizing Committee for their enthusiastic work towards the success of ICCS 2008, and to numerous colleagues from ACC CYFRONET AGH and the Institute of Computer Science for their help in editing the proceedings and organizing the event. We very much appreciate the help of the computer science students during the conference. We own thanks to the ICCS 2008 sponsors: Hewlett-Packard, Intel, Qumak-Secom, IBM, Microsoft, ATM, Elsevier (Journal Future Generation Computer Systems), Springer, ACC CYFRONET AGH, and the Institute of Computer Science AGH for their generous support. We wholeheartedly invite you to once again visit the ICCS 2008 website (http://www.iccs-meeting.org/iccs2008/), to recall the atmosphere of those June days in Krak´ ow.
June 2008
Marian Bubak G. Dick van Albada Peter M.A. Sloot Jack J. Dongarra
Organization
ICCS 2008 was organized by the Academic Computer Centre Cyfronet AGH in cooperation with the Institute of Computer Science AGH (Krak´ ow, Poland), the University of Amsterdam (Amsterdam,The Netherlands) and the University of Tennessee (Knoxville, USA). All the members of the Local Organizing Committee are staff members of ACC Cyfronet AGH and ICS AGH.
Conference Chairs Conference Chair Workshop Chair Overall Scientific Co-chair Overall Scientific Chair
Marian Bubak (AGH University of Science and Technology, Krak´ ow, Poland) Dick van Albada (University of Amsterdam, The Netherlands) Jack Dongarra (University of Tennessee, USA) Peter Sloot (University of Amsterdam, The Netherlands)
Local Organizing Committee Kazimierz Wiatr Marian Bubak Zofia Mosurska Maria Stawiarska Milena Zaj¸ac Mietek Pilipczuk Karol Fra´ nczak
Sponsoring Institutions Hewlett-Packard Company Intel Corporation Qumak-Sekom S.A. and IBM Microsoft Corporation ATM S.A. Elsevier Springer
Program Committee J.H. Abawajy (Deakin University, Australia) D. Abramson (Monash University, Australia)
XII
Organization
V. Alexandrov (University of Reading, UK) I. Altintas (San Diego Supercomputer Centre, UCSD, USA) M. Antolovich (Charles Sturt University, Australia) E. Araujo (Universidade Federal de Campina Grande, Brazil) M.A. Baker (University of Reading, UK) B. Bali´s (AGH University of Science and Technology, Krak´ ow, Poland) A. Benoit (LIP, ENS Lyon, France) I. Bethke (University of Amsterdam, The Netherlands) J. Bi (Tsinghua University, Beijing, China) J.A.R. Blais (University of Calgary, Canada) K. Boryczko (AGH University of Science and Technology, Krak´ ow, Poland) I. Brandic (Technical University of Vienna, Austria) M. Bubak (AGH University of Science and Technology, Krak´ ow, Poland) K. Bubendorfer (Victoria University of Wellington, New Zealand) B. Cantalupo (Elsag Datamat, Italy) L. Caroprese (University of Calabria, Italy) J. Chen (Swinburne University of Technology, Australia) O. Corcho (Universidad Politcnica de Madrid, Spain) J. Cui (University of Amsterdam, The Netherlands) J.C. Cunha (University Nova de Lisboa, Portugal) S. Date (Osaka University, Japan) S. Deb (National Institute of Science and Technology, Berhampur, India) Y.D. Demchenko (University of Amsterdam, The Netherlands) F. Desprez (INRIA, France) T. Dhaene (Ghent University, Belgium) I.T. Dimov (University of Reading, Bulgarian Academy of Sciences, Bulgaria) J. Dongarra (University of Tennessee, USA) F. Donno (CERN, Switzerland) C. Douglas (University of Kentucky, USA) G. Fox (Indiana University, USA) W. Funika (AGH University of Science and Technology, Krak´ ow, Poland) G. Geethakumari (University of Hyderabad, India) B. Glut (AGH University of Science and Technology, Krak´ ow, Poland) Y. Gorbachev (St.-Petersburg State Polytechnical University, Russia) A.M. Go´sci´ nski (Deakin University, Australia) M. Govindaraju (Binghamton University, USA) G.A. Gravvanis (Democritus University of Thrace, Greece) D.J. Groen (University of Amsterdam, The Netherlands) T. Gubala (Academic Computer Centre Cyfronet AGH, Krak´ow, Poland) M. Hardt (Forschungszentrum Karlsruhe, Germany) T. Heinis (ETH Zurich, Switzerland) L. Hluch´ y (Slovak Academy of Sciences, Slovakia) W. Hoffmann (University of Amsterdam, The Netherlands) A. Iglesias (University of Cantabria, Spain) C.R. Jesshope (University of Amsterdam, The Netherlands)
Organization
XIII
H. Jin (Huazhong University of Science and Technology, China) D. Johnson (University of Reading, UK) B.D. Kandhai (University of Amsterdam, The Netherlands) S. Kawata (Utsunomiya University, Japan) W.A. Kelly (Queensland University of Technology, Australia) J. Kitowski (AGH University of Science and Technology, Krak´ ow, Poland) M. Koda (University of Tsukuba, Japan) D. Kranzlm¨ uller (Johannes Kepler University Linz, Austria) J. Kroc (University of Amsterdam, The Netherlands) B. Kryza (Academic Computer Centre Cyfronet AGH, Krak´ ow, Poland) M. Kunze (Forschungszentrum Karlsruhe, Germany) D. Kurzyniec (Google, Krak´ ow, Poland) A. Lagana (University of Perugia, Italy) L. Lefevre (INRIA, France) A. Lewis (Griffith University, Australia) H.W. Lim (Royal Holloway, University of London, UK) E. Lorenz (University of Amsterdam, The Netherlands) P. Lu (University of Alberta, Canada) M. Malawski (AGH University of Science and Technology, Krak´ ow, Poland) A.S. McGough (London e-Science Centre, UK) P.E.C. Melis (University of Amsterdam, The Netherlands) E.D. Moreno (UEA-BENq, Manaus, Brazil) J.T. Mo´scicki (CERN, Switzerland) S. Naqvi (CETIC, Belgium) P.O.A. Navaux (Universidade Federal do Rio Grande do Sul, Brazil) Z. Nemeth (Hungarian Academy of Science, Hungary) J. Ni (University of Iowa, USA) G.E. Norman (Russian Academy of Sciences, Russia) ´ Nuall´ B.O. ain (University of Amsterdam, The Netherlands) S. Orlando (University of Venice, Italy) M. Paprzycki (Polish Academy of Sciences, Poland) M. Parashar (Rutgers University, USA) C.P. Pautasso (University of Lugano, Switzerland) M. Postma (University of Amsterdam, The Netherlands) V. Prasanna (University of Southern California, USA) T. Priol (IRISA, France) M.R. Radecki (AGH University of Science and Technology, Krak´ ow, Poland) M. Ram (C-DAC Bangalore Centre, India) A. Rendell (Australian National University, Australia) M. Riedel (Research Centre J¨ ulich, Germany) D. Rodr´ıguez Garca (University of Alcal, Spain) K. Rycerz (AGH University of Science and Technology, Krak´ ow, Poland) R. Santinelli (CERN, Switzerland) B. Schulze (LNCC, Brazil) J. Seo (University of Leeds, UK)
XIV
Organization
A.E. Solomonides (University of the West of England, Bristol, UK) V. Stankovski (University of Ljubljana, Slovenia) H. Stockinger (Swiss Institute of Bioinformatics, Switzerland) A. Streit (Forschungszentrum J¨ ulich, Germany) H. Sun (Beihang University, China) R. Tadeusiewicz (AGH University of Science and Technology, Krak´ow, Poland) M. Taufer (University of Delaware, USA) J.C. Tay (Nanyang Technological University, Singapore) C. Tedeschi (LIP-ENS Lyon, France) A. Tirado-Ramos (University of Amsterdam, The Netherlands) P. Tvrdik (Czech Technical University Prague, Czech Republic) G.D. van Albada (University of Amsterdam, The Netherlands) R. van den Boomgaard (University of Amsterdam, The Netherlands) A. Visser (University of Amsterdam, The Netherlands) D.W. Walker (Cardiff University, UK) C.L. Wang (University of Hong Kong, China) A.L. Wendelborn (University of Adelaide, Australia) Y. Xue (Chinese Academy of Sciences, China) F.-P. Yang (Chongqing University of Posts and Telecommunications, China) C.T. Yang (Tunghai University, Taichung, Taiwan) L.T. Yang (St. Francis Xavier University, Canada) J. Yu (Renewtek Pty Ltd, Australia) Y. Zheng (Zhejiang University, China) E.V. Zudilova-Seinstra (University of Amsterdam, The Netherlands)
Reviewers J.H. Abawajy H.H. Abd Allah D. Abramson R. Albert M. Aldinucci V. Alexandrov I. Altintas D. Angulo C. Anthes M. Antolovich E. Araujo E.F. Archibong L. Axner M.A. Baker B. Bali´s S. Battiato M. Baumgartner U. Behn
P. Bekaert A. Belloum A. Benoit G. Bereket J. Bernsdorf I. Bethke B. Bethwaite J.-L. Beuchat J. Bi J. Bin Shyan B.S. Bindhumadhava J.A.R. Blais P. Blowers B. Boghosian I. Borges A.I. Boronin K. Boryczko A. Borzi
A. Boutalib A. Brabazon J.M. Bradshaw I. Brandic V. Breton R. Brito W. Bronsvoort M. Bubak K. Bubendorfer J. Buisson J. Burnett A. Byrski M. Caeiro A. Caiazzo F.C.A. Campos M. Cannataro B. Cantalupo E. Caron
Organization
L. Caroprese U. Catalyurek S. Cerbat K. Cetnarowicz M. Chakravarty W. Chaovalitwongse J. Chen H. Chojnacki B. Chopard C. Choquet T. Cierzo T. Clark S. Collange P. Combes O. Corcho J.M. Cordeiro A.D. Corso L. Costa H. Cota de Freitas C. Cotta G. Cottone C.D. Craig C. Douglas A. Craik J. Cui J.C. Cunha R. Custodio S. Date A. Datta D. De Roure S. Deb V. Debelov E. Deelman Y.D. Demchenko B. Depardon F. Desprez R. Dew T. Dhaene G. Di Fatta A. Diaz-Guilera R. Dillon I.T. Dimov G. Dobrowolski T. Dokken J. Dolado
W. Dong J. Dongarra F. Donno C. Douglas M. Drew R. Drezewski A. Duarte V. Duarte W. Dubitzky P. Edmond A. El Rhalibi A.A. El-Azhary V. Ervin A. Erzan M. Esseffar L. Fabrice Y. Fan G. Farin Y. Fei V. Ferandez D. Fireman K. Fisher A. Folleco T. Ford G. Fox G. Frenking C. Froidevaux K. F¨ ulinger W. Funika H. Fuss A. Galvez R. Garcia S. Garic A. Garny F. Gava T. Gedeon G. Geethakumari A. Gerbessiotis F. Giacomini S. Gimelshein S. Girtelschmid C. Glasner T. Glatard B. Glut M. Goldman
XV
Y. Gorbachev A.M. Go´sci´ nski M. Govindaraju E. Grabska V. Grau G.A. Gravvanis C. Grelck D.J. Groen J.G. Grujic Y. Guang Xue T. Gubala C. Guerra V. Guevara X. Guo Y. Guo N.M. Gupte J.A. Gutierrez de Mesa P.H. Guzzi A. Haffegee S. Hannani U. Hansmann M. Hardt D. Har¸ez˙ lak M. Harman R. Harrison M. Hattori T. Heinis P. Heinzlreiter R. Henschel F. Hernandez V. Hern´andez P. Herrero V. Hilaire y L. Hluch´ A. Hoekstra W. Hoffmann M. Hofmann-Apitius J. Holyst J. Hrusak J. Hu X.R. Huang E. Hunt K. Ichikawa A. Iglesias M. Inda
XVI
Organization
D. Ireland H. Iwasaki B. Jakimovski R. Jamieson A. Jedlitschka C.R. Jesshope X. Ji C. Jim X H. Jin L. Jingling D. Johnson J.J. Johnstone J. Jurek J.A. Kaandorp B. Kahng Q. Kai R. Kakkar B.D. Kandhai S. Kawata P. Kelly W.A. Kelly J. Kennedy A. Kert´esz C. Kessler T.M. Khoshgoftaar C.H. Kim D.S. Kim H.S. Kim T.W. Kim M. Kisiel-Drohinicki J. Kitowski Ch.R. Kleijn H.M. Kl´ıe A. Kn¨ upfer R. Kobler T. K¨ ockerbauer M. Koda I. Kolingerova J.L. Koning V. Korkhov G. Kou A. Koukam J. Ko´zlak M. Krafczyk D. Kramer
D. Kranzlm¨ uller K. Kreiser J. Kroc B. Kryza V.V. Krzhizhanovskaya V. Kumar M. Kunze D. Kurzyniec M. Kuta A. Lagana K. Lai R. Lambiotte V. Latora J. Latt H.K. Lee L. Lefevre A. Lejay J. Leszczy´ nski A. Lewis Y. Li D. Liko H.W. Lim Z. Lin D.S. Liu J. Liu R. Liu M. Lobosco R. Loogen E. Lorenz F. Loulergue M. Low P. Lu F. Luengo Q. Luo W. Luo C. Lursinsap R.M. Lynden-Bell W.Y. Ma N. Maillard D.K. Maity M. Malawski N. Mangala S.S. Manna U. Maran R. Marcjan
F. Marco E. Matos K. Matsuzaki A.S. McGough B. McKay W. Meira Jr. P.E.C. Melis P. Merk M. Metzger Z. Michalewicz J. Michopoulos H. Mickler S. Midkiff L. Minglu M. Mirto M. Mitrovic H. Mix A. Mohammed E.D. Moreno J.T. Mo´scicki F. Mourrain J. Mrozek S. Naqvi S. Nascimento A. Nasri P.O.A. Navaux E. Nawarecki Z. Nemeth A. Neumann L. Neumann J. Ni G. Nikishkov G.E. Norman M. Nsangou J.T. Oden D. Olson M. O’Neill S. Orlando H. Orthmans ´ Nuall´ B.O. ain S. Pal Z. Pan M. Paprzycki M. Parashar A. Paszy´ nska
Organization
M. Paszy´ nski C.P. Pautasso B. Payne T. Peachey S. Pelagatti J. Peng Y. Peng F. Perales M. P´erez D. Pfahl G. Plank D. Plemenos A. Pluchino M. Polak S.F. Portegies Zwart M. Postma B.B. Prahalada V. Prasanna R. Preissl T. Priol T. Prokosch M. Py G. Qiu J. Quinqueton M.R. Radecki B. Raffin M. Ram P. Ramasami P. Ramsamy O.F. Rana M. Reformat A. Rendell M. Riedel J.L. Rivail G.J. Rodgers C. Rodr´ıguez-Leon B. Rodr´ıguez D. Rodr´ıguez D. Rodr´ıguez Garc´ıa F. Rogier G. Rojek H. Ronghuai H. Rosmanith J. Rough F.-X. Roux
X. R´ oz˙ a´ nska M. Ruiz R. Ruiz K. Rycerz K. Saetzler P. Saiz S. Sanchez S.K. Khattri R. Santinelli A. Santos M. Sarfraz M. Satpathy M. Sbert H.F. Schaefer R. Schaefer M. Schulz B. Schulze I. Scriven E. Segredo J. Seo A. Sfarti Y. Shi L. Shiyong Z. Shuai M.A. Sicilia L.P. Silva Barra F. Silvestri A. Simas H.M. Singer V. Sipkova P.M.A. Sloot R. Slota ´ zy´ B. Snie˙ nski A.E. Solomonides R. Soma A. Sourin R. Souto R. Spiteri V. Srovnal V. Stankovski E.B. Stephens M. Sterzel H. Stockinger D. Stokic A. Streit
XVII
B. Strug H. Sun Z. Sun F. Suter H. Suzuki D. Szczerba L. Szirmay-Kalos R. Tadeusiewicz B. Tadic R. Tagliaferri W.K. Tai S. Takeda E.J. Talbi J. Tan S. Tan T. Tang J. Tao M. Taufer J.C. Tay C. Tedeschi J.C. Teixeira D. Teller G. Terje Lines C. Te-Yi A.T. Thakkar D. Thalmann S. Thurner Z. Tianshu A. Tirado A. Tirado-Ramos P. Tjeerd R.F. Tong J. Top H. Torii V.D. Tran C. Troyer P. Trunfio W. Truszkowski W. Turek P. Tvrdik F. Urmetzer V. Uskov G.D. van Albada R. van den Boomgaard M. van der Hoef
XVIII
Organization
R. van der Sman B. van Eijk R. Vannier P. Veltri E.J. Vigmond J. Vill´ a i Freixa A. Visser D.W. Walker C.L. Wang F.L. Wang J. Wang J.Q. Wang J. Weidendorfer C. Weihrauch C. Weijun A. Weise A.L. Wendelborn
E. Westhof R. Wism¨ uller C. Wu C. Xenophontos Y. Xue N. Yan C.T. Yang F.-P. Yang L.T. Yang X. Yang J. Yu M. Yurkin J. Zara I. Zelinka S. Zeng C. Zhang D.L. Zhang
G. Zhang H. Zhang J.J. Zhang J.Z.H. Zhang L. Zhang J. Zhao Z. Zhao Y. Zheng X. Zhiwei A. Zhmakin N. Zhong M.H. Zhu T. Zhu O. Zimmermann J. Zivkovic A. Zomaya E.V. Zudilova-Seinstra
Workshops Organizers 7th Workshop on Computer Graphics and Geometric Modeling A. Iglesias (University of Cantabria, Spain) 5th Workshop on Simulation of Multiphysics Multiscale Systems V.V. Krzhizhanovskaya and A.G. Hoekstra (University of Amsterdam, The Netherlands) 3rd Workshop on Computational Chemistry and Its Applications P. Ramasami (University of Mauritius, Mauritius) Workshop on Computational Finance and Business Intelligence Y. Shi (Chinese Academy of Sciences, China) Workshop on Physical, Biological and Social Networks B. Tadic (Joˇzef Stefan Institute, Ljubljana, Slovenia) Workshop on GeoComputation Y. Xue (London Metropolitan University, UK) 2nd Workshop on Teaching Computational Science Q. Luo (Wuhan University of Science and Technology Zhongnan Branch, China), A. Tirado-Ramos (University of Amsterdam, The Netherlands), Y.-W. Wu
Organization
XIX
(Central China Normal University, China) and H.-W. Wang (Wuhan University of Science and Technology Zhongnan Branch, China) Workshop on Dynamic Data Driven Application Systems C.C. Douglas (University of Kentucky, USA) and F. Darema (National Science Foundation, USA) Bioinformatics’ Challenges to Computer Science M. Cannataro (University Magna Gracia of Catanzaro, Italy), M. Romberg (Research Centre J¨ ulich, Germany), J. Sundness (Simula Research Laboratory, Norway), R. Weber dos Santos (Federal University of Juiz de Fora, Brazil) Workshop on Tools for Program Development and Analysis in Computational Science A. Kn¨ upfer (University of Technology, Dresden, Germany), J. Tao (Forschungszentrum Karlsruhe, Germany), D. Kranzlm¨ uller (Johannes Kepler University Linz, Austria), A. Bode (University of Technology, M¨ unchen, Germany) and J. Volkert (Johannes Kepler University Linz, Austria) Workshop on Software Engineering for Large-Scale Computing D. Rodr´ıguez (University of Alcala, Spain) and R. Ruiz (Pablo de Olavide University, Spain) Workshop on Collaborative and Cooperative Environments C. Anthes (Johannes Kepler University Linz, Austria), V. Alexandrov (University of Reading, UK), D. Kranzlm¨ uller, G. Widmer and J. Volkert (Johannes Kepler University Linz, Austria) Workshop on Applications of Workflows in Computational Science Z. Zhao and A. Belloum (University of Amsterdam, The Netherlands) Workshop on Intelligent Agents and Evolvable Systems K. Cetnarowicz, R. Schaefer (AGH University of Science and Technology, Krak´ ow, Poland) and B. Zheng (South-Central University For Nationalities, Wuhan, China)
Table of Contents – Part II
7th International Workshop on Computer Graphics and Geometric Modeling VII International Workshop on Computer Graphics and Geometric Modeling – CGGM’2008 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Andr´es Iglesias
3
Sliding-Tris: A Sliding Window Level-of-Detail Scheme . . . . . . . . . . . . . . . Oscar Ripolles, Francisco Ramos, and Miguel Chover
5
Efficient Interference Calculation by Tight Bounding Volumes . . . . . . . . . Masatake Higashi, Yasuyuki Suzuki, Takeshi Nogawa, Yoichi Sano, and Masakazu Kobayashi
15
Modeling of 3D Scene Based on Series of Photographs Taken with Different Depth-of-Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Marcin Denkowski, Michal Chlebiej, and Pawel Mikolajczak A Simple Method of the TEX Surface Drawing Suitable for Teaching Materials with the Aid of CAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Masataka Kaneko, Hajime Izumi, Kiyoshi Kitahara, Takayuki Abe, Kenji Fukazawa, Masayoshi Sekiguchi, Yuuki Tadokoro, Satoshi Yamashita, and Setsuo Takato
25
35
Family of Energy Conserving Glossy Reflection Models . . . . . . . . . . . . . . . Michal Radziszewski and Witold Alda
46
Harmonic Variation of Edge Size in Meshing CAD Geometries from IGES Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Maharavo Randrianarivony
56
Generating Sharp Features on Non-regular Triangular Meshes . . . . . . . . . Tetsuo Oya, Shinji Seo, and Masatake Higashi
66
A Novel Artificial Mosaic Generation Technique Driven by Local Gradient Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sebastiano Battiato, Gianpiero Di Blasi, Giovanni Gallo, Giuseppe Claudio Guarnera, and Giovanni Puglisi Level-of-Detail Triangle Strips for Deforming Meshes . . . . . . . . . . . . . . . . . Francisco Ramos, Miguel Chover, Jindra Parus, and Ivana Kolingerova Triangular B´ezier Approximations to Constant Mean Curvature Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Arnal, A. Lluch, and J. Monterde
76
86
96
XXII
Table of Contents – Part II
Procedural Graphics Model and Behavior Generation . . . . . . . . . . . . . . . . . J.L. Hidalgo, E. Camahort, F. Abad, and M.J. Vicent
106
Particle Swarm Optimization for B´ezier Surface Reconstruction . . . . . . . . Akemi G´ alvez, Angel Cobo, Jaime Puig-Pey, and Andr´es Iglesias
116
Geometrical Properties of Simulated Packings of Spherocylinders . . . . . . . Monika Bargiel
126
Real-Time Illumination of Foliage Using Depth Maps . . . . . . . . . . . . . . . . . Jesus Gumbau, Miguel Chover, Cristina Rebollo, and Inmaculada Remolar
136
On-Line 3D Geometric Model Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . H. Zolfaghari and K. Khalili
146
Implementation of Filters for Image Pre-processing for Leaf Analyses in Plantations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jacqueline Gomes Mertes, Norian Marranghello, and Aledir Silveira Pereira
153
5th Workshop on Simulation of Multiphysics Multiscale Systems Simulation of Multiphysics Multiscale Systems, 5th International Workshop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Valeria V. Krzhizhanovskaya and Alfons G. Hoekstra
165
A Hybrid Model of Sprouting Angiogenesis . . . . . . . . . . . . . . . . . . . . . . . . . . Florian Milde, Michael Bergdorf, and Petros Koumoutsakos
167
Particle Based Model of Tumor Progression Stimulated by the Process of Angiogenesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rafal Wcislo and Witold Dzwinel
177
A Multiphysics Model of Myoma Growth . . . . . . . . . . . . . . . . . . . . . . . . . . . Dominik Szczerba, Bryn A. Lloyd, Michael Bajka, and G´ abor Sz´ekely
187
Computational Implementation of a New Multiphysics Model for Field Emission from CNT Thin Films . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. Sinha, D. Roy Mahapatra, R.V.N. Melnik, and J.T.W. Yeow
197
A Multiphysics and Multiscale Software Environment for Modeling Astrophysical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ´ Nuall´ Simon Portegies Zwart, Steve McMillan, Breannd´ an O ain, Douglas Heggie, James Lombardi, Piet Hut, Sambaran Banerjee, Houria Belkus, Tassos Fragos, John Fregeau, Michiko Fuji, Evghenii Gaburov, Evert Glebbeek, Derek Groen, Stefan Harfst, Rob Izzard, Mario Juri´c, Stephen Justham, Peter Teuben, Joris van Bever, Ofer Yaron, and Marcel Zemp
207
Table of Contents – Part II
Dynamic Interactions in HLA Component Model for Multiscale Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Katarzyna Rycerz, Marian Bubak, and Peter M.A. Sloot
XXIII
217
An Agent-Based Coupling Platform for Complex Automata . . . . . . . . . . . Jan Hegewald, Manfred Krafczyk, Jonas T¨ olke, Alfons Hoekstra, and Bastien Chopard
227
A Control Algorithm for Multiscale Simulations of Liquid Water . . . . . . . Evangelos M. Kotsalis and Petros Koumoutsakos
234
Multiscale Models of Quantum Dot Based Nanomaterials and Nanodevices for Solar Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alexander I. Fedoseyev, Marek Turowski, Ashok Raman, Qinghui Shao, and Alexander A. Balandin
242
Multi-scale Modelling of the Two-Dimensional Flow Dynamics in a Stationary Supersonic Hot Gas Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . Giannandrea Abbate, Barend J. Thijsse, and Chris R. Kleijn
251
Multiscale Three-Phase Flow Simulation Dedicated to Model Based Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dariusz Choi´ nski, Mieczyslaw Metzger, and Witold Noco´ n
261
Simulation of Sound Emitted from Collision of Droplet with Shallow Water by the Lattice Boltzmann Method . . . . . . . . . . . . . . . . . . . . . . . . . . . Shinsuke Tajiri, Michihisa Tsutahara, and Hisao Tanaka
271
Multiscale Numerical Models for Simulation of Radiation Events in Semiconductor Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alexander I. Fedoseyev, Marek Turowski, Ashok Raman, Michael L. Alles, and Robert A. Weller Scale-Splitting Error in Complex Automata Models for Reaction-Diffusion Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alfonso Caiazzo, Jean Luc Falcone, Bastien Chopard, and Alfons G. Hoekstra Wavelet Based Spatial Scaling of Coupled Reaction Diffusion Fields . . . . Sudib K. Mishra, Krishna Muralidharan, Pierre Deymier, George Frantziskonis, Srdjan Simunovic, and Sreekanth Pannala
281
291
301
Domain Decomposition Methodology with Robin Interface Matching Conditions for Solving Strongly Coupled Problems . . . . . . . . . . . . . . . . . . . Fran¸cois-Xavier Roux
311
Transient Boundary Element Method and Numerical Evaluation of Retarded Potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ernst P. Stephan, Matthias Maischak, and Elke Ostermann
321
XXIV
Table of Contents – Part II
A Multiscale Approach for Solving Maxwell’s Equations in Waveguides with Conical Inclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Franck Assous and Patrick Ciarlet Jr.
331
3rd Workshop on Computational Chemistry and Its Applications 3rd Workshop on Computational Chemistry and Its Applications (3rd CCA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ponnadurai Ramasami First Principle Gas Phase Study of the Trans and Gauche Rotamers of 1,2-Diisocyanoethane, 1,2-Diisocyanodisilane and Isocyano(isocyanomethyl)silane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ponnadurai Ramasami A Density Functional Theory Study of Oxygen Adsorption at Silver Surfaces: Implications for Nanotoxicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Brahim Akdim, Saber Hussain, and Ruth Pachter Mechanism of Influenza A M2 Ion-Channel Inhibition: A Docking and QSAR Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alexander V. Gaiday, Igor A. Levandovskiy, Kendall G. Byler, and Tatyana E. Shubina
343
344
353
360
A Java Tool for the Management of Chemical Databases and Similarity Analysis Based on Molecular Graphs Isomorphism . . . . . . . . . . . . . . . . . . . ´ Irene Luque Ruiz and Miguel Angel G´ omez-Nieto
369
Noncanonical Base Pairing in RNA: Topological and NBO Analysis of Hoogsteen Edge - Sugar Edge Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . Purshotam Sharma, Harjinder Singh, and Abhijit Mitra
379
Design of Optimal Laser Fields to Control Vibrational Excitations in Carboxy-myoglobin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Harjinder Singh, Sitansh Sharma, Praveen Kumar, Jeremy N. Harvey, and Gabriel G. Balint-Kurti Computations of Ground State and Excitation Energies of Poly(3-methoxy-thiophene) and Poly(thienylene vinylene) from First Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.V. Gavrilenko, S.M. Black, A.C. Sykes, C.E. Bonner, and V.I. Gavrilenko
387
396
Workshop on Computational Finance and Business Intelligence Workshop on Computational Finance and Business Intelligence . . . . . . . . Yong Shi, Shouyang Wang, and Xiaotie Deng
407
Table of Contents – Part II
XXV
Parallelization of Pricing Path-Dependent Financial Instruments on Bounded Trinomial Lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hannes Schabauer, Ronald Hochreiter, and Georg Ch. Pflug
408
Heterogeneity and Endogenous Nonlinearity in an Artificial Stock Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hongquan Li, Wei Shang, and Shouyang Wang
416
Bound for the L2 Norm of Random Matrix and Succinct Matrix Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rong Liu, Nian Yan, Yong Shi, and Zhengxin Chen
426
Select Representative Samples for Regularized Multiple-Criteria Linear Programming Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peng Zhang, Yingjie Tian, Xingsen Li, Zhiwang Zhang, and Yong Shi
436
A Kernel-Based Technique for Direction-of-Change Financial Time Series Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Andrew Skabar
441
An Optimization-Based Classification Approach with the Non-additive Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nian Yan, Zhengxin Chen, Rong Liu, and Yong Shi
450
A Selection Method of ETF’s Credit Risk Evaluation Indicators . . . . . . . Ying Zhang, Zongfang Zhou, and Yong Shi
459
Estimation of Market Share by Using Discretization Technology: An Application in China Mobile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xiaohang Zhang, Jun Wu, Xuecheng Yang, and Tingjie Lu
466
A Rough Set-Based Multiple Criteria Linear Programming Approach for Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhiwang Zhang, Yong Shi, Peng Zhang, and Guangxia Gao
476
Predictive Modeling of Large-Scale Sequential Curves Based on Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wen Long and Huiwen Wang
486
Estimating Real Estate Value-at-Risk Using Wavelet Denoising and Time Series Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kaijian He, Chi Xie, and Kin Keung Lai
494
The Impact of Taxes on Intra-week Stock Return Seasonality . . . . . . . . . . Virgilijus Sakalauskas and Dalia Kriksciuniene
504
A Survey of Formal Verification for Business Process Modeling . . . . . . . . Shoichi Morimoto
514
XXVI
Table of Contents – Part II
Workshop on Physical, Biological and Social Networks Network Modeling of Complex Dynamic Systems . . . . . . . . . . . . . . . . . . . . Bosiljka Tadi´c
525
Clustering Organisms Using Metabolic Networks . . . . . . . . . . . . . . . . . . . . . Tomasz Arod´z
527
Influence of Network Structure on Market Share in Complex Market Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Makoto Uchida and Susumu Shirayama
535
When the Spatial Networks Split? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Joanna Natkaniec and Krzysztof Kulakowski
545
Search of Weighted Subgraphs on Complex Networks with Maximum Likelihood Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Marija Mitrovi´c and Bosiljka Tadi´c
551
Spectral Properties of Adjacency and Distance Matrices for Various Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Krzysztof Malarz
559
Simplicial Complexes of Networks and Their Statistical Properties . . . . . Slobodan Maleti´c, Milan Rajkovi´c, and Danijela Vasiljevi´c
568
Movies Recommendation Networks as Bipartite Graphs . . . . . . . . . . . . . . . Jelena Gruji´c
576
Dynamical Regularization in Scalefree-Trees of Coupled 2D Chaotic Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zoran Levnaji´c
584
Physics Based Algorithms for Sparse Graph Visualization . . . . . . . . . . . . . ˇ Milovan Suvakov
593
Workshop on GeoComputation High Performance Geocomputation - Preface . . . . . . . . . . . . . . . . . . . . . . . . Yong Xue, Dingsheng Liu, Jianwen Ai, and Wei Wan
603
Study on Implementation of High-Performance GIServices in Spatial Information Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fang Huang, Dingsheng Liu, Guoqing Li, Yi Zeng, and Yunxuan Yan
605
Numerical Simulation of Threshold-Crossing Problem for Random Fields of Environmental Contamination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Robert Jankowski
614
Table of Contents – Part II
A Context-Driven Approach to Route Planning . . . . . . . . . . . . . . . . . . . . . . Hissam Tawfik, Atulya Nagar, and Obinna Anya InterCondor: A Prototype High Throughput Computing Middleware for Geocomputation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yong Xue, Yanguang Wang, Ying Luo, Jianping Guo, Jianqin Wang, Yincui Hu, and Chaolin Wu
XXVII
622
630
Discrete Spherical Harmonic Transforms: Numerical Preconditioning and Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . J.A. Rod Blais
638
A Data Management Framework for Urgent Geoscience Workflows . . . . . Jason Cope and Henry M. Tufo
646
2nd Workshop on Teaching Computational Science Second Workshop on Teaching Computational Science – WTCS 2008 . . . A. Tirado-Ramos and Q. Luo
657
Using Metaheuristics in a Parallel Computing Course . . . . . . . . . . . . . . . . . ´ Angel-Luis Calvo, Ana Cort´es, Domingo Gim´enez, and Carmela Pozuelo
659
Improving the Introduction to a Collaborative Project-Based Course on Computer Network Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Felix Freitag, Leandro Navarro, and Joan Manuel Marqu`es
669
Supporting Materials for Active e-Learning in Computational Models . . . Mohamed Hamada
678
Improving Software Development Process Implemented in Team Project Course . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Iwona Dubielewicz and Bogumila Hnatkowska
687
An Undergraduate Computational Science Curriculum . . . . . . . . . . . . . . . . Angela B. Shiflet and George W. Shiflet
697
Cryptography Adapted to the New European Area of Higher Education . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Queiruga Dios, L. Hern´ andez Encinas, and D. Queiruga
706
An Introductory Computer Graphics Course in the Context of the European Space of Higher Education: A Curricular Approach . . . . . . . . . . Akemi G´ alvez, Andr´es Iglesias, and Pedro Corcuera
715
Collaborative Environments through Dialogues and PBL to Encourage the Self-directed Learning in Computational Sciences . . . . . . . . . . . . . . . . . Fernando Ramos-Quintana, Josefina S´ amano-Galindo, and V´ıctor H. Z´ arate-Silva
725
XXVIII
Table of Contents – Part II
The Simulation Course: An Innovative Way of Teaching Computational Science in Aeronautics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ricard Gonz´ alez-Cinca, Eduard Santamaria, and J. Luis A. Yebra
735
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
745
VII International Workshop on Computer Graphics and Geometric Modeling – CGGM’2008 Andr´es Iglesias Department of Applied Mathematics and Computational Sciences, University of Cantabria, Avda. de los Castros, s/n, E-39005, Santander, Spain
[email protected] http://personales.unican.es/iglesias/
Abstract. This short paper is intended to give our readers a brief insight about the Seventh International Workshop on Computer Graphics and Geometric Modeling-CGGM’2008, held in Krakow (Poland), June 23-25 2008 as a part of the ICCS’2008 general conference.
1 1.1
CGGM Workshops Aims and Scope
Computer Graphics (CG) and Geometric Modeling have become two of the most important and challenging areas of Computer Science. The CGGM workshops seek for high-quality papers describing original research results in those fields. Topics of the workshop include (but not limited to): geometric modeling, solid modeling, CAD/CAM, physically-based modeling, surface reconstruction, geometric processing and CAGD, volume visualization, virtual avatars, computer animation, CG in Art, Education, Engineering, Entertainment and Medicine, rendering techniques, multimedia, non photo-realistic rendering, virtual and augmented reality, virtual environments, illumination models, texture models, CG and Internet (VRML, Java, X3D, etc.), artificial intelligence for CG, CG software and hardware, CG applications, CG education and new directions in CG. 1.2
CGGM Workshops History
The history of the CGGM workshops dates back eight years ago, when some researchers decided to organize a series of international conferences on all aspects of computational science. The first edition of this annual conference was held in San Francisco in 2001 under the name of International Conference on Computational Science, ICCS. This year ICCS is held in Krakow (Poland). After ICCS’2001, I realized that no special event devoted to either computer graphics or geometric modeling had been organized at that conference. Aiming to fill this gap, I proposed a special session on these topics to ICCS’2002 organizers. Their enthusiastic reply encouraged me to organize the first edition of M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 3–4, 2008. c Springer-Verlag Berlin Heidelberg 2008
4
A. Iglesias
this workshop, CGGM’2002. A total of 81 papers from 21 countries were submitted to the workshop, with 35 high-quality papers finally accepted and published by Springer-Verlag, in its Lectures Notes in Computer Science series, vol. 2330. This great success and the positive feedback of authors and participants motivated that CGGM became an annual event on its own. Subsequent editions were held as follows (see [1] for details): CGGM’2003 in Montreal (Canada), CGGM’2004 in Krakow (Poland), CGGM’2005 in Atlanta (USA), CGGM’2006 in Reading (UK) and CGGM’2007 in Beijing (China). All of them were published by Springer-Verlag, in its Lecture Notes in Computer Science series, volumes 2668, 3039, 3515, 3992 and 4488 with a total of 52, 24, 22, 22 and 20 contributions, respectively. In addition, one Special issue has been published in 2004 in the Future Generation Computer Systems – FGCS journal [2]. Another special issue on CGGM’2007 is in the way to be published this year in the Advances in Computational Science and Technology-ACST journal [3].
2
CGGM’2008
This year CGGM has received a total of 39 papers of which 17 have been accepted as full papers and 1 as short paper. The reader is referred to [4] for more information about the workshop. The workshop chair would like to thank the authors, including those whose papers were not accepted, for their contributions. I also thank the referees (see the CGGM’2008 International Program Committee and CGGM’2008 International Reviewer Board in [4]) for their hard work in reviewing the papers and making constructive comments and suggestions, which have substantially contributed to improving the workshop. It is expected that some workshop papers will be selected for publication in extended and updated form in a Special Issue on CGGM’2008. Details will appear in [4] at due time. Acknowledgements. This workshop has been supported by the Spanish Ministry of Education and Science, National Program of Computer Science, Project Ref. TIN2006-13615. The CGGM’2008 chair also thanks Dick van Albada, workshops chair of general conference ICCS’2008 for his all-time availability and diligent work during all stages of the workshop organization. It is a pleasure to work with you, Dick!
References 1. Previous CGGMs, http://personales.unican.es/iglesias/CGGM2008/Previous.htm 2. Iglesias, A. (ed.): Special issue on Computer Graphics and Geometric Modeling. Future Generation Computer Systems 20(8), 1235–1387 (2004) 3. Iglesias, A. (ed.): Special issue on Computer Graphics and Geometric Modeling. Advances in Computational Science and Technology (in press) 4. CGGM, web page (2008), http://personales.unican.es/iglesias/CGGM2008/
Sliding-Tris: A Sliding Window Level-of-Detail Scheme Oscar Ripolles, Francisco Ramos, and Miguel Chover Universitat Jaume I, Castellon, Spain {oripolle,jromero,chover}@uji.es
Abstract. Virtual environments for interactive applications demand highly realistic scenarios, which tend to be large and densely populated with very detailed meshes. Despite the outstanding evolution of graphics hardware, current GPUs are still not capable of managing these vast amounts of geometry. A solution to overcome this problem is the use of level-of-detail techniques, which recently have been oriented towards the exploitation of GPUs. Nevertheless, although some solutions present very good results, they are usually based on complex data structures and algorithms. We thus propose a new multiresolution model based on triangles which is simple and efficient. The main idea is to modify the list of vertices when changing to a new level of detail, in contrast to previous models which modify the index list, which simplifies the extraction process. This feature also provides a perfect framework for adapting the algorithm to work completely on the GPU. Keywords: Multiresolution, Level of Detail, GPU, Sliding-Window.
1
Introduction
Nowadays, applications such as computer games, virtual reality or scientific simulations are increasing the detail of their environments with the aim of offering more realism. This objective usually involves dealing with larger environments containing lots of objects which amount to a large quantity of triangles. However, despite the constant improvements in performance and capabilities of GPUs, it is still difficult to render such complex scenes as vertex throughput and memory bandwidth become considerable bottlenecks when dealing with them. As a result, these environments cannot be interactively rendered by brute force methods. Among the solutions to overcome this limitation, one of the most widely used is multiresolution modeling. A level-of-detail or multiresolution model is a compact description of multiple representations of a single object [1] that must be capable of extracting the appropriate representation in different contexts. In recent years, many solutions based on level-of-detail techniques have been presented. Nevertheless, only a few exploit, in one way or another, some of the current GPU functionalities. However, although they provide interactive rates, they require complex data structures and algorithms to manage them. In this paper we describe a multiresolution model for real-time rendering of arbitrary M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 5–14, 2008. c Springer-Verlag Berlin Heidelberg 2008
6
O. Ripolles, F. Ramos, and M. Chover
meshes which contributes to diminish the existing distance between a multiresolution GPU-based solution and its implementation in any 3D application. Our approach includes the following contributions: – A simple data structure based on vertex hierarchies adapted to the GPU architecture. The vertex hierarchy is given through the edge contraction operations of the simplification process [2]. – Storage cost with low memory requirements. – Representations are stored and processed entirely in the GPU avoiding the typical bottleneck between the CPU and the GPU and thus obtaining a great performance by exploiting the implicit parallelism existing in current GPUs. This paper presents the following structure. Section 2 contains a study of the work previously carried out on GPU-friendly multiresolution modeling. Section 3 presents the basic framework of Sliding-Tris. Section 4 provides thorough details of the implementations of the algorithms in both the CPU and the GPU. Section 5 includes a comparative study of spatial cost and rendering time. Lastly, Section 6 comments on the results obtained in our tests.
2
Related Work
Evolution of graphics hardware has given rise to new techniques that allow us to accelerate multiresolution models. This research field has been exploited for many years, and it is possible to find a wealth of papers which present very different solutions. Nevertheless, the authors have lately re-oriented their efforts towards the development of new models which consider the possibilities offered by new graphics hardware. Recent GPUs include vertex and fragment processors, which have evolved from being configurable to being programmable, allowing us to execute shader programs in parallel. In general, multiresolution models can be classified into two large groups [3]: discrete models, which contain various representations of the same object, and continuous models, which represent a vast range of approximations. With respect to discrete models, they offer a very efficient solution but they usually present visual artifacts when switching between pre-calculated levels of detail. A possible solution to avoid these popping artifacts is the use of geomorphing [4] or blending [5] in the GPU. A more thorough method is that presented in [6], which consists in sending to the GPU a mesh at minimum level of detail and applying later a refining pattern in the GPU to every face of the model. The problem, according to the authors, is that it suffers again the popping effects. Another aspect is the load suffered by the GPU when a model keeps the level of detail, as a pass must be made for each face that the coarser model has. It is possible to find in the literature continuous algorithms aimed at rendering common meshes by exploiting GPUs. The LodStrips model was reconsidered in [7] to offer a GPU-oriented solution, by creating efficient data structures that can be integrated into the GPU. Ji et al. [8] suggest a method to select and visualize several levels of detail by using the GPU. In particular, they encode
Sliding-Tris: A Sliding Window Level-of-Detail Scheme
7
the geometry in a quadtree based on a LOD atlas texture. The main problem of this method is the costly process that the CPU must execute in every change of level of detail. Moreover, if the mesh is too complex, the representation with quadtrees can be very inefficient and even the size of the video memory can be an important restriction. Finally, the work presented by Turchyn [9] is based on Progressive Meshes. It builds a complex hierarchical data structure that derives in great memory requirements. Moreover, it changes the mesh connectivity trying to reduce memory costs. Many of the GPU-based continuous models are aimed at view-dependent rendering of massive models. Works like [10],[11],[12] have adapted their data structures so that CPU/GPU communication can be optimized to fully exploit the complex memory hierarchy of modern graphics platforms. With a similar objective but with a further GPU exploitation, the GoLD method [13] introduces a hierarchy of geometric patches for very detailed meshes with high-resolution textures. The maintenance of boundaries is assured by means of geomorphing performed in the GPU. Finally, the work presented in [14] introduces a multigrained hierarchical solution which avoids the appearance of cracks in the borders of nodes at different LODs by applying a border-stitching approach directly in the GPU. In general, discrete models are easier to be implemented in GPUs but they do not avoid the popping artifacts. By contrast, continuous models offer a better granularity and avoid that problem, although their memory requirements are high and some of them even need several rendering passes for one LOD change.
3
Our Approach
The solution we are presenting offers an easy and fast level-of-detail update which, contrary to previous multiresolution models, modifies the list of vertices instead of the indices. The basic idea is to order indices and vertices so that we can apply a sliding-window approach to the level-of-detail extraction process. Before explaining in detail the basis of our proposed solution, it is important to comment on the simplification algorithm that we will use for obtaining the sequence of approximations of the original model. This sequence will be used to obtain a progressive coarsening (and refinement) of the original model. 3.1
Mesh Simplification
It is possible to find plenty of research on appearance-preserving simplifications methods. The most important contributions have been made in the areas of geometric-based algorithms[15],[16] and viewpoint-based approaches[17],[18],[19]. Among these works, we will use an edge-collapse based method which preserves texture appearance [19]. It is important to comment that in these collapse operations we will not modify vertices coordinates, as we assumed that the vertex that disappears will collapse to an existing one. The selection of this type of edgecollapse simplifies the data structures of our model and still offers very accurate
8
O. Ripolles, F. Ramos, and M. Chover
Fig. 1. Simplification of a section of a polygonal model
simplifications. An example of this simplification process can be observed in Fig. 1, where a section of a polygonal mesh is simplified with two edge-collapse operations. 3.2
Sliding-Tris Framework
This multiresolution model represents a mesh with three different sets. Let M be the original polygonal surface and V and T its sets of vertices and triangles. We will also define the set E, which refers to the evolution of each vertex and will be explained later. Considering that n is the number of vertices and m is the number of triangles, M = {V, T, E} can be defined as: V = {v0 , v1 , ..., vn }, T = {t0 , t1 , ..., tm }, E = {e0 , e1 , ..., en }
(1)
As we have previously commented, the main idea of Sliding-Tris is to update the contents of the vertices list instead of the indices one. As an example, collapsing vertex i to vertex j would mean that the coordinates values of vertex i will be replaced with the values of vertex j. The multiresolution model we are presenting is also based on the adequate ordering of vertices and triangles: – Vertices: they are ordered following the collapse order, so that vertex i will collapse when changing from lod i − 1 to lod i. – Triangles: they are ordered according to their elimination order, so that the last triangle will be the first one to disappear. Following with the simplification example offered in the previous section, the correct ordering of the initial mesh should be the one presented in Fig. 2. On top we present the original contents of the triangles and vertices lists. In the bottom, we offer the ordered lists obtained following our requirements. Thus, we can see how vertex 4 is now vertex 0 and vertex 0 is now vertex 1, following the order of vertex collapses. In a similar way, triangles 6 and 2 are now the last ones, as they will be the first ones to disappear. As we already commented, we will need to store the evolution of each vertex. The evolution reflects the different vertices an original one collapses to throughout
Sliding-Tris: A Sliding Window Level-of-Detail Scheme
9
Fig. 2. Initial ordering (top) and re-order of triangles and vertices (bottom)
// LOD Extraction algorithm. for (vert=0 to demandedLOD ) { i = 0; while (Evolution[vert][i] < demandedLOD) i++; CopyVertex(CurrentVertices[vert],OriginalVertices[i]); } // Visualization algorithm. numTriangles -= 2*(demandedLOD-currentLOD); //If increasing detail add. glDrawElements (Triangles,0,numTriangles,...); Fig. 3. Pseudocode of a simple CPU implementation of the LOD algorithms
the levels of detail. Thus, each ei element will be composed of a list of references to the vertices that vertex i collapses to. As the vertices have been ordered following the collapse order, we will be able to know in which LOD a particular vertex must change. More precisely, we can assure that the evolution of vertex i satisfies that we must use the contents of its j-th element while ei,j ≤ demandedLOD < ei,j+1 . Finally, we will also have to store a copy of the original vertices, which will be used for updating the value of each vertex when traversing the different LODs. Once we have fulfilled all these requirements, we are ready to start with the algorithm that will enable us to obtain all the levels of detail (Fig. 3). Each time we change to a different LOD we must check every vertex to see if it is necessary to update it. Nevertheless, due to the order we have chosen for the vertices, it will only be necessary to check vertices from 0 to demandedLOD − 1. Once we have updated the necessary vertices, the sliding-window approach is applied to render the suitable number of indices.
10
O. Ripolles, F. Ramos, and M. Chover
Fig. 4. Example of the extraction process of three levels-of-detail
Fig. 4 presents the evolution of the example mesh during three edge collapses. This figure includes, for each level of detail, the array of triangles and vertices and, for each vertex, its evolution. The array of triangles is shaded following the sliding-window approach. With respect to the evolution, the shaded cell reflects the current contents of the vertex. Following the algorithm introduced in Fig. 3, let’s suppose we change from LOD 0 to LOD 1. We would decrease the triangle count bt two and, in this case, we would only modify vertex 0 so that its coordinates are updated with the contents of vertex 1. For the second LOD change, vertices 0 and 1 must be updated, and according to the contents of their evolution, they must change their values for vertex 5.
4
Sliding-Tris Implementation
In this section we will introduce different possible implementations of the original algorithm which will allow us to exploit the graphics hardware. The first version will store the mesh information in the GPU, updating it in the most appropriate way. Nevertheless, this solution still involves data traffic through the BUS. As a consequence, we will also present the GPU implementations of the algorithm in both the vertex shader and the pixel shader, which reduce the traffic to just uploading the value of the new demandedLOD.
Sliding-Tris: A Sliding Window Level-of-Detail Scheme
11
CPU Version. Storing the information in buffers in the GPU offers a faster rendering. Thus, this version uploads indices and vertices to the GPU. It is important to comment that we must keep a copy of the current vertices and also the original ones in the CPU, as well as the evolution of the vertices. The extraction process is similar to that presented in Fig. 3, but once we have updated the array of vertices in the CPU, we will upload it to GPU. More precisely, we will carefully delimit the vertices to transfer to minimize traffic. We can then render the mesh by indicating how many triangles must be considered. Exploiting the Possibilities of the Vertex Shader. For adapting the algorithm to shaders in the GPU, the first problem we must address is how to store the necessary information (the evolution and the original vertices) in the GPU. On the one hand, we create a floating point texture to store the original values of the vertices. On the other, the evolution of each vertex will be stored in different sets of its attributes, mainly in unused MultiTexCoords. These vertex attributes can store 4 components, which in our case represent 4 elements of the evolution. We will use as many attributes as necessary. For efficiency, we store all the vertex attributes in a single interleaved array. Once the data is stored, the only information that the CPU must send to the GPU is the new LOD value. The original extraction process has been carefully adapted to work on a vertex shader, consulting the attributes to analyze the evolution and accessing the texture once the correct value of the evolution is obtained. Thus, each vertex uses this shader to correctly update its coordinates. An important issue with this version is that we oblige the vertex shader to update the coordinates even when the LOD is maintained, as we are not storing the resulting vertex buffer. This is an important disadvantage, but we must consider than in interactive applications the user keeps moving all the time, and under these conditions the detail must be updated very often. Exploiting the Possibilities of the Pixel Shader. To overcome the limitation of the vertex shader implementation, we adapted the Sliding-Tris algorithm to a pixel shader. In this case, we will use a render-to-vertex approach to store the newly calculated vertices. With respect to storing the necessary data, we will still use a texture to store the initial values of the vertices, but we will create a new one to store the evolution of the vertices. The main algorithm must consider the render-to-vertex operation[20]. This way, before rendering the model we will define a viewport and render a quad that fills it, covering as many pixels as vertices we must update. Thus, each pixel will use the shader to compute the value of a different vertex. This pixel shader will use a routine similar to the vertex shader but, in this case, consulting the evolution of a vertex implies accessing a texture instead of the attributes. Once all pixels have been evaluated, the CPU must perform an extra operation which involves reading the output buffer into a VertexBufferObject via ReadPixels. Then, we can disable this pixel shader and render the mesh normally using this buffer object as a new source of vertex data.
12
O. Ripolles, F. Ramos, and M. Chover Table 1. Models used in the experiments, with their storing cost (in MB.)
Model Vertices Faces Original (triangles) Progressive Meshes LodStrips Sliding-Tris
5
Cow 2904 5804 0.10 0.27 0.17 0.12
Bunny 35947 69451 1.21 3.28 2.21 1.45
Dragon 54296 108588 1.86 5.09 3.32 2.23
Phone 83044 165963 2.84 7.86 5.08 3.45
Isis 187644 375283 6.44 17.23 11.69 7.72
Buddha 543644 1085634 18.65 51.28 35.51 22.56
Results
In this section we will present some tests that cover the storing cost and the rendering performance of the presented versions of Sliding-Tris. The experiments were carried out using Windows XP on a PC with a processor at 2.8 Ghz, 2 GB RAM and an nVidia GeForce 7800 graphics card with 256MB RAM. The different implementations have been done in C++, OpenGL and HLSL. 5.1
Storing Cost
Table 1 shows a comparison of spatial costs among previous continuous uniform resolution models: PM [21], a triangle-based approach, and LodStrips [7], which is based on triangle strips. As it can be observed, the model presented offers the best spatial cost. On average, it fits in 1.2 times the original mesh in triangles, in contrast to PM and LodStrips which fit in 2.7 and 1.9 times respectively. This is due to the fact that the only extra information that we store is the evolution of the vertices. Furthermore, it is important to note that the size of each evolution (en ) element is usually small. Our experiments have shown that the evolution of the vertices of most models have a maximum number of 12 elements and an average of 2, despite being meshes composed of thousands of vertices. 5.2
Rendering Time
We analyzed the frame rate obtained throughout the different levels-of-detail when rendering a bunny model without textures nor illumination. We have also included the results of a version without extraction cost, in order to show which frame-rate could be obtained at most. The results are presented in Fig. 5, where it can be observed that the frame rate of the pixel shader version is always the lowest, as this version obliges the pipeline to stop until the render-to-vertex has been completely performed, thus limiting the performance of both the CPU and the GPU. The best results are offered by the vertex shader implementation. It is important to note that in this test we extracted a new LOD for each frame. Thus, the CPU version and the pixel shader one would be able to achieve the performance of the original model when the LOD remains stable. As commented before, interactive applications tend to change scene conditions permanently, and under these circumstances the vertex shader offers the best performance.
Sliding-Tris: A Sliding Window Level-of-Detail Scheme
13
Fig. 5. Frame rate obtained when rendering the bunny model with the different implementations. The original values refer to the visualization cost only.
6
Conclusions
In this paper we have presented a new multiresolution model which has been completely adapted to the GPU. Sliding-Tris offers a low storing cost, easy implementation and a fast extraction process which make it suitable for any rendering engine. A further advantage is that the extraction process is always similar in cost. The shaders must consider the extraction process for each vertex, and, as a consequence, these algorithms would obtain similar results when the difference between the demanded and the current LOD is big or small, in contrast to hierarchical models. For the CPU model, an important limitation is the data traffic involved in extracting the approximations. Updating vertices instead of indices involves working with three floats per vertex instead of one integer per index. This limitation can be worse if the meshes work with normals, textures, etc. Nevertheless, our experiments have shown that the total number of update operations is similar, and the final rendering speed is not affected by this increase in the quantity of data interchanged. The use of shaders to perform the LOD changes avoids the traffic problem, even though the render-to-vertex approach is still slow and demands CPU intervention. As a consequence, even when the three approaches are quite similar in rendering time, the vertex shader offers a wiser solution as it can be running while the CPU and GPU are performing a different operation. Acknowledgments. This work has been supported by grant P1 1B2007-56 (Bancaixa), the Spanish Ministry of Science and Technology (Contiene Project: TIN2007-68066-C04-02) and FEDER funds.
14
O. Ripolles, F. Ramos, and M. Chover
References 1. Clark, J.: Hierarchical geometric models for visible surface algorithms. CACM 10(19), 547–554 (1976) 2. Garland, M., Heckbert, P.: Simplification using quadric error metrics. Computer and Graphics 31, 209–216 (1997) 3. Ribelles, J., Chover, M., Lopez, A., Huerta, J.: A first step to evaluate and compare multiresolution models. In: EUROGRAPHICS, pp. 230–232 (1999) 4. Sander, P.V., Mitchell, J.L.: Progressive buffers: View-dependent geometry and texture for lod rendering. In: Symp. on Geom. Process, pp. 129–138 (2005) 5. Southern, R., Gain, J.: Creation and control of real-time continuous level of detail on programmable graphics hardware. Comp. Graph. For. 22(1), 35–48 (2003) 6. Boubekeur, T., Schlick, C.: Generic mesh refinement on gpu. In: Graphics Hardware, pp. 99–104 (2005) 7. Ramos, F., Chover, M., Ripolles, O., Granell, C.: Continuous level of detail on graphics hardware. In: Kuba, A., Ny´ ul, L.G., Pal´ agyi, K. (eds.) DGCI 2006. LNCS, vol. 4245, pp. 460–469. Springer, Heidelberg (2006) 8. Ji, J., Wu, E., Li, S., Liu, X.: Dynamic lod on gpu. In: CGI (2005) 9. Turchyn, P.: Memory efficient sliding window progressive meshes. In: WSCG (2007) 10. Cignoni, P., Ganovelli, F., Gobbetti, E., Marton, F., Ponchio, F., Scopigno, R.: Adaptive tetrapuzzles: efficient out-of-core construction and visualization of gigantic multiresolution polygonal models. In: SIGGRAPH, pp. 796–803 (2004) 11. Cignoni, P., Ganovelli, F., Gobbetti, E., Marton, F., Ponchio, F., Scopigno, R.: Batched multi triangulation. In: IEEE Visualization, pp. 207–214 (2005) 12. Yoon, S., Salomon, B., Gayle, R.: Quick-vdr: Interactive view-dependent rendering of massive models. IEEE Transactions on Visualization and Computer Graphics 11(4), 369–382 (2005) 13. Borgeat, L., Godin, G., Blais, F., Massicotte, P., Lahanier, C.: Gold: interactive display of huge colored and textured models. Trans. Graph. 24(3), 869–877 (2005) 14. Niski, K., Purnomo, B., Cohen, J.: Multi-grained level of detail using a hierarchical seamless texture atlas. In: Proceedings of I3D 2007, pp. 153–160 (2007) 15. Cohen, J., Olano, M., Manocha, D.: Appearance-preserving simplification. In: SIGGRAPH 1998, pp. 115–122. ACM Press, New York (1998) 16. Gonzalez, C., Gumbau, J., Chover, M., Castello, P.: Mesh simplification for interactive applications. In: WSCG (2008) 17. Lindstrom, P., Turk, G.: Image-driven simplification. ACM Trans. Graph. 19(3), 204–241 (2000) 18. Luebke, D., Hallen, B.: Perceptually-driven simplification for interactive rendering. In: 12th Eurographics Workshop on Rendering, pp. 223–234 (2001) 19. Castello, P., Chover, M., Sbert, M., Feixas, M.: Applications of information theory to computer graphics (part 7). In: Eurographics Tutorial Notes, Eurographics, vol. 2, pp. 891–902 (2007) 20. Biermann, R., Cornish, D., Craighead, M., Licea-Kane, B., Paul, B.: pixel buffer objects (2004), http://www.nvidia.com/dev content/nvopenglspecs/GL EXT pixel buffer object.txt 21. Hoppe, H.: Progressive meshes. In: SIGGRAPH, pp. 99–108 (1996)
Efficient Interference Calculation by Tight Bounding Volumes Masatake Higashi, Yasuyuki Suzuki, Takeshi Nogawa, Yoichi Sano, and Masakazu Kobayashi Mechanical Systems Engineering Division, Toyota Technological Institute, 2-12-1, Hisakata, Tempaku-ku, Nagoya, 468-8511 Japan
[email protected]
Abstract. We propose a method for efficient calculation of proximity queries for a moving object. The proposed method performs continuous collision detection between two given configurations according to the exact collision checking (ECC) approach which performs distance calculation between two objects. This method obtains efficient results as it employs the concept of clearance bounds and performs approximate distance calculations with a tight fit of bounding volumes. The high efficiency of the method, when applied to robot path planning, is demonstrated through some experiments.
1
Introduction
Interference calculation or collision detection [1] is one of the key technologies employed in computational geometry and related areas such as robotics, computer graphics, virtual environments, and computer-aided design. In geometric calculations, a collision or proximity query reports information about the relative configuration or placement of two objects, and it checks whether two objects overlap in space, or whether their boundaries intersect; furthermore, it computes the minimum Euclidean separation distance between the boundaries of the objects. These queries are necessary in different applications, including robot motion planning, dynamic simulation, haptic rendering, virtual prototyping, interactive walkthroughs, and molecular modeling. Some of the most common algorithms employed for collision detection and separation distance computation use spatial partitioning or bounding volume hierarchies (BVHs). Spatial subdivision is the recursive partitioning of the embedding space, whereas BVHs are based on the recursive partitioning of the primitives of an object. The cost of performing a proximity query, including collision detection and/or distance computation, is often greater than 90% of the planning time involved in robot motion planning [2]. Due to performance related issues, most of the existing planners use discrete proximity query algorithms and perform queries in several fixed sampled configurations in a given interval. This does not assure that they do not miss any thin objects between sampled configurations. Therefore, a method [3] that employs exact collision checking (ECC) and assures no collision has M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 15–24, 2008. c Springer-Verlag Berlin Heidelberg 2008
16
M. Higashi et al.
been proposed recently. Redon et al. [4], [5] have proposed a different approach; this method checks the collisions between the swept volume of a robot and its obstacles, and achieves a runtime performance roughly comparable to that of the ECC method. In this paper, we propose a method that employs the concept of ECC but still obtains efficient results, by adopting the following principles. 1. We do not calculate the exact minimum distance between an object and its obstacles. However, we obtain the approximate minimum distance between an object and its obstacles by using BVHs. 2. We avoid calculations if the distance between BVHs is larger than a clearance bound that assures no collision by the object movement. 3. We calculate BVHs of obstacles by dividing the bounding volumes according to approximate convex decomposition (ACD) algorithm to enhance the tightness of fit and perform the collision tests with a less number of BVHs. The rest of the paper is organized as follows. Section 2 introduces the clearance bound calculation after the explaining for the ECC. Section 3 proposes a method for obtaining tighter fit of BVHs. Section 4 describes the results of the experiments along with the discussion. Section 5 summarizes the research.
2 2.1
Exact Collision Checking and Clearance Bound Calculation Exact Collision Checking
Schwarzer et al. [3] introduced an algorithm that executes ECC by using the distances between an object and its obstacles. The object does not collide with obstacles if (1) holds for two configurations q and q . ρ · d(q, q ) < η(q) + η(q ) .
(1)
Here, d(q, q ) denotes a path length in the configuration space C. ρ is a space conversion factor for conversion from the configuration space to the work space; it is calculated by using the present configuration. Hence, ρ · d(q, q ) denotes the maximum trajectory length of the object in the work space, for the movement from q to q in the configuration space. η(q) denotes the Euclidean distance between the object and its obstacles in configuration q, as shown in Fig. 1. Equation (1) indicates that an object can move freely if the sum of distances from the object to the obstacles is larger than its moving distance. When (1) is not satisfied, we insert a mid point qmid and calculate the distances recursively for qmid ∈ Cfree ; we return ‘collision’ for qmid ∈ Cobstacle . 2.2
Distance Calculation by a Clearance Bound
The cost of performing the proximity query is given in [6]: T = Nbv × Cbv + Np × Cp ,
(2)
Efficient Interference Calculation by Tight Bounding Volumes
qmid ρd(q,q') q η(q)
q' η(q') Obstacle
Fig. 1. Exact collision check using distances
r=σ
r = σ/2 r = σ/4
17
q'
qmid ρ d(q,q') q Obstacle
Fig. 2. Collision check by clearance bound
where T denotes the total cost function for proximity queries; Nbv , the number of bounding volume pair operations; and Cbv , the total cost of a BV pair operation, including the cost of transforming each BV for use in a given configuration of the models and other per BV-operation overhead. Np denotes the number of primitive pairs tested for proximity, and Cp denotes the cost of testing a pair of primitives for proximity (e.g., overlaps or distance computation). The computation cost increases significantly if the objects consist of so many points (triangles). In particular, for the distance calculation, the latter part, Np × Cp , is relatively large, and we must calculate the distance between all the objects searching for pairs of vertices, edges, or planes on them. We do not require a precise distance value, but we must determine whether the object can move from the start configuration to the goal one without colliding with objects. Therefore, Schwarzer et al. introduced greedy distance calculation to compute lower distance bounds in [3]. In addition, we introduce a clearance bound. A clearance bound is a distance which is sufficient for an object to clear obstacles. We verify whether the distance between the object and its obstacles is greater than the clearance bound, instead of obtaining the minimum distance. The calculation is executed by using bounding volumes of the objects. If the distance is larger than the clearance bound in the calculations involved in the trees of the bounding volumes, we can stop the calculation at a higher level of the tree. We need not trace the tree to leaves, but we can discontinue the calculation at the corresponding level. We apply the clearance bound to calculations in ECC. By setting the clearance bound to half the distance in the two configurations: σ = 1/2 · ρ · d(q, q ), we verify whether η(q) > σ at both the ends of the interval and cull the distance calculations for cases in which the object sufficiently clears the obstacles. When the object is near an obstacle, the above condition is not satisfied; hence, mid points are inserted until the distance decreases below a given resolution or the robot is in Cobstacle . In Fig. 2, a clearance bound is applied. When η(q ) < σ at q , a mid point is inserted and η(q ) > σ/2 and η(qmid ) > σ/2 are verified. In this example, the check is completed at the second level of dividing the interval. We can also apply the clearance bounds to distance calculations for determining an adaptive step size of the grid-based approach such as BLS [7]. When a global planner wants to select a step size according to the space in a configuration, it
18
M. Higashi et al.
sets the given step size as a clearance bound. The robot is collision free if it advances within the clearance bound. Thus, both the step-size determination and ECC calculation are executed simultaneously. We employ a collision checker called proximity query package (PQP) [8], which pre-computes a bounding hierarchical representation of each object by using two types of bounding volumes: oriented-bounding boxes (OBBs) [9] and rectangle swept spheres (RSSs). OBBs are efficient for detecting collisions of objects, and RSSs are effective for the distance calculations. We execute clearance bound calculation by using the approximate separation-distance-computation, which is provided by the PQP as a function.
3 3.1
Tighter Fit of Bounding Volume Hierarchies Approximate Convex Decomposition of OBBs
To create a BVH for OBBs or RSSs in PQP, an intermediate OBB is divided into two parts according to the principal directions of covariance matrix (PDC) of the distribution of the input data. However, for obtaining a tighter fit of OBBs, we should divide an intermediate OBB according to the position where it changes direction sharply or at the deepest point of a concavity. Hence, we utilize the approximate convex decomposition (ACD) algorithm [10] for generating an OBB hierarchy. First, we generate a convex hull for the input point data; then, we search the notch point which is not located on the hull and is furthest away from the hull. Finally, we divide the volume into two boxes by the plane passing through the point. The direction of the plane is determined, for example, to equally divide the included angle at the notch vertex. This is repeated until the distance from the point to the hull is below the specified tolerance. We provide an example of collision checking for two types of solids: C-type and S-type. Figure 3 shows the manner in which each type of solid is divided into an OBB hierarchy. The lefthand-side and righthand-side images show the decompositions by PDC and ACD, respectively. The tightness of fitting to OBBs is measured by the volumes of bounding boxes at each level (see Table 1). A lower hierarchy level indicates a larger volume difference. At the second level, decomposition by ACD is tighter by 13% as compared to that by PDC. For comparing the computation time required for the two types of decomposed BVHs, collision check is executed for the movement of the left (green) object, as shown in Fig. 4. The result is shown in Table 2, which also includes the number of collision calculations. The computation time is reduced by approximately 40% because a tighter fit separates objects earlier (at a higher level). 3.2
Decomposition for Open Shells
Obstacles are not necessarily composed of complete solids that have no boundaries, but they are sometimes represented by open shells. For example, a car body consists of sheet metal parts; hence it might be composed of open shells.
Efficient Interference Calculation by Tight Bounding Volumes
19
Table 1. Comparison of volume of BVHs at each hierarchy : PDC : ACD
Object Hierarchy level 0 C-type 1 2 0 S-type 1 2
Volume PDC ACD 35.9 35.9 47.5 47.0 37.8 32.7 40.0 40.0 35.2 35.0 32.2 28.3
Table 2. Comparison of computation time for different BVHs
Fig. 3. OBBs by PDC and ACD
Object S-type number time (ms) C-type number time (ms)
PDC 30 0.58 52 1.24
ACD 28 0.37 46 0.74
Fig. 4. Movement of objects for collision check
For an open shell, we require a different algorithm of decomposition as compared to that required for a solid object. Since an open shell has boundaries, a point on the boundary might be the most concave part, although it is located on the convex hull. Hence, in addition to the convex hull calculation, we must verify the distance between the concave vertex on the boundary and the convex boundary. We create a data structure of a shell model from the input data, which include point data with indices and coordinate values and face data with a sequence of vertices expressed by point indices such as mesh data [11]. The input data are stored in two tables, as shown in Fig. 5, in the form of a data structure. From these tables, we form a table of vertex-cycle, which is a sequence of counterclockwise faces around a centered vertex. The list can be generated according to the following procedure. Algorithm 1: Generation of point-based data structure (Vertex cycle) For each vertex (vi ), 1) Collect faces from Faces Table, which include vi . Select one face, for example fp , and its edge (vj , vi ).
20
M. Higashi et al.
f9
f1
f8
f2
f7 f3
INPUT [Faces Table] [Vertices Table] Number of faces Number of pts 1: (1, 7, 8) 1: x1, y1, z1 2: (8, 7, 6) 2: x2, y2, z2 3: (9, 6, 5) 3: x3, y3, z3 : : Generated Data Structure [Vertex cycle] [Boundary / Branch edge] 1: (f1, f9,-) (1,7,6,5,4,3,2,1) 2: (f9, f8,-) :
8: (f1, f2, f3, f7, f8, f9)
Fig. 5. Point-base data structure
11
12
f9 f1
f11
bridge qn
f8f10
f2 f3
f7
(3, 8) is a branch edge.
Fig. 6. Shell model with branch boundary
pm : concave vertex
Fig. 7. Convex boundary and bridges
2) Search face fq , which has edge (vi , vj ), and store fq in the vertex-cycle list. If there is no face that includes (vi , vj ), insert “-”; this means no face and implies that it is a boundary. Then, search the vertex cycle in reverse. 3) Get edge (vk , vi ) in fq and set it as the next edge for step 2); repeat steps 2) and 3) until the selected face returns to the first face or the boundary again. In step 2, if the number of edges is greater than one, it is a branch edge, where more than two faces meet. We consider a branch edge as a semi-boundary that separates a shell as a boundary edge. We insert “#” and search faces in the reverse direction, similarly to the boundary edge. As a result, we can obtain multiple vertex cycles. From the table of vertex cycles, we deduce the boundaries and branch boundaries of a shell model, according to the following steps. Algorithm 2: Extraction of boundaries 1) Collect vertices from the vertex-cycle table, which include a boundary edge, for example, (fi , fj ,-). Set vertex vi in the boundary list and set fi which is next to the boundary, that is “-”, as a boundary face. 2) Search vertex vj , which has a sequence of (fi ,-) in the vertex cycle, and add it to extend the boundary list.
Efficient Interference Calculation by Tight Bounding Volumes
21
3) Select face fk next to “-” in the vertex cycle of vj . Replace fi with fk and repeat step 2) until we reach the first vertex. 4) Repeat this until there remain no vertices in Step 1. If there are holes, we have multiple boundaries. We can obtain the branch boundaries by using the same algorithm provided above except for the following conditions: in a vertex cycle, there are multiple pairs of same edges and there may exist pairs of a branch edge and boundary edge. Figure 6 shows an example of a boundary edge. Here, an open shell is added to the shape in Fig. 5. Edge (3, 8) is a branch edge. The vertex cycles for vertex 8 are (f 1, f 2, f 3, f 7, #, f 8, f 9) and (#, f 10, f 11, -). Thus we simply express a non-manifold surface model instead of exact representation [12]. Now, we describe a method to divide an OBB to generate a tight fit of BV hierarchy. First, we divide an object into independent shells according to the obtained boundaries and compose a binary tree for the open shells according to the positions of the shell boundaries. Then, we divide each OBB as a shell, as follows. Here, we define a concave vertex locally for every three points along a three dimensional boundary and compose a concave part of pairs of a bridge and concave vertices. A bridge [10] is an edge inserted to connect a concave part and generate a convex boundary. Figure 7 shows an example of a convex boundary and bridges for an open shell. The small circles show concave vertices. The concave vertex that is furthest from the corresponding bridge is the point for division of the shell if the distance is larger than a given tolerance. The shell is divided along the edges that connect the division point and the nearest vertex on the opposite boundary or the inner boundaries of holes. This division is repeated until there are no concave vertices whose distances to the bridge are greater than the tolerance. In Fig. 7, pm is a division point and (pm , qn ) is a dividing edge. When there are no concave parts and there exists a hole, we divide its boundary at the extremum vertices along the principal direction of the hole boundary.
4
Experiments
We implemented our local planner for application to rapidly-exploring random trees (RRT) [13] and bi-directional local search (BLS) [7] algorithms. We used the Motion Strategy Library (MSL) [14] to implement and modify RRT as well as GUI for BLS. We show the effectiveness of our planner, applied to RRT and BLS, by conducting some experiments for an object movement in a 2D maze and 3D environments along with an articulated robot in 3D environments. The planner was implemented in C++ and the experiments were executed on a Windows PC (2.4 GHz Pentium 4 processor with 1.0 GB memory). First, we report the analysis of the effectiveness of our algorithms. Figure 8 shows the start and goal configurations (small red circles) for a moving object in a 2D maze, and Fig. 9 shows those for a 3D cage. We ran RRT 10 times for the 2D maze and 3D cage. Each case involves three types of calculation methods – without ECC, with ECC, and with ECC along with clearance bounds (CB)
22
M. Higashi et al.
(a) Start and goal configurations
(b) Obtained tree and path
Fig. 8. Start and goal configurations for 2D maze
(a) Start configura- (b) Goal configuration tion Fig. 9. Start and goal configurations for cage data
Table 3. Experiment results for Table 4. Experiment result for articulated robot moving object Case A Case B Maze Cage CPU Without ECC 28.94 3.71 ECC 35.90 22.05 Time ECC with CB 27.32 11.85 No. Collision check 7023 26383 of Nodes of tree 2022 470 Nodes in path 247 30
RRT
BLS
RRT BLS
CPU Without ECC 369.74 42.34 55.87 9.66 ECC 523.59 145.32 238.25 21.44 Time ECC with CB 422.60 115.22 69.84 2.87 No. Collision check 16878 90817 of Nodes of tree 5869 4546 Nodes in path 478 2345
7548 1750 2267 467 110 366
– in the distance calculations. Table 3 shows the results of the calculation. The averages of running CPU time are shown for these methods. The number of points for collision check in the configuration space, number of nodes in the tree, and number of nodes used for the path are shown for ECC with CB. The running time for ECC with CB as compared to that for ECC without CB is reduced by 25%, from 35.9 to 27.3, for the 2D maze. For the cage, the running time is significantly reduced, from 22.1 to 11.9 (approximately half). The difference between the running time in the two cases is due to the following reasons. First, the number of generated nodes in the maze is considerably larger than that in the cage; hence, a large amount of time is required for the tree generation, which is shown in Fig. 8 (b). Second, the time required for 2D distance calculation in ECC is considerably smaller than that required for 3D distance calculation. Next, we executed experiments on articulated robots that have many degrees of freedom and require a large number of interference calculations. Figure 10 shows the start and goal configurations of the robot along with the given obstacles. The robot is a 6-axes articulated robot and has a fixed base. The surfaces of the robot are represented by 989 triangles and those of a car are represented by 2,069 triangles. In Case A: figures (a) and (b), the robot goes into a car from the outside. However, the robot turns between two cars in Case B: figures (c) and (d). We ran RRT and BLS 20 times for Case B and 5 times for Case A.
Efficient Interference Calculation by Tight Bounding Volumes
(a) Start-A
(b) Goal-A
(c) Start-B
(d) Goal-B
(a) Output boundaries
(c) Door panel
Fig. 10. Start and goal configurations for articulated robot
23
(b) Extracted shells
(d) Shell decomposition
Fig. 11. Boundaries and extracted shells
Each case involves three types of calculation methods, similar to the previous experiments. Table 4 shows the results of the calculations. The running time of ECC with CB compared to ECC without CB is reduced by 20% for Case A; however, for Case B, the running time is reduced drastically from 238.25 to 69.84 (1/3) for RRT and from 21.44 to 2.87 (1/7) for BLS. This is because the robot cannot be separated easily from the BVH due to its position inside the car. In Case B, the time required for ECC with CB is nearly the same as that required “without ECC” for RRT. Hence, the introduction of CB is proven to be effective. For BLS, the time required by the method without ECC is larger than that required for ECC with CB. This is because in the calculation without ECC, we require distance calculations for determining a step size adaptively; however, in the calculation by ECC with CB, we apply the clearance bound to the adaptive step size along with the calculations for ECC. RRT requires a larger time as compared to BLS, because it generates a tree uniformly for free configurations and requires time to search the nearest node to add it to the tree. On the other hand, BLS attacks a target more directly with less nodes and uses discrete adaptive step sizes. Furthermore, BLS applies a lazy evaluation of ECC. Next, we conducted the decomposition of OBBs by ACD and path calculation for Case A. The reduction of execution time was only 10%, because the object consists of open shells. Hence, we generated a data structure of a shell model and obtained boundaries and shells, as shown in Fig. 11. An example of open-shell decomposition is also shown for a door panel. By using extracted shells, the CPU time decreased to less than one third of the initial value.
5
Summary
We have introduced a method for efficient calculation of proximity queries for a moving object. Our method can be employed for continuous collision detection
24
M. Higashi et al.
between two given configurations according to the ECC approach. It obtains results efficiently by using the concept of a clearance bound and approximate distance calculation with close bounding volumes. To obtain close BVHs, we employed algorithms for the decomposition of OBBs. For a solid object, we decomposed OBBs by using ACD; we generated a data structure for a shell object and detected a dividing point furthest from a bridge in the convex boundary. The high efficiency of the method, when applied to RRT and BLS, is demonstrated by experiments for moving objects and practical articulated robots. Acknowledgments. This study was partly supported by the High-Tech Research Center for Space Robotics from the Ministry of Education, Sports, Culture, Science and Technology, Japan. The implementation of the prototype system was performed in cooperation with IVIS Inc.
References 1. Hadap, S., Eberle, D.: Collision Detection and Proximity Queries. In: SIGGRAPH 2004 Course Notes (2004) 2. Latombe, J.C.: Robot Motion Planning. Kluwer, Boston (1991) 3. Schwarzer, F., Saha, M., Latombe, J.-C.: Exact collision checking of robot paths. In: Boissonnat, J.D., et al. (eds.) Algorithmic Found. Robot V., pp. 25–42 (2004) 4. Redon, S., Kim, Y.J., Lin, M.C., Manocha, D.: Fast Continuous Collision Detection for Articulated Robot. In: Proc. ACM Symp. Solid Modeling and Application, pp. 1316–1321 (2004) 5. Redon, S., Lin, C.: Practical Local Planning in the Contact Space. In: Proc. IEEE Int. Conf. Robot. Autom., pp. 4200–4205 (2005) 6. Lin, M.C., Manocha, D.: Collision and Proximity Queries. In: Handbook of Discrete and Computational Geometry, 2nd edn., ch. 35, pp. 787–807. Chapman & Hall/CRC (2004) 7. Umetani, S., Kurakake, T., Suzuki, Y., Higashi, M.: A Bi-directional Local Search for Robot Motion Planning Problem with Many Degrees of Freedom. In: The 6th Metaheuristics International Conference MIC 2005, pp. 878–883 (2005) 8. Larsen, E., Gottschalk, S., Lin, M., Manocha, D.: Fast Proximity Queries with Swept Sphere Volumes. Technical report TR99-018. Department of Computer Science, University of N. Carolina, Chapel Hill (1999) 9. Gottschalk, S., Lin, M., Monacha, D.: OBB-Tree: A Hierarchical Structure for Rapid Interference Detection. In: Proceedings of ACM SIGGRAPH 1996, pp. 171– 180 (1996) 10. Lien, J.-M.C., Amato, M., Approximate, N.: convex decomposition of polygons. In: Proc. 20th Annual ACM Symp. Comutat. Geom (SoCG), pp. 17–26 (2004) 11. Botsch, M., Pauly, M.: Geometric Modeling Based on Polygonal Meshes. In: SIGGRAPH 2007 Course Notes (2007) 12. Higashi, M., Yatomi, H., Mizutani, Y., Murabata, S.: Unified Geometric Modeling by Non-Manifold Shell Operation. In: Proc. Second Symposium on Solid Modeling and Applications, pp. 75–84. ACM Press, New York (1993) 13. Kuffner, J.J., LaValle, S.M.: RRT-connect: An efficient approach to single-query path planning. In: Proc. IEEE Int. Conf. Robot. Autom., pp. 995–1001 (2000) 14. LaValle, S.M.: Motion Strategy Library, http://msl.cs.uiuc.edu/msl/
Modeling of 3D Scene Based on Series of Photographs Taken with Different Depth-of-Field Marcin Denkowski1 , Michal Chlebiej2 , and Pawel Mikolajczak1 1
2
Faculty of Computer Science, Maria Curie-Sklodowska University, pl. Marii Curie-Sklodowskiej 5, 20-031 Lublin, Poland
[email protected] Faculty of Mathematics and Computer Science, N. Copernicus University, Chopina 12/18, 87-100 Toru´ n, Poland
Abstract. This paper presents a method for fusing multifocus images into enhanced depth-of-field composite image and creating a 3D model of a photographed scene. A set of images of the same scene is taken from a typical digital camera with macro lenses with different depth-of-field. The method employs convolution and morphological filters to designate sharp regions in this set of images and combine them together into an image where all regions are properly focused. The presented method consists of several phases including: image registration, height map creation, image reconstruction and final 3D scene reconstruction. In result a 3D model of the photographed object is created.
1
Introduction
Macro photography is a type of close-up photography with magnification ratios from about 1:1 to about 10:1. The most crucial parameter of macro photography is the depth of field (DOF) [1]. Because it is very difficult to obtain high values of DOF for extreme close-ups it is essential to focus on the most important part of the subject. Any other elements that are even a millimeter farther or closer may appear blurred in the acquired photo. The depth of field can be defined as the distance in front of and behind the subject appearing in focus. Only a very short range of the photographed subject will appear in exact focus. The most important factor that determine whether the subject appears in focus is how a single point is mapped onto the sensor area. If a given point is exactly at the focus distance it will be imaged as one point on the sensor, in the other case it will produce a disk whose border is known as a “circle of confusion”. These circles can be used to define the measure of focus and blurriness as they increase in diameter the further away they are from the focus point. For a specific film format, the depth of field is described as a function parametrized by: the focal length of the lens, the diameter of the lens opening (the aperture), and the distance between the subject and the camera. Let D be the distance at which the camera is focused, F the focal length (in millimeters) calculated for M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 25–34, 2008. c Springer-Verlag Berlin Heidelberg 2008
26
M. Denkowski, M. Chlebiej, and P. Mikolajczak
an aperture number f and k - the “circle of confusion” for a given film format (in millimeters), then depth of field (DOF) [1] can be defined as: DOF1,2 =
D 1±
1000×D×k×f F2
(1)
where DOF1 is distance from the camera to the far depth of field limit, and DOF2 is the distance from the camera to the near depth of field limit. The aperture controls the effective diameter of the lens opening. Reduction of the aperture size increases the depth of field, however, it also reduces the amount of light transmitted. Lenses with a short focal length have a greater depth-offield than long lenses. Greater camera-to-subject distance results in a greater depth-of-field. We used this optical phenomenon to achieve two aims. The first one was to obtain the deepest possible depth-of-field using standard digital camera images and image processing algorithms. And the second goal was to create threedimensional model of photographed scene. As an input we have created a series of macro photograph images of the same subject with different focus lengths. In the first step of our method we have to register them together to create a properly aligned stack of images. The next step is to fuse them into a one composit image. For that purpose we propose enhanced multiscale convolution and morphology method, which we have introduced in [2]. Methods for image fusion using multiscale morphology have been broadly discussed in [3,4,5]. As an effect of fusing algorithm we obtain a height map and the reconstructed focused image with a very deep depth-offield. The height map is a label map which determines the height of each part of the scene. From this map, we can construct a 3D model of the scene. In this work we limit our method to macro photography only and we assume that images were taken perpendicularly or almost perpendicularly to the scene. However, to clearly present advantages and problems of our method, we also show some cases with sets of images acquired in different way.
2
Registration
In the first step a set of photographs of the desire object is acquired. Unfortunately, during extreme close-up sessions small movements of the camera are possible even when using tripods for stabilization. To make the reconstruction method more robust we can make use of an image registration procedure. The main idea behind image registration is to find perfect geometric alignment between a set of overlapping images. The quality of match measure represents the matching function parametrized by the geometric transformation. In our method we use the rigid (translations and rotation) or the affine transformation model (rigid + scaling and shears). In most cases it is sufficient to use the simplified rigid transformation (translations only). But when images are acquired without stabilization devices the use of complete affine transformation is a necessity. In
Modeling of 3D Scene Based on Series of Photographs
27
our approach we use the normalized mutual information [7] as the matching function: h(F I) + h(RI) N M I(F I, RI) = (2) h(F I, RI) where RI represents the reference image and F I represents the floating image. h(F I) = − h(RI) = − h(F I, RI) = −
pF I (x)log(pF I (x))
(3)
pRI (x)log(pRI (x))
(4)
pF I,RI (x, y)log(pF I,RI (x, y))
(5)
where h(F I), h(RI) and h(F I, RI) are the single and joint entropies [2], pF I and pRI are the probabilities of each intensity in the intersection volume of both data sets and pF I,RI is a probability distribution of a joint histogram. For the minimization of the selected similarity measure we use Powell’s algorithm [8]. As a result of the registration procedures we obtain a set of geometrically matched images that can be used in the next stages of our wide depth of field reconstruction algorithm.
3 3.1
Image Fusion Overview
Image fusion is a process of combining a set of images of the same scene into one composite image. The main objective of this technique is to obtain an image that is more suitable for visual perception. This composite image has reduced uncertainty and minimal redundancy while the essential information is maximized. In other words, image fusion integrates redundant and complementary information from multiple images into a composite image but also decreases dimensionality. There are many methods discovered and discussed in literature that focus on image fusion. They vary with the aim of application used, but they can be mainly categorized due to algorithms used into pyramid techniques [10,11], morphological methods [3,4,5], discrete wavelet transform [12,13,14] and neural network fusion [15]. The different classification of image fusion involves pixel, feature and symbolic levels [16]. Pixel-level algorithms are low level methods and work either in the spatial or in transform domain. This kind of algorithms work as a local operation despite of transform used and can generate undesirable artifacts. These methods can be enhanced by using multiresolution analysis [10] or by complex wavelet transform [14]. Feature-based methods use segmentation algorithms to divide images into relevant patterns and then combine them to create output image by using various properties [17]. High-level methods combine image descriptions, typically, in the form of relational graphs [18].
28
3.2
M. Denkowski, M. Chlebiej, and P. Mikolajczak
Methodology
In our work we use multiscale convolution and morphology methods combined with pyramid segmentation algorithm to distinguish homogeneous regions. Our fusion method is also capable to work with color images. Color image fusion has been discussed in [19]. At this stage we assume that images on the image stack are aligned to each other. At this point the main objective is to create focused image and the height map. The whole algorithm, shown in Fig. 1, can be divided into 5 stages: 1. Creation of n-level multiresolution pyramid for every input image. In this case we use median filter to downscale images. 2. Segmentation of every image on the stack by using pyramid segmentation. For this process we convert images into HSL color model [9] to separate luminance (contrast) information contained in luminance channel from color description in hue and saturation channels. Example results of the segmentation process are shown in Fig. 2 as segmentation maps. 3. Calculation of local standard deviation SD at local region R for every pixel f (x, y) at each pyramid level L for every image on the stack (z): 1 (L) (f (x, y) − fR )2 (6) SDR (x, y, z) = NR (x,y)∈R,z
Color RGB components are coverted to its graylevel intensity according to Gf = 0.299R + 0.587G + 0.114B. 4. Reconstruction rules. (0) Step-1. For the lowest level of pyramid, pixels with maximum SDmax (x, y, z) are marked as focus and labeled in the height map HM (x, y) with z value. If (0) (0) abs(SDmax (x, y) − SDmin (x, y)) < Ts , where Ts is a threshold value, pixel is marked as unresolved because it usually belongs to smooth region. These pixels are taken care of at subsequent steps. Step-2. Every pixel is checked with the segmentation map. If it isn’t nearby any edge and its SDR (x, y, z) value drastically differ from SDR (x, y, z) average pixel value for its region R it is marked with SDR (x, y, z) value of the median pixel. It prevents from marking false or noise pixels. (i) Step-3. For every i-th pyramid level, starting from i=1, if SDR (x, y, z) of (i−1) actual pixel is not equal to SDR (x, y, z) from previous pyramid level, then: (a) if the pixel is nearby some edge marked on the segmentation map, pixel (i) (i−1) with max(SDR (x, y, z), SDR (x, y, z)) value is taken and labeled in the height map HM (x, y) with (i) or (i − 1) value, (b) else, the height map HM (x, y) is labeled as: HM (x, y) = HM (i−1) (x, y) +
HM (i) (x, y) − HM (i−1) (x, y) 2
(7)
Modeling of 3D Scene Based on Series of Photographs
29
Fig. 1. Image Fusion scheme using pyramid decomposition and HSL Segmentation
Step-4. Labeling remaining pixels. If unresolved pixel belongs to region with many other unresolved pixels it is marked as a background, else the median value from region is taken. 5. Creation of fusing image. The value of fused image pixel f (x, y) is equal to the pixel f (z) (x, y) from z − th input image on the stack, where z is a value taken from created height map HM (x, y). The main difficulty is to obtain the height map without spikes or noise, generally smooth but with sharp edges. It is not essential from the point of view of the image fusion, but it may be crucial in three-dimensional reconstruction of the scene. Most of such peaks are generated in smooth regions, where noise in defocused region on one image from the stack often gives greater values of SD than in the corresponding region on sharp image. This leads to undesired deformations of reconstructed spatial surface. For that reason, it is necessary to determine a background plane. For now, we assumed that the background plane overlaps with the last image on the stack, but the plane equation may be also given by hand. Fusion process often creates halo effects near the edges of objects. This phenomenon can be observed in Fig. 3. To resolve this problem we use segmentation maps to determine edges. After that we are able to mark pixels near edges properly as shown in Step-2 and Step-3 of Reconstruction rules.
4
3D Scene Creation
Spatial scene is generated on the basis of information contained in the height map, where each pixel value represents z coordinate of appropriate mesh vertex. In 3D reconstruction process we have considered two methods e.i. marching cubes algorithm (MS) [20,21] and simply by changing z coordinate in a 3D regular mesh. Both methods have advantages as well as disadvantages. Marching-cube gives more control over reconstruction process but is also more complicated and sometimes produces too sharp blocky edges, while second method is very simple and fast but always produces regular mesh. Generated mesh is decimated and smoothed. Created surface is textured with a plane mapping by the fused image.
30
M. Denkowski, M. Chlebiej, and P. Mikolajczak
Fig. 2. Segmentation maps created by using pyramid segmentation (right column) for multifocus images (left column)
Fig. 3. Example of halo effect. Part of the original image (a), the segmentation map (b), the height map created without using the segmentation map - visible halo effect (c) and edges in the height map with help of the segmentation map (d).
5
Experimental Results
The proposed method has been implemented on Linux platform in C++ language using SemiVis framework [22] and Kitware VTK library for visualisation purposes. For testing procedure we have prepared eight image stacks from macrophotography. Each stack contains six to twelve images taken with different depth-of-field, and one control image taken with the largest possible depth-of-field that we were able to receive from our testing digital camera with macro lens. In all cases the procedure is performed in the following order. At first, the registration process aligns multifocus images to each other to minimize misregistration. Then all images are segmented and the pyramid is created up to three levels. Finally, the reconstruction process combine image stack into height map and fused image. Reconstruction time strongly depends on the size of the images used in the fusion and the number of images on the stack. The most computationally expensive is the registration procedure, which consumes above fifty percent of the overal reconstruction time. The fusion process takes about 35%, and generation of three dimensional mesh takes remaining 15%. For a typical set of images, containing ten images with resolution 512x512 the whole procedure lasts about 60 seconds.
Modeling of 3D Scene Based on Series of Photographs
31
Fig. 4. Sets of multifocus images (1,2,3abc), reconstructed focus image (1,2,3e), created height map (1,2,3f), control image taken with the largest possible depth-of-field (1,2,3g)
Examples of multifocus images with height map and reconstructed fused images are shown in Fig. 4. Each fused image is compared to its control image. Mutual Information (MI) and Mean Square Difference (MSD) are useful tools in such comparison. Table 1 contains calculated similarity values for every fused image and corresponding reference image. Table 1 also contains widely used metric QAB/F that measures quality of image fusion. This measure was proposed by Xydeas and Petrovi´c in [23]. In this case, a per-pixel measure of information preservation is obtained between each input and the fused image which is aggregated into a single score QAB/F using a simple local importance assignment. This metric is based on the assumption that fusion algorithm that transfers input gradient information into result image more accuretely performs better. QAB/F is in range [0, 1] where 0 means complete loss of information and 1 means perfect fusion. From the height map and fused image we can generate 3D model of the scene. Additionally, the height map is filtered with strong median and gaussian filter
32
M. Denkowski, M. Chlebiej, and P. Mikolajczak
Table 1. Similarity measures between reconstructed image and reference image with large depth-of-field - MI and MSD and the quality measure QAB/F Stack
Similarity measures MI MSD
S-1 S-2 S-3 S-4 S-5 S-6 S-7 S-8
0.82 0.67 0.72 0.88 0.82 0.64 0.69 0.71
28.48 32.11 38.43 26.03 27.74 35.81 34.30 41.65
QAB/F 0.84 0.73 0.79 0.85 0.80 0.72 0.72 0.69
Fig. 5. Result fused images and 3D models
Modeling of 3D Scene Based on Series of Photographs
33
Fig. 6. Typical image that creates failed 3D model. This photograph presents a common child’s spinning top. Reconstruction algorithms failed because of many smooth and uniform regions and a lack of background plane.
to smooth regions and after that the mesh is created. Fig. 5 shows qualitative results of our method for eight tested image sets. The biggest problem in this 3d reconstruction is to obtain a surface which is smooth enough in uniform regions and simultaneously has sharp edges on the objects boundaries. The best results are received when the photographs are taken perpendicularly to the background, objects are within the scene, and they are rough without smooth regions. Fig. 6 shows an example of a typical failure. Our method often fails when there are large smooth regions which don’t belong to the background plane. The main difficulty in such cases is to distinguish between background and an object without any external spatial knowledge of the scene.
6
Conclusions
This paper presented an attempt to the problem of generating of 3d model from a set of multifocus images. We proposed the whole pipeline from raw photographs to the final spatial model. Input multifocus images were registered together and next, using typical image filters and gradient methods the height map was created by detecting focused regions in each of them. Based on the height map the image with a greater depth-of-field was composed. Finally, further algorithms reconstructed the 3d model of the photographed scene. The presented results of generation of 3D models are very promising, but as for now, there are still many problems that need to be solved. Future work could include improvements in segmentation and edge detection to help in automatic detection of the background plane. Second, there should be more complex methods used to identify smooth regions of objects. We think that in both cases pattern recognition algorithms should improve effectiveness of our method. Also Feature-based fusion methods such as [17] could generate more accurate height maps.
References 1. Constant, A.: Close-up Photography. Butterworth-Heinemann (2000) 2. Denkowski, M., Chlebiej, M., Mikolajczak, P.: Depth of field reconstruction method using partially focused image sets. Polish Journal of Environmental Studies 16(4A), 62–65 (2007)
34
M. Denkowski, M. Chlebiej, and P. Mikolajczak
3. Ishita, D., Bhabatosh, C., Buddhajyoti, C.: Enhancing effective depth-of-field by image fusion using mathematical morphology. Image and Vision Computing 24, 1278–1287 (2006) 4. Mukopadhyay, S., Chanda, B.: Fusion of 2d gray scale images using multiscale morphology. Pattern Recognition 34, 1939–1949 (2001) 5. Matsopoulos, G.K., Marshall, S., Brunt, J.N.M.: Multiresolution morphological fusion of mr and ct images of the human brain. IEEE Proceedings Vision, Image and Signal Processing 141(3), 137–142 (1994) 6. Eltoukhy, H., Kavusi, S.: A computationally efficient algorithm for multi-focus image reconstruction. In: Proceedings of SPIE Electronic Imaging (June 2003) 7. Studholme, C., et al.: An overlap invariant entropy measure of 3D medical image alignment. Pattern Recognition 32(1), 71–86 (1999) 8. Press, W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T.: Numerical Recipes in C, 2nd edn. Cambridge University Press, Cambridge (1992) 9. Gonzalez, R.C., Woods, R.E.: Digital image processing. Addison-Wesley Publishing Company, Inc, Reading (1992) 10. Burt, P.J.: The pyramid as a structure for efficient computation. In: Multiresolution Image Processing and Analysis, pp. 6–35. Springer, Berlin (1984) 11. Toet, A.: Image fusion by rati of low-pass pyramid. Pattern Recognition Letters 9(4), 245–253 (1989) 12. Li, H., Manjunath, H., Mitra, S.: Multisensor image fusion using the wavelet transform. Graphical Models and Image Processing 57(3), 235–245 (1995) 13. Chibani, Y., Houacine, A.: Redundant versus orthogonal wavelet decomposition for multisensor image fusion. Pattern Recognition 36, 879–887 (2003) 14. Lewis, L.J., O’Callaghan, R., Nikolov, S.G., Bull, D.R., Canagarajah, N.: Pixeland region-based image fusion with complex wavelets. Information Fusion 8, 119– 130 (2007) 15. Ajjimarangsee, P., Huntsberger, T.L.: Neural network model for fusion of visible and infrared sensor outputs, Sensor Fusion, Spatial Reasoning and Scene Interpretation. In: The International Society for Optical Engineering, SPIE, Bellingham, USA, vol. 1003, pp. 152–160 (1988) 16. Goshtasby, A.A.: Guest editorial: Image fusion: Advances in the state of the art. Information Fusion 8, 114–118 (2007) 17. Piella, G.: A general framework for multiresolution image fusion: from pixels to regions. Information Fusion 4, 259–280 (2003) 18. Wiliams, M.L., Wilson, R.C., Hancock, E.R.: Deterministic search for relational graph matching. Pattern Recognition 32, 1255–1516 (1999) 19. Bogoni, L., Hansen, M.: Pattern-selective color image fusion. Pattern Recognition 34, 1515–1526 (2001) 20. Lorensen, W.E., Cline, H.E.: Marching cubes: A high resolution 3D surface construction algorithm. Computer Graphics 21(4), 163–169 (1987) 21. Durst, M.J.: Additional reference to Marching Cubes. Computer Graphics 22(2), 72–73 (1988) 22. Denkowski, M., Chlebiej, M., Mikolajczak, P.: Development of the cross-platform framework for the medical image processing. Annales UMCS, Sectio AI Informatica III, 159–167 (2005) 23. Xydeas, C., Petrovi´c, V.: Objective image fusion performance measure. Electronics Letters 36(4), 308–309 (2000)
A Simple Method of the TEX Surface Drawing Suitable for Teaching Materials with the Aid of CAS Masataka Kaneko, Hajime Izumi, Kiyoshi Kitahara1, Takayuki Abe, Kenji Fukazawa2 , Masayoshi Sekiguchi, Yuuki Tadokoro, Satoshi Yamashita, and Setsuo Takato3 1
Kisarazu National College of Technology, Japan Kogakuin University, Japan 2 Kure National College of Technology, Japan 3 Toho University, Japan
[email protected]
Abstract. The authors have been developing KETpic as a bundle of macro packages for Computer Algebra Systems (CASs) to draw fine TEX-pictures. Recently we have developed a new method of the surface drawing using KETpic. The equation of envelopes is used to draw ridgelines of surfaces. Also the technique of hidden line elimination is used. By these methods, we can draw 3D-graphics which are so simple that the global (i.e. sketchy) shapes of them are easily understood.
1
Introduction
We have been developing KETpic as a bundle of macro packages for CASs. It has been developed for inserting fine (accurate and expressive) figures into LATEX documents. We can readily produce a graphic output within such CASs and export the resulting figures to LATEX documents as source code rather than a graphical file itself. It is downloadable for free from [5]. Since KETpic has been equipped with various commands or accessories, its 2D-graphics have been actually used in our mathematics classroom. We can also draw space curves easily with KETpic. The 3D-graphics of KETpic are monochrome and are composed of lines. 5 0 5 1.0 0.5 0.0 0.5 1.0 5 0 5
Fig. 1. The graphic of CAS
Fig. 2. The graphic of KETpic
M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 35–45, 2008. c Springer-Verlag Berlin Heidelberg 2008
36
M. Kaneko et al
It takes little cost for us to make copies of the figures drawn with KETpic. Moreover, the quality of the figures is maintained when they are copied. Using these printed materials repeatedly, students can establish the mathematical concepts. Thus, they are suitable for teaching materials in the form of mass printed matter. On the other hand, the 3D-graphics of CASs are colourful and illuminated, so they are suitable for demonstration on displays. However, it takes much cost to maintain the quality of the graphic images of them when they are copied. Therefore, it is difficult for students to use them repeatedly. Currently KETpic is equipped with the ability to eliminate the part of curves hidden by the other ones [1]. We call such a technique “skeleton method”. By using only one command “skeletondata” of KETpic, we can easily give clear perspective to 3D-graphics as shown in Fig. 3. These figures were actually used to explain the definition of double integral and the concept of repeated integral in our mathematics classroom. z
z
y
y
x
x
Fig. 3. Figures drawn with skeleton method
Though the ability of KETpic for drawing space curves is exploited as above, there were some deficits for surface drawing. In this paper, the authors introduce a new capability of KETpic for surface drawing. It meets the potential demands of the mathematics teachers who usually use CAS and TEX , and who 1. want to show students the figures of surfaces with the accurate ridgelines just as we see them. 2. want to draw such figures using the minimal number of lines so as to make them easy to see. In fact, a method to draw accurate ridgelines has been known. In that method, a point on the ridgelines is characterized by the following condition: The normal vector of the surface at that point is orthogonal to the line connecting that point with the view point. Since the equation describing this condition is simple, we can solve it by hand. However, this method is ad hoc in a sense, and it would be complicated in case of
A Simple Method of the TEX Surface Drawing
37
the surface given by parameters other than x, y. The method introduced in this paper is applicable to such general cases. Though the equation of our method is a little complicated and can not be solved by hand, it can be solved within almost the same time as in the above method by using CAS. The authors also introduce the method to eliminate the parts of ridgelines hidden by the parts of the surface on this side of the viewer. In the next Section 2, we introduce mathematical background of our method. In Section 3, we explain the procedure to draw surfaces using our method with an example.
2 2.1
Mathematical Background Setting of the Problem
Suppose z = f (x, y) is a smooth function with two variables defined on a domain D in R2 . When we draw a graph of f on a plane, we actually draw its image under the projection p onto the plane (the field of our view). The projection depends on the viewpoint and the focus point, and generally has a form of nonlinear transformation. The 3D-graphics of CAS are drawn also by using the automatic calculation of this projection p. Therefore, it should be impossible for us to draw a precise figure of the graph of f without using CAS or other graphic softs. In the above setting, the composition p◦f : R2 −→ R2 becomes a smooth map. For exactness, we give coordinate (x, y) to the former R2 (containing domain D) and coordinate (X, Y ) to the latter R2 (containing image (p ◦ f )(D)). For example, we consider the graph of the function: z = 2 − x2 − y 2
(−1 ≤ x ≤ 1, −1 ≤ y ≤ 1) .
It is drawn with CAS, and is inserted into LATEX document by using KETpic. Remark that Fig. 4 is drawn on the XY -plane. z ridgeline
y edge line
x
Fig. 4. Edge lines and ridgelines
38
M. Kaneko et al
As shown in Fig. 4, the boundary of (p ◦ f )(D) in the XY -plane is composed of two kinds of parts. One part is the set of “edge lines”, the part of (p ◦ f )(∂D). The other part is the set of “ridgelines”. Though this surface is obtained by the rotation of a parabola, the ridgeline is not a parabola. As is easily seen, it is nothing but the envelope of the image of some family of curves on the surface. We can draw edge lines easily by using CAS. On the other hand, it is not so easy to draw ridgelines. 2.2
Main Result
In this subsection, we show a method to draw ridgelines by using CAS.
Y p X f
y
x Fig. 5. Setting of the main result
We represent p ◦ f by components as follows: (p ◦ f )(x, y) = (F1 (x, y), F2 (x, y)) . Then our main result is the next Proposition If the point (F1 (x, y), F2 (x, y)) is located on the ridgelines of the graph of f , then the following equality holds at (x, y): ∂F1 ∂F2 ∂F1 ∂F2 − =0. ∂x ∂y ∂y ∂x
A Simple Method of the TEX Surface Drawing
39
Proof Firstly, remark that, to deduce the equation of the envelope, it is sufficient to consider any family of curves on the image of p ◦ f the union of which covers the image of p ◦ f . So that, we can deduce the equation by using both of the following families of curves on the image: X = F1 (α, y) X = F1 (x, β) Y = F2 (x, β) Y = F2 (α, y) Here α and β indicate the parameters of the families.
Fig. 6. Idea of proof
In Fig. 6 the dotted line is a member of α-family, and the dash line is a member of β-family. Remark that these two lines are tangent to each other at one point of the envelope. By the formula for the differentials of the functions with parametric representations, the slopes of the tangent lines to each of the above two families are equal to ∂F2 ∂F1 ∂F2 ∂F1 , ∂x ∂x ∂y ∂y respectively. Clearly, at each point of the envelope, a curve belonging to α-family is tangent to a curve belonging to β-family. Hence, at that point, the equality ∂F2 ∂F1 ∂F2 ∂F1 = ∂x ∂x ∂y ∂y holds. This implies the claim of our proposition. 2.3
(Q.E.D.)
The Meaning of the Main Result
As is easily seen, the quantity ∂F1 ∂F2 ∂F1 ∂F2 − ∂x ∂y ∂y ∂x is nothing but the Jacobian of the map p ◦ f : R2 −→ R2 . Therefore, our proposition means that the envelope is given as the singular value set of the map p ◦ f . In singular point theory, it is well known that the singular point set is given as the set of points where the Jacobian of the map vanishes. In our situation,
40
M. Kaneko et al
the envelope corresponds to the singular value set of p (defined on the image of f ). Since f is a diffeomorphism onto its image, the critical value set of p ◦ f is the same as that of p (by the chain rule of Jacobian). From the viewpoint of singular point theory, it seems to be meaningless to consider p ◦ f instead of p. Actually there is almost no opportunity of using p ◦ f because it is very difficult to give the explicit representation of it by hand. However, p ◦ f becomes very important when we draw figures by using CAS. This is because CAS can calculate p ◦ f and its differential. Furthermore, CAS can automatically solve the equation in the proposition. Based on our main result, we can draw simple figures of surfaces with the minimal number of lines. The simpleness of the figures has not only aesthetic value but also mathematical meaning. In fact, it is an important idea of Morse theory [2] that the topological information of a manifold is contained in the structure of some critical submanifolds for a Morse function defined on the manifold.
3
The Procedure to Draw Surfaces by Using KETpic
In this section, we explain how to draw surfaces by using the method given in the previous section. Here we draw the graph of the following function as an example: 2 2 f (x, y) = 3 1 − (2x2 + y 2 )e−(x +y ) . 3.1
Calculate the Critical Point Set
Firstly, we calculate the critical point set of p◦f which is a set of connected curves in the xy-plane. By fixing the focus point and the view point, the command such as “implicitplot” of CAS enables us to find the points which satisfy the equation in the previous section. Connecting the neighbouring points by segments, we can obtain the picture like Fig. 7. Remark that the number of connected components depends on the choice of the focus point and the view point. y
O
Fig. 7. Critical point set
x
A Simple Method of the TEX Surface Drawing
3.2
41
Calculate the Critical Value Set
Secondly, we draw the graph of f restricted to the above critical point set. It is drawn in the XY -plane as in Fig. 8. The ridgelines which we want to draw are composed of some parts of this critical value set.
Y
O
X
Fig. 8. Critical value set
There are some cusps in the critical value set. In Fig. 9, these points are specified clearly by using bold points. Topologically, the graph of f is composed of three disks centered at extremal points of f . The cusps correspond to the points where the line segments bounding three disks are attached together. We judge, by rather unrefined method, whether a point in the critical value set is a cusp or not. The judgement is accomplished by calculating the angle between the two vectors from that point to the neighbouring points.
Fig. 9. Cusps
Though there are some parts in the critical value set which are invisible from the view point as shown in Fig. 11, they become visible when other parts on this side are eliminated.
42
3.3
M. Kaneko et al
Add Edges of the Graph and Eliminate Hidden Parts
Thirdly, we add the edge lines of the graph to the critical value set. Then we obtain Fig. 10.
Fig. 10. Add edge lines
In these curves, there are some parts which are drawn in this picture but actually can not be seen from the viewpoint. So we must eliminate such hidden parts to draw the picture we want (as in Fig. 11). As seen in Fig. 10, ridgelines and edge lines are separated to some parts by their own intersections or cusps. Remark that whether a point in these curves is visible or not is determined by the separated part to which the point belongs. We judge whether a part is visible or not by following the next procedure: 1. Pick up an arbitrary point (x0 , y0 , f (x0 , y0 )) from that part. 2. Take some points on the line connecting (x0 , y0 , f (x0 , y0 )) with the view point by the method of bisection, and find the signature of {z − f (x, y)} for each of these points. 3. If the signatures in the second step are constant, then the part containing (x0 , y0 , f (x0 , y0 )) is visible. Otherwise, the part is invisible. Thus, through this process of “hidden line elimination”, we obtain the next Fig. 11 which we aimed at.
Fig. 11. Eliminate hidden lines
Even though this figure is precise with the aid of CAS, there remains only minimal information about the shape of the graph of f . This is because the
A Simple Method of the TEX Surface Drawing
43
minimal number of lines are used. Therefore, this figure enables us to grasp the global shape easily. 3.4
Draw Wire Frames
When we attach importance to the preciseness of the figure, we can also use wire frames widely used in 3D-graphics of CAS. Our graphics are more beautiful than those of CAS because the number of wires is small and because wires are smooth curves. In the next Fig. 12, five wires are drawn in both x and y directions. Remark that the technique of hidden line elimination can be also applied to wires.
Fig. 12. Add wire frames
Compared with Fig. 11, Fig. 12 contains additional information about the local shape of the graph. In fact, the shape of the wire frames indicates the slope and curvature of the graph. In case of 3D-graphics of CAS, colours and half tone shadings indicate the local shape. Since it takes much cost for us to use colours and half tone shadings in teaching materials in the form of mass printings, the surface drawing of KETpic is more suitable for such practical use.
Fig. 13. Finished figure
44
M. Kaneko et al
Oppositely, the existence of the wire frames makes it a little harder for us to grasp the global shape. This is because we must pick up the curves which are also contained in Fig. 11 from those in Fig. 12 to grasp the sketchy shape. To make up this deficit, in Fig. 13, we draw clear distinction between the curves contained in Fig. 11 and wire frames by utilizing the ability of KETpic to draw thick and thin lines. This distinction makes this figure more easy to be understood. Thus, KETpic can offer 3D-graphics which are not only accurate (i.e. differential geometric) but also easy to understand (i.e. topological).
4
Conclusion and Future Works
The ability of surface drawing with minimal number of essential curves has great meaning. This is because we can easily understand the topological structure. As illustrated in the figures of this paper, we can draw accurate and simple 3Dgraphics in TEX documents with KETpic. Moreover, KETpic enables us to hand the printed materials with those figures to students.
Fig. 14. The case of polar coordinate
The method which we have introduced in this paper can be applied to the case where the surface is given by parameters other than x, y as follows: x = x(u, v),
y = y(u, v),
z = z(u, v) .
Here we assume that the differential of the map (u, v) → (x, y, z) is injective. The authors think that our method will be applicable also to the case of several surfaces. In the near future, we should develop a new KETpic functionality as above.
References 1. Kaneko, M., Abe, T., Sekiguchi, M., Tadokoro, Y., Fukazawa, K., Yamashita, S., Takato, S.: CAS-aided Visualization in LATEX documents for Mathematical Education. Teaching Mathematics and Computer Science (to appear) 2. Milnor, J.: Morse theory. Princeton Univ. Press, New Jersey (1963)
A Simple Method of the TEX Surface Drawing
45
3. Sekiguchi, M., Yamashita, S., Takato, S.: Development of a Maple Macro Package Suitable for Drawing Fine TEX-Pictures. In: Iglesias, A., Takayama, N. (eds.) ICMS 2006. LNCS, vol. 4151, pp. 24–34. Springer, Heidelberg (2006) 4. Sekiguchi, M., Kaneko, M., Tadokoro, Y., Yamashita, S., Takato, S.: A New Application of CAS to LATEX-Plottings. In: Shi, Y., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2007. LNCS, vol. 4488, pp. 178–185. Springer, Heidelberg (2007) 5. Sekiguchi, M., http://www.kisarazu.ac.jp/∼ masa/math/
Family of Energy Conserving Glossy Reflection Models Michal Radziszewski and Witold Alda AGH University of Science and Technology, al. Mickiewicza 30, 30-059 Krakow, Poland
[email protected],
[email protected]
Abstract. We present an improved reflection model optimized for global illumination. The model produces visually plausible images, is symmetric and has improved energy preserving capabilities compared to previous approaches which satisfies these requirements. Having an efficient sampling routine, the model is ready to use in Monte Carlo rendering. Presented model is phenomenological, i.e. it has intuitive glossiness parameter that affects its appearance. Moreover it can be used as a set of basis functions designed to fit material reflection to measured data. Keywords: Reflection Functions, BRDF, Global Illumination, Ray Tracing.
1
Introduction
Modeling reflection properties of surfaces is very important for rendering. Traditionally, in global illumination fraction of light which is reflected from a surface is described by a BRDF (Bidirectional Reflection Distribution Function) abstraction. This function is defined over all scene surface points, as well as two light directions – incident and outgoing. As the name suggests, to conform the laws of physics, all BRDFs must be symmetric, i.e. swapping incident and outgoing directions must not change BRDF value. Moreover, the function must be energy preserving – it cannot reflect more light than it receives. To achieve best results of rendering with global illumination, energy preservation of BRDF should satisfy more strict requirements. It is desirable that basic BRDF model reflects exactly all the light that arrives on a surface. The actual value of reflection is then modeled by a texture. If BRDF is unable to reflect all incident light, even white texture appears to absorb some part of it. In local illumination algorithms this can be corrected a bit by making reflection value more than unit, but in global illumination such trick can have fatal consequences due to multiple light scattering. Our model is strictly energy preserving, while it still maintains other desirable properties. This paper concentrates on phenomenological glossy BRDFs only. It does not account for wave model of light scattering, neither it touches the concept of BSSRDF. The latter is defined on two directions and two surface points instead of one, which is used to simulate subsurface scattering, during which light is reflected from surface at a point slightly different than is illuminated. M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 46–55, 2008. c Springer-Verlag Berlin Heidelberg 2008
Family of Energy Conserving Glossy Reflection Models
47
Paper organization. In the subsequent section there is a brief description of former research related to the BRDF concept. Next, requirements, which should be satisfied by a plausible reflection model are presented. Then there is explained the derivation of our reflection function, followed by comparison of our results with previous ones. Finally, we present a summary, which describes what was achieved during our research and what is left for future development.
2
Related Work
The first well known attempt to create glossy reflection is Phong model [1]. This model is, however, neither symmetric nor energy conserving. An improved version of it was created by Neumann et al. [2,3]. Lafortune et al. [4] used combination of generalized Phong reflection functions to adjust scattering model to measured data. There are popular reflection models based on microfacets. Blinn [5] and Cook et al. [6] assumed that scattering from each individual microfacet is specular, while Oren and Nayar [7] used diffuse reflection instead. A lot of work was dedicated to anisotropic scattering models. The first well known approach is Kajiya’s one [8], which uses physical model of surface reflection. Ward [9] presented a new technique of modeling anisotropic reflection, together with method to measure real-world material reflectances. The Walter’s technical report [10] describes how to efficiently implement Ward’s model in Monte Carlo renderer. Ashikhmin and Shirley [11] showed how to modify Phong reflection model to support anisotropy. Some approaches are based on physical laws. He et al. [12] developed a model that supports well many different types of surface reflections. Stam [13] used wave optics to accurately model diffraction of light. Westin et al. [14] used a different approach to obtain this goal. They employed Monte Carlo simulation of scattering of light from surface microgeometry to obtain coefficients to be fitted into their BRDF representation. On the other hand, Schlick’s model [15], is purely phenomenological. It accounts for diffuse and glossy reflection, in isotropic and anisotropic versions through a small set of intuitive parameters. Pellacini et al. [16] used a physically based model of reflection and modified its parameters in a way which makes them perceptually meaningful. A novel approach of Edwards et al. [17] is designed to preserve all energy while scattering, however at the cost of non-symmetric scattering function. Different approach was taken by Neumann et al. [18]. They modified Phong model to increase its reflectivity at grazing angles as much as possible while still satisfying energy conservation and symmetry as well. Some general knowledge on light reflection models can be found in Lawrence’s thesis [19]. More information on this topic is in Siggraph Course [20] and in Westin’s et al. technical report [21]. Westin et al. [22] also provided a detailed comparison of differen BRDF models. Stark et al. [23] shown that many BRDFs can be expressed in more convenient, less than 4D space (two directional vectors).
48
M. Radziszewski and W. Alda
Shirley et al. [24] described some general issues which are encountered when reflection models are created.
3
Properties of Reflection Functions
In order to create visually plausible images, all reflection functions should satisfy some well defined basic requirements. Energy conservation. In global illumination it is not enough to ensure that no surface scatters more light that it receives. It is desirable to have a function which scatters exactly all light. We are aware of only Edwards’s et al. work [17], which satisfies this requirement, but at the high price of lack of symmetry. Neumann et al. [18] improved reflectivity of Phong model, but the energy preservation still is not ideal. Symmetry. The symmetry of BRDF is very important when bidirectional methods (which trace rays from viewer and from light as well) are used. When a BRDF is not symmetrical, apropriate corrections similar to described in [25] must be made, in order to get proper rendering results. Everywhere positive. If a reflection function happens to be equal to zero on part of its domain, the respective surface may potentially render to black, no matter how strong the illumination is. Having a ’blackbody’ on the scene is a severe artifact, which is typically mitigated by a complex BRDF with an additional additive diffuse component. However, this option produces dull matte color and is not visually plausible. Everywhere smooth. Human eye happens to be particularly sensitive on detecting discontinuities of first derivative of illumination, especially on smooth, curved surfaces. This artifact occurs in any BRDF which uses functions such as min or max. Particularly, many microfacet based models use so-called geometric attenuation factor with min function, and look unpleasant at low glossiness values. Limit #1 – diffuse. It is very helpful in modeling if glossy BRDF can be made ’just a bit’ more glossy than a matte surface. That is, good quality reflection model should be arbitrarily close to matte reflection when glossines is near to zero. Surprisingly, few of BRDF models satisfy this useful and easy to achieve property. Limit #2 – specular. Similarly, it is convenient if glossy BRDF becomes near to ideal specular reflection when glossines approaches infinity. Unfortunately, this property is partially much more difficult to achieve than Limit #1. First, all glossy BRDFs are able to scatter light in near ideal reflection direction, which is correct. Second, energy preservation typically is not satisfied. Ease of sampling. Having a probability distribution proportional (or almost proportional) to BRDF value, which can be integrated and then inverted analytically, allows efficient BRDF sampling in Monte Carlo rendering. This feature is roughly satisfied in majority of popular BRDF models. Our work is an attempt to create a BRDF which satisfies all these conditions together.
Family of Energy Conserving Glossy Reflection Models
4
49
Derivation of Reflection Function
In this section a detailed derivation of the new reflection model is presented. Since this model is purely phenomenological, all mathematical functions chosen to use in it are selected just because of desirable properites they have. This particular choice has no physical basis, and of course, is not unique. Through the rest of this section the notation presented in Table 1 is used. By convention,
Table 1. Notation used in BRDF derivation Symbol fr R ωi ωo ωr N u, v θi θr φi φr Ω
Meaning Reflection function (BRDF) Reflectivity of BRDF Direction of incident light Direction of outgoing light Ideal reflection direction of outgoing light Surface normal Arbitrary orthogonal tangent directions Angle between ωi and N Angle between ωr and N Angle between ωi and u Angle between ωr and u Hemisphere above surface, BRDF domain
all direction vectors are in Ω, i.e. cosine of angle between any of them and N is non-negative. Moreover, these vectors are of unit length. 4.1
Symmetry and Energy Conservation
Symmetry requires that incident and outgoing direction can be swapped without modification of BRDF value: ∀ωi , ωo ∈ Ω
fr (ωi , ωo ) = fr (ωo , ωi ).
(1)
Energy conservation requires that reflectivity of the BRDF must not be greater than one, and is desirable to be equal to one: fr (ωi , ωo ) cos(θi )dωi ≤ 1. (2) R(ωo ) = Ω
The reflectivity can be expressed in a different domain. The following expression is used through the rest of this section: π
2π2 R(θo , φo ) =
fr (θi , φi , θo , φo ) cos(θi ) sin(θi )dθi dφi . 0 0
(3)
50
M. Radziszewski and W. Alda
It is very useful if reflection function fr can be separated into a product: fr (θi , φi , θo , φo ) = fθ (θi , θo )fφ (θi , φi , θo , φo ),
(4)
where fθ is latitudal reflection and fφ is longitudal reflection. If fφ integrates to unit regardless of θi and θo this separation significantly simplifies reflectivity evaluation, which now can be re-expressed as: ⎞ π ⎛ 2 2π R(θo , φo ) = ⎝ fφ (θi , φi , θo , φo )dφi ⎠ fθ (θi , θo ) cos(θi ) sin(θi )dθi , (5) 0
0
and energy conservation as: π
2π
2 fφ (θi , φi , θo , φo )dφi ≤ 1
fθ (θi , θo ) cos(θi ) sin(θi )dθi ≤ 1. (6)
and
0
0
Due to this feature, latitudal and longitudal reflection functions can be treated separately. 4.2
Latitudal Reflection Function
The domain of latitudal function is very inconvenient due to the sine and cosine factors in the integrand: π
2 Rθ (θo ) =
fθ (θi , θo ) cos(θi ) sin(θi )dθi .
(7)
0
However, substituting x = cos2 (θi ), y = cos2 (θo ) and dx = −2 sin(θi ) cos(θi )dθi leads to much simpler expression for reflectivity: 1 Ry (y) = 0.5
fθ (x, y)dx.
(8)
0
Despite being much simpler, this space is still not well suited for developing reflection function, mainly because of necessity of symbolic integration. Using the final transformation it may be obtained: yx Fθ (x, y) =
fθ (s, t)dsdt
and
fθ (x, y) =
∂ 2 Fθ (x, y) . ∂x∂y
(9)
0 0
Designing a function Fθ is much easier than fθ . The requirements that Fθ must satisfy are the following: ∀x,y ∀x ∀x1 ≤x2
Fθ (x, y) = Fθ (y, x)
(10)
Fθ (x, 1) = x
(11)
Fθ (x1 , y) ≤ Fθ (x2 , y)
(12)
Family of Energy Conserving Glossy Reflection Models
51
The requirement (11) can be released a bit. If it is not satisfied, it is enough if Fθ (1, 1) = 1 and Fθ (0, 1) = 0 are satisfied instead. In the latter case, applying: x = F −1 (x, 1)
and y = F −1 (1, y)
(13)
guarantees that Fθ (x , y ) satisfies original requirements (10-12). A matte BRDF in this space is expressed as Fθ = xy. We have found that the following (unnormalized) function is a plausible initial choice for latitudal glossy reflection: (14) fθ (x, y) = sech2 n(x − y) . Transforming this equation into Fθ space leads to: ln cosh(nx) + ln cosh(ny) − ln cosh n(x − y) . Fθ (x, y) = 2 ln cosh n
(15)
This function satisfies only the released requirements, so it is necessary to substitute:
1 − e−2 ln(cosh n)x 1 x = artanh (16) n tanh n for x, and analogical expression for y. After substitution and transformation to the fθ space it may be obtained: m tanh2 n · e−m(x+y) fθ (x, y) = 2 , tanh2 n − (1 − e−mx ) (1 − e−my )
(17)
where m = 2 ln cosh n. Finally, it should be substituted x = cos2 (θi ) and y = cos2 (θr ). Considering how complex the final expression is, it is clear why it is difficult to guess the form of plausible reflection function, and how useful these auxiliary spaces are. 4.3
Longitudal Reflection Function
The longitudal reflection function should be a function of cos(φi − φr ). It has to integrate to unit over [−π, π], so it is reasonable to choose a function that can be integrated analytically: fφ (φi , φr ) = Cn
1
6,
[n(1 − cos(φi − φr )) + 1]
and Cn =
(2n + 1)5.5 , 2πP5 (n)
(18)
where P5 (n) = 7.875n5 + 21.875n4 + 25n3 + 15n2 + 5n + 1. When n = 0, the function becomes constant. When n increases, the function is largest when φi = φr . In the limit, when n approaches infinity, the function converges to δ(φi − φr ). There is still one issue – whenever either ωi or ωr is almost parallel to N , φi or φr is poorly defined. In fact, in these cases, the function should progressively become constant. The simple substitution n = n sin θi sin θr works fine.
52
4.4
M. Radziszewski and W. Alda
Reflection Model
Combining latitudal and longitudal scattering functions leads to the final BRDF: fr (θi , φi , θo , φo ) =
(2nφ sin θi sin θr + 1)5.5 6
2πP5 (nφ ) (nφ sin θi sin θr (1 − cos(φi − φr )) + 1)
·
mθ tanh2 nθ · e−mθ (cos θi +cos θr ) · 2 . (19) tanh2 nθ − 1 − e−mθ cos2 θi 1 − e−mθ cos2 θr 2
2
The parameters nθ and nφ do not have to satisfy nθ = nφ = n. Using various functions of the form nθ = f1 (n) and nφ = f2 (n) leads to a variety of different glossy scattering models. The reflection angles (θr and φr ) may be computed from outgoing angles θo and φo in a few ways, e.g. with ideal reflection, ideal refraction or backward scattering, leading to variety of useful BRDFs. The reflection model is strictly energy preserving, so cosine weighted BRDF forms probability density functions (pdf ) to sample θi and φi from. Obviously, both pdf s are integrable analytically, which is very helpful.
5
Results
The following results are generated using a white sphere and a complex dragon model illuminated by a point light source.√ The proportions of latitudal and longitudal gloss are nθ = n and nφ = 0.75n n sin θi sin θr . Fig. 1 examines how selected scattering models cope with little glossiness. Phong-based models expose
Fig. 1. Comparison of different glossy BRDFs with gloss ’just a bit’ more than matte. From left: diffuse reference, reciprocal Phong, max-Phong, microfacet.
Fig. 2. Latitudal scattering only. From left: glossiness ’just a bit’ more than matte, medium glossiness, large glossines, similarity between θi and θr .
Family of Energy Conserving Glossy Reflection Models
53
Fig. 3. Longitudal scattering only with varying glossiness
Fig. 4. Product of latitudal and longitudal scattering with increasing glossiness
Fig. 5. Scattering with perpendicular (left) and grazing (right) illumination
Fig. 6. Complex dragon model rendered with glossiness n = 2 (left) and n = 4 (right)
54
M. Radziszewski and W. Alda
a zero reflectivity in certain directions, while max-Phong and microfacet models have shading discontinuities. Neither of these models is fully energy conserving. Fig. 2 shows latitudal component of our reflection model. The scattering is increased at grazing angles to achieve energy conservation. Similarly, Fig. 3 presents longitudal scattering only. In Figs. 4, 5 and 6 our BRDF model, defined as a product of latitudal and longitudal scattering is presented. The Fig. 4 shows how the BRDF behaves when glossiness is increased, while the Fig. 5 changes illumination angle using the same glossines. The BRDF exhibits some anisotropy at nonperpendicular illumination, not significant in complex models (Fig. 6).
6
Conclusions
We have presented a novel approach to create BRDFs, for which we have designed an energy preserving and symmetrical reflection function. Energy conservation allows improved rendering results. For example, when a model rendered with our function is placed into an environment with uniform illumination, it vanishes. On the other hand, majority of other models lose some energy, especially at grazing angles. For example, Phong reflection tend to absorb much light, producing dark borders around objects, impossible to control. Microfacet model, which conserves energy far better than Phong model, still is not perfect. In general, properties of our reflection model are different from typically used Phong or microfacet models. It conserves energy and for low glossiness values our model produces neither visual defects like illumination discontinuities, caused by microfacet model, nor black patches, caused by Phong model. Moreover, it smoothly becomes matte, when glossiness approaches zero. On the other hand, our model produces anisotropy when grazing illumination is encountered, i.e. when viewer direction is rotated around ideal reflection direction, reflection value changes in a way difficult to predict. Phong model does not produce such anisotropy at all, and in microfacet model it is less distracting. Nevertheless, when geometry of illuminated figure is complex, the anisotropy is no more visible, and at grazing angles microfacet model loses numerical stability, especially when ωi ≈ −ωr . Moreover, simple implementation of latitudal reflection formula causes numerical instabilities when high glossiness is used, i.e. when n 25. Currently, our reflection model is best suited for complex geometrical shapes and not extremely high glossiness. In these conditions it produces better results than simple formulae of Phong or microfacet models. Minimizing the impact of anisotropy and numerical instability at high glossiness requires further research in this area. AGH Grant no. 11.11.120.777 is acknowledged.
References 1. Phong, B.T.: Illumination for computer generated pictures. Communications of the ACM 18(6), 311–317 (1975) 2. Neumann, L., Neumann, A., Szirmay-Kalos, L.: Compact metallic reflectance models. Computer Graphics Forum 18(3), 161–172 (1999)
Family of Energy Conserving Glossy Reflection Models
55
3. Neumann, L., Neumann, A., Szirmay-Kalos, L.: Reflectance models with fast importance sampling. Computer Graphics Forum 18(4), 249–265 (1999) 4. Lafortune, E.P., Foo, S.C., Torrance, K.E., Greenberg, D.P.: Non-linear approximation of reflectance functions. In: SIGGRAPH 1997 Proceedings, pp. 117–126 (1997) 5. Blinn, J.F.: Models of light reflection for computer synthesized pictures. In: SIGGRAPH 1977 Proceedings, pp. 192–198. ACM, New York (1977) 6. Cook, R.L., Torrance, K.E.: A reflectance model for computer graphics. ACM Transactions on Graphics 1(1), 7–24 (1982) 7. Oren, M., Nayar, S.K.: Generalization of lambert’s reflectance model. In: SIGGRAPH 1994 Proceedings, pp. 239–246. ACM, New York (1994) 8. Kajiya, J.T.: Anisotropic reflection models. In: SIGGRAPH 1985 Proceedings, pp. 15–21. ACM, New York (1985) 9. Ward, G.J.: Measuring and modeling anisotropic reflection. In: SIGGRAPH 1992 Proceedings, pp. 265–272. ACM, New York (1992) 10. Walter, B.: Notes on the ward brdf. Technical Report PCG-05-06, Cornell University (April 2005) 11. Ashikhmin, M., Shirley, P.: An anisotropic phong brdf model. Journal of Graphics Tools 5(2), 25–32 (2000) 12. He, X.D., Torrance, K.E., Sillion, F.X., Greenberg, D.P.: A comprehensive physical model for light reflection. In: SIGGRAPH 1991 Proceedings, pp. 175–186 (1991) 13. Stam, J.: Diffraction shaders. In: SIGGRAPH 1999 Proceedings, pp. 101–110. ACM Press/Addison-Wesley Publishing Co, New York (1999) 14. Westin, S.H., Arvo, J.R., Torrance, K.E.: Predicting reflectance functions from complex surfaces. In: SIGGRAPH 1992 Proceedings, pp. 255–264 (1992) 15. Schlick, C.: A customizable reflectance model for everyday rendering. In: Fourth Eurographics Workshop on Rendering. Number Series EG 93 RW, pp. 73–84 (1993) 16. Pellacini, F., Ferwerda, J.A., Greenberg, D.P.: Toward a psychophysically-based light reflection model for image synthesis. In: SIGGRAPH 2000 Proceedings, pp. 55–64. ACM Press/Addison-Wesley Publishing Co, New York (2000) 17. Edwards, D., Boulos, S., Johnson, J., Shirley, P., Ashikhmin, M., Stark, M., Wyman, C.: The halfway vector disk for brdf modeling. ACM Transactions on Graphics 25(1), 1–18 (2006) 18. Neumann, L., Neumann, A., Szirmay-Kalos, L.: Reflectance models by pumping up the albedo function. Machine Graphics and Vision (1999) 19. Lawrence, J.: Acquisition and Representation of Material Appearance for Editing and Rendering. PhD thesis, Princeton University, Princeton, NJ, USA (2006) 20. Ashikhmin, M., Shirley, P., Marschner, S., Stam, J.: State of the art in modeling and measuring of surface reflection. In: SIGGRAPH 2001 Course #10 (2001) 21. Westin, S.H., Li, H., Torrance, K.E.: A field guide to brdf models. Technical Report PCG-04-01, Cornell University (January 2004) 22. Westin, S.H., Li, H., Torrance, K.E.: A comparison of four brdf models. Technical Report PCG-04-02, Cornell University (April 2004) 23. Stark, M.M., Arvo, J., Smits, B.: Barycentric parameterizations for isotropic brdfs. IEEE Transactions on Visualization and Computer Graphics 11(2), 126–138 (2005) 24. Shirley, P., Smits, B., Hu, H., Lafortune, E.: A practitioners’ assessment of light reflection models. In: Pacific Graphics 1997 Proceedings, pp. 40–50 (1997) 25. Veach, E.: Non-symmetric scattering in light transport algorithms. In: Proceedings of the Eurographics Workshop on Rendering Techniques 1996, pp. 81–90 (1996)
Harmonic Variation of Edge Size in Meshing CAD Geometries from IGES Format Maharavo Randrianarivony Institute of Computer Science Christian-Albrecht University of Kiel, 24098 Kiel, Germany
[email protected]
Abstract. We shall describe a mesh generation technique on a closed CAD surface composed of a few parametric surfaces. The edge size function is a fundamental entity in order to be able to apply the process of generalized Delaunay triangulation with respect to the first fundamental form. Unfortunately, the edge size function is not known a-priori in general. We describe an approach which invokes the Laplace-Beltrami operator to determine it. We will discuss theoretically the functionality of our methods. Our approach is illustrated by numerical results from the harmonicity of triangulations of some CAD objects. The IGES format is used in order to acquire the initial geometries. Keywords: Geometric modeling, IGES, mesh generation, CAD models, edge size, Delaunay.
1
Introduction
The importance of meshes in computer graphics and geometric modeling has become evident in the past decades [1],[6],[8]. In this paper, we address the problem of creating a mesh [7] on a surface Γ of a CAD model. We report on the results of our intensive implementation using IGES (Initial Graphics Exchange Specification) format as CAD exchange. Our main goal focus on the fact that we want the variation of lengths of the neighboring edges to be smooth. That will result in meshes composed of nicely shaped triangles. The generation of a surface mesh by means of the generalized Delaunay technique [4],[6] with respect to the first fundamental form [2] requires the knowledge of the edge size function which is unfortunately unknown a-priori [10]. In [1], the authors triangulate planar domains using the Delaunay technique. A method for evaluating the quality of a mesh is given in [6]. Some upper bound of the Delaunay triangulation has been investigated in [4]. In this document, our main contribution is the determination of the edge size function for the Delaunay triangulation with respect to the first fundamental form. Additionally, we examine theoretically the functionality of our method. Furthermore, our description is supported by numerical results produced from real IGES data where we investigate mesh harmonicity. In the next section, we will state our problem more specifically and we will introduce various important definitions. After quickly giving a motivation for M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 56–65, 2008. c Springer-Verlag Berlin Heidelberg 2008
Harmonic Variation of Edge Size in Meshing CAD Geometries
57
planar problems in section 3, we will detail the meshing of a single parametric surface by using generalization of Delaunay triangulation in section 4. Since the edge size function is known on the boundary, the treatment of the LaplaceBeltrami problem becomes a boundary value problem which we propose to solve numerically in section 5 which contains also the theoretical background of the mesh generation approach. In section 6, we focus on practical aspect and intuitive description for practitioners. Toward the end of the paper, we report on some benchmarks of CAD objects from IGES files and we compare our method with other approaches.
2
Definitions and Problem Setting
In our approach, the input is a CAD object which is bounded by a closed surface Γ that is composed of n parametric surfaces {Sk }nk=1 . Each Sk is given as the image of a multiply connected domain Dk ⊂ R2 by the following function xk : (u1 , u2 ) ∈ R2 −→ (xk,1 (u1 , u2 ), xk,2 (u1 , u2 ), xk,3 (u1 , u2 )) ∈ R3
(1)
which is supposed to be bijective and sufficiently smooth [5]. The surfaces Sk will be referred to as the patches of the whole surface Γ . Every patch of the surface Γ is bounded by a list of curves Ci . The CAD models come from IGES format where the most important nontrivial entities are enlisted in Table 1. Table 1. Most important IGES entities IGES Entities ID numbers IGES-codes Line 110 LINE Circular arc 100 ARC Polynomial/rational B-spline curve 126 B SPLINE Composite curve 102 CCURVE Surface of revolution 120 SREV Tabulated cylinder 122 TCYL Polynomial/rational B-spline surface 128 SPLSURF Trimmed parametric surface 144 TRM SRF Transformation matrix 124 XFORM
A mesh Mh is a set of triangles Tk ⊂ Rd (d = 2, 3) such that the intersection of two nondisjoint different triangles is either a single node or a complete edge. If d is 2 (resp. 3), then we will call Mh a 2D (resp. 3D) mesh. For a node A in a mesh Mh , its valence η(A) is the number of edges which are incident upon A. The set of nodes which are the endpoints of edges incident upon A, and which are different from A, will be denoted by ν(A). Our objective is to generate a 3D mesh Mh such that all nodes of Mh are located on the surface Γ . Additionally, we want that the edge lengths vary slowly implying that the lengths of the three edges in any triangle T ∈ Mh are proportional.
58
M. Randrianarivony
We will need the matrix I(xk ) := [gij (xk )] which represents the first fundamental form where ∂xk,p ∂xk,p ∂xk ∂xk , >= ∂ui ∂uj ∂ui ∂uj p=1 3
gij (xk ) :=<
3
i, j ∈ {1, 2}.
(2)
Motivation for the Planar Case
In this section, we want to treat briefly the mesh generation problem in the planar case (see Fig. 1) that should provide both motivation and intuitive ideas which facilitate the description of the general case of parametric surfaces. For that matter, we want to triangulate a planar multiply connected domain Ωh ⊂ R2 with polygonal boundaries Ph . Note that the boundary edge sizes are generally nonuniform (see Fig. 1). That is usually caused by adaptive discretization of some original curved boundaries P according to some error criteria. In order to
(a)
(b)
(c)
(d)
Fig. 1. Selected steps in mesh recursive refinement
obtain a few of triangles while keeping their good quality shape (Fig. 1(d)), the variation of sizes of neighboring edges should be small. Let us introduce the edge size function (3) ρ : Ωh −→ R+ . If this function is explicitly known, then a way to obtain the mesh is to start from a very coarse mesh (Fig. 1(a)) and to apply node insertion [2] in the middle of every edge [a, b] whose length exceeds the value of ρ at the midnode of [a, b]. Delaunay edge flipping is simultaneously applied to achieve better angle conditions. Unfortunately, the value of ρ is not known in practice. Since the edge
Harmonic Variation of Edge Size in Meshing CAD Geometries
59
size function ρ is known at the boundaries ∂Ωh = Ph , we consider the following boundary value problem: Δρ :=
∂2ρ ∂2ρ + =0 2 ∂u1 ∂u22
in Ωh ,
(4)
with the nonhomogeneous Dirichlet boundary condition given by the edge sizes at the boundary. That means the edge size function is required to be harmonic. A harmonic function satisfies in general the mean value property: 2π 1 ρ(a1 , a2 ) = ρ(a1 + r cos θ, a2 + r sin θ)dθ. (5) 2π 0 That is, ρ(a) is ideally the same as the average of the values of ρ in a circle centered at a = (a1 , a2 ). In that way, the edge size function ρ has practically small variation. Meshes with small edge size variation have advantages [10] in graphics and numerics. First, they use few triangles while not loosing shape quality. Second, they prevent numerical instabilities which are not desired in simulations.
4
Meshing Using the First Fundamental Form
In this section, we summarize the meshing of a single patch Sk specified by the smooth parametric function xk given in (1). To simplify the notation, we will drop the index k in the sequel. The approach in triangulating S is processed in two steps. First, a 2D mesh on the parameter domain D is generated according to the first fundamental form. Afterwards, the resulting 2D mesh is lifted to the parametric surface S by computing its image by x. We will call an edge of a mesh in the parameter domain a 2D edge and an edge in the lifted mesh a 3D edge. For that purpose, one starts from a coarse 2D mesh of D and a generalized two dimensional Delaunay refinement is used as summarized below. Similarly to the planar case, we introduce an edge size function ρ which is defined now on the parametric surface ρ : S −→ R+ . By composing ρ with the parameterization x of S, we have another function ρ˜ := ρ ◦ x which we will call henceforth ”parameter edge size function” because it is defined for all u = (u, v) in the parameter domain. Let us consider a 2D edge [a, b] ⊂ D and let us denote the first fundamental forms at a and b by Ia and Ib respectively. Further, we introduce the following average distance between a and b − →T − → T := 0.5(Ia + Ib ). (6) dRiem (a, b) := ab T ab The 2D edge [a, b] is split if this average distance exceeds the value of the parameter edge size function ρ˜ at the midnode of [a, b]. Note that no new boundary nodes are introduced during that refinement because only internal edges are allowed to be split. Consider now a 2D edge [a, c] shared by two triangles which form a convex quadrilateral [a, b, c, d]. Denote by T the average values of the
60
M. Randrianarivony
first fundamental forms Ia , Ib , Ic and Id at those nodes. The edge [a, c] is flipped into [b, d] if the next generalized Delaunay angle criterion is met − → − → − →T − → − → − → − →T − → bc × ba(da T dc) < da × dc(cb T ba).
(7)
We would like now to describe the procedure of obtaining the initial coarse triangulation. Suppose that we have a 2D domain P which may contain some holes and which has polygonal boundaries. We may think of P as a polygonal discretization of the parameter domain N D. First, the polygon P is split into a few simply connected polygons P = i=1 P(i) . Afterwards, we do the following for every simply connected polygon P(i) . One initializes its set of triangles as (i) empty set Th = ∅. Then, one finds a triangle T which can be chopped off from (i) (i) P(i) . We apply some updating P(i) := P(i) \ T and Th := Th ∪ T . We repeat the same chopping until Pi has no vertices left. Finally, the triangulation of P (i) is the union of all triangulations: Th := ∪N i=1 Th .
5
Edge Size and Theoretical Discussion
Let us consider a parametric surface S and a differentiable function F : S −→ R. The Laplace-Beltrami operator is defined by √ ∂F 1 ∂ ggij ΔS F = − √ (8) g ∂uj ∂ui in which we use Einstein notation in indexing and g is the determinant of I which we introduced in (2). The function F is said to be harmonic if ΔS F = 0. Since the edge size function ρ should be harmonic, we have the following problem −ΔS ρ = 0 in S (9) ρ = ρbound on ∂S . The values of ρ at the boundary which are denoted by ρbound are known because they are described by the boundary discretization. We will sketch in this section how to numerically solve the boundary value problem (9) by means of the finite element method. For that end, we take a temporary mesh Mh on S and we denote its boundary by ∂Mh . For a smooth function φ which takes value zero at the boundary we have < ∇S ρ, ∇S φ >=: a(ρ, φ) . (10) − ΔS ρφ = S
S
Let us define the following set of approximating linear space Vh := {f ∈ C0 (Mh ) : f|T ∈ P1 ∀ T ∈ Mh }, where C0 (Mh ) denotes the space of functions which are globally continuous on Mh and P1 the space of linear polynomials. For a function g we define the set Vhg := {f ∈ Vh : f = g on ∂Mh } which is not in general a linear space. The approximated solution ρh will reside in the set Vhρbound . In order
Harmonic Variation of Edge Size in Meshing CAD Geometries
61
to find ρh , we pick an element ρ˜ of Vhρbound and define μh by setting ρh = ρ˜ + μh . The function ρh is therefore completely determined if we know the new unknown function μh which resides interestingly in Vh0 . Observe that Vh0 is a linear space in which we choose a basis {φi } i∈I . As a consequence, the function μh is a linear combination of {φi }i∈I : μh = i∈I μi φi . By introducing the following bilinear form ah (·, ·) ah (ψ, φ) := aT (ψ, φ) with aT (ψ, φ) :=< ∇T ψ, ∇T φ >, (11) T ∈Mh
we have ah (ρh , φ) = 0 ∀φ ∈ Vh0 or equivalently ah (μh , φ) = −ah (˜ ρ, φ) 0 ∀φ ∈ Vh . Since φi builds a basis for Vh0 , this leads to a linear equation ah (φi , φj )μi = −ah (˜ ρ, φj ) ∀j ∈ I . (12) i∈I
One can assemble the stiffness matrix Mij := ah (φi , φj ) and solve (12) for μi which yields the value of μh . For every triangle T in Mh with internal angles α1 , α2 and α3 , its contribution to the stiffness matrix M is ⎡ ⎤ cot α2 + cot α3 − cot α3 − cot α2 MT = 0.5 ⎣ − cot α3 cot α1 + cot α3 − cot α1 ⎦ . (13) − cot α2 − cot α1 cot α1 + cot α2 We want now to theoretically discuss the applicability of the former triangulation to a given CAD model. We will only emphasize on the main points but the complete detail can be seen in [10]. The main critical aspects which we want to clarify are twofold. First, it has been shown that from every simply connected [9] polygon P, one may remove two triangles T1 and T2 (called ears) by introducing internal cuts. Thus, if we suppose that the polygon has n vertices, we need to chop off triangles (n − 2) times in order to obtain the initial coarse mesh. As a consequence, a simply connected polygon can be triangulated by using only boundary nodes. For triangulation of multiply connected polygons, we need to split them first into several simply connected polygons [9],[10]. Now, we will show that the linear system from (12) is uniquely solvable. Consider a triangle T = [A, B, C] of the mesh Mh . Let N 3 be the unit normal vector of T . Generate two unit vectors N 1 and N 2 perpendicular to N 3 such that (N 1 , N 2 , N 3 ) is an orthonormal system which can be centered at A. Since the triangle T is located in the plane spanned by (N 1 , N 2 ), every point x ∈ T −→ can be identified by (v1 , v2 ) ∈ R2 such that Ax = v1 N 1 + v2 N 2 . Consider the triangle t := [(0, 0), (1, 0), (0, 1)] and let ϕ be the parameterization which transforms t into T : V1 W1 u1 ϕ1 (u1 , u2 ) := , (14) ϕ2 (u1 , u2 ) V2 W2 u2 −− → −→ where (V1 , V2 ) and (W1 , W2 ) are the components of W := AB and V := AC in (N 1 , N 2 ). Denote by M the above matrix and let θ be the inverse of ϕ: 1 θ1 (v1 , v2 ) v1 W2 −W1 = . (15) −V V v θ2 (v1 , v2 ) det M 2 1 2
62
M. Randrianarivony
Let ψ be the linear polynomial which transforms ψ(0, 0) = μ(A), ψ(1, 0) = μ(B), ψ(0, 1) = μ(C). Its exact expression is ψ(u1 , u2 ) = [μ(B) − μ(A)]u1 + [μ(C) − μ(A)]u2 + μ(A). We want to compute aT (·, ·) in terms of (u1 , u2 ). By introducing aij := ∂vj θi , the integrand for aT (·, ·) involves I(u1 , u2 ) := (a11 ∂u1 ψ + a21 ∂u2 ψ)2 + (a12 ∂u1 ψ + a22 ∂u2 ψ)2 . Because of (15), we have a11 = W2 /D, a12 = −W1 /D, a21 = −V2 /D and a22 = V1 /D where D = det M . By using the fact that cos α =< V, W > /(V · W ) and sin α = det M/(V · W ), we obtain 1 {[μ(B)−μ(C)]2 cot α+[μ(C)−μ(A)]2 cot β +[μ(B)−μ(A)]2 cot γ}. I(u1 , u2 ) = D We have therefore (16) aT (μ, μ) = I(u1 , u2 )(det M )du1 du2 t
= 0.5{[μ(B) − μ(C)]2 cot α + [μ(C) − μ(A)]2 cot β + [μ(B) − μ(A)] cot γ}. 2
(17) (18)
By introducing Ψ (μ) := aT (μ, μ), the system in (13) can be obtained by aT (μ, ν) = ˜ := 1 (∂u1 ψ)W and V˜ := 1 (∂u2 ψ)V , 0.5[Ψ (μ + ν) − Ψ (μ) − Ψ (ν)]. Denote by W D D ˜ 2 − 2 < W ˜ , V˜ > +V˜ 2 . Thus, I(u1 , u2 ) = 0 iff W ˜ = λV˜ we have I(u1 , u2 ) = W 0 for some λ ∈ R. Since μ ∈ Vh is globally continuous and it takes zero values at the boundary, we have μ = 0. The form ah (·, ·) is thus symmetric positive definite. Hence, the linear system from (12) is solvable and the edge size function can be deduced.
6
Practical Aspect and Intuitive Description
In this section, we describe a realistic approach which is interesting for practitioners. Intuitively, our meshing approach is similar to the standard 2D Delaunay but we replace the usual Euclidian distance by the one given in (6). That means, the 2D parameter mesh which might be very anisotropic corresponds to a surface mesh that is very well shaped. In the previous discussions, the triangulation of a single trimmed surface [3] was provided. In practice, an IGES file contains several trimmed surfaces. Therefore, we will describe briefly how to triangulate the whole surface Γ composed of the patches S1 , · · · , Sn . First, we discretize (Fig. ˜ i in which we aim at 2) the curved boundaries Ci by piecewise linear curves C both accuracy and smoothness: curves which are almost straight need few vertices while those having sharp curvatures need many vertices [10]. Afterwards, ˜ i back to the we map the 3D nodes of the relevant piecewise linear curves C 2 parameter domain Dk ⊂ R for each patch Sk . Thus, we may apply the former approach to the polygons formed by the 2D preimages of the 3D nodes. In other words, a mesh Mk is created by using the technique in section 4 for each surface patch Sk . Finally, one merges the meshes M1 ,· · ·, Mn in order to obtain the final mesh of Γ . Since we use no boundary nodes other than those corresponding to the preimages, no new nodes are inserted during the refinements. As a consequence, nodes at the interface of two adjacent patches align.
Harmonic Variation of Edge Size in Meshing CAD Geometries
(a)
63
(b) Fig. 2. (a)Boundary nodes (b)Surface mesh
7
Numerical Results and Comparison
In this section, we illustrate our former approach numerically. Additionally, we compare our results with other meshing approaches. The CAD objects whose surfaces have to be triangulated are given as input in IGES files. We consider four CAD objects which have respectively 30, 25, 24 and 26 patches. We used the former method to generate meshes on their surfaces. The resulting meshes, having respectively 11834, 7944, 7672 and 8932 elements, as portrayed in Fig. 3. We would like to investigate the harmonicity of the meshes which we want to define now. For any considered node A ∈ R3 of a mesh Mh , we define ρ(A) :=
− 1 − → AB η(A)
(19)
B∈ν(A)
to be the average edge length. Now we define r(A) to be the length of the shortest edge incident to a node A and we let si (A) be the intersection of the i-th edge [A, Bi ] incident upon A and the sphere centered at A with radius r(A). We define the discrete mean value ρmean (A) to be ρmean (A) :=
η(A) 1 ρ(si (A)) η(A) i=1
in which ρ(si (A)) is the following convex combination of ρ(A) and ρ(Bi ) − → − → Asi Asi ρ(si (A)) := −−→ ρ(Bi ) + 1 + −−→ ρ(A). ABi ABi
(20)
64
M. Randrianarivony
(a)
(b)
(c)
(d)
Fig. 3. Four meshes generated from CAD objects Table 2. Harmonicity of the four meshes
mesh1 mesh2 mesh3 mesh4
Average harmonicity Smallest harmonicity Largest harmonicity 0.997292 0.738332 1.263884 0.997050 0.724865 1.231922 0.997709 0.755730 1.239204 0.997353 0.745304 1.270894
We have a discrete mean value property if ρ(A) = ρmean (A). We define the harmonicity of a node A to be the ratio ξ(A) := ρ(A)/ρmean (A). If the value of the harmonicity approaches the unity, the discrete mean value property is valid. We have computed the average harmonicity of the four meshes and the results can be found in the next table. As one can note in Table 2, the mesh sizes in our tests practically satisfy good harmonicity. That can equally be observed in Fig. 3.
Harmonic Variation of Edge Size in Meshing CAD Geometries
65
Most mesh generation approaches do not take harmonicity into consideration. For the sake of comparison, we have generated four meshes for the same CAD data but with another meshing technique where the harmonicity is neglected. The resulting average harmonicities are respectively 1.603, 1.909, 1.512 and 1.802. That means that neighboring edges are not guaranteed to have proportional edge size.
8
Conclusion and Future Work
We have described a method for generating a triangular surface mesh on a CAD model which is given by an IGES format. Neighboring edges have proportional lengths. That was achieved by imposing that the edge size function is conform with the Laplace-Beltrami operator. In the future, we will improve the implementation by including more IGES entities. Since many people move now from the graphic standard IGES to STEP, we intend also to accept STEP files. Additionally, we will develop quadrilateral meshes since they are appropriate for some graphical or numerical tasks.
References 1. Borouchaki, H., George, P.: Aspects of 2-D Delaunay Mesh Generation. Int. J. Numer. Methods Eng. 40(11), 1957–1975 (1997) 2. Bossen, F., Heckbert, P.: A Pliant Method for Anisotropic Mesh Generation. In: 5th International Meshing Roundtable Sandia National Laboratories, pp. 63–76 (1996) 3. Brunnett, G.: Geometric Design with Trimmed Surfaces. Computing Supplementum 10, 101–115 (1995) 4. Edelsbrunner, H., Tan, T.: An Upper Bound for Conforming Delaunay Triangulations. Discrete Comput. Geom. 10, 197–213 (1993) 5. Farin, G.: Curves and Surfaces for Computer Aided Geometric Design: A Practical Guide. Academic Press, Boston (1997) 6. Frey, P., Borouchaki, H.: Surface Mesh Quality Evaluation. Int. J. Numer. Methods Eng. 45(1), 101–118 (1999) 7. Graham, I., Hackbusch, W., Sauter, S.: Discrete Boundary Element Methods on General Meshes in 3D. Numer. Math. 86(1), 103–137 (2000) 8. Kolingerova, I.: Genetic Approach to Triangulations. In: 4th International Conference on Computer Graphics and Artificial Intelligence, France, pp. 11–23 (2000) 9. O’Rourke, J.: Computational Geometry in C. Cambridge Univ. Press, Cambridge (1998) 10. Randrianarivony, M.: Geometric Processing of CAD Data and Meshes as Input of Integral Equation Solvers. PhD thesis, Technische Universit¨ at Chemnitz (2006)
Generating Sharp Features on Non-regular Triangular Meshes Tetsuo Oya1,2 , Shinji Seo1 , and Masatake Higashi1 1
2
Toyota Technological Institute, Nagoya, Japan The University of Tokyo, Institute of Industrial Science, Tokyo, Japan {oya, sd04045, higashi}@toyota-ti.ac.jp
Abstract. This paper presents a method to create sharp features such as creases or corners on non-regular triangular meshes. To represent sharp features on a triangular spline surface we have studied a method that enables designers to control the sharpness of the feature parametrically. Extended meshes are placed to make parallelograms, and then we have an extended vertex which is used to compute control points for a triangular B´ezier patch. This extended vertex expressed with a parameter enables designers to change the shape of the sharp features. The former method we presented deals with regular meshes, however, it can be a strong restriction against the actual variety of meshes. Therefore, we developed a method to express sharp features around an extraordinary vertex. In this paper, we present algorithms to express creases and corners for a triangular mesh including extraordinary vertices.
1
Introduction
Computer-aided design tools have supported the designer’s work to create aesthetic and complex shapes. However, to represent a pleasing and high quality surface is still a difficult task. The reason of this difficulty is that both high continuity of the surface and the ability to handle 2-manifold surfaces with arbitrary topology are required especially in the industrial design field. In addition, sharp features such as creases and corners which play a significant role to express product’s shape should be treated as they want. However, expressing sharp features at an arbitrary place is not easy. Therefore, a method to represent sharp features and to control their shapes would be a great help to designers. There are two major ways to represent surfaces in computer graphics. In CAD/CAM software, tensor product surfaces such as B´ezier, B-spline and NURBS are usually used to represent free-form surfaces. These are expressed in parametric form and generally have high differentiability enough to represent class-A surfaces. However, connecting multi patches with high continuity is rather difficult. Also, techniques like trim or blend are usually utilized to represent the required complexity, and it would be an exhausting job. Furthermore, it is difficult to generate sharp features on arbitrary edges. The other important method, namely, subdivision surfaces have become a popular method recent years especially in the entertainment industry. Inputting M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 66–75, 2008. c Springer-Verlag Berlin Heidelberg 2008
Generating Sharp Features on Non-regular Triangular Meshes
67
an original mesh, some subdivision scheme is repeatedly performed on the vertices and the faces of the mesh, then a refined resultant is obtained. Although its limit surface is theoretically continuous everywhere, the obtained surface is a piecewise smooth surface. Thus it is not applicable to the surfaces used in industrial design where high quality surfaces are always required. Moreover, the parametric form of the surface is not available. As for sharp features on a subdivision surface, there are many studies dealing with creases and corners. Nasri [1] presented subdivision methods to represent boundary curves, to interpolate data points, and to obtain intersection curves of subdivision surfaces. Hoppe et al. [2] proposed a method to reconstruct piecewise smooth surfaces from scattered points. They introduced a representation technique of sharp features based on the Loop’s subdivision scheme [3]. To model features like creases, corners and darts, several new masks were defined on regular and non-regular meshes. DeRose et al. [4] described several effective subdivision techniques to be used in the character animation. They introduced a method to generate semi-sharp creases whose sharpness can be controlled by a parameter. Biermann et al. [5] improved subdivision rules so as to solve the problems of extraordinary boundary vertices and concave corners. Based on this method, then Ying and Zorin [6] presented a nonmanifold subdivision scheme to represent a surface where different patches interpolates a common edge along with the same tangent plane. Sederberg et al. [7] presented a new spline scheme, called T-NURCCs, by generalizing B-spline surfaces and Catmull-Clark surfaces. In this method, features are created by inserting local knots. To create and change features, direct manipulation of knots and control points is required. These methods have succeeded to represent sharp features, however, the subdivision surface technique is not a principal method in an industrial design field where high quality smooth surfaces are demanded. An alternative method to create surfaces is generating a spline surface composed of B´ezier patches. Triangular B´ezier patching can be used to represent complex models because each patches are easily computed from the original mesh. The advantage of this method is that it is rather easier to keep continuity across the patches than conventional tensor product patches. Hahmann [8] [9] has shown the effectiveness of spline surface technique. Yvart and Hahmann [10] proposed a hierarchical spline method to represent smooth models on arbitrary topology meshes. With their method, designers are able to create complex models preserving tangent plane continuity when refining a local patch to add details. However, they have not mentioned how to represent sharp features. Thus, to be a more practical method, representing sharp features on a triangular spline surface should be studied. Loop [11] represented a sharp edge as a boundary curve by connecting two patches, however, its shape is not controllable because it is depend on each patches’ boundary curves. Higashi [12] presented a method to express sharp features by using the concept of extended mesh. With that method, the shape of the edge can be changed parametrically. In spite of its high potential, the triangular spline technique is not frequently used like other methods. One of the reasons is its difficulty of handling non-regular meshes. Here, a non-regular mesh means a mesh containing an extraordinary
68
T. Oya, S. Seo, and M. Higashi
vertex whose valence is not six. In this paper, we developed a method to represent controllable sharp features on a non-regular triangular mesh. This paper is organized as follows. Sec. 2 presents basics on B´ezier representation used in this paper. Sec. 3 describes the method of mesh extension to express sharp features. In Sec. 4, we present main contributions of this paper, that is, schemes to handle non-regular meshes. Then several examples are shown in Sec. 5, and Sec. 6 concludes this paper.
2
Triangular B´ ezier Patch
To construct a triangular spline surface, we utilize a triangular B´ezier patch. In this section, we briefly describe important backgrounds about B´ezier forms [13]. A B´ezier surface of degree m by n is defined as a tensor product surface bm,n (u, v) =
m n
bi,j Bim (u)Bjn (v)
(1)
i=0 j=0
where bi,j is the control net of a B´ezier surface, and Bim (u) is the Bernstein polynomials of degree m. A triangular B´ezier patch is defined in the barycentric coordinates that is denoted u := (u, v, w) with u + v + w = 1. The expression is |i| = i + j + k (2) bi Bin (u), b(u) = |i|=n where bi is a triangular array of the control net and n! n n n i j k Bi (u) = = u v w ; |i| = n, i i i!j!k!
(3)
are the bivariate Bernstein polynomials. In this paper we use quartic B´ezier patches thus degree n is set to 4. The triangular spline surface used in this paper is represented by B´ezier patches. Computing the necessary control points bi from the original mesh, we can obtain the corresponding B´ezier patches. Composing all of them, the resulting surfaces is a C 2 surface if the original mesh is regular. We utilize Sabin’s rules [14] [15] to compute B´ezier points. Let P0 be the ordinary vertex and P1 (i); i = 1, · · · , 6 be the six neighboring vertices of P0 . In the case of quartic triangular B´ezier patch there are fifteen control points, however, just four distinct rules exist due to symmetry. The following is the rules to compute B´ezier control points Qijk as illustrated in Fig. 1 : 24Q400 = 12P0 + 2
6
P1 (i),
(4)
i=1
24Q310 = 12P0 + 3P1 (1) + 4P1 (2) + 3P1 (3) + P1 (4) + P1 (6),
(5)
24Q211 = 10P0 + 6P1 (1) + 6P1 (2) + P1 (6) + P1 (3),
(6)
24Q220 = 8P0 + 8P1 (2) + 4P1 (1) + 4P3 (3).
(7)
Generating Sharp Features on Non-regular Triangular Meshes B
P1 (1)
Q004
P1 (6)
E
Q103 Q013 Q112
Q202 P1 (5)
69
P0
Q400
Q310
Q022
Q121
Q301 Q211 Q220
G Q031
Q130
P1 (2)
C A
H 1-s
s
F
Q040
D P1 (4)
P1 (3) Vertices of the original meshes Ordinary vertex 1-ring vertices of the ordinary vertex Control points computed by using and Other control points
Midpoints Interpolated point on the line EF Extended vertex
Fig. 1. Control points for triangular B´ezier patch
Fig. 2. Mesh extension and definition of parameter s
The remaining control points are obtained by applying the same rules to the neighboring vertices of P1 (1) and P1 (2).
3
Mesh Extension
In this section we briefly describe the method of mesh extension [12] to represent sharp features on a regular mesh. First, a designer specifies an edge to be a crease, then the meshes sharing the specified edge are separated. Next, we make extended vertices at the opposite side of the specified edge for both triangles. As shown in Fig. 2, let the original triangles be ABC and BCD. The edge BC is specified to create a crease. Point E is the midpoint of the edge BC and F is the midpoint of AD. Then the position of G is parametrically defined so as to satisfy the relation G = sE + (1 − s)F
(8)
where s denotes the parameter to control the sharpness of the crease. Finally the extended vertex H is defined as the position to satisfy H = 2 AG and the control triangle BCH is produced. In the opposite side, similar procedure is conducted with same parameter s.
(a) s = 0 (smooth surface)
(b) s = 0.5 (crease)
(c) s = 1.0 (crease)
Fig. 3. Examples of generating crease edges for regular mesh
70
T. Oya, S. Seo, and M. Higashi
Inputting s, all B´ezier control points are determined and we get a regenerated surface where a sharp crease appears. By changing the value of s, the designer is able to control the sharpness of the crease. Fig. 3 shows examples of the surfaces created by this method. When s is equal to 0, the resulting surface is the original C 2 surface itself(Fig. 3(a)). When s is equal to 1, the resulting surface(Fig. 3 (c)) is same as the shape of the subdivision crease. For the details of generating a dart or a corner, see [12].
4
Sharp Features on Non-regular Meshes
This section introduces the rules to generate sharp features such as creases and corners on non-regular meshes. Higashi et al. [12] presented the method of mesh extension to create sharp features on regular meshes described in the previous section, however, computing rules for generating them on non-regular meshes are not shown. We have developed a method based on the mesh extension technique and Peters’ algorithm [16] to generate a smooth surface with sharp features on a mesh including an extraordinary vertex. 4.1
Creating a Crease
There are numerous types of non-regular meshes. Here, non-regular meshes mean the set of meshes around an extraordinary vertex whose valance is not six. In this paper, we deal with the case of n = 5 because the proposed method can be applied to other cases such as n = 4 or 7. In the case of n = 5, there are five meshes around an extraordinary vertex. Crease edges are defined on the two edges that divide these five meshes into two and three. On the side of the three meshes, the same rule is applied to obtain the control points as in the case of regular meshes. On the side of the two meshes, there are three steps to compute the control points. In Fig. 4 (a), the input meshes are shown. For the first step of the process, as depicted in Fig. 4 (b), one edge is changed to connect the vertex in orange with the extraordinary vertex. Then, two meshes including crease edges are extended to make parallelograms using two new vertices which are colored red. By this treatment, the original non-regular meshes are now regular. Using ordinal Sabin’s rules, control points represented by small red point are computed. Five white points are computed by Sabin’s rules with one extended vertex and five original vertices. And small black control points are obtained by using original six vertices. Second, the same process is conducted in the opposite side as shown in Fig. 4 (c). In the third step, as illustrated in Fig. 4 (d), the positions of two control points in yellow are modified to be middle between two adjacent control points. This calculation is done to keep G1 continuity along the edge. Now, all of the required control points are available and we have the resulting surface by composing five B´ezier patches that is expressed with Eq. (2). Note that the vertices of the extended meshes are only used to compute the necessary control
Generating Sharp Features on Non-regular Triangular Meshes (a)
(b)
(c)
(d)
Extraordinary vertex 1-ring ordinary vertices Adjacent vertices Vertex used in step 1 and 2 Vertex of extended mesh
71
Control points using Sabin’ s rules with extended vertices Control points using orange vertices in step 1 and 2 Control points using Sabin’ s rules with original vertices Control points to be modified for smoothness in step 3
Fig. 4. Description of the presented procedure: (a) the input meshes and vertices, (b) change of an edge(orange) and computation of control points(red) with extended meshes, (c) same procedure with (b) on the other side, (d) modification of the positions of the control points(yellow) to be smooth surface
points. This surface represents the case of s = 1, therefore, the shape of the crease is identical to the original mesh. In order to represent the case of s = 0, we adopt Peters’s scheme [16] and these two surfaces are linearly interpolated with parameter s. Thus the surface changes its shape between these two shapes by inputting the parameter s. 4.2
Creating a Corner
To represent a sharp corner, we defined another rule. For simplicity, the case of n = 3 is described. As shown in Fig. 5, there are three faces meeting at the target corner. Control points of each faces are obtained by using mesh extension procedure. Making parallelograms, five faces are generated as if the target corner vertex is the center of six meshes. Control points are calculated by Sabin’s rules with these six vertices. Performing same procedure on the other two faces, all required control points are obtained. These control points represent the shape of the input mesh itself (s = 1). Then, using Peters’ scheme [16] with the original vertices, we obtain control points to express a smooth surface (s = 0). Seven control points, namely the corner point and its surrounding six points, are employed to represent a sharp corner. These points are colored blue in Fig. 6. Using the shape control parameter s, these seven control points are linearly interpolated as pnew = (1 − s)psmooth + spsharp
(9)
72
T. Oya, S. Seo, and M. Higashi
Vertices of the original mesh Vertices of the extended meshes Control points by using the vertices of the extended meshes
Fig. 5. An illustration of mesh extension to obtain control points for corner triangle faces. Starting from one of the corner mesh, four vertices are generated by making parallelogram. The same procedure is conducted on remaining two faces.
Smooth surface
psharp 1-s s Control points obtained by using extended meshes Control points used for sharp corner Control points obtained by Peters’ method
pnew
psmooth pnew = (1 − s)psmooth + spsharp
Fig. 6. Description of sharp corner generation. To change the shape of the corner, colored seven control points are used. These control points are linearly interpolated between Peters’ control points (red) and control points (blue) obtained by using mesh extension.
where psmooth is the position of the control points obtained by using Peters’s scheme and psharp denotes the position of the control points generated by the mesh extension. Inputting some value to the parameter s, new control points pnew are computed by Eq. (9). By changing the value of the parameter s, we have a smooth corner (s = 0) and a sharp corner (s = 1).
Generating Sharp Features on Non-regular Triangular Meshes
5
73
Results
This section provides application results where the presented method is used to represent sharp features on non-regular meshes. Here, results of generating creases are given in the case of valence n = 4, 5 and 7. Tested meshes are depicted in Fig. 7. Figs. 8∼10 show results of each cases, where parameter s is changed from 0 to 0.5 and 1. When s = 0, the resulting surface is identical to the smooth surface that is obtained by using the input mesh. On the other hand, if s becomes greater than 0, creases appear at the specified edges. When s is equal to 1, the shape of the crease is same as the shape of the input mesh. A small undulation is observed around the extraordinary vertex when s = 0.5. The reason is that the smooth surface(s = 0) is constructed by using Peters’ scheme, where the tangential plane on the extraordinary vertex is arbitrarily input by a user. And the shapes of each B´ezier patches are influenced by the tangential plane. Therefore, the crease line undulates when the mesh is not regular because tangent vectors are not necessarily parallel to the crease edges. This must be conquered to generate high quality creases. Fig. 11 represents the case of sharp corner. From this picture, making a sharp corner is also successfully performed.
n = 4 (non-regular)
n = 5 (non-regular)
n = 6 (regular)
n = 7 (non-regular)
Ordinary vertex( regular meshes) Extraordinary vertex( non-regular meshes) 1-ring vertices Normal mesh edge Crease edge
Fig. 7. Types of meshes used to produce examples
(a) s = 0 (smooth surface)
(b) s = 0.5 (crease)
(c) s = 1.0 (crease)
Fig. 8. Results of generating creases in the case of n = 4
74
T. Oya, S. Seo, and M. Higashi
(a) s = 0 (smooth surface)
(b) s = 0.5 (crease)
(c) s = 1.0 (crease)
Fig. 9. Results of generating creases in the case of n = 5
(a) s = 0 (smooth surface)
(b) s = 0.5 (crease)
(c) s = 1.0 (crease)
Fig. 10. Results of generating creases in the case of n = 7
(a) s = 0 (smooth surface)
(b) s = 0.5 (corner)
(c) s = 1.0 (corner)
Fig. 11. Results of generating sharp corner in the case of n = 3
6
Conclusion
This paper presented a method to generate sharp features on non-regular meshes. Our method is based upon the regular version of the mesh extension technique and we have developed new schemes to deal with non-regular meshes. Results suggest that the method is effective for the tested cases. Future work would be an exploration for more various cases and a pursuit of the quality of features. Acknowledgement. This study was financially supported by the High-tech Research Center for Space Robotics from the Ministry of Education, Sports, Culture, Science and Technology, Japan.
Generating Sharp Features on Non-regular Triangular Meshes
75
References 1. Nasri, A.H.: Polyhedral Subdivision Methods for Free-Form Surfaces. ACM Transactions on Graphics 6(1), 29–73 (1987) 2. Hoppe, H., DeRose, T., Duchamp, T., Halstead, M.: Piecewise Smooth Surface Reconstruction. In: Proc. SIGGRAP 1994, pp. 295–302 (1994) 3. Loop, C.: Smooth Subdivision Surfaces Based on Triangles, Master’s thesis, Department of Mathematics, University of Utah (1987) 4. DeRose, T., Kass, M., Truong, T.: Subdivision Surfaces in Character Animation. In: Proc. SIGGRAPH 1998, pp. 85–94 (1998) 5. Biermann, H., Levin, A., Zorin, D.: Piecewise Smooth Subdivision Surfaces with Normal Control. In: Proc. SIGGRAPH 2000, pp. 113–120 (2000) 6. Ying, L., Zorin, D.: Nonmanifold Subdivision. In: Proc. IEEE Visualization, pp. 325–332 (2001) 7. Sederberg, T.W., Zheng, J., Bakenov, A., Nasri, A.: T-splines and T-NURCCs. ACM Transactions on Graphics 22(3), 477–484 (2003) 8. Hahmann, S., Bonneau, G.-P.: Triangular G1 interpolation by 4-splitting domain triangles. Computer Aided Geometric Design 17, 731–757 (2000) 9. Hahmann, S., Bonneau, G.-P.: Polynomial Surfaces Interpolating Arbitrary Triangulations. IEEE Transactions on Visualization and Computer Graphics 9(1), 99–109 (2003) 10. Hahmann, S., Bonneau, G.-P.: Hierarchical Triangular Splines. ACM Transactions on Graphics 24(4), 1374–1391 (2005) 11. Loop, C.: Smooth Spline Surfaces over Irregular Meshes. In: Proc. SIGGRAP 1994, pp. 303–310 (1994) 12. Higashi, M., Inoue, H., Oya, T.: High-Quality Sharp Features in Triangular Meshes. Computer-Aided Design & Applications 4, 227–234 (2007) 13. Farin, G.: Curves and Surfaces for CAGD, 5th edn. Academic Press, London (2005) 14. Boem, W.: The De Boor Algorithm for Triangular splines. In: Surfaces in Computer Aided Geometric Design, pp. 109–120. North-Holland, Amsterdam (1983) 15. Boem, W.: Generating the Bezier Points of Triangular Spline. In: Surfaces in Computer Aided Geometric Design, pp. 77–91. North-Holland Publishing Company, Amsterdam (1983) 16. Peters, J.: Smooth Patching of Refined Triangulations. ACM Transactions on Graphics 20(1), 1–9 (2001)
A Novel Artificial Mosaic Generation Technique Driven by Local Gradient Analysis Sebastiano Battiato, Gianpiero Di Blasi, Giovanni Gallo, Giuseppe Claudio Guarnera, and Giovanni Puglisi Dipartimento di Matematica e Informatica University of Catania, Italy {battiato,gdiblasi,gallo,guarnera,puglisi}@dmi.unict.it http://www.dmi.unict.it/∼ iplab
Abstract. Art often provides valuable hints for technological innovations especially in the field of Image Processing and Computer Graphics. In this paper we present a novel method to generate an artificial mosaic starting from a raster input image. This approach, based on Gradient Vector Flow computation and some smart heuristics, permit us to follow the most important edges maintaining at the same time high frequency details. Several examples and comparisons with other recent mosaic generation approaches show the effectiveness of our technique. Keywords: Artificial Mosaic, Non Photo-Realistic Rendering, Gradient Vector Flow.
1
Introduction
Mosaics are artwork constituted cementing together small colored tiles. Probably they can be considered the first example of image synthesis technique based on discrete primitives. The creation of a digital mosaic of artistic quality is a challenging task. Many factors like position, orientation, size and shape of tiles must be taken into account in the mosaic generation in order to densely pack the tiles and emphasize the orientation chosen by the artist. Digital mosaic generation from a raster image can be formulated as an optimization problem in the following way: Given an image I in the plane R2 and a vector field Φ(x, y) defined on that region by the influence of the edges of I, find N sites Pi (xi , yi ) in I and place N rectangles, one at each Pi , oriented with sides parallel to Φ(xi , yi ), such that all rectangles are disjoint, the area they cover is maximized and each tile is colored by a color which reproduces the image portion covered by the tile [1]. Many mosaic generation algorithms have been recently developed. Hausner [1] uses Centroidal Voronoi Diagram together with user selected features and Manhattan distance. This approach obtains good results but, due to the high number of iterations necessary to reach convergence, it is computationally slow. In [2,3] the authors present an approach based on directional guidelines and distance transform. They obtain very realistic results with a linear complexity M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 76–85, 2008. c Springer-Verlag Berlin Heidelberg 2008
A Novel Artificial Mosaic Generation Technique
77
with respect to image size. A novel technique for ancient mosaics generation has been presented in [4]. The authors, using graph-cut optimization algorithm, are able to work on tiles positioning without an explicit edge detection phase. Other related works can be found in [5,6,7]. A complete survey of the existing methods in the field of artificial mosaic is available in [8]. In this paper we propose a novel approach based on Gradient Vector Flow (GVF) [9] computation together with some smart heuristics used to drive tiles positioning. Almost all previous approaches filter out high frequencies in order to simplify mosaic generation. Preliminary works [10] have shown that GVF properties permit us to preserve edge information and maintain image details. The novelty of this paper is related to the heuristics used to follow principal edges and to maximize the overall mosaic area covering. In particular the tiles positioning is not based only on gradient magnitude [10] but makes use of local considerations to link together vectors that share the same ”logical” edge. Experimental results confirm the better quality of the new technique with respect to the state of the art proposals [8]. The paper is structured as follows. Next section describes in detail the proposed methodology. Experiments are reported in Section 3 whereas Section 4 closes the paper tracking direction for future works.
2
Proposed Algorithm
In order to emulate the style of an ancient artisan we have developed an automatic technique based on the following two steps: – GVF (Gradient Vector Flow) field computation based on [9] algorithm; – rule based tile positioning. GVF is a dense force field designed by the authors of [9] in order to solve the classical problems affecting snakes (sensitivity to initialization and poor convergence to boundary concavity). Starting from the gradient of an image, this field is computed through diffusion equations. GVF is a field of vectors v = [v, u] that minimizes the following energy function: (1) E= µ(u2x + u2y + vx2 + vy2 ) + |∇f |2 |v − ∇f |2 . where the subscripts represent partial derivates along x and y axes respectively, µ is a regularization parameter and |∇f | is the gradient computed from the intensity input image. Due to the formulation described above, GVF field values are close to |∇f | values where this quantity is large (energy E, to be minimized, is dominated by |∇f |2 |v − ∇f |2 ) and are slow-varying in homogeneous regions (the energy E is dominated by sum of the squares of the partial derivatives of GVF field). An example of GVF field is shown in Figure 1.
78
S. Battiato et al.
Fig. 1. Input image and its corresponding GVF field
This vector field can be used to effectively drive tiles positioning. Edge information is preserved, it is propagated in the close regions and merged together in a smoothly way. Let I be the input image to be mosaicized. In order to simplify the algorithm we work only on the luminance channel of the image I, eliminating hue and saturation information. The luminance L(I) is then equalized and the discrete gradient of the results is calculated, by means of crossing difference. The equalization process, especially for natural images, allows to normalize the overall gradient distribution. The horizontal and vertical components of the gradient ∇L(I) are used as input for the GVF algorithm. Notice that in the implementation gradient computation is performed using Robert’s Kernel. This choice is more noise sensitive and hence incorporates in the final mosaic a little, aesthetically pleasant, randomicity. All the tiles have the same rectangular shape and the same size. Moreover we impose that tiles do not overlap. Placing ordering is hence fundamental in terms of visual overall effect. First we consider local |GV F (I)| maxima with values greater than a threshold th . These pixels, sorted according to |GV F (I)|, are selected together with their neighbors with |GV F (I)| greater than tl (chains of tiles are built up and placed). We impose that the tile orientation is obtained according to the GVF direction in its central point. In this way we locate first the neighborhood of main edges in the input image, just to follow the perceptual orientation of the image itself. The second step of the algorithm is devoted to cover the homogeneous regions of the image. This is accomplished simply placing each tile one by one following the order left to right, up to bottom, starting form the upper-left corner of the image. This heuristic strategy, somewhat arbitrary, is justified by the properties of the GVF: this technique leads to aesthetically pleasant results, by preserving main orientations and covering a wide portion of pixels with tiles densely packed. The algorithm can be summarized as follows:
A Novel Artificial Mosaic Generation Technique 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
16. 17. 18. 19. 20. 21. 22. 23.
3
79
Input: a raster image I L(I) ← Luminance(I) G(I) ← Robert’s Gradient(Equalize(L(I))) [u, v] ← GV F (|G(I)|∞ , µ, nIterations) gvf (I) ← (u2 + v 2 )1/2 In (I) ← NonMaxSuppression(gvf (I)) let th , tl be threshold values, with th > tl Sort in queue Q pixels (i, j) according to decreasing In (i, j) values. Only pixels whose In is greater than the threshold th are put into Q. while Q is not empty Extract a pixel (i, j) from Q if (i, j) is not marked Place a tile centering it in (i, j) at angle α = tan−1 (v(i, j)/u(i, j)) if in this way the tile overlaps with previously placed tiles Skip tile positioning Starting from (i, j) follow and mark as visited the chain of local maxima (i.e. In (w, z) > tl ) in both directions perpendicular to α, to obtain a guideline Place a tile centering it in each (w, z) in the chain at angle β = tan−1 (v(w, z)/u(w, z)) if in this way the tile overlaps with previously placed tiles Skip tile positioning. for j ← 1 to length(I) for i ← 1 to width(I) Place a tile in the pixel(i, j) at angle γ = tan−1 (v(i, j)/u(i, j)) if in this way the tile overlaps with previously placed tiles Skip tile positioning.
Experimental Results
The above algorithm has been implemented in JAVA, using for GVF computation an external MATLAB module. The method as been compared, to assess the aesthetic quality, with the mosaics obtained by [2] and [4] by using their original implementations. We wish to thank all the authors of these papers for providing the output of their techniques upon our request. For sake of comparison Figure 2 reports the mosaic obtained from the standard Lena picture. A first clear advantage of the novel technique is that it is able to better preserve fine details. This happens because high frequency areas are prioritary on tiles placing. Observe, to support this claim, the areas around Lena’s nose, lips and high brow. Another performance index of mosaic algorithms is the amount and spatial distribution of untiled space. The area left uncovered by the proposed technique is comparable with the amount of uncovered area left by [2], but gaps are here better distributed. For example the constraints of [2] force the appearance of a long ”crack” in the vertical band on the wall behind Lena, while the proposed approach achieves in the same region a pleasant smoothness. Relatively to [4] it should be observed that the uncovered area left by our technique is considerably less. Observe that a higher percentage of covered area leads to a better preservation of the original
80
S. Battiato et al.
Fig. 2. Visual comparison between mosaics generated by our approach (B), [2] (C) and [4] (D), applied on input image (A) considering in all cases the same tile size 5×5
A Novel Artificial Mosaic Generation Technique
81
colors of the same picture (see Table 1). The perceived texture obtained with the proposed technique appears, finally, less chaotic than the texture obtained with [4]. As for the parameter adopted to produce Figure 2, the tile size is 5×5 and the image size is 667×667. Typically, considering other images, we obtain covered area greater than [4] but smaller than [2]. Table 1. Number of tiles and covered area comparison between various approaches Technique Number of tiles Covered area Our Method 13412 75.4% [2] 13994 78.6% [4] 11115 62.5%
Fig. 3. Comparison between the proposed approach and [10] on a Lena image detail (a). The novel heuristics (c) are able to follow the underlying edges (b) maintaining higher fidelity than (d) also considering the original colors.
Figure 3 shows a comparison between the proposed approach and [10] on a Lena image detail (a). Both approaches do not consider the final step of linear tiles placing. The novel heuristics (c) are able to follow the underlying edges (b) maintaining higher fidelity than (d) also considering the original colors. In Figure 4 we show the same image processed using increasing tile size. The right relation between image size (and its level of detail) of the input, and the tile size to be used in the mosaicing process, can be derived only by aesthetic considerations. Our algorithm is able to preserve the global appearance even with higher tile size. Finally we show in Figure 5 an example of mosaicized image by using rectangular tiles (3x9). This example shows how the proposed criteria are able to preserve fine details (due to GVF capabilities) maintaining at the same time the global orientation of almost all edge present in the original picture (due to the tile positioning rules).
82
S. Battiato et al.
Fig. 4. Mosaics generated with increasing tiles size A (3x3), B (6x6), C (10x10), D (14x14)
A Novel Artificial Mosaic Generation Technique
83
Fig. 5. An example of mosaic generated with rectangular tiles (3x9)
The overall complexity of the proposed technique is O(kn) + O(nlogn), where n is the number of pixels in the source image. Further results are depicted in Figures 6, 7. The mosaicized images can be also downloaded at the following web address: http://svg.dmi.unict.it/iplab/download/
84
S. Battiato et al.
Fig. 6. Input image (A) and its mosaic (B) generated by our approach (image size 595x744, tile size 5x5 )
Fig. 7. Input image (A) and its mosaic (B) generated by our approach (image size 532x646, tile size 4x4 )
A Novel Artificial Mosaic Generation Technique
4
85
Conclusions
We propose a novel technique to produce a traditionally looking mosaic from a digital source picture. The new technique tries to overcome the difficulties that rely on edge detection, using the Gradient Vector Flow. Tests show that the new technique produces aesthetically pleasant images that have a greater fidelity in dealing with fine details and a better management of gaps. The proposed technique does not cut tiles. Indeed the next research step will integrate heuristics like those proposed in [2] and [3] to cut tiles with the proposed method. Future works will be also devoted to color management and mosaic generation in vectorial format without using raster-to-vector conversion techniques [11].
References 1. Hausner, A.: Simulating decorative mosaics. In: Proc. SIGGRAPH 2001, pp. 573– 580 (2001) 2. Di Blasi, G., Gallo, G.: Artificial mosaics. The Visual Computer 21(6), 373–383 (2005) 3. Battiato, S., Di Blasi, G., Farinella, G.M., Gallo, G.: A novel technique for opus vermiculatum mosaic rendering. In: Proc. ACM/WSCG 2006(2006) 4. Liu, Y., Veksler, O., Juan, O.: Simulating classic mosaics with graph cuts. In: Yuille, A.L., Zhu, S.-C., Cremers, D., Wang, Y. (eds.) EMMCVPR 2007. LNCS, vol. 4679, pp. 55–70. Springer, Heidelberg (2007) 5. Faustino, G.M., de Figueiredo, L.H.: Simple adaptive mosaic effects. In: Proc. SIBGRAPI 2005, pp. 315–322 (2005) 6. Elber, E., Wolberg, G.: Rendering traditional mosaics. The Visual Computer 19(1), 67–78 (2003) 7. Schlechtweg, S., Germer, T., Strothotte, T.: Renderbots — multi-agent systems for direct image generation. Computer Graphics Forum 24(2), 137–148 (2005) 8. Battiato, S., Di Blasi, G., Farinella, G.M., Gallo, G.: Digital mosaic frameworks an overview. Computer Graphics Forum 26(4), 794–812 (2007) 9. Xu, C., Prince, L.: Snakes, shapes, and gradient vector flow. IEEE Transactions on Image Processing 7(3), 359–369 (1998) 10. Battiato, S., Di Blasi, G., Gallo, G., Guarnera, G.C., Puglisi, G.: Artificial mosaics by gradient vector flow. In: Proc. EUROGRAPHICS 2008 (2008) 11. Battiato, S., Farinella, G.M., Puglisi, G.: Statistical based vectorization for standard vector graphics. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2006. LNCS, vol. 3992, pp. 334–341. Springer, Heidelberg (2006)
Level-of-Detail Triangle Strips for Deforming Meshes Francisco Ramos1 , Miguel Chover1, Jindra Parus2, and Ivana Kolingerova2 1
2
Universitat Jaume I, Castellon, Spain {Francisco.Ramos,chover}@uji.es University of West Bohemia, Pilsen, Czech Republic {jparus,kolinger}@kiv.zcu.cz
Abstract. Applications such as video games or movies often contain deforming meshes. The most-commonly used representation of these types of meshes consists in dense polygonal models. Such a large amount of geometry can be efficiently managed by applying level-of-detail techniques and specific solutions have been developed in this field. However, these solutions do not offer a high performance in real-time applications. We thus introduce a multiresolution scheme for deforming meshes. It enables us to obtain different approximations over all the frames of an animation. Moreover, we provide an efficient connectivity coding by means of triangle strips as well as a flexible framework adapted to the GPU pipeline. Our approach enables real-time performance and, at the same time, provides accurate approximations. Keywords: Multiresolution, Level of Detail, GPU, triangle strips, deforming meshes.
1
Introduction
Nowadays, deforming surfaces are frequently used in fields such as games, movies and simulation applications. Due to their availability, simplicity and ease of use, these surfaces are usually represented by polygonal meshes. A typical approach to represent these kind of meshes is to represent a different mesh connectivity for every frame of an animation. However, this would require a high storage cost and the time to process the animation sequence would be significantly higher than in the case of using a single mesh connectivity for all frames. Even so, these meshes often include far more geometry than is actually necessary for rendering purposes. Many methods for polygonal mesh simplification have been developed [1,2,3]. However, these methods are not applicable to highly deformed meshes. A single simplification sequence for all frames can also generate unexpected results in those meshes. Hence, multiresolution techniques for static meshes are not directly applicable to deforming meshes and so we need to adapt these techniques to this context. Therefore, our goal consists in creating a multiresolution model for deforming meshes. We specifically design a solution for morphing meshes (see Fig. 1), M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 86–95, 2008. c Springer-Verlag Berlin Heidelberg 2008
Level-of-Detail Triangle Strips for Deforming Meshes
87
Fig. 1. A deforming mesh: Elephant to horse morph sequence
although it could be extended to any kind of deforming mesh. Our approach includes the following contributions: – Implicit connectivity primitives: we benefit from using optimized rendering primitives, such as triangle strips. If compared to the triangle primitive, triangle strips lead us to an important reduction in the rendering and storage costs. – A single mesh connectivity: for all the frames we employ the same connectivity information, that is, the same triangle strips. It generally requires less spatial and temporal cost than using a different mesh for every frame. – Real-time performance: meshes are stored, processed and rendered entirely by the GPU. In this way, we obtain greater frame-per-second rates. – Accurate approximations: we provide high quality approximations in every frame of an animation.
2 2.1
Related Work Deforming Meshes: Morphing
A solution to approximate deforming meshes is to employ mesh morphing [4,5]. Morphing techniques aim at transforming a given source shape into a target shape, and they involve computations on the geometry as well as the connectivity of meshes. In general, two meshes M0 = (T0 , V0 ) and M1 = (T1 , V1 ) are given, where T0 and T1 represent the connectivity (usually in triangles) and V0 and V1 the geometric positions of the vertices in R3 . The goal is to generate a family of meshes M (t) = (T, V (t)), t ∈ [0, 1], so that the shape represented by the new connectivity T together with the geometries V (0) and V (1) is identical to the original shapes. The generation of this family of shapes is typically done in three subsequent steps: finding a correspondence between the meshes, generating a new and consistent mesh connectivity T together with two geometric positions V (0), V (1) for each vertex so that the shapes of the original meshes can be reproduced and finally, creating paths V (t), t ∈ [0, 1], for the vertices. The traditional approach to generate T is to create a supermesh [5] of the meshes T0 and T1 , which is usually more complex than the input meshes. After
88
F. Ramos et al.
the computation of one mesh connectivity T and two mesh geometries represented by vertex coordinates V (0) and V (1), we must create the paths. The most-used technique to create them is the linear interpolation [6], see Fig. 1. Given a transition parameter t the coordinates of an interpolated shape are computed by: V (t) = (1 − t)V (0) + tV (1) (1) As commented before, connectivity information generated by morphing techniques usually gives rise to more dense and complex information than necessary for rendering purposes. In this context, we can make use of level-of-detail solutions to approximate such meshes and thus remove unnecessary geometry when required. We can also represent the connectivity of the mesh in triangle strips, which reduces in a factor of three the number of vertices to be processed [7]. 2.2
Multiresolution
A wide range of papers about multiresolution or level of detail [8,9,10,11,12,13] that benefit from using hardware optimized rendering primitives have recently appeared. However, as they are built from a fixed and static mesh, they usually produce low quality approximations when applied to a mesh with extreme deformations. Some methods also provide multiresolution models for deforming meshes [14,15,16], but they are based on the triangle primitive and their adaptation to the GPU pipeline is potentially difficult or does not exploit it maximally. Another important work introduced by Kircher et al. [17] is a triangle-based solution as well. This approach obtains accurate approximations over all levels of detail. However, temporal cost to update its simplification hierarchy is considerable, and GPU-adaptation is not straightforward. Recent graphics hardware capabilities have led to great improvements in rendering. Mesh morphing techniques are also favored when they are employed directly in the GPU. With the current architecture of GPUs, it is possible to store the whole geometry in the memory of the GPU and to modify the vertex positions in real time to morph a supermesh. This would greatly increase performance. In order to obtain all the intermediate meshes, we can take advantage of the GPU pipeline to interpolate vertex positions by means of a vertex shader. A combination of multiresolution techniques and GPU processing for deforming meshes can lead us to an approach that offers great improvements in rendering, providing, at the same time, high quality approximations.
3
Technical Background
Starting from two arbitrary polygonal meshes, M0 = (T0 , V0 ) and M1 = (T1 , V1 ), where V0 and V1 are sets of vertices and T0 and T1 are the connectivity to represent these meshes, our approach is built upon two algorithms: we first obtain a supermesh (Morphing Builder) and later we build the multiresolution scheme (Lod Builder). The general construction process is shown in Fig. 2.
Level-of-Detail Triangle Strips for Deforming Meshes
89
Fig. 2. General construction process data flow diagram
3.1
Generating Morphing Sequences
As commented before, linear interpolation is a well-known technique to create paths for vertices in morphing solutions. Vertex paths defined by this kind of technique are suitable to be implemented in recent GPUs offering considerable performance when generating intermediate meshes, M (t). Thus, we first generate a family of meshes M (t) = (T, V (t)), t ∈ [0, 1] by applying the method proposed by Parus [5]. As paths are linearly interpolated, we only need the geometries V (0) and V (1) and the connectivity information T , to reproduce the intermediate meshes M (t) by applying the equation 1. The FaceToFace morphing sequence shown in Fig. 6 was generated by using this method [5]. 3.2
Construction of the Multiresolution Scheme
A strip-based multiresolution scheme for polygonal models is preferred in this context as we obtain improvements both in rendering and in spatial cost. Thus, we perform an adaptation of the LodStrips multiresolution model [13] to deforming meshes. This work represents a mesh as a set of multiresolution strips. Let M the original polygonal surface and M r its multiresolution representation: M r = (V, S), where V is the set of all the vertices and S the triangle strips used to represent any resolution or level of detail. S can also be expressed as a tuple (S0 , C), where S0 consists of the set of triangle strips at the lowest level of detail and C is the set of operations required to extract the approximations. Every element in C contains the set of changes to be applied in the multiresolution strips so that they represent the required level of detail. Thus, the process to construct the multiresolution scheme performs two fundamental tasks. On the one hand, it generates the triangle strips to represent the connectivity by means of these primitives, S, and, on the other hand, it generates the simplification sequence which allows us to recover the different levels of detail, C. We generate the simplification sequence for each frame that we will consider in the animation. This task is performed by modifying the t factor in the supermesh. The number of frames to be taken into account is called |r|. The LOD builder subprocess first computes S, that is, it converts the supermesh into triangle strips. Later, it transforms the supermesh thus obtained into the subsequent intermediate meshes that we will use in the multiresolution scheme. We thus
90
F. Ramos et al.
store the sequence of modifications required into the triangle strips for each considered frame. After the general construction process has finished, we obtain the sets {V (0), V (1), S 0 , C }, where V (0) and V (1) comes from the Morphing builder, S 0 is the supermesh in triangle strips at the lowest level of detail, and C contains the sequences of simplification operations that enable us to change the resolution of the supermesh for each frame.
4
Real-Time Representation
Once construction has finished, we must build a level-of-detail representation with morphing during run-time. According to the requirements of the applications, it involves the extraction of a level of detail at a given frame. In Fig. 3, we show the main functional areas of the pipeline used in our approach. The underlying method to extract approximations of the models is based on the LodStrips work [13]. Among other advantages already commented, this model offers a low temporal cost when extracting any level of detail for stripbased meshes. We take advantage of this feature to perform fast updatings when traversing a supermesh from frame to frame in any level of detail.
Fig. 3. Multiresolution morphing pipeline using the current technology
Thereby, according to the frame and level of detail required by applications, the level-of-detail extraction algorithm is responsible for recovering the appropriate approximation in the triangle strips by means of the previously computed simplification sequence. In Fig. 3, we show the general operation of this algorithm. It reads the simplification sequence of the current frame from the data
Level-of-Detail Triangle Strips for Deforming Meshes
91
structure Changes, and it modifies the triangle strips located in the GPU so that they always have the geometry corresponding to the level of detail used at the current time. A more detailed algorithm is shown in Fig. 4. After extraction, vertices must also be transformed according to the current frame in such a way that the deforming mesh is correctly rendered. When an application uses the GPU to compute the interpolation operations, the CPU can spend time improving its performance rather than continuously blending frames. Thus, by using the processing ability of the GPU, the CPU takes over the task of frame blending. Therefore, after extracting the required approximation, we directly compute the linear interpolations between V (0) and V (1) in the GPU by means of a vertex shader. Function ExtractLODFromFrame (Frame,LOD) if Frame!=CurrentFrame then CurrentFrame=Frame; CurrentChanges=Changes[CurrentFrame]; ExtractLevelOfDetail(LOD); else if LOD!=CurrentLOD then ExtractLevelOfDetail(LOD); end if end Function Fig. 4. Extraction algorithm
Regarding the GPU pipeline, the first stage is the Input Assembler. The purpose of this stage is to read primitive data, in our case triangle strips, from the user-filled buffers and assemble the data into primitives that will be used by the other pipeline stages. As shown in the pipeline-block diagram, once the Input Assembler stage reads data from memory and assembles the data into primitives, the data is output to the Vertex Shader stage. This stage processes vertices from the Input Assembler, performing per-vertex morphing operations. Vertex shaders always operate on a single input vertex and produce a single output vertex. Once every vertex has been transformed and morphed, the Primitive Assembly stage provides the assembled triangle strips to the next stage.
5
Results
In the previous section, we described the sets required to represent our approach: V (0), V (1), S 0 and C. According to the multiresolution morphing pipeline that we propose in Fig. 3, our sets are implemented as follows: V (0), V (1) and S 0 are located in the GPU, whereas C is stored in the CPU. In particular, V (0) and V (1) are stored as vertex array buffers and S 0 as an element array buffer, which offers better performance than creating as many buffers as there are triangle strips. It is important to notice that we represent the geometry to be rendered by means of the data structures in the GPU, where the morphing process also takes
92
F. Ramos et al.
place. On the other hand, the simplification sequence for every frame is stored in the CPU. These data structures are efficiently managed in runtime in order to obtain different approximations of a model over all the frames of an animation. Tests and experiments were carried out with a Dell Precision PWS760 Intel Xeon 3.6 Ghz with 512 Megabytes of RAM, the graphics card was an NVidia GeForce 7800 GTX 512. Implementation was performed in C++, OpenGL as the supporting graphics library and Cg as the vertex shader programming language. The morphing models taken as a reference are shown in Fig. 6 and Fig. 7. The high quality of the approximations, some of which are reduced (in terms of number of vertices) by more than a 90%, can be seen. 5.1
Spatial Cost
Spatial costs from the FaceToFace and HorseToMan morphing models are shown in Table 1. For each model, we specify the number of vertices and triangle strips that compose them (strips generated by means of the STRIPE algorithm [7]), the number of approximations or levels of detail available, the number of frames generated and, finally, the spatial cost per frame in Kilobytes. It is divided into cost in the GPU (vertices and triangle strips) and cost in the CPU (simplification sequence). Finally, in the total column, we show the cost per frame, calculated as the total storage cost divided by the number of frames. As expected, the cost of storing the simplification sequence of every frame is the most important part of the spatial cost. Table 1. Spatial cost Cost/Frame Model #Verts #Strips #LODs #Frames GPU CPU Total FaceToFace 10,520 620 9,467 25 14.2 KB. 472.1 KB. 486.3 KB. HorseToMan 17,489 890 15,738 26 22.4 KB. 848.3 KB. 870.7 KB.
5.2
Temporal Cost
Results shown in this section were obtained under the conditions mentioned above. Levels of detail were in the interval [0, 1], zero being the highest LOD and one the lowest. Geometry was rendered by using the glMultiDrawElements OpenGL extension, which only sends the minimum amount of information that enables the GPU to correctly interpret data contained in its buffers. With glMultiDrawElements we only need one call per frame to render the whole geometry. In Fig. 5a, we show the level-of-detail extracting cost per frame of the FaceToFace morphing sequence. The per-frame time to extract the required level-ofdetail ranges between 6% and 1.4% of the frame time. If we consider the lowest level of detail as being the input mesh reduced by 90% (see LOD 0.9 in Fig. 5a), we obtain times around 6% of the frame time, which offers us better perfomance than other related works such as [17], which employs far more time in changing and applying the simplification sequence.
Level-of-Detail Triangle Strips for Deforming Meshes
93
(a) Level-of-detail extraction cost per (b) Frame-per-second rates by performing frame at a constant rate of 24 fps. one extraction every 24 frames. Fig. 5. Temporal cost. Results obtained by using the FaceToFace morphing model.
Fig. 6. Multiresolution morphing sequence for the FaceToFace model. Rows mean level of detail, 10,522 (original mesh), 3,000 and 720 vertices, respectively, and columns morphing adaptation, aproximations were taken with t=0.0, 0.2, 0.4, 0.6, 0.8 and 1.0, respectively.
We performed another test by extracting one approximation every 24 frames and, at the same time, we progressively changed the level of detail. This was carried out to simulate an animation which is switching its LOD as it is further from the viewer. In Fig. 5b, we show the results of this test. As expected, our approach is able to extract and render different approximations over all frames of an animation at considerable frame-per-second rates.
94
F. Ramos et al.
Fig. 7. Multiresolution morphing sequence for the HorseToMan model. Rows mean level of detail, 17,489 (original mesh), 5,000 and 1,000 vertices, respectively, and columns morphing adaptation, aproximations were taken with t=0.0, 0.2, 0.4, 0.6, 0.8 and 1.0, respectively.
6
Conclusions
We have introduced a multiresolution scheme suitable for deforming meshes such as those generated by means of morphing techniques. A solution for morphing sequences was specially designed, although it can be adapted to any kind of deformed mesh by storing the vertex positions of every frame within the animation. We also share the same topology storing the whole geometry in the GPU, thus saving bandwidth in the typical CPU-GPU bound bottleneck. Morphing is also computed in the GPU by exploiting its parallelism. We thus obtain real-time performance at high frame-per-second rates. At the same time, we offer high quality approximations in every frame of an animation. Acknowledgments. This work has been supported by the Spanish Ministry of Science and Technology (Contiene Project: TIN2007-68066-C04-02) and Bancaja (Geometria Inteligente project: P1 1B2007-56).
References 1. DeHaemer, M., Zyda, J.: Simplification of objects rendered by polygonal approximations. Computer and Graphics 2(15), 175–184 (1991) 2. Cignoni, P., Montani, C., Scopigno, R.: A comparison of mesh simplification methods. Computer and Graphics 1(22), 37–54 (1998)
Level-of-Detail Triangle Strips for Deforming Meshes
95
3. Luebke, D.P.: A developer’s survey of polygonal simplification algorithms. IEEE Computer Graphics and Applications 3(24), 24–35 (2001) 4. Aaron, L., Dobkin, D., Sweldens, W., Schroder, P.: Multiresolution mesh morphing. In: SIGGRAPH, pp. 343–350 (1999) 5. Parus, J.: Morphing of meshes. Technical report, DCSE/TR-2005-02, University of West Bohemia (2005) 6. Kanai, T., Suzuki, H., Kimura, F.: Metamorphosis of arbitrary triangular meshes. IEEE Computer Graphics and Applications 20, 62–75 (2000) 7. Evans, F., Skiena, S., Varshney, A.: Optimizing triangle srips for fast rendering. In: IEEE Visualization, pp. 319–326 (1996) 8. El-Sana, J., Azanli, E., Varshney, A.: Skip strips: Maintaining triangle strips for view-dependent rendering. In: Visualization, pp. 131–137 (1999) 9. Velho, L., Figueredo, L.H., Gomes, J.: Hierachical generalized triangle strips. The Visual Computer 15(1), 21–35 (1999) 10. Stewart, J.: Tunneling for triangle strips in continuous level-of-detail meshes. In: Graphics Interface, pp. 91–100 (2001) 11. Shafae, M., Pajarola, R.: Dstrips: Dynamic triangle strips for real-time mesh simplification and rendering. In: Pacific Graphics Conference, pp. 271–280 (2003) 12. Belmonte, O., Remolar, I., Ribelles, J., Chover, M., Fernandez, M.: Efficient use connectivity information between triangles in a mesh for real-time rendering. Computer Graphics and Geometric Modelling 20(8), 1263–1273 (2004) 13. Ramos, F., Chover, M.: Lodstrips: Level of detall strips. In: Bubak, M., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2004. LNCS, vol. 3039, pp. 107–114. Springer, Heidelberg (2004) 14. Mohr, A., Gleicher, M.: Deformation sensitive decimation. In: Technical Report (2003) 15. Shamir, A., Pascucci, V.: Temporal and spatial levels of detail for dynamic meshes. In: Symposium on Virtual Reality Software and Technology, pp. 77–84 (2000) 16. Decoro, C., Rusinkiewicz, S.: Pose-independent simplification of articulated meshes. In: Symposium on Interactive 3D Graphics (2005) 17. Kircher, S., Garland, M.: Progressive multiresolution meshes for deforming surfaces. In: EUROGRAPHICS, pp. 191–200 (2005)
Triangular B´ ezier Approximations to Constant Mean Curvature Surfaces A. Arnal1 , A. Lluch1 , and J. Monterde2 1
2
Dep. de Matem` atiques, Universitat Jaume I Castell´ o, Spain
[email protected],
[email protected] Dep. de Geometria i Topologia, Universitat de Val`encia, Burjassot (Val`encia), Spain
[email protected]
Abstract. We give a method to generate polynomial approximations to constant mean curvature surfaces with prescribed boundary. We address this problem by finding triangular B´ezier extremals of the CMCfunctional among all polynomial surfaces with a prescribed boundary. Moreover, we analyze the C 1 problem, we give a procedure to obtain solutions once the tangent planes for the boundary curves are also given.
1
Introduction
Surfaces with constant mean curvature (CMC-surfaces) are the mathematical abstraction of physical soap films and soap bubbles, and can be seen as the critical points of area for those variations that left the enclosed volume invariable. The study of these surfaces is actually relevant since there is a wide range of practical applications involving surface curvatures, ranging from rendering problems to real settings in automotive industry as measurement and calibration problems, for instance. In general, the characterization of “area minimizing under volume constraint” is no longer true from a global point of view, since they could have self-intersections and extend to infinity. But locally, every small neighborhood of a point is still area minimizing while fixing the volume which is enclosed by the cone defined by the neighborhood’s boundary and the origin. An exhaustive discussion of the existence of surfaces of prescribed constant mean curvature spanning a Jordan curve in R3 can be found in [2]. Given H ∈ R the functional DH is defined as follows → → → DH (− x ) = D(− x ) + 2HV (− x) 1 2H → → → → → = − x u 2 + − x v 2 dudv + <− xu ∧− x v, − x > dudv, 2 T 3 T where <, > and ∧ denote the scalar and the cross product respectively. If an isothermal patch is an extremal of the functional DH , then it is a CMC→ surface. The “volume” term, V (− x ), measures the algebraic volume enclosed in M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 96–105, 2008. c Springer-Verlag Berlin Heidelberg 2008
Triangular B´ezier Approximations to Constant Mean Curvature Surfaces
97
→ the cone segment consisting of all lines joining points − x (u, v) on the surface → − with the origin. The first term, D( x ), is the Dirichlet functional. We will give a method to generate B´ezier extremals of DH , for prescribed boundary curves and constant mean curvature. Our method lets to obtain approximations to CMC-surfaces, since we have considered the problem of minimizing this functional restricted to the space of polynomials. Moreover, we will consider the C 1 problem, that is, we give a way to generate a polynomial approximation to CMC-surface once the boundary curves and the tangent planes along them have been prescribed.
2
Existence of Triangular B´ ezier Surfaces of Prescribed Constant Mean Curvature
Here, we are not working with parametrizations, we are working instead with triangular control nets. So, our aim is to find the minimum of the real function → → x P ), − x P being the triangular B´ezier patch associated to the control P → DH (− net P. The Dirichlet functional, D, has a minimum in the B´ezier case due to the following facts: 3(n−1)(n−2) 2 , First, it can be considered as a continuous real function defined on R 3 interior control points which belong to R . since there are (n−1)(n−2) 2 Second, the functional is bounded from below. Third, the infima is attained: when looking for a minimum, we can restrict this function to a suitable compact subset. → On the other hand, the function assigning the value V (− x P ) to each control net, P, with fixed boundary control points, has no global minimum. If that minimum existed, since spatial translations do not affect the curvature of the surface, we could suppose that the origin is located far enough away from the surface so that the control net is enclosed in a half-space passing through the origin. Let us move an interior control point, PI0 , toward the origin. Then, a well→ known property of B´ezier surfaces states that all the points of − x (u, v) change in n a parallel direction with intensity BI0 (u, v) . Then, since the new cone segment is totally included in the initial one, its volume decreases. → As we said, the function, P → D(− x P ), for control nets with fixed boundary → always has a minimum and, as we have just seen, the function P → V (− x P ), never has a minimum. Therefore, by using the constant H to balance both functions → x P ), will have a minimum only for we can say that the function, P → DH (− H ∈ [a, −a] for some constant a ∈ R. It should be noted that when H = 0, DH is reduced to the Dirichlet functional, D, and then there is a minimum, whereas when H is too big, the main term in DH is V , and therefore the minimum does not exist. The value of a depends on the boundary control points and the symmetry of the interval, [a, −a], is a consequence of the fact that reversing the orientation of a surface means a change in the sign of the mean curvature. A detailed explanation
98
A. Arnal, A. Lluch, and J. Monterde
about the existence conditions of CMC-surfaces suited to a boundary and this dependency can be found in [2].
3
The CMC-Functional B´ ezier Form
The following proposition gives a characterization of an isothermal CMC-surface. − Proposition 1. [2] An isothermal patch, → x , is a CMC-surface if and only if → → → Δ− x = 2H − xu∧− x v.
(1)
Expression (1) is the Euler-Lagrange equation of the functional DH . Moreover, an isothermal patch satisfies the PDE in (1) if and only if it is an extremal of DH . In [1], it was proved that an extremal of the Dirichlet functional among all B´ezier triangular surfaces with a prescribed boundary always exists and it is the solution of a linear system. Now we find two qualitative differences, the existence of the extremal of DH can only be ensured with certainty when |H| ≤ a, for a certain constant, a, depending on the boundary configuration, and they are computed as solutions of a quadratic system. Moreover, since the Euler-Lagrange equation of the functional DH , in Equation (1), is not linear we cannot determine a B´ezier solution as a solution of a linear system of equations in terms of the control points. Here we will give an expression of the CMC-functional in terms of the control points of a triangular B´ezier surface, which implies that the restriction of the functional to the B´ezier case can be seen as a function instead of as a functional. The following two results will simplify the way to obtain the formula in terms of control points of the functional DH . → Proposition 2. The Dirichlet functional, D(− x ), of a triangular B´ezier surface, → − x , associated to the control net, P = {PI }|I|=n , can be expressed in terms of the control points, PI = (x1I , x2I , x3I ), with |I| = |{I 1 , I 2 , I 3 }| = n by the formula 3 1 − → D( x ) = 2 a=1
CI0 I1 xaI0 xaI1
(2)
|I0 |=n |I1 |=n
n n
where
CI0 I1 = I0 2nI1 (a1 + a2 + 2a3 − b13 − b23 )
(3)
I0 +I1
and ar =
0 I0r I r (I0r +I r )(I0r +I r −1)
I r = 0, Ir > 0
brs =
(I0r
I0r I s + I0s I r . + I r ) (I0s + I s )
(4)
Proof. The Dirichlet functional is a second-order functional, therefore we compute its second derivative in order to obtain the coefficients CI0 I1 .
Triangular B´ezier Approximations to Constant Mean Curvature Surfaces
99
The first derivative with respect to the coordinates of an interior control point PI0 = x1I0 , x2I0 , x3I0 where I0 = (I01 , I02 , I03 ) for any a ∈ {1, 2, 3}, and any |I0 | = n, with I01 , I02 , I03 = 0, is → → ∂− x→ ∂− x ∂D(− x) u − v − → = (< , x→ u > +< a a a , xv >) du dv, ∂xI0 ∂x ∂x T I0 I0 and the second derivative → n n ∂ 2 D(− x) = BI0 u BI1 u + BIn0 v BIn1 v < ea , ea > dudv ∂xaI0 ∂xaI1 T n n 2n(2n − 1) I0 I1 n2 2n (a1 + a2 + 2a3 − b13 − b23 ), = 2n(2n − 1) n2 I+I 0
where we took into account the formula for the product of the Bernstein polynomials and the value of its integral. Therefore n n CI0 I1 = I0 2nI1 (a1 + a2 + 2a3 − b13 − b23 ), I+I0
where a1 , a2 , a3 , b13 , b23 were defined in Equation (4).
Now, we will work the volume term of the CMC-functional. → Proposition 3. Let − x be the triangular B´ezier surface associated to the control → x ), can be expressed in terms of the net, P = {PI }|I|=n , then the volume, V (− 1 2 3 control points, PI = (xI , xI , xI ), with |I| = n, by the formula → V (− x)=
CI0 I1 I2 x1I0 x2I1 x3I2
|I0 |=|I1 |=|I2 |=n
where
n n n CI0 I1 I2 = I0
I1 I2 3n I0 +I1 +I2
(dI120 I1 I2 + dI230 I1 I2 + dI130 I1 I2 )
(5)
with dIJK = rs
IrJs − JrIs . (I r + J r + K r )(I s + J s + K s )
(6)
→ Proof. The term V (− x ), is a cubical polynomial of the control points, so in order to compute the coefficients CI0 I1 I2 we will compute its third derivative. The derivative with respect to a first coordinate x1I0 of an arbitrary interior point PI0 = x1I0 , x2I0 , x3I0 , where |I0 | = n and I01 , I02 , I03 = 0, is given by
100
A. Arnal, A. Lluch, and J. Monterde
→ ∂V (− x) 1 = ∂x1I0 3 = T
T
→ → → → ( < BIn0 u e1 ∧ − x v, − x >+<− x u ∧ BIn0 v e1 , − x > → → x v , BIn0 e1 > ) du dv +<− xu ∧−
→ → → → → → < BIn0 e1 ∧ − x v, − x >u − < BIn0 e1 ∧ − x vu , − x >+<− x u ∧ BIn0 e1 , − x >v
→ → → → −<− x uv ∧ BIn0 e1 , − x >+<− xu ∧− x v , BIn0 e1 > du dv.
After computing the derivative with respect to an arbitrary first coordinate, we applied the integration by parts formula. Now, bearing in mind that → → → → < BIn0 e1 ∧ − x v, − x >u = <− x u ∧ BIn0 e1 , − x >v = 0, T
T
− v, v) = = = BIn0 (u, 1 − u) = 0 for |I0 | = n with 0, and the properties of the cross and the scalar triple product, we obtain that → ∂V (− x) 1 → → = <− xu ∧− x v , BIn0 e1 > . (7) ∂x1I0 3 T since BIn0 (1 I01 , I02 , I03 =
BIn0 (0, v)
BIn0 (u, 0)
Now we must compute the derivative with respect to a second coordinate, x2I1 , of an arbitrary interior point, such that, as before, |I1 | = n with I11 , I12 , I13 = 0. Using the same process as before we have: → ∂ 2 V (− x) 1 = 1 ∂xI0 ∂x2I1 3
T
→ → < (BIn1 )u e2 ∧ − x v , BIn0 e1 > + < − x u ∧ (BIn1 )v e2 , BIn0 e1 > dudv
= T
→ (BIn0 )u (BIn1 )v − (BIn0 )v (BIn1 )u < e1 ∧ e2 , − x > dudv.
Finally we compute the derivative with respect to an arbitrary third coordinate x3I2 with |I2 | = n and such that I21 , I22 , I23 = 0, that is, → x) ∂ 2 V (− Cx1I x2I x3I = = ((BIn0 )u (BIn1 )v − (BIn0 )v (BIn1 )u ) BIn2 dudv 0 1 2 ∂x1I0 ∂x2I1 ∂x3I2 T n n n = I0
I1 I2 3n I0 +I1 +I2
(dI120 I1 I2 + dI230 I1 I2 + dI130 I1 I2 )
where we have achieved the last formula after computing the integral of the Bernstein polynomials and performing some simplifications like the following: n−1 n−1 n I1 −e2 I2 n−1 n−1 n I0 −e1 3n−2 BI3n−2 BI0 −e1 BI1 −e2 BI2 dudv = dudv 0 +I1 +I2 −e1 −e2 T
T
I +I +I −e −e
n n0 1n 2 1 2 3n(3n − 1) I I I = 0 3n1 2 2 I0 +I1 +I2
n
(I01
+
I11
I01 I12 . + I21 )(I02 + I12 + I22 )
Triangular B´ezier Approximations to Constant Mean Curvature Surfaces
101
Lemma 1. The coefficients CIJK verify the following symmetry relations CIJK = −CJIK = CJKI . Proof. The symmetry of the coefficients C’s is a direct consequence of the sym= −dJIK metry of d’s: dIJK rs rs , which is immediate from its definition in Proposition 3, since: JrIs − IrJs . = r dJIK rs r (I + J + K r )(I s + J s + K s ) → In the following proposition we give a formula for the CMC-functional, DH (− x) → − in terms of the control net, P = {PI }|I|=n , of the B´ezier triangular surface, x . → Proposition 4. Let − x be the triangular B´ezier surface associated to the control net, P = {PI }|I|=n , where PI = (x1I , x2I , x3I ) with |I| = |{I 1 , I 2 , I 3 }| = n. The CMC-functional, DH , can be expressed by the formula 3 1 → DH (− x)= 2 a=1
CI0 I1 xaI0 xaI1 + 2H
|I0 |=n |I1 |=n
CI0 I1 I2 x1I0 x2I1 x3I2
|I0 |=|I1 |=|I2 |=n
n n
where
CI0 I1 = I0 2nI1 (a1 + a2 + 2a3 − b13 − b23 ) I0 +I1
with ar and brs defined in Equation (4) and n n n CI0 I1 I2 = I0
I1 I2 3n I0 +I1 +I2
(dI120 I1 I2 + dI230 I1 I2 + dI130 I1 I2 )
defined in Equation (6). with dIJK rs
4
B´ ezier Approximations to CMC-Surfaces
We have just seen in Proposition 4 that the CMC-functional, is a function of the control points, so let us now compute its gradient with respect to the coordinates of an arbitrary control point. This will let us to give a characterization of the control net of the triangular B´ezier extremals of DH , which are B´ezier approximations to CMC-surfaces. The gradient of the first addend, corresponding to the Dirichlet functional, with respect to the coordinates of a control point PI0 = x1I0 , x2I0 , x3I0 ⎞ ⎛ − → ∂D( x ) ⎝ = CI0 J x1J CI0 J x2J , CI0 J x3J ⎠ = CI0 J PJ ∂PI0 |J|=n
|J|=n
|J|=n
|J|=n
(8)
102
A. Arnal, A. Lluch, and J. Monterde
→ So, let us consider the volume expression V (− x ) = |I|,|J|,|K|=n CIJK x1I x2J x3K , and compute its gradient with respect to the coordinates of a control point PI0 . → ∂V (− x) = ∂PI0 =
|J|,|K|=n
|J|,|K|=n
1 2
=
1 = 2
CI0 JK (x2J x3K , −x1J x3K , x1J x2K ) CI0 JK − CI0 KJ 2 3 (xJ xK , −x1J x3K , x1J x2K ) 2
(9)
CI0 JK (x2J x3K − x2K x3J , x1K x3J − x1J x3K , x1J x2K − x1K x2J )
|J|,|K|=n
(10) CI0 JK PJ ∧ PK .
|J|,|K|=n
Now we can characterize the triangular control net of an extremal of the CMCfunctional among all triangular B´ezier patches constrained by a given boundary. Proposition 5. A triangular control net, P = {PI }|I|=n , is an extremal of the CMC-functional, DH , among all triangular control nets with a prescribed boundary if and only if: 0=
|J|=n
CI0 J PJ + H
CI0 JK PJ ∧ PK
(11)
|J|,|K|=n
for all |I0 = (I01 , I02 , I03 )| = n with I01 , I02 , I03 = 0, where the coefficients CI0 J and CI0 JK are defined in Equation (3) and Equation (5) respectively. The last result lets us to obtain B´ezier approximations to CMC-surfaces since we compute solutions to a restricted problem, that is, we find extremals of the functional DH among all polynomial patches with prescribed border. The following proposition characterizes the extremals of this restricted prob→ lem: − x is an extremal of the functional DH among all triangular B´ezier patches with a prescribed boundary if and only if a weak version of the condition in Equation (1) is fulfilled. → Proposition 6. A triangular B´ezier patch − x is an extremal of the CMC-functional, DH , among all patches with a prescribed boundary if and only if: 0= T
→ → → (Δ− x − 2H − xu∧− x v ) BIn0 dudv
with I01 , I02 , I03 = 0.
for all
|I0 = (I01 , I02 , I03 )| = n
(12)
Triangular B´ezier Approximations to Constant Mean Curvature Surfaces
103
Proof. We simply compute the gradient of the CMC-functional with respect to an arbitrary control point. The boundary curves of our example in Fig. 1 describe an approximation to a circle. Therefore we obtain approximations to spheres. In Fig. 1 top, we have asked the interior control points to fulfill a symmetry condition: P112 =
a cos
4π 4π , a sin ,b 3 3
P121 =
a cos
2π 2π , a sin ,b 3 3
P211 = (a, 0, b)
and we show three different approximations to CMC-surfaces. The three surfaces at the bottom are obtained as a solution of the system of quadratic equations described in Equation (11). Here we don’t ask for any kind of symmetry.
Fig. 1. These surfaces are approximations to CMC-surfaces with curvatures H = −1.5, H = −1 and H = −0.5 respectively
In Fig. 2 we present two more examples. The boundary curves in the first are built in such a way that any associated patch would be isothermal at the corner points and in the bottom surfaces in Fig. 2 the boundaries are approximations to three circular arcs, and therefore our results look like pieces of a sphere. The resulting plots are pleasant and moreover they can be continuously deformed by the parameter H, thus allowing the designer to choose of the shape which best fits the objective. We maintain the good shapes we got with the Dirichlet results in [1], but now the choice of the curvature gives the designer another degree of freedom, although the surfaces are obtained as a solution of a quadratic system of the control points.
104
A. Arnal, A. Lluch, and J. Monterde
Fig. 2. These surfaces are approximations to CMC-surfaces with curvatures H = −1, H = 0 and H = 1 at the top and H = −2, H = −1.5 and H = −1 respectively at the bottom
5
The C 1 Problem
In this section we will consider the prescription of not only the boundary but also the tangent planes along the boundary curves, the C 1 problem. Now, the boundary and the next to the boundary control points are fixed, but again the extremals of the CMC-functional, where the other interior control points are considered as variables, can be computed. Here we show an example. We prescribe the border control points along a planar equilateral triangle and three more lines of control points as it is shown in Fig. 3.
Fig. 3. The border control points and their neighboring lines of control points are prescribed
The following figures show approximations to CMC-surfaces obtained as a solution of the quadratic system of the control points in Equation (11), but now for all |I0 = (I01 , I02 , I03 )| = n with I01 , I02 , I03 > 1. The free points are the interior control points outside the boundary and its next line of control points.
Triangular B´ezier Approximations to Constant Mean Curvature Surfaces
105
Fig. 4. These surfaces are approximations to CMC-surfaces with curvatures H = −2, H = −1.5 and H = −1 respectively
6
Conclusions
An isothermal patch has constant mean curvature H if and only if it is an extremal of the functional → → → DH (− x ) = D(− x ) + 2HV (− x ). We have generated approximations to CMC-surfaces, since we have considered the problem of minimizing this functional restricted to the space of polynomials. We have obtained an expression of DH in terms of the control points of a triangular B´ezier surface. After that, we deduced the condition that a triangular control net must fulfill in order to be an extremal of DH among all B´ezier triangles with a prescribed boundary. This characterization of the B´ezier extremals of DH allowed us to compute them as a solution of a quadratic system of the control points. The surfaces that are obtained have regular shapes and have the advantage of allowing prescription of the desired curvature in addition to the boundary. This makes it possible to ensure, for a given boundary, the existence of a family of polynomial approximations to CMC-surfaces with this boundary and curvatures within a particular interval. Therefore, the prescription of the curvature in this method can be seen as another degree of freedom in comparison with the Dirichlet surface generation method in [1]. Finally, in the last section, we consider the C 1 problem, that is, once the boundary curves and the tangent planes along them have been prescribed we give a way to generate a polynomial approximation to CMC-surface associated to this initial information.
References 1. Arnal, A., Lluch, A., Monterde, J.: Triangular B´ezier Surfaces of Minimal Area. In: Kumar, V., Gavrilova, M.L., Tan, C.J.K., L’Ecuyer, P. (eds.) ICCSA 2003. LNCS, vol. 2669, pp. 366–375. Springer, Heidelberg (2003) 2. Struwe, M.: Plateau’s problem and the calculus of variations. Mathematical Notes. Princeton University Press, Princeton (1988)
Procedural Graphics Model and Behavior Generation J.L. Hidalgo, E. Camahort, F. Abad, and M.J. Vicent Dpto. de Sistemas Inform´ aticos y Computaci´ on Universidad Polit´ecnica de Valencia, 46021 Valencia, Spain
Abstract. Today’s virtual worlds challenge the capacity of human creation. Trying to reproduce natural scenes, with large and complex models, involves reproducing their inherent complexity and detail. Procedural generation helps by allowing artists to create and generalize objects for highly detailed scenes. But existing procedural algorithms can not always be applied to existing applications without major changes. We introduce a new system that helps include procedural generation into existing modeling and rendering applications. Due to its design, extensibility and comprehensive interface, our system can handle user’s objects to create and improve applications with procedural generation of content. We demonstrate this and show how our system can generate both models and behaviours for a typical graphics application.
1
Introduction
Many application areas of Computer Graphics require generating automatic content at runtime. TV and movies, console and computer games, simulation and training applications, and massive on-line games also require large numbers of object models and complex simulations. Automatic content generation systems are also useful to create a large number of members of the same class of an object with unique attributes, thus producing more realistic scenes. With the advent of high-definition displays, simulation and game applications also require highly detailed models. In many situations, it is not enough to procedurally generate the geometric models of the actors in the scene. And it is not practical to create their animations by hand, so automatic modeling behavior is another problem to solve. Our goal in this paper is to provide a unified approach to the generation of models for simulation and computer games. To achieve this goal we implement a system that combines procedural modeling with scripting and object-oriented programming. Procedural models may be of many different kinds: fractals, particle systems, grammar-based systems, etc. Ours are based on L-systems, but it can be used to implement all the other models. Our system supports features like parameterized, stochastic, context-sensitive and open L-systems. Moreover, we want our system to be as flexible as possible, and to allow the user to embed it in her own applications. Thus, we provide a method to use our procedural engines in many application domains. Also, our system is able M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 106–115, 2008. c Springer-Verlag Berlin Heidelberg 2008
Procedural Graphics Model and Behavior Generation
107
to combine different types of objects (both system- and user-provided) within a single framework. Geometry generators based on L-systems are usually targeted at specific applications with fixed symbols and tools based on LOGO’s turtle metaphor. They are highly dependable on the rewriting engine, thus preventing grammar improvement, language extensions, and code reuse. These generators can not generate different models for the same application. They require multiple L-systems that are difficult to integrate within the same application framework. To overcome these problems we introduce a procedural model generator that is easily extensible and supports different types of representations and application areas. It stems from a generator based on modular L-systems, L-systems that generate models using a rewriting engine witten in C/C++ and using Lua [1] as scripting language for grammar programming. We show how we can implement procedural, growth, image-based and other types of models using our system. This paper is structured as follows. The next section reviews previous work in modeling and automatic model generation. The following sections present our system implementation and several results obtained with it. We implement three different improvements on L-system: stochastic, context-sensible and open L-systems. Finally, we finish our paper with some conclusions and directions for future work.
2
Background
Procedural generators are typically based on techniques like fractals [2], particle systems [3], and grammar-based systems [4]. One may also find generators of simple primitives, subdivision surfaces, complex geometries [5] and Constructive Solid Geometry. All these generators allow the creation of texture images [6], terrain models, water and rain, hair, and plants and trees, among others. Historically, the most expressive procedural models have been the grammarbased technique called L-systems. It was introduced by Lindenmayer to model cellular interaction [7]. L-systems use a set of symbols, an axiom and a set of rewriting rules. The axiom is rewritten using the rules. Then, an output symbol string is generated and interpreted. The result of the interpretation is a freshly generated model of a tree, a building or any other object. Initially L-systems were used to create plant ecosystems [4]. Subsequently, they have been used for shell texturing [8], virtual urban landscaping [9],[10],[11], and geometry mesh generation [5]. L-systems have also been used for behavior modeling. Early L-systems were later modified to improve their expressiveness. First, they were parameterized, allowing arithmetic and boolean expressions on the parameters during the rewriting process. Later, stochastic L-systems introduced random variables into parameter expressions to support modeling the randomness of natural species [4]. Finally, context-sensitive rules and external feedback were added to L-systems to support interaction among generated objects and
108
J.L. Hidalgo et al.
between objects and their environment [12]. These systems are all based on the turtle metaphor [6]. Recent improvements on L-systems include FL-systems and the L+C language. In the L+C language the symbols are C data structures and the righthand sides of the rules are C functions from user developed libraries [13]. This improves on earlier systems by adding computation and data management to the rewriting process. Alternatively, FL-systems are L-systems that do not use the turtle metaphor [14]. Instead, they interpret the symbols of the derived string as function calls that can generate any geometry type. FL-systems have been used to generate VRML buildings.
3
Procedural Modeling and L-Systems
A general modeling system must support building many objects of many different types like, for example, crowd models made of people and city models made of streets and buildings. People may be represented using the same geometry, but each actual person should have slightly different properties, appearances and behaviors. Modeling such objects and behaviors by hand is impractical due to their complexity and large number of instances. We need systems to generate models automatically. Procedural modeling has been successfuly used to generate multiple instances of a same class of models. Our system can generate procedural models and behaviors of many kinds. It was originally developed to generate geometry using L-systems [15]. We have now extended it to generate image-based, grammar-based, and growth-based models as well as behaviors. In this paper we will show how to implement in our system stochastic and context-sensitive L-systems, as well as systems that can interact with their environment. The system’s interface and programming are based on Lua [1], a scripting language. Lua is used to handle all the higher-level elements of the modeling like: rule definition, user code, plugins, . . . . Additionally, lower-level objects are implemented in C/C++ and bound to Lua elements. These objects are loaded during the initialization of the system. Our system includes the classes used for basic graphics modeling, and the framework to allow the user to provide his own classes. This was designed with reusability in mind: objects, rules and bindings are organized in plugins that may be selectively loaded depending on the application. Object modeling is decoupled from string rewriting and offers a flexibility unavailable in other systems. To generate a procedural model and/or behavior, the user must provide an axiom and a set of rules written in the scripting language. These rules can use both the modeling objects provided by the system as well as custom, userprovided modeling objects. Currently our system provides objects like extruders, metaball generators, 3D line drawers and geometry generators based on Euler operators.
Procedural Graphics Model and Behavior Generation
109
Rewriting and interpretation rules are applied to the axiom and the subsequently derived strings. Internally the process may load dynamic objects, run code to create object instances, call object methods, and access and possibly modify the objects’ and the environment’s states. During the derivation process, any object in the system can use any other object, both system-provided or user-provided. This is why our L-systems are more flexible and more expressive than previous ones. For example, our system supports the same features as FL-systems and the L+C language: we generate geometry without the turtle metaphor, we include a complete and extensible rewriting engine, and we allow rules that include arithmetic and boolean expressions, flow control statements and other imperative language structures. Using a scripting language allows us to perform fast prototyping and avoids the inconvenience derived of the compiling/linking process. Since we use plugins and C/C++ bindings to handle objects that are instances of Object Oriented Programming classes, we offer more expressiveness than the L+C system.
4
Derivation Engine and Programming
We illustrate the features of our approach by describing its basic execution and a few application examples. To create a model or behavior we need to select the supporting classes. These classes can be already available in the system or they have to be written by the user. Then we use plugins to load them into the system, we instantiate the required objects and we create the system’s global state. Finally, the user has to provide the axiom and the rules (in Lua) that will control the derivation of the model. Currently, our system only implements a generic L-system deriving engine. Other engines are planned to be implemented, like genetic algorithms, recursive systems, etc. Our engine takes an L-system made of a set of symbols, a symbol called axiom, and the two sets of rewriting and interpretation rules. Then it alternatively and repeatedly applies rewriting and interpretation rules to the axiom and its derived strings, thus obtaining new derived strings like any other L-system. The difference is what happens during rewriting and interpretation. Our system is different because each time a rule is applied, a set of C/C++ code may be executed. Rewriting rules modify the derivation string without changing the C/C++ objects’ state and the system’s global state. Interpretation rules change the objects’ state without modifying the derivation string. Both types of rules have a left-hand side (LHS) and a right-hand side (RHS). The LHS has the form AB < S > DE where S is the symbol being rewritten and AB and DE are the left and right contexts, respectively. These contexts are optional. To match a rule we compare the rewriting string with the LHS’s symbol and its contexts. If the rule matches we run its RHS’s associated code, a Lua function that computes the result of the rewriting process or does the interpretation.
110
5
J.L. Hidalgo et al.
Generating Models and Behaviors
To illustrate the power of our approach we show how to generate models and behaviors based on three types of improved L-systems. First, we implement stochastic L-systems, a kind of parametric L-system whose parameters may include random variables [4]. Then, we show how our system supports context-sensitive L-Systems, an improvement that allows modeling interaction of an object with itself. Finally, we implement open L-systems, an extension of context-sensitive L-systems that allows modeling objects and their interactions with each other and the environment [12]. 5.1
Stochastic L-Systems
We implement stochastic L-systems by allowing parameters containing random variables. Fig. 1 shows code to generate the set of three buildings in the back of Fig. 3. Symbol City is rewritten as itself with one building less, followed by a translation and a Building symbol. Building is rewritten as a set of symbols that represent the floor, the columns and the roof of a building. The interpretation rules for those symbols generate geometry using a 3D extruder implemented in C/C++. The extruder can create 3D objects by extruding a given 2D shape and a path. obj:RRule("City", function(c) if c[1] > 0 then return { City{c[1]-1}, T{building_column_separation*building_width_columns+10,0}, Building{}, } end end) Fig. 1. A rule for creating a city made of copies of the same building
The code of Fig. 1 is deterministic and always generates the same set of buildings. To generate different types of buildings we parameterize the Building symbol with random variables representing building size, number of columns, number of steps, etc. Depending on the values taken by the random variables, different number of symbols will be created during rewriting. For example, building and column dimensions will determine how many columns are generated for each instance of a building. This is illustrated in the front row of buildings of Fig. 3. In practice, adding this stochastic behavior requires adding random number generator calls to the rules’ code (see Fig. 2). Fig. 3 shows an scene generated by this code. Note that we do not have to change our rewriting engine to support the improved L-System.
Procedural Graphics Model and Behavior Generation
111
obj:RRule("RandCity", function(c) if c[1] > 0 then return { RandCity{c[1]-1}, T{building_column_separation*building_width_columns+10,0}, Building{ width = rand_int(5)+3, length = rand_int(8)+4, column_height = rand_range(5,12), column_separation = rand_range(5,12), roof_height = rand_range(2,4), steps = rand_int(4) + 1 }, } end end) Fig. 2. A rule for creating a city made of different, randomly-defined buildings
5.2
Context-Sensitive L-Systems
The modeling of fireworks is another example that illustrates both model and behavior generation. We use it to describe the context-sensitive L-system feature of our system. We start with a C++ line generator object that creates a line given a color and two 3D points. Then, we define a L-system with five symbols: A, B, E, F, and L. Fig. 4 shows how the symbols are used to generate the fireworks. Symbol A is responsible for creating the raising tail. Symbol B defines the beginning of the raising tail and symbol L defines one of its segments. The end of the tail and its explosion are represented by symbol E, which is rewritten with a number of arms (symbol F) in random directions. Fig. 4 bottom shows an example derivation. When L is dim it becomes a B. When two Bs are together, the first one is deleted using a context-sensitive rule, thus eliminating unnecessary symbols and speeding up derivation. Symbols contain parameters representing position, direction, timestamp, etc. (parameters have been removed for clarity). They are used to compute the parabolas of the fireworks. They can also be used to add wind and gravity effects and to change the speed of the simulation. Fig. 5 shows three frames of an animation showing the output of the L-System. A video of the animated fireworks can be found in [16]. Note that the particle model and its behavior are both generated using a simple 3D line generator together with an L-system of a few rules. This illustrates the expressiveness of our system. 5.3
Open L-Systems
Open L-systems allow modeling of objects that interact with other objects and/or the environment. We are primarily interested in spatial interactions, even
112
J.L. Hidalgo et al.
Fig. 3. In the back three copies of the same building are shown, generated with the rule of Fig. 1. The buildings in the front are generated with the rule of Fig. 2. Note how a single parameterized rule can be used to generate different building instances.
Fig. 4. Top: the raising tail of the palm grows from B to A, in segments defined by L. Middle: At the top of the palm, symbol E fires the F arms in random directions. Bottom: derivation example.
Procedural Graphics Model and Behavior Generation
113
Fig. 5. Three frames of the firework animation generated using our context-sensitive L-system
if they are due to non-spatial issues, like plants growing for light or bacteria fighting for food. In this section we present an example of an open L-system. The environment in represented by a texture, in which the user has marked several target regions with a special color. The axiom generates a number of autonomous explorers in random positions of the environment. The goal of these explorers is to find a target region and grow a plant. There are two restrictions: (i) only one explorer or one plant can be at any given cell at any given time, and (ii) no explorer can go beyond the limits of the environment. The explorer is parameterized by a two properties: position and orientation. With a single rewriting rule, the explorer checks the color of its position in the environment map. If that color is the target color then, it is rewritten as a plant, else it updates its position. In each step, the explorer decides randomly if it advances using its current orientation, or it changes it. The explorer can not
Fig. 6. Two frames of the animation showing the explorer’s behavior
114
J.L. Hidalgo et al.
advance if the next position is outside of the map or there is another explorer or plant in that position. Fig. 6 shows two frames of an animation created using this rules. In this example, two modeling objects are used: the image manipulator, and the line generator used in the fireworks example. The image manipulator is an object that is able to load images, and read and write pixels from that image. Both objects are used simultaneously, and they communicate (the explorers’ traces are drawn onto the environment map, and the explorer check that map to decide whether they can move in certain direction).
6
Conclusions and Future Work
We present in this paper a new approach to procedural model and behavior generation. We propose a highly flexible and expressive tool to build L-systems using a scripting language and specifying rule semantics with an imperative object-oriented language. Our system combines the power of C/C++ objects with the simplicity and immediacy of Lua scripting. We show how our tool can be used to implement stochastic, context-sensitive and open L-systems. We show how different types of objects can be combined to generate geometry (buildings), images, and behaviors (explorers, fireworks). Our system can be applied to virtually any Computer Graphics related area: landscape modeling, image-based modeling, and modeling of population behavior and other phenomena. We expect to increase the functionality of our system by adding tools to generate models and behaviors based on: fractals and other iterative and recursive functions, genetic algorithms, growth models, particle systems and certain physics-based processes. We then want to apply it to the generation of virtual worlds for different types of games and for simulation and training applications. Acknowledgments. This work was partially supported by grant TIN200508863-C03-01 of the Spanish Ministry of Education and Science and by a doctoral Fellowship of the Valencian State Government.
References 1. Ierusalimschy, R.: Programming in Lua, 2nd edn. Lua.org (2006) 2. Mandelbrot, B.B.: The Fractal Geometry of Nature. W.H. Freeman, New York (1982) 3. Reeves, W.T., Blau, R.: Approximate and probabilistic algorithms for shading and rendering structured particle systems. In: SIGGRAPH 1985: Proceedings of the 12th annual conference on Computer graphics and interactive techniques, pp. 313– 322. ACM Press, New York (1985) 4. Prusinkiewicz, P., Lindenmayer, A.: The algorithmic beauty of plants. Springer, New York (1990)
Procedural Graphics Model and Behavior Generation
115
5. Tobler, R.F., Maierhofer, S., Wilkie, A.: Mesh-based parametrized L-systems and generalized subdivision for generating complex geometry. International Journal of Shape Modeling 8(2) (2002) 6. Ebert, D., Musgrave, F.K., Peachey, D., Perlin, K., Worley, S.: Texturing & Modeling: A Procedural Approach, 3rd edn. Morgan Kaufmann, San Francisco (2002) 7. Lindenmayer, A.: Mathematical models for cellular interaction in development, parts I and II. Journal of Theoretical Biology (18), 280–315 (1968) 8. Fowler, D.R., Meinhardt, H., Prusinkiewicz, P.: Modeling seashells. Computer Graphics 26(2), 379–387 (1992) 9. Parish, Y.I.H., M¨ uller, P.: Procedural modeling of cities. In: SIGGRAPH 2001: Proceedings of the 28th annual conference on Computer graphics and interactive techniques, pp. 301–308. ACM Press, New York (2001) 10. Hahn, E., Bose, P., Whitehead, A.: Persistent realtime building interior generation. In: sandbox 2006: Proceedings of the 2006 ACM SIGGRAPH symposium on Videogames, pp. 179–186. ACM, New York (2006) 11. M¨ uller, P., Wonka, P., Haegler, S., Ulmer, A., Gool, L.V.: Procedural modeling of buildings 25(3), 614–623 (2006) 12. Mˇech, R., Prusinkiewicz, P.: Visual models of plants interacting with their environment. In: SIGGRAPH 1996: Proceedings of the 23rd annual conference on Computer graphics and interactive techniques, pp. 397–410. ACM, New York (1996) 13. Karwowski, R., Prusinkiewicz, P.: Design and implementation of the L+C modeling language. Electronic Notes in Theoretical Computer Science 86(2), 141–159 (2003) 14. Marvie, J.E., Perret, J., Bouatouch, K.: The FL-system: a functional L-system for procedural geometric modeling. The Visual Computer 21(5), 329–339 (2005) 15. Hidalgo, J., Camahort, E., Abad, F., Vivo, R.: Modular l-systems: Generating procedural models using an integrated approach. In: ESM 2007: Proceedings of the 2007 European Simulation and Modeling Conference, EUROSIS-ETI, pp. 514– 518 (2007) 16. http://www.sig.upv.es/papers/cggm08
Particle Swarm Optimization for B´ ezier Surface Reconstruction Akemi G´alvez, Angel Cobo, Jaime Puig-Pey, and Andr´es Iglesias Department of Applied Mathematics and Computational Sciences, University of Cantabria, Avda. de los Castros, s/n, E-39005, Santander, Spain {galveza,acobo,puigpeyj,iglesias}@unican.es
Abstract. This work concerns the issue of surface reconstruction, that is, the generation of a surface from a given cloud of data points. Our approach is based on a metaheuristic algorithm, the so-called Particle Swarm Optimization. The paper describes its application to the case of B´ezier surface reconstruction, for which the problem of obtaining a suitable parameterization of the data points has to be properly addressed. A simple but illustrative example is used to discuss the performance of the proposed method. An empirical discussion about the choice of the social and cognitive parameters for the PSO algorithm is also given.
1
Introduction
A major challenge in Computer Graphics nowadays is that of surface reconstruction. This problem can be formulated in many different ways, depending on the given input, the kind of surface involved and other additional constraints. The most common version in the literature consists of obtaining a smooth surface that approximates a given cloud of 3D data points accurately. This issue plays an important role in real problems such as construction of car bodies, ship hulls, airplane fuselages and other free-form objects. A typical example comes from Reverse Engineering where free-form curves and surfaces are extracted from clouds of points obtained through 3D laser scanning [5,11,12,17,18]. The usual models for surface reconstruction in Computer Aided Geometric Design (CAGD) are free-form parametric curves and surfaces, such as B´ezier, Bspline and NURBS. This is also the approach followed in this paper. In particular, we consider the case of B´ezier surfaces. In this case, the goal is to obtain the control points of the surface. This problem is far from being trivial: because the surface is parametric, we are confronted with the problem of obtaining a suitable parameterization of the data points. As remarked in [1] the selection of an appropriate parameterization is essential for topology reconstruction and surface fitness. Many current methods have topological problems leading to undesired surface fitting results, such as noisy self-intersecting surfaces. In general, algorithms for automated surface fitting [2,10] require knowledge of the connectivity between sampled points prior to parametric surface fitting. This task becomes increasingly difficult if the capture of the coordinate data is unorganized or scattered. Most of the techniques used to compute connectivity require a dense data M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 116–125, 2008. c Springer-Verlag Berlin Heidelberg 2008
Particle Swarm Optimization for B´ezier Surface Reconstruction
117
set to prevent gaps and holes, which can significantly change the topology of the generated surface. Some recent papers have shown that the application of Artificial Intelligence (AI) techniques can achieve remarkable results regarding this parameterization problem [5,8,9,11,12,16]. Most of these methods rely on some kind of neural networks, either standard neural networks [8], Kohonen’s SOM (Self-Organizing Maps) nets [1,9], or the Bernstein Basis Function (BBF) network [16]. In some cases, the network is used exclusively to order the data and create a grid of control vertices with quadrilateral topology [9]. After this preprocessing step, any standard surface reconstruction method (such as those referenced above) has to be applied. In other cases, the neural network approach is combined with partial differential equations [1] or other approaches. The generalization to functional networks is also analyzed in [5,11,12]. A previous paper in [7] describes the application of genetic algorithms and functional networks yielding pretty good results for both curves and surfaces. Our strategy for tackling the problem also belongs to this group of AI techniques. In this paper we address the application of the Particle Swarm Optimization method for B´ezier surface reconstruction. Particle Swarm Optimization (PSO) is a popular metaheuristic technique with biological inspiration, used in CAM (Computer-Aided Manufacturing) for dealing with optimization of milling processes [6]. The original PSO algorithm was first reported in 1995 by James Kennedy and Russell C. Eberhart in [3,13]. In [4] some developments are presented. These authors integrate their contributions in [15]. See also [19]. The structure of this paper is as follows: the problem of surface reconstruction is briefly described in Section 2. Then, Section 3 describes the PSO procedure in detail. A simple yet illustrative example of its application is reported in Section 4. This section also discuss the problem of the adequate choice of the social and cognitive parameters of the PSO method by following an empirical approach. The main conclusions and further remarks in Section 5 close the paper.
2
Surface Reconstruction Problem
The problem of surface reconstruction can be stated as follows: given a set of sample points X assumed to lie on an unknown surface U, construct a surface model S that approximates U. This problem is generally addressed by means of the least-squares approximation scheme, a classical optimization technique that (given a series of measured data) attempts to find a function (the fitness function) which closely approximates the data. The typical approach is to assume that f has a particular functional structure which depends on some parameters that need to be calculated. In this work, we consider the case of f being a B´ezier parametric surface S(u, v) of degree (M, N ) whose representation is given by:
S(u, v) =
M N i=0 j=0
Pi,j BiM (u)BjN (v)
(1)
118
A. G´ alvez et al.
where BiM (u) and BjN (v) are the classical Bernstein polynomials and the coefficients Pi,j are the surface control points. Given a set of 3D data points {Dk }k=1,...,nk , we can compute, for each of the cartesian components, (xk , yk , zk ) of Dk , the minimization of the sum of squared errors referred to the data points:
Errμ =
nk k=1
⎛ ⎝μk −
M N
⎞2 Pijμ BiM (uk )BjN (vk )⎠
;
μ = x, y, z
(2)
i=0 j=0
Coefficients Pij = (Pijx , Pijy , Pijz ), i = 0, . . . , M , j = 0, . . . , N , are to be determined from the information given by the data points (xk , yk , zk ), k = 1, . . . , nk . Note that performing the component-wise minimization of these errors is equivalent to minimizing the sum, over the set of data, of the Euclidean distances between data points and corresponding points given by the model in 3D space. Note that, in addition to the coefficients of the basis functions, Pij , the parameter values, (uk , vk ), k = 1, . . . , nk , associated with the data points also appear as unknowns in our formulation. Due to the fact that the blending functions BiM (u) and BjN (v) are nonlinear in u and v respectively, the least-squares minimization of the errors becomes a strongly nonlinear problem [20], with a high number of unknowns for large sets of data points, a case that happens very often in practice.
3
Particle Swarm Optimization
Particle Swarm Optimization (PSO) is a stochastic algorithm based on the evolution of populations for problem solving. PSO is a kind of swarm intelligence, a field in which systems are comprised by a collection of individuals exhibiting decentralized or collective behavior such that simple agents interact locally with one another and with their environment. Instead of a central behavior determining the evolution of the population, are these local interactions between agents which lead to the emergence of a global behavior for the swarm. A typical example of PSO is the behavior of a flock of birds when moving all together following a common tendency in their displacements. Other examples from nature include ant colonies, animal herding, and fish schooling. In PSO the particle swarm simulates the social optimization commonly found in communities with a high degree of organization. For a given problem, some fitness function such as (2) is needed to evaluate the proposed solution. In order to get a good one, PSO methods incorporate both a global tendency for the movement of the set of individuals and local influences from neighbors [3,13]. PSO procedures start by choosing a population (swarm) of random candidate solutions in a multidimensional space, called particles. Then they are displaced throughout their domain looking for an optimum taking into account global and local influences, the latest coming form the neighborhood of each particle. To this purpose, all particles have a position and a velocity. These particles evolve all through the hyperspace according to two essential reasoning capabilities: a
Particle Swarm Optimization for B´ezier Surface Reconstruction
119
Table 1. General structure of the particle swarm optimization algorithm begin k=0 random initialization of individual positions Pi and velocities Vi in Pop(k) fitness evaluation of Pop(k) while (not termination condition) do Calculate best fitness particle Pgb for each particle i in Pop(k) do Calculate particle position Pib with best fitness Calculate velocity Vi for particle i according to (3) while not feasible Pi + Vi do Apply scale factor to Vi end Update position Pi according to (4) end k =k+1 end end
memory of their own best position and knowledge of the global or their neighborhood’s best. The meaning of the ”best” must be understood in the context of the problem to be solved. In a minimization problem (like in this paper) that means the position with the smallest value for the target function. The dynamics of the particle swarm is considered along successive iterations, like time instances. Each particle modifies its position Pi along the iterations, keeping track of its best position in the variables domain implied in the problem. This is made by storing for each particle the coordinates Pib associated with the best solution (fitness) it has achieved so far along with the corresponding fitness value, fib . These values account for the memory of the best particle position. In addition, members of a swarm can communicate good positions to each other, so they can adjust their own position and velocity according to this information. To this purpose, we also collect the best fitness value among all the particles in the population, fgb , and its position Pgb from the initial iteration. This is a global information for modifying the position of each particle. Finally, the evolution for each particle i is given by: Vi (k + 1) = w Vi (k) + αR1 [Pgb (k) − Pi (k)] + βR2 [Pib (k) − Pi (k)]
(3)
Pi (k + 1) = Pi (k) + Vi (k)
(4)
where Pi (k) and Vi (k) are the position and the velocity of particle i at time k respectively, w is called inertia weight and decide how much the old velocity will affect the new one and coefficients α and β are constant values called learning factors, which decide the degree of affection of Pgb and Pib . In particular, α is a weight that accounts for the “social” component, while β represents the “cognitive” component, accounting for the memory of an individual particle
120
A. G´ alvez et al.
Fig. 1. Example of surface reconstruction through particle swarm optimization: reconstructed bicubic B´ezier surface and data points
along the time. Two random numbers, R1 and R2 , with uniform distribution on [0, 1] are included to enrich the searching space. Finally, a fitness function must be given to evaluate the quality of a position. This procedure is repeated several times (thus yielding successive generations) until a termination condition is reached. Common terminating criteria are that a solution is found that satisfies a lower threshold value, or that a fixed number of generations has been reached, or that successive iterations no longer produce better results. The final PSO procedure is briefly sketched in Table 1.
4
An Illustrative Example
In this section we analyze a simple yet illustrative example aimed at showing the performance of the presented method. To this purpose, we consider an input of 256 data points generated from a B´ezier surface as follows: for the u’s and v’s of data points, we choose two groups of 8 equidistant parameter values in the intervals [0, 0.2] and [0.8, 1]. This gives a set of 256 3D data points. Our goal is to reconstruct the surface which such points come from. To do so, we consider a bicubic B´ezier surface, so the unknowns are 3 × 16 = 48 scalar coefficients (3 coordinates for each of 16 control points) and two parameter vectors for u and v (each of size 16) associated with the 256 data points. That makes a total of 80 scalar unknowns. An exact solution (i.e. with zero value error) exists for this problem.
Particle Swarm Optimization for B´ezier Surface Reconstruction 15
121
1
0.8
10 v
0.6
0.4
5
0.2
0
0
50
100
150
200
250
Iterations
300
350
400
450
0
0
0.2
0.4
0.6
0.8
1
u
Fig. 2. Example of surface reconstruction through particle swarm optimization: (left) evolution of the mean (solid line) and the best (dotted line) Euclidean errors along the generations; (right): optimum parameter values for u and v on the parametric domain
The input parameter values for the PSO algorithm are: population size: 200 individuals or particles, where each particle is represented by two vectors, U and V , each with 16 components initialized with random uniform values on [0,1] sorted in increasing order; inertia coefficient w = 1. The termination criteria is that of not improving the solution after 30 consecutive iterations. An example of reconstructed surface along with the data points is shown in Figure 1. This example does not correspond to the best solution (note that some points do not actually lie on the surface); it is just an average solution instead. It has been attained from eqs. (3)-(4) with α = β = 0.5 at generation 432 with the following results1 : best error in the fit: 1.6935426; mean error: 1.6935856; computation time: 56.73 seconds. All computations in this paper have been performed on a 2.4 GHz. Intel Core 2 Duo processor with 2 GB. of RAM. The source code has been implemented in the popular scientific program Matlab, version 7.0. Fig. 2(left) displays the evolution of mean error (solid line) and best (dotted line) distance error for each generation along the iterations. The optimum parameter values for (u, v) are depicted in Fig. 2(right) where one can see how the fitting process grasps the distribution of parameter values assigned to the data points. It is worthwhile to mention the tendency of the obtained parameter values, initially uniformly distributed on the unit square, to concentrate at the corners of such unit square parameter domain, thus adjusting well the input information. However, neither the couples (u, v) are uniformly distributed at the corners nor they fall within the intervals [0, 0.2] and [0.8, 1] for u and v, meaning that current results might actually be improved. A critical issue in this method is the choice of the coefficients α and β accounting for the global (or social) and the local (or cognitive) influences, respectively. In order to determine their role in our approach and how their choice affects method’s performance, we considered values for α and β ranging from 0.1 to 0.9 1
For the sake of easier comparison, this case has been boldfaced in Table 2.
122
A. G´ alvez et al.
Table 2. Executions of PSO algorithm for our B´ezier surface reconstruction problem. Cases (left-right, top-bottom): α from 0.9 to 0.4 with step 0.1, β = 1 − α in all cases.
Best error 1.1586464 1.6367439 2.3784306 1.8768595 2.1174907 1.1742370 2.0640768 2.2379692 2.1448603 2.0545017
α = 0.9, β = 0.1 Mean error # iter. CPU time 1.1586805 338 48.49 1.6368298 624 85.03 2.3784327 736 93.27 1.8778174 350 46.26 2.1174907 917 131.72 1.1785017 145 20.97 2.0640795 503 60.19 2.2380810 302 46.43 2.1450512 443 56.22 2.0547538 408 52.48
Best error 2.1516721 2.5577003 2.0212768 1.8898777 2.0019422 1.6520482 2.1574432 2.4201197 2.0328183 2.2947584
α = 0.8, β = 0.2 Mean error # iter. CPU time 2.1727290 820 151.74 2.5578876 308 46.45 2.0212868 431 69.06 1.8899836 455 69.23 2.003234 456 57.26 1.6520508 815 106.01 2.1574436 1540 205.35 2.4202493 822 100.09 2.0328913 587 72.24 2.2949567 3144 402.61
Best error 1.8639684 1.5992922 2.2059607 2.3826529 2.0860826 1.5692097 1.6049119 1.3689993 1.6388899 1.7290016
α = 0.7, β = 0.3 Mean error # iter. CPU time 1.8653115 162 21.19 1.5993440 192 24.27 2.2059608 1340 149.33 2.3836276 185 22.42 2.0862052 400 49.38 1.5692186 967 174.42 1.6049300 470 58.12 1.3690215 292 39.17 1.6395055 458 50.81 1.7291297 389 41.83
Best error 1.9953687 2.1917047 1.2152328 1.9617652 1.2548267 1.8748061 1.9507635 2.0454719 1.3580824 1.9017035
α = 0.6, β = 0.4 Mean error # iter. CPU time 1.9959765 379 44.73 2.1921891 289 39.14 1.2152527 382 49.42 1.9623143 303 43.27 1.2548387 1222 143.48 1.8752081 407 52.23 1.9509406 363 40.67 2.0464246 692 82.73 1.3583463 212 25.62 1.9017552 791 92.15
Best error 1.0799138 1.7608284 1.8697185 1.6935426 1.2815625 2.1078771 1.7415515 1.6556435 2.0329562 1.0632575
α = 0.5, β = 0.5 Mean error # iter. CPU time 1.0805757 408 52.47 1.7608294 808 102.93 1.8697928 809 104.86 1.6935856 432 56.73 1.2815865 495 61.72 2.1079752 401 61.96 1.7415516 574 78.96 1.6556464 1083 143.02 2.0329594 1286 172.42 1.0632593 413 54.67
Best error 0.9491048 1.7165179 1.3993802 1.1001050 1.2968360 0.9909381 1.4642326 1.6312540 1.4394767 1.4422279
α = 0.4, β = 0.6 Mean error # iter. CPU time 0.9492409 382 45.20 1.7165739 1573 178.12 1.3993921 466 58.42 1.1001427 720 104.10 1.2968360 1236 157.12 0.9909412 575 73.24 1.4642397 781 89.32 1.6312592 619 74.42 1.4394768 665 83.01 1.4422380 784 91.03
with step 0.1 in the way of a convex combination, i.e., α + β = 1. This choice allows us to associate the values of the couple (α, β) with a probability, so that their interplay can be better analyzed and understood. Note that the limit values 0 and 1 for any parameter α or β automatically discards the other one so they are not considered in our study. Furthermore, in [14] some experiments for the two extreme cases, social-only model and cognitive-only model, were accomplished
Particle Swarm Optimization for B´ezier Surface Reconstruction
123
Table 3. Executions of PSO algorithm for our B´ezier surface reconstruction problem. Cases: α = 0.3 (top-left), α = 0.2 (top-right), α = 0.1 (bottom) β = 1 − α in all cases.
Best error 2.1435684 1.8060138 2.2339992 2.0832623 1.6257242 1.6742073 2.2355623 2.0420308 2.2102381 2.2140913
α = 0.3, β = 0.7 Mean error # iter. CPU time 2.1435684 974 109.00 1.8060836 531 57.48 2.2339993 1159 129.07 2.0832632 3781 387.39 1.6257283 748 83.71 1.6742091 831 104.53 2.2356626 1262 134.60 2.0420326 510 64.98 2.2102381 741 82.48 2.2141552 2168 243.69
Best error 1.866938 0.9551159 2.0777061 2.0558802 1.6975429 1.9514682 1.8397214 1.8298951 2.0990000 1.5363575
α = 0.2, β = 0.8 Mean error # iter. CPU time 1.867126 461 61.99 0.9557403 290 37.26 2.0777782 940 113.08 2.0559779 807 96.07 1.6975450 1330 163.06 1.9514725 1405 177.13 1.8397337 898 107.12 1.8298951 1297 165.25 2.0990008 905 100.92 1.5363722 766 92.41
Best error 1.6872942 1.8395049 1.3436841 1.7886651 1.0500875
Mean error 1.6873196 1.8395169 1.3436841 1.7889720 1.0501339
α = 0.1, β = 0.9 # iter. CPU time Best error 1885 197.43 1.6657973 1793 194.67 1.8360591 1887 214.16 1.4325258 561 67.38 0.9387540 1442 154.02 0.9643215
Mean error # iter. CPU time 1.6658058 1377 156.11 1.8360592 3918 412.28 1.4325283 751 99.35 0.9387899 814 89.28 0.9642876 798 84.32
and the author found out that both parts are essential to the success of PSO. On the other hand, to overcome the randomness inherent in our method, we carried out 25 executions for each choice of these parameters. The 15 worst results were then removed to prevent the appearance of spurious solutions leading to local minima. The remaining 10 executions are collected in Tables 2 and 3. Columns of these tables show the best and mean errors, the number of iterations and the computation time (in seconds) respectively. Note that the best and mean errors take extremely close (although slightly different) values, with differences of order 10−5 in most cases. Note also that the number of iterations (and hence the computation time) varies a lot among different executions. However, larger number of iterations do not imply, in general, lower errors.
5
Conclusions and Future Work
In this paper we consider the problem of the reconstruction of a B´ezier surface from a set of 3D data points. The major problem here is to obtain a suitable parameterization of the data points. To this aim, we propose the use of the PSO algorithm that is briefly described in this paper. The performance of this method is discussed by means of a simple example. In general, the PSO performs well for the given problem. Errors typically fall within the interval [0.9, 2.6] in our executions, although upper (and possibly lower) values can also be obtained. This means that the present method compares
124
A. G´ alvez et al.
well with the genetic algorithms approach reported in [7], although the PSO seems to yield more scattered output throughout the output domain. Other remarkable feature is that the best and mean errors are very close each other for all cases, as opposed to the genetic algorithms case, where the differences are generally larger. On the other hand, there is not correlation between the number of iterations and the quality of the results. This might mean that our way of exploring the space domain of the problem is not optimal yet, and consequently there is room for further improvement. This can be achieved by a smart choice of the PSO parameters. As a first step, we performed an empirical analysis about the choice of the (α, β) parameters of the PSO (although other parameters such as the initial population and the number of neighbors for each particle are also relevant). The results show that there is no significant differences when changing the (α, β) values in the way of a convex combination, although the condition β ≥ α seems to achieve slightly better results. However, further research is still needed in order to determine the role of the parameter values at full extent. Our future work include further analysis about the influence of the PSO parameters on the quality of the results. Some modifications of the original PSO scheme might lead to better results. Other future work is the consideration of piecewise polynomials models like B-spline or NURBS, which introduce some changes in the computational process for dealing with the knot vectors (that are other parameters to be taken into account in these models). Some ideas on how to improve globally the search process are also part of our future work. Acknowledgments. The authors thank the financial support from the SistIngAlfa project, Ref: ALFA II-0321-FA of the European Union and the Spanish Ministry of Education and Science, National Program of Computer Science, Project Ref. TIN2006-13615 and National Program of Mathematics, Project Ref. MTM2005-00287.
References 1. Barhak, J., Fischer, A.: Parameterization and reconstruction from 3D scattered points based on neural network and PDE techniques. IEEE Trans. on Visualization and Computer Graphics 7(1), 1–16 (2001) 2. Bradley, C., Vickers, G.W.: Free-form surface reconstruction for machine vision rapid prototyping. Optical Engineering 32(9), 2191–2200 (1993) 3. Eberhart, R.C., Kennedy, J.: A new optimizer using particle swarm theory. In: Proceedings of the Sixth International Symposium on Micro Machine and Human Science, Nagoya, Japan, pp. 39–43 (1995) 4. Eberhart, R.C., Shi, Y.: Particle swarm optimization: developments, applications and resources. In: Proceedings of the 2001 Congress on Evolutionary Computation, pp. 81–86 (2001) 5. Echevarr´ıa, G., Iglesias, A., G´ alvez, A.: Extending neural networks for B-spline surface reconstruction. In: Sloot, P.M.A., Tan, C.J.K., Dongarra, J., Hoekstra, A.G. (eds.) ICCS-ComputSci 2002. LNCS, vol. 2330, pp. 305–314. Springer, Heidelberg (2002)
Particle Swarm Optimization for B´ezier Surface Reconstruction
125
6. El-Mounayri, H., Kishawy, H., Tandon, V.: Optimized CNC end-milling: a practical approach. International Journal of Computer Integrated Manufacturing 15(5), 453– 470 (2002) 7. G´ alvez, A., Iglesias, A., Cobo, A., Puig-Pey, J., Espinola, J.: B´ezier curve and surface fitting of 3D point clouds through genetic algorithms, functional networks and least-squares approximation. In: Gervasi, O., Gavrilova, M.L. (eds.) ICCSA 2007, Part II. LNCS, vol. 4706, pp. 680–693. Springer, Heidelberg (2007) 8. Gu, P., Yan, X.: Neural network approach to the reconstruction of free-form surfaces for reverse engineering. Computer Aided Design 27(1), 59–64 (1995) 9. Hoffmann, M., Varady, L.: Free-form surfaces for scattered data by neural networks. J. Geometry and Graphics 2, 1–6 (1998) 10. Hoppe, H., DeRose, T., Duchamp, T., McDonald, J., Stuetzle, W.: Surface reconstruction from unorganized points. In: Proc. of SIGGRAPH 1992, vol. 26(2), pp. 71–78 (1992) 11. Iglesias, A., G´ alvez, A.: A new artificial intelligence paradigm for computer aided geometric design. In: Campbell, J.A., Roanes-Lozano, E. (eds.) AISC 2000. LNCS (LNAI), vol. 1930, pp. 200–213. Springer, Heidelberg (2001) 12. Iglesias, A., Echevarr´ıa, G., G´ alvez, A.: Functional networks for B-spline surface reconstruction. Future Generation Computer Systems 20(8), 1337–1353 (2004) 13. Kennedy, J., Eberhart, R.C.: Particle swarm optimization. In: IEEE International Conference on Neural Networks, Perth, Australia, pp. 1942–1948 (1995) 14. Kennedy, J.: The particle swarm: social adaptation of knowledge. In: IEEE International Conference on Evolutionary Computation, Indianapolis, Indiana, USA, pp. 303–308 (1997) 15. Kennedy, J., Eberhart, R.C., Shi, Y.: Swarm Intelligence. Morgan Kaufmann Publishers, San Francisco (2001) 16. Knopf, G.K., Kofman, J.: Free-form surface reconstruction using Bernstein basis function networks. In: Dagli, C.H., et al. (eds.) Intelligent Engineering Systems Through Artificial Neural Networks, vol. 9, pp. 797–802. ASME Press (1999) 17. Pottmann, H., Leopoldseder, S., Hofer, M., Steiner, T., Wang, W.: Industrial geometry: recent advances and applications in CAD. Computer-Aided Design 37, 751– 766 (2005) 18. Varady, T., Martin, R.: Reverse Engineering. In: Farin, G., Hoschek, J., Kim, M. (eds.) Handbook of Computer Aided Geometric Design. Elsevier, Amsterdam (2002) 19. Vaz, I.F., Vicente, L.N.: A particle swarm pattern search method for bound constrained global optimization. Journal of Global Optimization 39, 197–219 (2007) 20. Weiss, V., Andor, L., Renner, G., Varady, T.: Advanced surface fitting techniques. Computer Aided Geometric Design 19, 19–42 (2002)
Geometrical Properties of Simulated Packings of Spherocylinders Monika Bargiel Institute of Computer Science, AGH University of Science and Technology al. Mickiewicza 30, 30-059 Krak´ ow, Poland
[email protected]
Abstract. In a wide range of industrial applications there appear systems of hard particles of different shapes and sizes, known as “packings”. In this work, the force-biased algorithm, primarily designed to model close packings of equal spheres, is adapted to simulate mixtures of spherocylindrical particles of different radii and aspect ratios. The packing densities of simulated mono and polydisperse systems, are presented as functions of particle elongation and different algorithm parameters. It is shown that spherocylinders can pack more densely than spheres, reaching volume fraction as high as 0.705.
1
Introduction
Historically, dense random close packings (RCP) of spheres were considered as a model for the structure of liquids, especially those of the noble gases. RCP was viewed as a well-defined state [1] with density φ ≈ 0.6366. This value was obtained in experiments [2] [3] as well as computer simulations [4]. Later work by Jodrey and Tory [5] and Mo´sci´ nski et al. [6] showed that higher densities could be easily obtained at the cost od increasing the order in the sphere system. Since the precise definition of “randomness” is lacking, the distinction between ordered and random is not absolute. Bargiel and Tory [7] introduced the measure of local disorder as the deviation of each 13-sphere complex from the corresponding fragment of either f.c.c. or h.c.p. lattice. The global disorder is then defined as the average local disorder (see formulae (21) to (23) of [7]). This measure enables to identify crystalline or nearly crystalline regions and hence to track the transition from RCP to the higher densities and determine the extent to which the greater density increases the order. Approximating the fraction of quasi-crystalline fragments (f.c.c. or h.c.p.) versus packing density they observed that the first crystalline fragments begin to form in the random structure at φ ≈ 0.6366 (see Fig. 7 and Table 6 of [7]), the value close to the earlier predictions for RCP. Recently, Torquato et al. [8] described RCP is an ill-defined state and introduced the concept of the maximally random jammed (MRJ) state [9] [10], corresponding to the least ordered among all jammed packings. For a variety of order metrics, it appears that the MRJ state has a density of φ ≈ 0.6366 and again is consistent with what has been thought of as RCP [11]. M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 126–135, 2008. c Springer-Verlag Berlin Heidelberg 2008
Geometrical Properties of Simulated Packings of Spherocylinders
127
There exist a wide spectrum of experimental and computational algorithms that can produce packings of equal spheres of different porosity and geometrical properties [12] [13] [6] [14] [15] [4] [5] [16]. However, in many applications we have to deal with granular systems of hard particles which are far from spherical. Those non-spherical particles have to be treated in quite different way due to their additional rotational degrees of freedom. Recently, Donev et al. [17] [18] and independently Bezrukov et al. [19] adapted some of the known algorithms to produce random packings of ellipsoids. Donev R and then they et al. [20] experimented with two kinds of M&M’sCandies, generalized the well known Lubaschevsky-Stillinger algorithm (LS) [12] [13] to handle ellipsoids. In both cases (experiment and simulation) they obtained high volume fractions (up to 0.71). Abreu et al. [21] used the Monte Carlo technique to study the packings of spherocylinders in the presence of gravitational field. In this paper we used the adaptation of the force-biased (FB) algorithm [6] [14] to produce dense random packings of spherocylinders. The reason for this choice is that spherocylinders are excellent model for particles from spherical to very elongated (rod-like) depending on their aspect ratio. Furthermore there exists an efficient algorithm to calculate the distance between two spherocylinders and to detect potential overlaps [22] [21], which is crucial in practically any dense packing simulation process. Furthermore, Allen et al. [23] argue, that spherocylinders show a smectic phase (while ellipsoids do not), since they can be mapped onto the hard sphere fluid by the global change of scale. We tested two systems of hard spherocylinders: isotropic and nematic. In the isotropic system the spherocylinders have random initial orientation, which is constant throughout the simulation. To observe the isotropic - nematic transition we allowed for the rotation of the spherocylinders with different ratios. In this case we obtained much higher densities at the cost of increasing the value of the nematic order parameter [23].
2 2.1
The Force-Biased Algorithm Spheres
The force-biased (FB) algorithm was primarily designed to attain very dense irregular packings of hard spheres [6] [14]. The initial configuration of the system is a set of N (initially overlapping) spheres centered at r i , i = 1, . . . , N and of diameter di chosen according to a given distribution function. The algorithm attempts to eliminate overlaps by pushing apart overlapping particles while gradually reducing their diameters as described in [6] [14]. The spheres are moved according to the repulsive “force”, F ij , defined between any two overlapping particles. The new position of the i-th sphere is then given by F ij , (1) ri = ri + ε j=i
and F ij = αij pij
δ ij , |δ ij |
(2)
128
M. Bargiel
where ε is the scaling factor, and αij is the overlap index 1 if particles i and j intersect αij = 0 otherwise.
(3)
The pair “potential”, pij , is proportional to the overlap between spheres i and j. For monosized spheres (of diameter d) the definition of pij , is straightforward pij = d (1 −
2 δij ), d2
(4)
where δij is the distance between the centers of the spheres i and j, i.e. 2 δij = |r ij |2 .
(5)
However, (4) cannot be applied efficiently to spheres of different diameters (especially when the difference in sizes is large). Hence, for diameters of arbitrary distribution, another potential function was devised (similar to those given in [19]) 2 δij dj 1− 1 . (6) pij = d1 2 di 4 (di + dj ) For equal particles di = d, i = 1, . . . , N and (6) simplifies to (4). 2.2
Adaptation to Spherocylinders
Using spherocylinders instead of spheres complicates the algorithm for couple of reasons. Firstly, a spherocylinder, i, is described by four parameters. In addition to its diameter, di , and spatial position, r i , we have to consider the length of its cylindrical portion, li , and orientation of its main axis, given by a unit vector ui . What is more important, the overlap detection and calculating the potential function between two spherocylinders is much more complicated than for the case of spheres. Vega and Lago [22] proposed a very efficient algorithm for locating the minimum distance between spherocylinders, later improved by Abreu et al. [21]. This algorithm uses the notion of a shaft of a spherocylinder as the main axis of its cylindrical portion. The coordinates of any point of a shaft i are given by si = ri + λi ui ,
(7)
where λi ∈ [−li /2, li /2]. Spherocylinders i and j overlap if the shortest distance between their shafts, δij , is less then the sum of their radii, that is, when δij < ∗(j) ∗(i) (di + dj )/2. Let q i = ri + λi ui and q j = rj + λj uj are the points on shafts i and j, respectively, closest to each other. Then ∗(i)
δ ij = q j − q i = rij + λj
∗(j)
uj − λi
ui
(8)
is the vector connecting the closest points of shafts i and j, and 2 = |δ ij |2 . δij
(9)
Geometrical Properties of Simulated Packings of Spherocylinders
129
For parallel shafts (e.i. when |ui · uj |2 = 1) their distance can be expressed as 2 li + lj 2 ) . δij = |r ij |2 − |ui · r ij |2 + max(0, |ui · r ij | − 2
(10)
Equation (10) can be applied for calculating overlaps between a sphere and a spherocylinder. A sphere i can be considered as a spherocylinder of null length (li = 0) and parallel to particle j (ui = uj or ui = −uj ). Obviously when li = lj = 0 (both particles are spheres) we get (5). In addition to shifting the positions of the particles (see (1)), the spherocylinders can be rotated according to ⎞ ⎛ r di ∗(j) δ ij ⎠ , (11) αij prij λi ui = n ⎝ui − εr li |δ rij | i=j
where prij is the rotational potential prij = 1 −
2 δij 1 4 (di
+ dj )2
,
(12)
δ rij is the projection of δ ij onto the plane perpendicular to ui , given by δ rij = δ ij − ui (δ ij · ui ) ,
(13)
εr is the scaling factor, and n(x) = x/|x| is the unit vector in the direction of x. In the case of spheres, when all the overlaps are eliminated, the algorithm stops. For spherocylinders the density can be further increased when, after elimination of overlaps, the size of the particles is increased by a certain factor and the overlap elimination process is repeated. This is done until further densification is not possible.
3
Results
We present results obtained from the FB algorithm for monodisperse systems and binary mixtures of spherocylinders. The objective of this study was to verify the influence of the particle size distribution, shape (e.i. the aspect ratio), and rotation factor, εr , on the packing density and the orientational order of the system represented by the nematic order parameter, S. 3.1
Monodisperse Systems
In this section the packing fraction of the monodisperse beds of spherocylinders is studied using the FB algorithm. Each simulation was performed in a cubic container with periodic boundary conditions. We used systems of N particles of aspect ratios, γ, ranging from 0 (spheres) to 80 (very long rods). The value
130
M. Bargiel (b)
(a)
(c)
(d)
Fig. 1. Sample packings of spherocylinders (a) γ = 0.4, (b) γ = 2, (c) γ = 10, (d) γ = 80
of N was 2000 for γ ≤ 20 but it had to be increased to 6000 for γ = 40 and to 15000 for γ = 80 due to the periodic boundary condition requirements, that size of any particle cannot exceed half of the computational box length, L. For monodisperse packings the number of spheres required, N , can be estimated on the basis of the nominal packing density, η0 , which never can be exceeded by the actual volume fraction. For details on the meaning of the nominal and actual packing densities see [6] [14]. Additionally, if d0 is the diameter of the spherocylinder corresponding to η0 , its volume, v0 , is given by v0 =
πd30 (1.5γ + 1) . 6
(14)
Geometrical Properties of Simulated Packings of Spherocylinders
131
Fig. 2. Dependence of the packing den- Fig. 3. Dependence of the packing density on the aspect ratio for different val- sity on the aspect ratio for ε = 0.1 and ues of ε and εr εr = 0.02. The inset is a blowup of the upper left corner of the main figure.
Consequently, the packing density, η0 , is η0 = N v0 /L3 .
(15)
From the periodic boundary condition d0 (γ + 1) <
1 L, 2
(16)
it is easy to obtain N>
48(γ + 1)3 η0 . π(1.5γ + 1)
(17)
We start from the isotropic configuration in which particle centers are uniformly distributed in the box and orientations are taken uniformly from the unit sphere. By setting εr = 0 we disable rotations and ensure that the final orientations are globally isotropic. It is possible, however, that some positional order will appear in the final configuration. When εr > 0 rotations are allowed and some orientational order can appear as well. We measure the degree of orientational order calculating the well known nematic order parameter, S, [23]. The results obtained for spherocylinders depended strongly on the aspect ratio, γ. Images (formed by the Persistence of Vision Raytracer, povray, version 3.6.1 [24]) of the random packings for several aspect ratios may be seen in Fig. 1. In all cases rotations were allowed (εr = 0.02). Fig. 2 shows the dependence of the final packing density on the aspect ratio for different values of the rotation factor, εr , while Fig. 3 presents the same dependance for ε = 0.1, εr = 0.02 and it covers much wider range of aspect ratios. Each point is an average over 10 runs. It is apparent from the figures that the packing density increases with γ up to a certain value. Further increase of γ
132
M. Bargiel
Fig. 4. Dependence of the nematic order Fig. 5. Dependence of the packing denparameter, S, on the aspect ratio sity of the bidisperse mixture on the aspect ratios, γ1 and γ2 , for ε = 0.1 and εr = 0.02
causes a density decrease. That means that there exists an aspect ratio, γm , for which spherocylinders can be packed the densest. This value does not depend on the rotation factor, although obviously the densities obtained for various εr are different. The experimental points in Fig. 3 lie on the line φ = 4.5/γ (solid line in the figure) for γ > 10, which is in very good agreement with theory [21]. Fig. 4 shows the development of the orientational order in the system, represented by the nematic order parameter, S, vs. γ, and εr . As could be expected the nematic order parameter is very small (below 0.02) for εr = 0, since in this case no rotations are allowed and the directional order remains random. For εr = 0.02 and ε = 0.1 values of S are only slightly higher but for εr = 0.02 and ε = 0.4, S reaches almost 0.08. This is the effect of much faster movement which enables spherocylinders of similar orientation to group. This can be observed in the figures not shown in this paper. 3.2
Bidisperse Mixtures
Aiming at verifying the effect of particle shape on the packing density and ordering of binary mixtures of spherocylinders, FB simulations were carried out using two species of particles with different aspect ratios. Each species is composed of certain number of particles (N1 and N2 respectively) with a specific aspect ratio (γ1 and γ2 ). In order to focus on shape effects, all the simulated particles have the same volume. Fig. 5 shows the dependence of the total packing density of the binary mixture on the aspect ratios, γ1 , and γ2 . It can be observed that the highest density is attained for γ1 = γ2 = 0.4 (monodisperse case). The shape of the function φ(γ1 ) is similar for all values of γ2 used, but for γ2 = 0.4 the density is much higher than for other values. Also only for γ2 = 0.4 there is a characteristic peak of density.
Geometrical Properties of Simulated Packings of Spherocylinders (a)
133
(b)
Fig. 6. Sample packings of bidisperse mixtures for ε = 0.1 and εr = 0.02 (a) γ1 = 0.4, γ2 = 2 (b) γ1 = 0, γ2 = 4
Fig. 6(a) shows sample packing of a binary mixture with γ1 = 0.4 and γ2 = 2, while in Fig. 6(b) γ1 = 0 (spheres) and γ2 = 4.
4
Conclusions
The force-biased algorithm for obtaining granular packings has been adapted to handle spherocylinders out to very large aspect ratios. The results for the spherocylinders reproduce existing experimental results for all available aspect ratios [21]. The volume fractions of the long spherocylinders confirm the prediction that the random packing density of thin rods is inversely proportional to the aspect ratio. The agreement is excellent for γ above 10. Our simulation results also agree fairly well with the available experimental densities. For a comparison of simulation and experimental packing densities see Fig. 6 of [25]. Most experiments presented relate to granular rods or fibers from a variety of materials such as wood, metal wire, and raw spaghetti. It is believed that the scatter of experimental densities are due to factors such as wall effects, friction, local nematic ordering, and particle flexibility [25]. The random sphere packing density turns out to be a local minimum: the highest density occurs at an aspect ratio of γ ≈ 0.4. The practical implication is that a small deviation in shape from spherical may increase the random packing density significantly without crystallization. It is clear that a polydisperse system of spheres packs more densely than a monodisperse system. For equally sized closely packed spheres the interstices between the particles are small enough such that no additional spheres can be placed in them. If the system is made more and more polydisperse, the smaller spheres may be placed where the larger ones previously could not. Perturbing the particle shape from spherical has a
134
M. Bargiel
similar effect: a short spherocylinder that may not fit when oriented in a given direction may fit when the orientation is changed. Finally, our simulations clearly show that particles with a given aspect ratio have a unique random packing density: The Bernal sphere packing can be generalized to spherocylinders of arbitrary aspect ratio with one and the same simulation method. This indicates that these packings all follow the same geometrical principle. The parameters of the FB algorithm strongly influence the final packing density of a given mixture as well as the orientational ordering of the resulting bed. Careful choice of these parameters is crucial for the efficiency of the algorithm and properties of the resulting packings. As long as εr = 0 the bed remains isotropic. When εr > 0 but ε is small (particles are not allowed to move too quickly) the bed is only slightly more ordered. However, when we increase ε, the orientational order quickly raises producing packings in the nematic phase. It should be possible to study other shapes of particle such as ellipsoids and disks using this method. The only issue here is to find an effective algorithm to calculate the distances between particles of a given shape. Otherwise the adaptation is straightforward. Since there are many computer and experimental studies concerning packings of ellipsoids [17] [18] [20], this is the more probable next step in this research. Also the packing of disks presents an interesting problem. Nevertheless the geometry of a disk, though simple, shows nontrivial difficulties in the calculations. Acknowledgement. Partial support of the AGH Grant No. 11.11.120.777 is gratefully acknowledged.
References 1. Bernal, J.D.: A geometrical approach to the structure of liquids. Nature 183, 141– 147 (1959) 2. Scott, G.D., Kilgour, D.M.: The density of random close packing of spheres. Brit. J. Appl. Phys. 2, 863–866 (1964) 3. Finney, J.L.: Random packing and the structure of simple liquids. Proc. Roy. Soc. A 319, 470–493 (1970) 4. Jodrey, W.S., Tory, E.M.: Computer simulation of isotropic, homogeneous, dense random packing of equal spheres. Powder Technol. 30, 111–118 (1981) 5. Jodrey, W.S., Tory, E.M.: Computer simulation of close random packing of equal spheres. Phys. Rev. A 32, 2347–2351 (1985) 6. Mo´sci´ nski, J., Bargiel, M., Rycerz, Z.A., Jacobs, P.W.M.: The force-biased algorithm for the irregular close packing of equal hard spheres. Molecular Simulation 3, 201–212 (1989) 7. Bargiel, M., Tory, E.M.: Packing fraction and measures of disorder of ultradense irregular packings of equal spheres. II. Transition from dense random packing. Advanced Powder Technology 12(4), 533–557 (2001) 8. Torquato, S., Truskett, T.M., Debenedetti, P.G.: Is random close packing of spheres well defined? Phys. rev. Letters 84(10), 2064–2067 (2000)
Geometrical Properties of Simulated Packings of Spherocylinders
135
9. Donev, A., Torquato, S., Stillinger, F.H., Conelly, R.: Jamming in hard sphere and disk packings. J. Appl. Phys. 95(3), 989–999 (2004) 10. Donev, A., Torquato, S., Stillinger, F.H., Conelly, R.: A liner programming algorithm to test for jamming in hard sphere packings. J. Comp. Phys. 197(1), 139–166 (2004) 11. Kansal, A.R., Torquato, S., Stillinger, F.H.: Diversity of order ans densities in jammed hard-particle packings. Phys. Rev. E 66, 41109 (2002) 12. Lubachevsky, B.D., Stillinger, F.H.: Geometric properties of random disk packing. J. Stat. Phys. 60, 561–583 (1990) 13. Lubachevsky, B.D., Stillinger, F.H., Pinson, E.N.: Disks vs. spheres: Contrasting properties of random packings. J. Stat. Phys. 64, 501–525 (1991) 14. Mo´sci´ nski, J., Bargiel, M.: C-language program for simulation of irregular close packing of hard spheres. Computer Physics Communication 64, 183–192 (1991) 15. Bargiel, M., Tory, E.M.: Packing fraction and measures of disorder of ultradense irregular packings of equal spheres. I. Nearly ordered packing Advanced Powder Technol. 4, 79–101 (1993) 16. Zinchenko, A.Z.: Algorithm for random close packing of spheres with periodic boundary conditions. J. Comp. Phys. 114(2), 298–307 (1994) 17. Donev, A., Torquato, S., Stillinger, F.H.: Neighbor list collision-driven molecular dynamics simulation for nonspherical particles. I. Algorithmic details. Journal of Computational Physics 202, 737–764 (2005) 18. Donev, A., Torquato, S., Stillinger, F.H.: Neighbor list collision-driven molecular dynamics simulation for nonspherical particles. II. Applications to ellipses and ellipsoids. Journal of Computational Physics 202, 765–773 (2005) 19. Bezrukov, A., Bargiel, M., Stoyan, D.: Statistical analysis of Simulated Random Packings of Spheres. Part. Part. Syst. Charact. 19, 111–118 (2002) 20. Donev, A., Cisse, I., Sachs, D., Variano, E.A., Stillinger, F.H., Connel, R., Torquato, S., Chaikin, P.M.: Improving the density of jammed disordered packing using ellipsoids. Science 303, 990–993 (2004) 21. Abreu, C.R.A., Tavares, F.W., Castier, M.: Influence of particle shape on the packing and on the segregation of spherecylinders via Monte Carlo simulations. Powder Technology 134, 167–180 (2003) 22. Vega, C., Lago, S.: A fast algorithm to evaluate the shortest distance between rods. Computers and Chemistry 18(1), 55–59 (1994) 23. Allen, M.P., Evans, G.T., Frenkel, D., Muldner, B.M.: Hard Convex Body Fluids. Advances in Chemical Physics 86, 1–166 (1993) 24. http://www.povray.org 25. Williams, S.R., Philipse, A.P.: Random Packings of spheres and spherocylinders simulated by mechanical contraction. Phys. Rev. E 67, 51301 (2003)
Real-Time Illumination of Foliage Using Depth Maps Jesus Gumbau, Miguel Chover, Cristina Rebollo, and Inmaculada Remolar Dept. Lenguajes y Sistemas Informaticos, Universitat Jaume I, Castellon, Spain {jgumbau,chover,rebollo,remolar}@uji.es
Abstract. This article presents a new method for foliage illumination which takes into account direct, indirect illumination and self-shadowing. Both indirect illumination and self-shadowing are approximated by means of a novel technique using depth maps. In addition, a new shadow casting algorithm is developed to render shadows produced by the foliage onto regular surfaces which enhances the appeareance of this kind of shadows compared to traditional shadow mapping techniques. Keywords: Foliage rendering, illumination, shadows, ambient occlusion, shaders, real-time, tree rendering.
1
Introduction
The representation of natural scenes has always been a problem due to the high amount of detail involved in the rendering process. Even a single tree or plant is composed of too much detail in order to be displayed efficiently on the current graphics hardware. To be able to render this kind of scenes a number of different modelling techniques have been developed that represent approximations of the tree with other rendering primitives that are more efficient for the real time [1][2][3]. However, improvements on the graphics hardware over the last years have made possible the development of multiresolution tree models based on geometry [4][5]. These continuous multiresolution LOD models enables us to reduce the geometric complexity of a tree, so that the time needed to display it is also reduced in a significant way. Due to the scattered nature of the foliage, the standard Phong model is unable to illuminate realistically this kind of tree models, because when simply using a standard local illumination scheme the density of the leaves is lost as well as the real appeareance of tree. On the other hand, traditional radiosity and global illumination solutions, which are suitable for foliage rendering, are very expensive to calculate in real time. Therefore, special methods for foliage lighting and shading are needed (such as [6]). This work presents a new method for foliage lighting, shading and shadowing that provides good quality illumination in real time while keeping acceptable frame rates. More over, it also provides a shadow casting algorithm of the foliage onto regular surfaces which captures the sense of depth of the foliage. M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 136–145, 2008. c Springer-Verlag Berlin Heidelberg 2008
Real-Time Illumination of Foliage Using Depth Maps
137
This document is divided as follows. Sect. 2 presents a state of the art in illumination and shadowing focusing on leaves. Sect. 3 introduces the contribution presented in this paper. After that, Sect. 4 shows the results achieved with our method. Finally, Sect. 5 discusses some aspects of the new method.
2 2.1
Related Work Illumination
There are some methods to simulate global illumination in real time, such as Precomputed Radiance Transfer [7]. This method uses spherical harmonics to capture low frequency illumination scenarios (including soft shadows and interreflections of objects). On the other hand [8] uses a geometry instantiation system and precise phase functions for hierarchical radiosity in botanical environments. Mendez et al. [9] introduce Obscurances as a method to simulate diffuse illumination by considering neighbour light contributions instead of the global ones. Ambient Occlusion [10] enhances the illumination of an object by determining the light visibility of each part of the object in a way that the most occluded is an object point the lesser light it will receive from the exterior. Other authors [11] adaptat [10] to the GPU so that the ambient occlusion is computed directly in the fragment shader. Another approach for real-time illumination of trees is [12]. In this work, ellipsoidal occluders that describe the shape of the tree are evaluated at run time. Reeves and Blau [13] present a tree rendering algorithm which also takes into account lighting and shadowing. The method is based on particle systems. The relative position of each particle inside the tree is used to approximate the illumination and shadowing at a given point. Jensen et al. [14] present a new model for subsurface light transport which is useful on translucent objects such as leaves. Franzke et al. [15] introduce an accurate plant rendering algorithm using [14] as a leaf illumination method and improving it for leaf rendering. Finally, [6] presents an expressive illumination technique for foliage. It calculates implicit surfaces that approximate the general shape of the foliage. The implicit surfaces are used both for estimating the global illumination coefficient at a given point and for realigning leaves normals to calculate the diffuse reflection. 2.2
Shadows
Williams introduced shadow mapping for general meshes in 1978 [16]. Although this method is highly suitable for the graphics hardware, its main drawback is aliasing and its memory consumption. Thus lots of authors have suggested their own approaches to solve this problem. Adaptive Shadow Maps (ASM) [17] reduces aliasing by storing the shadow map as a hierarchical grid. This allows us for huge memory savings, but it is not graphics hardware friendly, because of its hierarchical structure. Arvo [18]
138
J. Gumbau et al.
proposes to use a tiled grid data structure to tessellate the light’s viewport, as a simplified version of ASM. Each cell in this grid contains a sampling density depending on a heuristical analysis. There are some perspective parametrizations to maximize the area occupied by shadow casters if they are near the observer. This allows for rendering high quality shadows near the camera at the cost of loosing detail, but not quality, on points that are far away from the observer. The most representative shadowing methods that use this scheme are [19][20][21]. Parallel Split Shadow Maps [22] use a similar approach, but it treats the continuous depth range as multiple depth layers. This allows us to utilize better the shadow map resolution. As seen, there are no few methods around this topic, but there are no specialized shadow casting methods for foliage that takes into account the leaves structure and its spread nature.
3
Contribution
This paper presents a method to handle the illumination of foliage, taking into account both direct and indirect illumination contributions as well as the autoocclusion information of the foliage itself. Moreover the method also handles the shadow projection of the foliage onto other surfaces. Our approach is based on the rendering equation introduced by Kajiya [23] described as follows: fr (x, w , w)Li (x, w )(w · n)dw (1) Lo (x, w) = Le (x, w) + Ω
where the amount of light irradiating from an object Lo (x, w) at a given point x and direction w depends on the light the object emanates Le (x, w), which we can ignore because leaves do not emit light, and on the incoming light Li (x, w ). Light incoming from all directions is modulated with the angle of incidence of the light onto the surface w · n and the BRDF fr (x, w , w) that describes the reflectance function, which depends on the material properties. This paper discusses how to implement each part of the Eq. 1 to provide a realistic illumination for the foliage in real-time. Finally, a new shadow casting algorithm is introduced which takes into account leaf density information at a given light direction to render realistic foliage shadows over other surfaces, such as the ground or the trunk. 3.1
Indirect Illumination Contribution
Although Eq. 1 takes into account the light incoming from all directions, evaluating all light directions would be very expensive. Thus, we apply our BRDF calculations only over the light direction which comes directly from the light source, separating the direct from the indirect light contributions. Therefore, we obtain the following formula for light irradiance at a given point x and direction
Real-Time Illumination of Foliage Using Depth Maps
139
w, where the direct light contribution is sepparated from the indirect lighting (AΩ (x)): Lo (x, w) = AΩ (x) + fr (x, w , w)Li (x, w )(w · n)
(2)
The method proposed replaces the AΩ term in Eq. 2 by a new indirect lighting algorithm specifically designed for the foliage. The indirect illumination of our model is calculated as a preprocess step. The amount of light a leaf receives from the scene is calculated as the visibility of the leaf from the exterior of the foliage. Fig. 1 shows the results of applying our indirect light algorithm.
(a) Tree foliage without illumination.
(b) Indirect lighting affecting the foliage.
Fig. 1. Results of our indirect lighting approach
The ambient light received from a leaf depends on the visibility of each leaf from the exterior of the foliage. As all leaves on the tree are of the same size, the visibility value for each leaf is compared with the visibility value of a single leaf, which is given by the orthogonal projection of the leaf over a known virtual viewport. This has been implemented by rendering each leaf with a unique colour with a depth buffer activated. Thus, the visibility of each leaf from outside the tree is given by the number of pixels of the same colour on the six faces of a cubemap surrounding the foliage. To represent the colour of the ambient light affecting each leaf, a texture read operation is performed over a downsampled cubemap that contains the environment of the tree. The normal of each leaf face is used as texture coordinates to fetch this data from the cubemap. Fig. 2 shows a tree illuminated using only indirect illumination with different scene light ambient absorption. This visibility calculations are performed per face, because each face of a single leaf can receive different amounts of light with a different colour, depending on the scene and the depending on the direction the leaf is facing. However, light tend to spread across and through the leaf, depending on its translucency. Thus, the transparency level of the leaf is used to add light reception values from both
140
J. Gumbau et al.
Fig. 2. Different views of a tree with different ambient light contributions. From left to right: white, reddish and yellow light.
sides of the leaf. Thus, the light scattering property of the leaves is taken into account to calculate the ambient occlusion term. Finally, the ambient occlusion colour AΩ for a given face of each leaf i is calculated as shown in Eq. 3. AΩ = [Ci Ii (Vi /V )n + α[Ci Ii (Vi /V )n ]
(3)
where α is the transparency of the leaf, Ii is the colour the current face of the leaf i absorves from the scene, Ii is the colour the opposite face of the leaf absorves from the scene, Ci and Ci represent the colour of the front and opposite faces of the leaf i, Vi and Vi are the number of pixels generated by the current and opposite faces of the triangle i respectively and V is the number of pixels generated the projection of a reference leaf without occlusion. The parameter n is always positive and controls how rapidly the darkening for the ambient term occurs in the foliage. Values of n > 1 will result in a more rapid darkening and values of n < 1 will cause the darkening to slow down from outer to inner parts of the foliage. The results of this process are stored per vertex so that it can be applied per leaf at run time. 3.2
Direct Illumination and Self-shadowing
Our illumination method has been developed having in mind the nature of the leaves in order to simulate the complex interaction of the light inside the foliage. A visual analysis of the light interaction with the foliage provides a simple conclusions about this issue: both inner leaves as well as those leaves in the opposite side from the light source appear darker. This is caused due to the leaves autoocclusion which impedes the light to reach those leaves and makes them receive less light and appear darker.
Real-Time Illumination of Foliage Using Depth Maps
141
Our approach is based on capturing the general shape of the foliage from the light source and illuminate each leaf depending on its position inside the foliage volume to approximate the self-shadowing of the leaves. We use two depth maps capturing the nearer and further parts of the foliage from the light source which we will call Dn and Df . The main idea is that the nearer a leaf is to Df respective to Dn the darker it should appear. In addition, another texture C is used to determine the amount of leaf intersections per pixel, from the light source. This value will be used to determine the leaf density in a given direction.
Fig. 3. Left: foliage without illumination. Middle: ambient lighting only. Right: the complete illuminaton system, including direct and indirect lighting, self-shadowing and shadows casted over the trunk and branches.
When rendering a leaf at run time, the pixel shader calculates the position of the leaf relative to the light source and calculates the adequate texture coordinates to access the depth maps, in the same way like in traditional shadow mapping. The shader compares the depth of the leaf in light space with the minimum and maximum depths at that point to determine its proximity to those values. This value is weighted with the value contained in the texture C which determines the amount of leaves at that light direction. Eq. 4 shows the formula used to calculate the self-shadowing factor S of the leaf depending on the light source i. This shadowing factor replaces the Li (x, w ) term in Eq. 2: Zx − Zn Li (x, w ) = αNc 1 − (4) Zf − Zn where α is the transparency level of the leaf, Nc is the number of leaf collisions at a certain light direction given by the texture C, Zx is the depth of the current leaf fragment in light space, and Zn and Zf are the minimum and maximum depths in that light direction given by textures Dn and Df respectively.
142
J. Gumbau et al.
Therefore Eq. 4 provides the darkening factor for each pixel depending on the light source direction and the general shape of the foliage volume. The results of this equation matches to the light intensity function of Eq. 2. Fig. 3 shows an example of our illumination approach for the foliage. The direct lighting contribution of the foliage is calculated in the following way. Due to the translucent nature of leaves, a subsurface scattering based BRDF is needed to correctly simulate the illumination on the leaves. Jensen et al. [14] propose an efficient method for subsurface scattering which separates the scattering process in a single scattering term L(1) and a diffusion approximation term Ld , as shown in Eq. 5. L(x, w) = L(1) (x, w) + Ld (x, w)
(5)
Frankze et al. [15] show how Eq. 5 can be approximated as shown in Eq. 6 due to the minimum thickness of the leaves providing a method easier to evaluate in real-time: fr (x, w , w) = L(1) + Ld = (1 + e−si e−so )Li (xi , w ) · (N · w )
(6)
Where si is the leaf thickness and so is a random outgoing distance inside the material from the actual sample position. This approximation matches the fr (x, w , w) component in Eq. 2 and describes the BRDF associated to the direct illumination. Direct illumination is evaluated in the pixel shader fetching some parameters from textures such as leaf thickness and normal information. 3.3
Shadow Casting Over Other Surfaces
While we have covered the lighting interaction of the leaves, how the light penetrates across the foliage and reaches another surfaces is also important. The foliage can be seen as a set of multiple layers of translucent leaves. Therefore, the amount of shadowing other objects receive from the foliage depends on how many leaves intersect a light direction and their amount of transparency. To simulate this, we use the texture C (see Sect. 3.2) which stores the amount of leaf intersections at a given direction weighted with the leaf transparency at each point. This texture can be calculated in a single pass and updated along with the others shadow maps. We use this information to render more realistic shadows over surfaces, where the depth of the foliage is taken into account to visualize more convincing foliage shadows. Fig. 4 shows the final appeareance of our shadow casting algorithm. Notice how the shadow map shows the depth of the foliage, being more opaque where there is more leaf density throught the light direction, and being more transparent where there is lower leaf density.
Real-Time Illumination of Foliage Using Depth Maps
143
Fig. 4. A detailed view of our shadow mapping aproach for the foliage. Notice how the depth of the foliage is captured in the shadows.
4
Results
In our tests, we have used a geometry-based continuous level of detail algorithm for the foliage. This allows for performance optimizations when rendering the forest scene with such amount of geometric trees. Fig. 5 shows the results of our illumination solution. Notice how the illumination captures general shape of the foliage, darkening those parts that are difficult to reach for the light.
Fig. 5. Forest scenes with our illumination and shadowing approach
144
J. Gumbau et al.
This method for foliage illumination requieres to access to three different depth maps per pixel to evaluate the illumination equation. However, this is optimized to require just one texture read by packing all textures in a single three channel floating point texture. Thus the overhead of applying this method is just one texture read and a few arithmetical operations in the pixel shader. To obtain a proper auto-occlusion shadowing of the foliage without this method a shadow map would be needed and also a texture read and some arithmetical operations would be needed as well. Therefore, applying our method adds a little overhead in these cases, being the main drawback to store three values per texel instead of just one. However, the visual quality of this method for foliage illumination justifies this storage overhead. The cost of casting foliage shadows over the ground or any other surface (as the trunk) is negligible compared to a standard shadow mapping algorithm, as the only difference is what the shadow map contains and a couple of arithmetical operations in the pixel shader.
5
Conclusions
This paper presents an approach for foliage illumination and expressive leaves shadow casting algorithm which is applicable in real time. The algorithm is based on depth maps. This means that in scenarios where shadow mapping is being used to simulate the shadows of the foliage as well as the auto-occlusion of the leaves, this method will improve the visual quality of the scene at the expense of little computational cost. We decided to calculate the ambient occlusion factor as a preprocessing step because the static nature of trees. Trees are always located in the same place in the space. Therefore we take this into account to accelerate the ambient occlusion calculation by preprocessing it and storing as per-vertex attributes. Thus, the cost of applying the ambient occlusion is negligible. As said before, this method uses depth maps as the base tool to infer the illumination and to render the shadows, such as trapezoidal, perspective, lightspace perspective shadow maps. This method is built on top on existing shadow mapping algorithms, dealing with the meaning of the information contained at each texel in the shadow map, so it does not compete with other shadow mapping methods, but extends them. Altough we have used geometry-based trees in this article, the algorithm is also applicable to image-based or point-based trees because the information needed to calculate the illumination is stored in a separate map and is not attached to geometry [6] which is a restriction in real-time rendering. Acknowledgments. This work has been supported by grant P1 1B2007-56 (Bancaixa), the Spanish Ministry of Science and Technology (Contiene Project: TIN2007-68066-C04-02) and FEDER funds.
Real-Time Illumination of Foliage Using Depth Maps
145
References 1. Deussen, O., Hanrahan, P., Lintermann, B., Mˇech, R., Pharr, M., Prusinkiewicz, P.: Realistic modeling and rendering of plant ecosystems. In: SIGGRAPH 1998, New York, NY, USA, pp. 275–286 (1998) 2. Dietrich, A., Colditz, C., Deussen, O., Slusallek, P.: Realistic and Interactive Visualization of High-Density Plant Ecosystems. In: Natural Phenomena 2005, pp. 73–81 (August 2005) 3. Colditz, C., Coconu, L., Deussen, O., Hege, H.: Real-Time Rendering of Complex Photorealistic Landscapes Using Hybrid Level-of-Detail Approaches. In: Conference for Information Technologies in Landscape Architecture (2005) 4. Rebollo, C., Remolar, I., Chover, M., Gumbau, J., Ripoll´es, O.: A clustering framework for real-time rendering of tree foliage. Journal of Computers (2007) 5. Rebollo, C., Gumbau, J., Ripolles, O., Chover, M., Remolar, I.: Fast rendering of leaves. In: Computer Graphics and Imaging (February 2007) 6. Luft, T., Balzer, M.: Deussen O. Expressive illumination of foliage based on implicit surfaces. In: Natural Phenomena 2007 (September 2007) 7. Sloan, P.P., Kautz, J., Snyder, J.: Precomputed radiance transfer for real-time rendering in dynamic, low-frequency lighting environments. In: SIGGRAPH 2002, New York, USA, pp. 527–536 (2002) 8. Soler, C., Sillion, F., Blaise, F., Dereffye, P.: An efficient instantiation algorithm for simulating radiant energy transfer in plant models. ACM Trans. Graph, 204–233 (2003) 9. M´endez, A., Sbert, M., Cat´ a, J.: Real-time obscurances with color bleeding. In: SCCG 2003, New York, USA, pp. 171–176 (2003) 10. Pharr, M., Green, S.: Ambient Occlusion (2004) 11. Bunnell, M.: Dynamic Ambient Occlusion And Indirect Lighting (2005) 12. Hegeman, K., Premoze, S., Ashikhmin, M., Drettakis, G.: Approximate ambient occlusion for trees. In: Sequin, C., Olano, M. (eds.) SIGGRAPH 2006, ACM SIGGRAPH, New York (March 2006) 13. Reeves, W., Blau, R.: Approximate and probabilistic algorithms for shading and rendering structured particle systems. In: SIGGRAPH 1985, New York, USA, pp. 313–322 (1985) 14. Jensen, H.W., Marschner, S.R., Levoy, M., Hanrahan, P.: A practical model for subsurface light transport. In: SIGGRAPH 2001 (2001) 15. Franzke, O.: Accurate graphical representation of plant leaves (2003) 16. Williams, L.: Casting curved shadows on curved surfaces. In: SIGGRAPH 1978, New York, USA, pp. 270–274 (1978) 17. Fernando, R., Fernandez, S., Bala, K., Greenberg, D.P.: Adaptive shadow maps. In: SIGGRAPH 2001, New York, USA, pp. 387–390 (2001) 18. Arvo, J.: Tiled shadow maps. In: Proceedings of Computer Graphics International 2004, pp. 240–247 (2004) 19. Stamminger, M., Drettakis, G., Dachsbacher, C.: Perspective shadow maps. In: Game Programming Gems IV (2003) 20. Wimmer, M., Scherzer, D., Purgathofer, W.: Light space perspective shadow maps (June 2004) 21. Martin, T., Tan, T.S.: Anti-aliasing and continuity with trapezoidal shadow maps. In: Rendering Techniques, pp. 153–160 (2004) 22. Zhang, F., Sun, H., Xu, L., Lun, L.K.: Parallel-split shadow maps for large-scale virtual environments. In: VRCIA 2006, pp. 311–318 (2006) 23. Kajiya, J.T.: The rendering equation. In: SIGGRAPH 1986, pp. 143–150 (1986)
On-Line 3D Geometric Model Reconstruction H. Zolfaghari1 and K. Khalili2 1
Computer Engineering Department, Islamic Azad University, Birjand Branch, Birjand, Iran
[email protected] 2 Mechanical Engineering Department, University of Birjand, Iran
[email protected]
Abstract. Triangulation techniques along with laser technology have been widely used for 3D scanning, however, 3D scanning of moving object and online modeling have received less attention. The current work describes a system developed for on-line CAD model generation using scanned data. The system developed uses structured laser light pattern and a digital CCD camera to generate and capture images from which the range data is extracted. The data is then analyzed to reconstruct the geometric model of the moving object. To exploit the potential of a geometric modeler the model is represented in a commercial CAD software. To reduce errors the system employs on-line calibration, light distribution correction algorithm, camera calibration, and subpixeling techniques. A data reduction scheme has also been incorporated to eliminate redundant data. Keywords: 3D Modeling, 3D Scanning, Surface Reconstruction. On-line Model Generation.
1 Introduction 3D scanning is an interesting subject and of interest to many researchers. During the past few years much effort has been devoted to technology development and improving the accuracy and speed of 3D scanning while reducing the cost of such systems. The potential areas of applications include reverse engineering, product development, quality control, and 3D photography. The general procedure in 3D scanning consists of scanning an object, usually with a range sensor, merging several data sets into a single registered data set, and representing the data set in a computer representation form, such as a mesh or a set of surfaces. Generally, the methods of measurement of three-dimensional geometry are divided into two main methods, i.e. tactile measurement and optical measurement [1]. Contact methods based on mechanical sensors have been used widely for accurate industrial applications; however the accuracy, speed, cost and being contact of such systems are of concern [2]. Non-contact methods use different technologies amongst which optical methods including active and passive systems are the most widely used methods. Triangulation technique along with laser technology provides a relatively accurate, fast, and inexpensive system for 3D scanning. Passive stereo vision is attractive in being a passive method but current systems lack the accuracy required for the most of industrial applications. M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 146 – 152, 2008. © Springer-Verlag Berlin Heidelberg 2008
On-Line 3D Geometric Model Reconstruction
147
Currently, many 3D scanners use structured light technique using laser light source. Structured light is the projection of a light pattern such as a plane, grid, or more complex shape at a known angle onto an object. Use of a beam of light and fannig it out into a sheet-of-light is widely used in 3D scanning. When the sheet-of-light intersects with the object of interest, a bright line of light can be seen on the surface of the object. By viewing this line of light, the observed deviation of the line from base line can be translated into height variations [3]; Fig.1.
Fig. 1. Active optical triangulation technique
Although 3D scanning has received a great attention from research community but little works have been carried out to on-line CAD model generation of moving objects. The potential applications include 3D modeling of continuous production line such as extruded products [4]. This paper deals with on-line CAD model generation of scanned data. The system developed uses structured laser light pattern and a digital CCD camera to generate and capture images from which the range data is extracted. The data is then analyzed to reconstruct the geometric model of the moving object. Finally, the model is represented in a commercial CAD software.
2 Conceptual Design In scanning process a set of 3D points from the measured object surface is acquired then transformed by software into a geometric model of the object. The accuracy of the model depends primarily on the resolution of scanning; i.e. the distance between adjacent scanned points. Moreover, the mapping process in which the 3D coordinate of the object is transformed into 2D coordinate of image has to be precise and accurate. Factors such as the resolution of the imaging device, the thickness and more importantly the sharpness of the light line and its uniformity, adjustment and calibration of the light source and the imaging device, the environmental factors and object color and texture, and the data extraction algorithm
148
H. Zolfaghari and K. Khalili
affect the accuracy of acquired data. The system has to be designed such that the final accuracy required is obtained. The current work describes an inexpensive 3D scanner using off-the-shelf components and software. In system design emphasis is given to modularity, use of off-the-shelf components, full potential use of an existing commercial CAD software, flexibility, and reduced cost. The designed system contains four modules; lighting module, imaging module, processing module, and display module. The system must be able to generate on-line 3D models of moving objects as they pass under the scanner.
3 Previous Works Three dimensional scanning has received great attention in research community. These include contact methods and non-contact methods. Coordinate Measuring Machine (CMM) is the most accurate mechanical machine being in use for decades. There are currently many commercial systems in the market. They are mainly used for dimensional quality control but they are also used for 3D scanning, but the process is very time consuming. Non-contact methods especially optical methods including passive and active techniques are used widely and research works towards increasing accuracy and speed, reduced cost, ease of use, are in progress. Son et al [1] developed a 3D scanner using structured light. They tried to automate the whole process and eliminate the need for manual operations. Lombardo et al [3] developed a time of scan triangulation technique. An accuracy of 100 μm, a working distance of 20 cm and a very limited range of 10 mm has been reported. The accuracy of the system developed depends on the instability of the rotation speed of the mechanical scanner that affects the measurement of the scanning time. Zhang and Wee [5] developed a novel calibration approach to improve the accuracy of the system. They applied back propagation (BP) neural network to the calibration of structured light 3D vision inspection. An accuracy of 0.083 mm has been reported. Zexiao et al [6] developed a multi-probe 3D measurement system using CMM machine with capability of structure light scanning to increase the capabilities and flexibility of CMM. A comprehensive review of range sensor development is carried out by Blais F. [7] to which readers are referred to. Also there are several methods for modeling an object surface. For a review on methods of modeling see [8].
4 System Development The lighting module includes a laser diode UHL 5-10G-G35-90 with visible 635 nm wavelength. A cylindrical lens is used to fan out the spot beam into a single sheet of light pattern. The adjustment of light source is crucial to the accuracy of final result. To be able to adjust the light source the laser head was mounted on a robot wrist. The robot had 5 degrees of freedom (dof) including wrist 2 dof. This allowed centering the light line and fine adjustment of laser head orientation. To obtain the orientation of laser head, the method developed by [10] was used.
On-Line 3D Geometric Model Reconstruction
149
The light had a Gaussian distribution resulting in an uneven thickness when imaging (1). −3
⎡ − β (x − μ)2 ⎤ β 2 f ( x; μ , β ) = x exp ⎢ ⎥ 2 2π ⎣ 2μ x ⎦ =0
x>0
(1)
elsewhere
The line of light was corrected by applying inverse distribution function. This provided a relatively uniform line pattern. The imaging module uses a Canon A640 digital camera. The camera is controlled via computer to take pictures at set intervals of time. The camera is mounted above a conveyor belt. The object is positioned on conveyor belt and moved along with a known and adjustable speed; Fig. 2. The resolution of picture along the movement direction is directly determined by time intervals of picture grabbing and the speed of conveyor and these have to be known precisely.
Fig. 2. The laboratory system developed
Prior to use, the camera was calibrated using Tsai method [9]. Processing Module uses MATLAB environment to preprocess and process data. The steps are as follows − −
Image capture (Fig. 3) Image preprocessing and processing including; − Grey level conversion − Object detection − Edge detection (Fig. 4) − Image enhancement − Image segmentation − Line thinning
150
H. Zolfaghari and K. Khalili
− − − −
− Base-line detection and construction (Fig. 5) − Range calculation − On-line Calibration Data points formatting (DXF format) Surface reconstruction (Fig. 6) Data export Model Display
As the color information is not required and to reduce the data, the color image is converted to a grey level image. It is further converted to binary image but the grey level values are kept for downstream use. Noise that is of Salt type is removed using Erosion technique. A structuring element (SE) of size 10 by 10 was used. SE size is important to eliminate noise not the data points. The width of the line is the distance between the rising edge and falling edge of the line along its width and the position of the line. To locate the position of the line more accurately subpixeling technique was used. To be able to use the accuracy of subpixeling the grey level of edges found and inner pixels were retrieved. Using subpixeling, the centre of the line was calculated and considered as data points. The line was thinned out to a line with a width of a single pixel. The base line was detected with the same algorithm. A straight line was curvefitted to the base line points. By calculating the distance between data points and the base line and knowing the camera-light source angle the depth information was then extracted using triangulation technique. Using on-line calibration developed by [4] the depth information was transformed into physical depth data.
5 3D Surface Reconstruction At the end of the scanning procedure, a set of 3D coordinates of the different 3 points, d i={xi,y i,zi} R on the surface So of the object is obtained. The data set D={d1,…,d n} consists of completely unstructured data. The data points di's are arranged as a 3D matrix in which the x and y values are extracted from the grey image and the depth is extracted using the algorithm explained above. The data
Fig. 3. Object scanning
On-Line 3D Geometric Model Reconstruction
151
Fig. 4. Edge detection
Fig. 5. Image after preprocessing
Fig. 6. Surface Reconstruction in AutoCAD
points are arranged in a left to right, top-down order. To decrease the amount of data a data reduction algorithm was employed. The data reduction algorithm eliminates, within a row, the intermediate points with the same depth. The final data points are stored and exported to AutoCAD in DXF format. 3D Mesh was used to reconstruct the initial rough surface. Piecewise 3D mesh of points are generated and using C1 continuity the pieces are patched together to generate the whole surface.
6
Discussion and Conclusion
A simple 3D scanning was developed using off-the-shelf components. The system is capable of generating models of 0.5*0.1* 0.25 millimeter (L*W*H) in resolution. The model is displayed in AutoCAD automatically. It can also be selected to view the
152
H. Zolfaghari and K. Khalili
model within MATLAB. The system benefits from full potential of AutoCAD geometric modeling software. The scanning and surface construction is performed automatically. Use of on-line calibration reduced the systematic errors significantly. Although a data reduction scheme was successfully employed but further works required to improve the process of patching pieces together and reducing more data. The system is used as a stationary single-view system, hence limitation on generating full actual models. To be able to generate complete 3D models several scans have to be performed either by relative motion of scanner-object or preferably using two scanners at sides. The system was tested for a simulated tire tread production line where the thickness of the tread has to be measured and displayed.
References 1. Son, S., Park, H., Lee, K.H.: Automated Laser Scanning System for Reverse Engineering and Inspection. International Journal of Machine Tools & Manufacture 42, 889–897 (2002) 2. Zhao, D., Li, S.: A 3D Image Processing Method for Manufacturing Process Automation. Computers in Industry 56, 975–985 (2005) 3. Lombardoa, V., Marzullib, T., Pappalettereb, C., Sforzaa, P.: A Time-Of-Scan Laser Triangulation Technique for Distance Measurements. Optics and Lasers in Engineering 39, 247–254 (2003) 4. Khalili, K., Nazemsadat, S.M.: Improved Accuracy Of On-Line Tire Profile Measurement Using A Novel On-Line Calibration. In: Proc. of the 7th IASTED International Conference on Visualization, Imaging, and Image Processing, Spain, pp. 123–128 (2007) 5. Zhang, Y.: Research In to the Engineering Application of Reverse Engineering Technology. Journal of Materials Processing Technology 139(1-3), 472–475 (2003) 6. Zexiao, X., Jianguo, W., Qiumei, Z.: Complete 3D Measurement in Reverse Engineering Using a Multi-Probe System. International Journal of Machine Tools and Manufacture 45(12-13), 1474–1486 (2005) 7. Blais, F.: Review of 20 Years of Range Sensor Development. Journal of Electronic Imaging 13(1), 231–240 (2004) 8. Campbell, R.J., Flynn, P.J.: A Survey of Free-Form Object Representation and Recognition Techniques. Computer Vision and Image Understanding 81, 166–210 (2001) 9. Tsai, R.Y.: A Versatile Camera Calibration Technique for High- Accuracy 3D Machine Vision Metrology Using Off-the-Shelf TV Cameras and Lenses. IEEE Journal of Robotics and Automation RA-3(4), 323–344 (1987) 10. Nazemsadat, S.M.: On-Line 3D Scanning. MSc. Dissertation, University of Birjand, Iran (2006) 11. Izquierdo, M.A.G., Sanchez, M.T., Ibanez, A., Ullate, L.G.: Subpixel Measurement of 3D Surfaces by Laser Scanning. Sensors and Actuators 76, 1–8 (1999) 12. Sokovic, M., Kopac, J.: RE (Reverse Engineering) as Necessary Phase by Rapid Product Development. Journal of Materials Processing Technology 175(1-3), 398–403 (2006)
Implementation of Filters for Image Pre-processing for Leaf Analyses in Plantations* Jacqueline Gomes Mertes, Norian Marranghello, and Aledir Silveira Pereira Instituto de Biociências, Letras e Ciências Exatas, Universidade Estadual Paulista, São José do Rio Preto, SP, Brazil
[email protected], {norian,aledir}@ibilce.unesp.br
Abstract. In this work an image pre-processing module has been developed to extract quantitative information from plantation images with various degrees of infestation. Four filters comprise this module: the first one acts on smoothness of the image, the second one removes image background enhancing plants leaves, the third filter removes isolated dots not removed by the previous filter, and the fourth one is used to highlight leaves’ edges. At first the filters were tested with MATLAB, for a quick visual feedback of the filters’ behavior. Then the filters were implemented in the C programming language. At last, the module as been coded in VHDL for the implementation on a Stratix II family FPGA. Tests were run and the results are shown in this paper. Keywords: Precision agriculture, hardware description language application, digital image processing, reconfigurable architectures.
1 Introduction Currently, digital image processing techniques are used to decide a variety of problems. Such techniques can improve image visualization for analysis by humans, as, for example, the biomedical interpretation of rays X images. It can also help recognizing irregular products in a production mat in an automated industry [1]. One of the newest uses of image processing techniques came into being with the need for real time processing. One such use is in the agricultural area, aiming at extracting quantitative information from plantations’ images with diverse degrees of infestations by various biological agents. In this case it is necessary a specific image treatment to help detecting, and determining the degree of infestation. In agriculture, the control of weeds is of real importance when the cost of herbicides is taken into consideration. A careful study of the plantation the amount of herbicides to be applied can be considerably reduced. This implies a reduction of the *
This work has been partially supported by the Brazilian National Research Council (CNPq) through a project within Brazil-IP program, and by Sao Paulo State Research Foundation (FAPESP) through travel grant number 2008/01645-7.
M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 153 – 162, 2008. © Springer-Verlag Berlin Heidelberg 2008
154
J.G. Mertes, N. Marranghello, and A.S. Pereira
final product as well as prevents unneeded herbicides from being introduced into the environment. Precision agriculture appeared in the late eighties to reduce fertilizer application according to previous chemical analysis of soil composition [2]. With time new equipments and techniques were developed to improve these studies, and maximize productivity. Thus, one of the main objectives of the precision agriculture is the adequate handling of the desired culture to maximize productivity, and to minimize the impact on the environment. PERNOMIAN reports a work in which weeds are identified in real time using artificial neural nets [3]. YANG et al. present another application of artificial neural nets to weed recognition and classification [4]. DIAS developed an application for rust identification in sugarcane plantations also using artificial neural nets [5]. In this paper we present a technique based on image filtering algorithms for weed detection in soybean plantations. A hardware module for leaf image pre-processing is developed based on such algorithms. After described in VHDL the module is simulated and tested. Then it is implemented onto an FPGA. In the following sections each filter is functionally presented then the development of the hardware module is described, and some test results are discussed.
2 Development of the Filters In order to develop the soybean leaves’ images pre-processing module four algorithms were elaborated. The module should identify the leaves from the scene, and highlight their edges so that in the future an artificial neural network (ANN) may use this information to find out possible weeds. Image dimension was chosen towards using the least possible number of pixels still enabling to properly identify the eventual presence of weeds. When submitted to the ANN each pixel will be treated by an input node of the net. The smaller is the number of pixels, the smaller the total input node count will be. Thus, it is important to keep the number of pixels small so that the implementation of the system comprising the pre-processing module plus the ANN onto an FPGA is viable. Therefore, the images used were restricted to 40x60 pixels, an amount of pixels that is small enough to be implemented on an FPGA, while preserves a good amount of information from the original image. The pre-processing module was developed with the following functionalities: – noise smoothing in the original image through a median filter; – putting leaves’ images apart from the background through image segmentation; – removing spots not detected by previous processing through a custom designed filter; and – edge enhancing as well as binarization of the resulting image through a Roberts filter.
Implementation of Filters for Image Pre-processing for Leaf Analyses in Plantations
155
At first these filters were implemented with MATLAB as this software has an easy to use language that is also fast to compile, providing prompt results. Then, the algorithms were implemented both in C and in VHDL. We had to face the fact that the VHDL language can only process uni-dimensional matrix declarations. Thus, the three dimensional RGB input matrix had to be broken down to three uni-dimensional vectors with 2400 positions each. Therefore, both C and VHDL implementations of the algorithms dealt with the original image as three vectors, one for each fundamental color R, G, and B. 2.1 Filter for Noise Smoothing The median filter is used to reduce image noise. In this work it is implemented using a 3x3 pixels’ neighborhood. Thus, a 3x3 mask scans the image from an initially defined pixel. The pixels within the 3x3 neighborhood of such an initial pixel are ordered by value. The, the pixel with the same number of values bellow and above it is identified as the median pixel. The mask is then centered onto this new pixel. Fig. 1 displays three images to show the diversity of leaves that can be found in soybean crops. In all instances the picture to the left (a) is the original image before being submitted to the filter, and the picture to the right (b) is the corresponding image after being processed by the median filter. In Fig. 1 instances 1 and 3 are soybean leaves, and instance 2 is weed. From these example images one can fell the difference among the leaves existing in a soybean plantation.
(1a)
(1b)
(2a)
(2b)
(3a)
(3b)
Fig. 1. Original images of leaves to the left; to the right leaves resulting from median filtering
156
J.G. Mertes, N. Marranghello, and A.S. Pereira
2.2 Filter for Background Removal Image segmentation provides for background removal. This means that this filter removes all background information such as soil and other particles laid on the ground hopefully leaving only the leaves in the scene. In order to do that, intervals of color values from the background image are determined. The image is then scanned a pixel at a time to check if the pixel’s value is within one of the background intervals. When there is a match the value of the pixel is set to white, it is left unchanged otherwise. Fig. 2(a) displays the resulting images after processing by the background removal filter. The input images for this filtering step are the corresponding numbered images to the right side of Fig. 1. 2.3 Filter for Noise Removal This filter aims at removing isolated spots from the image which were not detected by the previous filters. To accomplish that it uses a 7x7-pixel mask. Using this mask to scan the image the amount of white pixels sitting within the mask area is counted. We consider the existence of 25 or more white pixels under the mask area as an indicative of a background area. Thus, when this count is greater then or equal to 25 the pixel corresponding to the center of the mask is regarded as noise, and set to white too. Fig. 2(b) displays the results from submitting the corresponding images from Fig. 2(a) to the noise removal filter.
(1a)
(1b)
(2a)
(2b)
(3a)
(3b)
Fig. 2. Images resulting from background removal filtering to the left; to the right images resulting from noise removal filtering
Implementation of Filters for Image Pre-processing for Leaf Analyses in Plantations
157
2.4 Filter for Edge Enhancing The goal of this filter is to enhance the edges of the leaves as well as to turn these edges into binary form to make the image suitable for ANN processing. Some options were considered for edge enhancing, being the one using Roberts’ filter that presented the best performance, and the lowest computational cost. Other options considered were those using Sobel, Prewitt, and Nevatia-Babu filters. In figure 3, one can observe the images resulting from the application of the edge enhancing filter. The images resulting from the noise removal filtering have their edges enhanced. The black-and-white format is the result of the binarization process. From figure 4 the differences between soybean leaves and weeds can be noted. Weed leaves are more elongated while soybean leaves present a more rounded shape. It is also interesting to note that figure 3 is also the final result of the image preprocessing module. We intend to submit this sort of images to an ANN to tell weeds and soybean leaves apart so that we can determine the amount of herbicides to apply to the crop.
(1)
(2)
(3) Fig. 3. Images resulting from edge enhancing filter
2.5 Application of the Module of Pre-processing in Real Environment The module presented above was applied to images of soy plantations to prove its effectiveness. Such images were collected with the aid of a digital camera, at about 1,5 m hieght, perpendicularly to the ground. These images have dimension of 360x240 pixels. In Fig. 4 the result of the pre-processing carried out with one of the collected images is presented.
158
J.G. Mertes, N. Marranghello, and A.S. Pereira
(1)
(2)
(3)
(4)
(5) Fig. 4. Result of pre-processing: (1) original image; (2) image resulting from median filtering; (3) image resulting from background removal filtering; (4) resulting from noise removal filtering and (5) image resulting from edge enhancing filter
3 Composition of the Filters in VHDL The algorithms presented in the previous section were implemented in VHDL. The integration of these filters in a module to produce a resulting image that can be directly read by an ANN for weed identification requires some adaptation as well as the creation of auxiliary structures for testing on an FPGA. Fig. 5 illustrates how the module with the filters was organized for implementation, and how their internal logic blocks communicate to each other. In the following we describe each such structure comprising the image pre-processing module. From Fig. 5 one can observe the input data that are composed by the signals clock, reset, enable and by the set of pixels of the image. This set of pixels, as previously explained, is comprised by three vectors, namely R, G, and B. These signals will be reach the FPGA though input ports automatically determined by Quartus II software, during programming procedure.
Implementation of Filters for Image Pre-processing for Leaf Analyses in Plantations
159
Fig. 5. Block diagram for the implementation of the filters in VHDL
The MUX MEMORY structure is a multiplexer for the signals that must be sent to the internal memories. According to the control signals received the multiplexer switches between the FPGA input ports (corresponding to external data input), and one of the filters outputs (corresponding to internal data inputs). This multiplexer can be seen in the Fig. 6(a).
(a)
(b)
Fig. 6. Data memory multiplexer (a); and connections of memory blocks (b)
Blocks MEM R, MEM G, MEM B represent the memories for the storage of information regarding each image color, and MEM P/B represents the memory to store the image resulting from the image enhancing filter. These memories have the same structure, being their external connections presented in Fig. 6(b).
160
J.G. Mertes, N. Marranghello, and A.S. Pereira
The CONT structure represents an internal counter that is started with the reset signal to sum up the memory positions used by the input data. The counter is shown in Fig. 7(a). The GEN POS structure represents the pixel position generator used by the median, noise removal, and edge enhancement filters. This is the logic that properly initializes the required mask, as well as moves it by incrementing its position relatively to the corresponding interval. The position generator can be observed in Fig. 7(b).
(a)
(b)
(c)
Fig. 7. Counter of memory positions (a); and pixel position generator (b); position generator multiplexer (c)
The MUX GEN POS structure represents the signal multiplexer of the position generator. Thus, it receives signals from the filters, and sends to the position generator the information on the position that has to be incremented. The position generator returns the updated position to the multiplexer, which in turn transmits this information to the proper filter. This multiplexer can be observed in Fig. 7(c). The CTRL structure represents the controller for filter enabling. It enables the first filter, the median filter, when it receives a signal from the internal counter indicating that all memories are filled up. For the other filters to be enabled this controller waits until it receives a task completion signal from the corresponding previous filter. This controller is shown in Fig. 8(a). The MUX IN FILTERS structure represents the multiplexer of the signals to be sent to the filters. According to the enabled filter it determines the data to be sent. This multiplexer is shown in Fig. 8(b). The MUX OUT FILTERS structure represents the multiplexer of the signals that are received from the filters during or after processing of the image vectors. This multiplexer is shown in Fig. 8(c). Furthermore, Fig. 5 shows four structures named FILTER 1, FILTER 2, FILTER 3, and FILTER 4, corresponding to the median, background removal, noise removal, and edge enhancement filters. Their signals are presented in Fig. 9(a), 9(b), 10(a), and 10(b), respectively. Based on the given description of the interconnection of the sub-system in figure 6 the image pre-processing can be described as follows: upon concluding data input the counter sends the signal ot0 to the controller that enables the median filter; finishing its processing this filter sends the signal ot1 to the controller that enables the background removal filter; this filter processes the required data and then sends the signal
Implementation of Filters for Image Pre-processing for Leaf Analyses in Plantations
(b)
(a)
161
(c)
Fig. 8. Signal controller (a); multiplexers of input (b), and output (c) signals for the filters
(a)
(b)
Fig. 9. Median filter (a); filter for background removal (b)
(a)
(b)
Fig. 10. Filter for noise removal (a); filter for edge enhancement (b)
162
J.G. Mertes, N. Marranghello, and A.S. Pereira
ot2 to the controller, which enables the noise removal filter; at last, when the noise removal filter is done with its processing it sends the signal ot3 to the controller that finally enables the edge enhancement filter. Upon completion this filter sends the signal ot4 to the controller to conclude the processing. Before sending the end-of-processing signal (signoff), each filter stores the resulting vectors in the memories represented for structures MEM R, MEM G and MEM B. Except for the edge enhancement filter, which stores the resulting vector in the memory represented by the structure MEM P/B. Before sending the end-of-processing signal (signoff), each filter stores the resulting vectors in the memories represented for structures MEM R, MEM G and MEM B. Except for the edge enhancement filter, which stores the resulting vector in the memory represented by the structure MEM P/B.
4 Conclusions In this article filters for image pre-processing of soybean plantations were presented. Identification of such images through their edges was possible mainly due to the smart performance of image segmentation filters, and the easy edge detection made possible by the edge enhancement filter. From the results presented we intend to move on by designing and training an artificial neural net capable of identifying weeds in images like these. After effective weed detection it is possible to determine the correct amount of herbicides to be applied in the considered crop. Thus, reaching the goals of precision agriculture.
References 1. Gonzalez, R.C., Woods, R.E.: Processamento de Imagens Digitais (1993) 2. Whelan, B.M.: Precision Agriculture – An Introduction to Concepts, Analysis & Interpretation. ACPA, Sydney, Australia, p. 154 (2005) 3. Pernomian, V.A.: Identificação de plantas invasoras em tempo real. ICMC, USP. São Carlos (2002) 4. Yang, C.C., et al.: Application of artificial neural networks in image recognition and classification of crops and weeds. Canadian Agricultural Engineering 42(3) (July 2000) 5. Dias, D.N.: Identificação dos sintomas de ferrugem em áreas cultivadas com cana-deaçúcar. ICMC, USP. São Carlos (2004)
Simulation of Multiphysics Multiscale Systems, 5th International Workshop Valeria V. Krzhizhanovskaya1,2 and Alfons G. Hoekstra1 1
Section Computational Science, Faculty of Science, University of Amsterdam, Kruislaan 403, 1098 SJ Amsterdam, The Netherlands http://www.science.uva.nl/~valeria/SMMS {valeria, alfons}@science.uva.nl 2 St. Petersburg State Polytechnic University, Russia
Abstract. Modeling and Simulation of Multiphysics Multiscale Systems (SMMS) poses a grand challenge to computational science. To adequately simulate numerous intertwined processes characterized by different spatial and temporal scales spanning many orders of magnitude, sophisticated models and advanced computational techniques are required. The aim of the SMMS workshop is to encourage and review the progress in this multidisciplinary research field. This short paper describes the scope of the workshop and gives pointers to the papers reflecting the latest developments in the field. Keywords: Multiphysics, Multiscale, Complex systems, Modeling, Simulation, ICCS, SMMS, Workshop.
1 Introduction to the Workshop The progress in understanding physical, chemical, biological, sociological and economical processes strongly depends on adequacy and accuracy of numerical simulation. All the systems important for scientific and industrial applications are inherently multiphysics and multiscale: they involve interactions amongst a wide range of physical phenomena operating at different spatial and temporal scales. Complex flows, fluid-structure interactions, plasma and chemical processes, thermo-mechanical and electromagnetic systems are just a few examples essential for fundamental and applied sciences. Numerical simulation of these multiphysics and multiscale problems requires development of sophisticated models and methods for their integration, as well as efficient numerical algorithms and advanced computational techniques. To boost scientific cross-fertilization and promote collaboration of the diverse groups of specialists involved, we have launched a series of mini-symposia on Simulation of Multiphysics Multiscale Systems (SMMS) in conjunction with the International Conference on Computational Sciences (ICCS) [1]. The fifth workshop in this series, organized as a part of ICCS-2008, expands the scope of the meeting from physics and engineering to biological and biomedical applications. This includes computational models of tissue- and organo-genesis, tumor growth, blood vessel formation and interaction with the hosting tissue, biochemical M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 165 – 166, 2008. © Springer-Verlag Berlin Heidelberg 2008
166
V.V. Krzhizhanovskaya and A.G. Hoekstra
transport and signaling, biomedical simulations for surgical planning, etc. The topics traditionally addressed by the symposium include modeling of multiphysics and/or multiscale systems on different levels of description, novel approaches to combine different models and scales in one problem solution, advanced numerical methods for solving multiphysics multiscale problems, new algorithms for parallel distributed computing specific to the field, and challenging multiphysics multiscale applications from industry and academia. A large collection of rigorously reviewed papers selected for the workshops highlight modern trends and recent achievements [2]. It shows the progress made in coupling different models (such as continuous and discrete models; quantum and classical approaches; deterministic and stochastic techniques; nano, micro, meso and macro descriptions) and suggests various coupling approaches (e.g. homogenization techniques, multigrid and nested grids methods, variational multiscale methods; embedded, concurrent, integrated or hand-shaking multiscale methods, domain bridging methods, etc.). A number of selected papers have been published in the special issues of the International Journal for Multiscale Computational Engineering [3], collecting state-of-the-art methods for multiscale multiphysics applications. Acknowledgments. We would like to thank the participants of our workshop for their inspiring contributions, and the members of the workshop program committee for their diligent work, which led to the very high quality of the conference. The organization of this event was partly supported by the Virtual Laboratory for e-Science Bsik project.
References 1. Simulation of Multiphysics Multiscale Systems, http://www.science.uva.nl/~valeria/SMMS 2. LNCS this volume, 20 papers after this introduction; LNCS V. 4487/2007. DOI 10.1007/978-3-540-72584-8, pp. 755-954; LNCS V. 3992/2006. DOI 10.1007/11758525, pp. 1-138; LNCS V. 3516/2005. DOI 10.1007/b136575, pp. 1-146; LNCS V. 3039/2004. DOI 10.1007/b98005, pp. 540-678 3. Simulation of Multiphysics Multiscale Systems. Special Issues of the International Journal for Multiscale Computational Engineering: V. 4, Issue 2, 2006. DOI: 10.1615/IntJMultCompEng.v4.i2; V. 4, Issue 3, 2006. DOI: 10.1615/IntJMultCompEng.v4.i3; V. 5, Issue 1, 2007. DOI: 10.1615/IntJMultCompEng.v5.i1; V. 6, Issue 1, 2008. DOI: 10.1615/IntJMultCompEng.v6.i1
A Hybrid Model of Sprouting Angiogenesis Florian Milde, Michael Bergdorf, and Petros Koumoutsakos Computational Science, ETH Z¨ urich, CH-8092, Switzerland
[email protected]
Abstract. We present a computational model of tumor induced sprouting angiogenesis that involves a novel coupling of particle-continuum descriptions. The present 3D model of sprouting angiogenesis accounts for the effect of the extracellular matrix on capillary growth and considers both soluble and matrix-bound growth factors. The results of the simulations emphasize the role of the extracellular matrix and the different VEGF isoforms on branching behavior and the morphology of generated vascular networks. Keywords: Blood vessel growth, Sprouting angiogenesis, Computational modeling, Particle-continuum coupling, 3D, Matrix-bound VEGF, Extracellular matrix, Branching.
1
Introduction
Sprouting angiogenesis, the process of new capillaries forming from existing vessels, can be observed in the human body under various conditions. In this work, we focus on tumor-induced angiogenesis, where a tumor in hypoxic conditions, secretes growth factors in order to establish its own vasculature to ensure nutrient and oxygen supply to the tumor cells leading to increased tumor cell proliferation and enhanced tumor growth. The process of tumor-induced angiogenesis is initiated by tumor cells in conditions of glucose deprivation and hypoxia, with the shortage of oxygen supply triggering the release of angiogenic growth factors. Among the several growth factors known to contribute to the process, Vascular Endothelial Growth Factors (VEGF) have been identified as one of the key components. Upon release from the tumor, VEGFs diffuse through the ExtraCellular Matrix (ECM) occupying the space between tumor and existing vasculature and establish a chemical gradient. Once VEGF has reached a vessel, it binds to the receptors located on Endothelial Cells (EC), which line the blood vessel walls. This binding sets off a cascade of events triggering the outgrowth of new vessel sprouts at the existing vasculature near the tumor. While endothelial cell proliferation is confined to a region located behind the sprout tip, endothelial tip cells located at the sprouting front migrate through the ECM thus defining the morphology of the newly formed vasculature. Migrating tip cells probe their environment by extending filopodia and migrate along the VEGF gradient towards regions of higher concentration, a directed motion referred to as chemotaxis. In addition to the soluble isoform of VEGF, the presence of other VEGF isoforms expressing binding sites M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 167–176, 2008. c Springer-Verlag Berlin Heidelberg 2008
168
F. Milde, M. Bergdorf, and P. Koumoutsakos
for the ECM has been identified to significantly influence morphology of capillary network formation [12,9]. These “matrix-bound” VEGF isoforms can be cleaved from the ECM by Matrix MetalloProteinases (MMPs), expressed both by tumors and migrating ECs. Another component involved in the process of angiogenesis is fibronectin, a glycoprotein distributed in the ECM and at the same time released by migrating tip cells. Fibronectin binds to fibers occupying about 30% of the ECM. Through interaction with transmembrane receptors located on the EC membrane, fibronectin establishes an adhesive gradient which servers as another migration cue for the ECs. This autocrine signaling pathway, promoting cell-cell and cell-matrix adhesion, accounts for a movement referred to as haptotaxis. In addition to the chemotactic and haptotactic cues, the fibrous structures itself present in the ECM influence cell migration by facilitating movement in fiber direction. After initial sprouts have extended into the EC for some distance, repeated branching of the tips can be observed. Sprout tips approaching others may fuse and form loops, a process called anastomosis. Along with anastomosis, the formation of lumen within the strands of endothelial cells establishes a network that allows the circulation of blood. In a final stage, the newly formed vessels maturate, establishing a basal lamina and recruit pericytes and smooth muscle cells to stabilize the vessel walls. An overview of the biological processes involved in angiogenesis can be found in [8,10,11] and references therein. In the following, we propose a mathematical model of sprouting angiogenesis together with the computational methods that implement the equations in 3D. Along with the model, simulation results are shown, underlining the effect of the ECM structure and matrix-bound growth factors on the generated network morphology. 1.1
Computational Modeling of Angiogenesis
Computational models of tumor-induced angiogenesis address a limited number of the involved biological processes. The choice of the modeled processes is dictated by the availability of biological data and by the understanding of the underlying biological processes. In the presented model we consider the motion of the ECs as affected by chemical gradients induced by VEGF, haptotactic gradients induced by fibronectin and by the structure of the ECM. We note that the present assumptions may be more pertinent to in-vitro angiogenesis rather than in-vivo angiogenesis which is known to depend on the particular microenvironment [15]. In the present work VEGF appears in soluble and matrix-bound isoforms. The soluble VEGF is released from an implicit tumor source, and diffuses freely through the ECM. The matrix-bound VEGF isoform is randomly distributed and can be cleaved by MMPs released at the sprout tips. Different VEGF isoforms contribute equally to the migration cues of the ECs (see Fig. 1). Fibronectin is released at sprout tips, establishing a haptotactic gradient for the ECs. In addition, we model the binding of fibronectin to the ECM which
A Hybrid Model of Sprouting Angiogenesis
169
localises the haptotactic cues. The ECM is explicitly modeled to consist of directed bundles of collagen fibers randomly distributed throughout the domain. A vector field describing the fiber directions modulates the migration velocity of the ECs in the presence of fibers. A summary of work done in the field of modeling angiogenesis can be found in [10]. More recent work includes the influence of blood flow on the process of angiogenesis by Chaplain et al. [6], the model proposed by Sun et al. [14] considering the conductivity of the ECM and a cell based model of angiogenesis by Bauer et al. [2]. The present model is the first, to the best of our knowledge, to include a cleaving mechanism and to present simulations in the presence of both VEGF isoforms. The proposed 3D modeling approach combines the continuum representation [1,5,14] with a cell based approach confined to the migrating tip cells located at the sprouting front. We implement a hybrid approach to represent molecular species by their concentration and migrating EC tip cells by particles. The evolution of molecular species is governed by reaction-diffusion equations that are discretized on the grid while a particle approach is employed in order to model the migrating EC tip cells. The particle and grid descriptions are coupled as the ECs both serve as a source of fibronectin and MMPs and as sink for VEGF (binding to the cell surface receptors). As the tip cells migrate through the ECM following up the chemotactic and haptotactic cues, they “depose” ECs along their way leaving a trail of endothelial cell density on the grid that defines the 3D vessel structure of the outgrowing sprouts. Filopodia are explicitly modeled to sense chemotactic and haptotactic migration cues, which determine the sprout branching behavior. We report statistics on sprout section length, branching and anastomosis frequency, enabling a quantification of different parametric models and paving the way for future comparisons with experimental works.
2
Vascular Endothelial Growth Factors
Matrix-bound VEGF (bVEGF, Ψb ) does not diffuse and it is assumed to be locally distributed on the fibers composing the ECM. The bVEGF can be cleaved from the matrix by MMPs(χ) released from migrating ECs. Further, ECs express surface receptors that bind VEGF molecules. ∂Ψb = −C Ψb , χ − U Ψb , ∂t
(1)
with the cleaving function: C (Ψb , χ) = min (Ψb , υbV χΨb ) , and the cleaving rate υbV . The uptake function is given by U [C] = min ([C], υV σ) ,
(2)
(3)
with the endothelial uptake rate of VEGF given by υV and the endothelial cell density σ.
170
F. Milde, M. Bergdorf, and P. Koumoutsakos
Fig. 1. Left: Conceptual sketch of the different VEGF isoforms present in the ECM. Soluble and cleaved VEGF isoforms freely diffuse through the ECM, Matrix-bound VEGF isoforms stick to the fibrous structures composing the ECM and can be cleaved by MMPs secreted by the sprout tips. Right: Conceptional x-z plane through the computational domain. Five sprout tips are initially placed on the y-z plane lower end of the domain in x direction, a tumor source of soluble VEGF is modeled at the upper end in x direction outside the computational domain.
Cleaved bVEGF (cVEGF,Ψc ) and solube VEGF(sVEGF, Ψs ) diffuse through the ECM and are subject to natural decay. Endothelial cell uptake is modeled by the uptake function U. ∂Ψc = kV ∇2 Ψc + C Ψb , χ − U Ψc − dV Ψc . ∂t
(4)
sVEGF creation is implicitly modeled by dirichlet boundary conditions (Fig. 1). ∂Ψs = kV ∇2 Ψs − U Ψs − dV Ψs . ∂t
3
(5)
Fibronectin
Fibronectin (Φ) is released by the migrating ECs depending on the local fibronectin concentration. We consider fibronectin released by ECs binding to integrins located at the EC membrane and to matrix fibers. Fibronectin diffuses through the ECM when not bound to the matrix and is subject to natural decay. ∂Φ = kF ∇2 Φ + γF G Fth , Φ Σ − υbF (bFth − Φ) − dF Φ, ∂t with creation function
(6)
Cth − C G Cth , C = . (7) Cth depending on the local fibronectin concentration and the creation threshold level Fth . The rate of fibronection binding to the ECM is given by υbF and limited by bFth to account for binding site saturation in the ECM.
A Hybrid Model of Sprouting Angiogenesis
171
Once fibronectin binds to the ECM, further diffusion is inhibited. The matrixbound fibronectin(Φb ) evolution is given by: ∂Φb = υbF (bFth − Φ) − dbF Φb . ∂t 3.1
(8)
Matrix-MetalloProteinases
MMPs(χ) cleave the bVEGF isoforms from the binding sites in the ECM and are assumed to be released at the migrating ECs depending on the local MMP concentration. The specific release rate is given by γM and Σ describes the endothelial tip cell density. MMP release is stopped when the local MMP level approaches the threshold level Mth . Upon release by the ECs, MMPs are assumed to diffuse through the ECM and are subject to natural decay. ∂χ = kM ∇2 χ + γM G Mth , χ Σ − dM χ. ∂t
4
(9)
Endothelial Cells
The migration direction of endothelial tip cells is determined by chemotactic and haptotactic cues in the matrix given by VEGF and fibronectin gradients. As the VEGF level increases, EC surface receptors become occupied, attenuating the cells ability to sense the chemotactic cues. The attenuation is represented by a function W. The sprout tip acceleration during migration is defined as : a = α (Eρ ) T (W (Ψ) ∇Ψ + wF ∇Φb ) , where
(10)
wV , 1 + wv2 Ψ
(11)
Ψ = Ψs + Ψb + Ψc .
(12)
W (Ψ) = and
The presence of fibers (Eρ ) promote mesenchymal motion of the tip cells, thus enhance the migration speed of ECs. In contrast, a very dense matrix slows down the migration speed of the tip cells as characterized by the function: α (Eρ ) = (E0 + Eρ ) (E1 − Eρ ) C1 ,
(13)
where the threshold E0 defines the migration factor in the absence of fibers, E1 the maximal fiber density and C1 the ECM migration constant . To model the directional cues of the matrix fibers, a tensor T is introduced acting on the migration velocity. {T}ij = (1 − β (Eχ )) {1}ij + β (Eχ ) Ki Kj ,
(14)
β (Eχ ) = βK Eχ ,
(15)
with
172
F. Milde, M. Bergdorf, and P. Koumoutsakos
the ECM strength βK and K being the vector field the tensor is applied on. Tip cell particle positions xp are updated according to: xp up = up , = ap − λup , ∂t ∂t
(16)
with drag coefficient λ. The matrix structure may promote diverging migration directions, leading to branching of the endothelial tip cells and creation of new sprouts. In our model, we locate regions of high anisotropy in the migration acceleration direction field by a curvature measure k as proposed in [16]. Branching occurs in locations where the local curvature k exceeds a threshold level aith . In order to determine the preferred branching direction in 3D, 6 satellite particles are distributed radially around the tip cell particle in a plane perpendicular to the migration direction modeling the extension of filopodia. The velocity field is compared at opposing satellite positions and branching occurs into the directions that diverge the most. ECs are insensitive to branching cues immediately after a branching event has occurred. In order to account for this effect, a sprout threshold age sath is introduced. Sprout tips of age smaller than sath are not considered for branching. Anastomosis occurs when tip cells fuse either with existing sprouts or with other tip cells. In order to obtain the endothelial cell density defining the capillary sprouts, at every time step, we interpolate the sprout tip cell density Qp at xp onto the grid using a 4th order B-spline kernel B4 and add the maximum of the interpolated sprout tips and the σ field onto the σ field. n+1 n σijk = max σijk , B4 (ih − xp ) B4 (jh − yp ) B4 (kh − zp ) Qp , (17) p
with particle weight Qp , and mesh size h whereas n denotes the nth time step.
5
ECM
In the present work the ECM is modeled as a collection of fiber bundles randomly distributed throughout the computational domain. The ECM is represented by three grid-functions: (i) a vector field K that describes the fiber orientations, (ii) a smooth indicator function Eχ , which indicates the presence of fibers at any given point in space, and (iii) a fiber density field Eρ , which is used to regulate migration speed. These fields are constructed in a straightforward manner by generating N random fibers with a given length fl which is constant for all fibers. These fibers are then put on the grid much like lines are rasterized in computer graphics [4]. In the case of K the directions are rasterized onto the grid, and in the case of Eχ we tag the grid points at the fiber locations with a value of 1, resulting in randomly distributed fibers. The fields K and Eρ are filtered with a Gaussian filter to achieve a smooth matrix representation. In the case of Eχ this is not possible, so the field is
A Hybrid Model of Sprouting Angiogenesis
173
constructed by using smoothed fibers. In cases where fibers overlap the maximum value of the two fibers is retained.
6
Methods
The time step constraint for diffusion on the molecular species is stricter than for the reaction part. A fractional step algorithm is used to solve the system efficiently. In this algorithm the non linear and linear reaction parts of the equations are solved simultaneously using explicit Euler steps, while linear diffusion is solved implicitly. The systems can be safely decoupled, as EC cell migration occurs on a much smaller time scale than molecular diffusion and steady state can be assumed for the source and sink of the different proteins. VEGF, fibronectin and acceleration gradients for migration velocity and the curvature measure are calculated on the grid using second order finite differences. In order to get the acceleration and curvature on the particle location, Mesh-Particle interpolations are done using the M4 kernel [3] while for the interpolation of the sprout tip density onto the grid, Particle-Mesh interpolations employ 4th order B-spline kernel.
7
Results
We present results demonstrating the effect of the ECM density on the resulting vessel networks. The computational domain is defined as a cube of 1.53 mm discretized with a 1283 uniform grid. The ECM was modeled using random fiber fields, created with five different matrix densities: 15,000 fibers resulting in a volume density of 6%, 30,000 fibers (11%), 70,000 fibers (26%), 100,000 fibers (38%), and 200,000 fibers (75%)(see Fig. 2). The normalized volume density is given by the sum of the fiber density Eρ over all grid points divided by the number of grid points. For each density value we performed 128 simulations with a different random seed for the fiber placement. Comparing the number of branches found in the computational domain at simulation time T = 25.0 corresponding to 10.9 days in physical time, (Fig. 3) we find a logarithmic increase of the number of branches for linearly increasing fiber density. Examples of the structure of the corresponding vessel networks are depicted in Fig. 2: in very low density ECMs, hardly any branching occurs, while in very dense ECMs the EC sprouts branch very often. In the 75% density case the fiber density is high enough to impair the migration which leads to shorter capillary networks (Fig. 2 E). In Fig. 4 we depict the evolution of the vascular network in the presence of initially distributed pockets of bVEGF. The bVEGF pockets are cleaved by MMPs (not shown) as the sprouts approach the VEGF source at the far end of the domain. The vessels grow in diameter by application of a post-processing vessel maturation method(not described in this work).
174
A
F. Milde, M. Bergdorf, and P. Koumoutsakos
B
C
D
E
C
Fig. 2. Top: Slice of the ECM field for five different densities: A 6%, B 11%, C 26%, D 38%, and E 75%. Bottom: Capillary networks for the different ECM densities. 70.0
number of branches [-]
60.0 50.0 40.0 30.0 20.0 10.0 0.0 0.01
0.1 ECM density [-]
1
Fig. 3. Influence of the matrix density on the number of branches of the vessel network (error bars represent min/max of data)
Fig. 4. Capillary network evolution in the presence of bVEGF pockets. bVEGF is cleaved by MMPs during the course of the simulation.
A Hybrid Model of Sprouting Angiogenesis
8
175
Conclusions
The present work describes the first, to the best of our knowledge, simulations of 3D sprouting angiogenesis that incorporate effects of the extracellular matrix structure on the vessel morphology and considers both soluble and matrix-bound growth factor isoforms. The method is formulated as a generalized hybrid particle method and is implemented in the context of a parallel framework (PPM) [13]. This aspect of the method renders it scalable to massively parallel computer architectures, a crucial aspect for the study of angiogenesis at macroscopic scales and integrative models of vascular tumor growth. Efficient Particle to Mesh and Mesh to Particle interpolation schemes provide a straightforward way of coupling the two levels of representation. The presented simulations of sprouting angiogenesis have shown that the structure and density of the ECM has a direct effect on the morphology, expansion speed and number of branches observed in computationally grown vessel networks. The simulations reflect the influence of the extracellular matrix composition on endothelial cell migration and network formation corresponding to observations made by [7]. With the number of branches depending on the matrix structure and the presence and level of matrix-bound VEGF isoforms, this model may be easier to tune against experiments compared to branching probabilities that most individual-based methods employ. Limitations of the current model are related to the explicit definition of tip cells restricting the formation of new sprout tips to predefined locations on the initial vasculature. Formulation of a tip cell selection method combined with cell type specific migration and proliferation rules are the subject of current work. The integration of the present framework in studies of tumor induced angiogenesis is a subject of coordinated investigations with experimental groups.
References 1. Anderson, A.R.A., Chaplain, M.A.J.: Continuous and Discrete Mathematical Models of Tumor-Induced Angiogenesis. Bull. Math. Biol. 60(5), 857–899 (1998) 2. Bauer, A.L., Jackson, T.L., Jiang, Y.: A Cell-Based Model Exhibiting Branching and Anastomosis During Tumor-Induced Angiogenesis. Biophys. J. 92, 3105–3121 (2007) 3. Bergdorf, M., Koumoutsakos, P.: A Lagrangian Particle-Wavelet Method. Multiscale Model. and Simul. 5(3), 980–995 (2006) 4. Bresenham, J.E.: Algorithm for Computer Control of a Digital Plotter. IBM Syst. J. 4(1), 25–30 (1965) 5. Chaplain, M.A.: Mathematical Modelling of Angiogenesis. J. Neurooncol. 50(1-2), 37–51 (2000) 6. Chaplain, M.A.J., McDougall, S.R., Anderson, A.R.A.: Mathematical Modeling of Tumor-Induced Angiogenesis. Annu. Rev. Biomed. Eng. 8, 233–257 (2006) 7. Davis, E.G., Senger, D.R.: Endothelial Extracellular Matrix: Biosynthesis, Remodeling, and Functions During Vascular Morphogenesis and Neovessel Stabilization. Circ. Res. 97(11), 1093–1107 (2005)
176
F. Milde, M. Bergdorf, and P. Koumoutsakos
8. Folkman, J.: Angiogenesis: an Organizing Principle for Drug Discovery? Nat. Rev. Drug Discov. 6(4), 273–286 (2007) 9. Lee, S., Jilani, S.M., Nikolova, G.V., Carpizo, D., Iruela-Arispe, M.L.: Processing of VEGF-A by Matrix Metalloproteinases Regulates Bioavailability and Vascular Patterning in Tumors. J. Cell Biol. 169(4), 681–691 (2005) 10. Mantzaris, N., Webb, S., Othmer, H.: Mathematical Modeling of Tumor-Induced Angiogenesis. J. Math. Biol. 49, 111–187 (2004) 11. Paweletz, N., Knierim, M.: Tumor-Related Angiogenesis. Crit. Rev. in Oncol. Hematol. 9(3), 197–242 (1989) 12. Ruhrberg, C., Gerhardt, H., Golding, M., Watson, R., Ioannidou, S., Fujisawa, H., Betsholtz, C., Shima, D.T.: Spatially Restricted Patterning Cues Provided by Heparin-Binding VEGF-A Control Blood Vessel Branching Morphogenesis. Genes Dev. 16(20), 1684–2698 (2002) 13. Sbalzarini, I.F., Walther, J.H., Bergdorf, M., Hieber, S.E., Kotsalis, E.M., Koumoutsakos, P.: PPM – a Highly Efficient Parallel Particle-Mesh Library. J. Comput. Phys. 215(2), 566–588 (2006) 14. Sun, S., Wheeler, M.F., Obeyesekere, M., Patrick Jr., C.W.: A Deterministic Model of Growth Factor-Induced Angiogenesis. Bull. Math. Biol. 67, 313–337 (2005) 15. Sung, S.Y., Hsieh, C.L., Wu, D., Chung, L.W.K., Johnstone, P.A.S.: Tumor Microenvironment Promotes Cancer Progression, Metastasis, and Therapeutic Resistance. Curr.Prob. Cancer 31(2), 36–100 (2007) 16. Weinkauf, T., Theisel, H.: Curvature Measures of 3d Vector Fields and their Applications. J. WSCG. 10, 507–514 (2002)
Particle Based Model of Tumor Progression Stimulated by the Process of Angiogenesis Rafał Wcisło and Witold Dzwinel AGH University of Science and Technology, Institute of Computer Science al. Mickiewicza 30, 30-059 Kraków, Poland
[email protected]
Abstract. We discuss a novel metaphor of tumor progression stimulated by the process of angiogenesis. The realistic 3-D dynamics of the entire system consisting of the tumor, tissue cells, blood vessels and blood flow can be reproduced by using interacting particles. The particles mimic the clusters of tumor cells. They interact with their closest neighbors via semi-harmonic forces simulating mechanical resistance of the cell walls and the external pressure. The particle dynamics is governed by both the Newtonian laws of motion and the rules of cell life-cycle. The particles replicate by a simple mechanism of division, similar to that of a single cell reproduction and die due to necrosis or apoptosis. We conclude that this concept can serve as a general framework for designing advanced multi-scale models of tumor dynamics. In respect to spatio-temporal scale, the interactions between particles can define e.g., cluster-to-cluster, cellto-cell, red blood cells and fluid particles interactions, cytokines motion, etc. Consequently, they influence the macroscopic dynamics of the particle ensembles in various sub-scales ranging from diffusion of cytokines, blood flow up to growth of tumor and vascular network expansion. Keywords: tumor progression, angiogenesis, computer simulation, particle model.
1 Introduction Angiogenesis is a process of the blood vessel formation from a pre-existing vasculature [1, 2]. It refers to complex biological phenomena that appear at various scales [3, 4]. They ranged from cellular scale to the macroscopic scale corresponding to concentration of cancer cells into compact clusters. The vessel sprouts are formed in response to cytokines produced by the tumor cells in hypoxia. Initially, they develop mainly due to endothelial cell migration. Finally, they organize themselves into a branched, connected network, which supply nutrients (oxygen) to the tumor cells. Tumor-induced angiogenesis provides the fundamental link between the avascular phase of tumor growth and the more invasive vascular phase. Vascularized tumor attacks the surrounding tissue and blood system and the possibility of the cancer spreading (metastasis) increases dramatically. Judah Folkman published in 1971 the theory [1] that angiogenesis is a principal process in tumor progression. The clinical importance of angiogenesis as a prognostic tool is now well understood [2]. Since tumor dynamics depends on angiogenesis, M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 177 – 186, 2008. © Springer-Verlag Berlin Heidelberg 2008
178
R. Wcisło and W. Dzwinel
there is strong interest in using antiangiogenic drugs to treat cancer [5]. The process of angiogenic signaling and formation of blood vessels can be disrupted or slowed down with small molecules. In addition to treatments directed to specific target, non specific agents can be used to eliminate endothelial cells thus inhibiting angiogenesis. This involves numerous expensive and arduous investigations both of hundreds of factors inhibiting angiogenesis and antiangiogenic chemical species which can be considered in drug design process. To cut expenses, the predictive power of mathematical modeling and computer simulation has to be employed. As shown in [6], in silico experiments can play the role of angiogenesis assays. Mathematical modeling of angiogenesis extends back a number of years [3-11]. The modeling concentrates on key events such as the response of endothelial cells to tumor angiogenic factors (TAF) secreted by a solid tumor, endothelial cell proliferation, endothelial cell interactions with extracellular matrix macromolecules, capillary sprout branching and vessel maturation. The substantial difference with respect to the physical models (such as the kinetic theory) is that the microscopic state of the cells is defined not only by mechanical variables, such as position, velocity, pressure, but also by internal biological microscopic phenomena reflecting activities of the cells. Several models [4, 6, 7] have used techniques based on the partial differential equations (PDE) to examine the functions in space and time of endothelial cell density, cytokine concentration, capillary tip and branch density. Nevertheless, within these models, it is not possible to capture such the important events as repeated sprout branching and the overall dendritic structure of the network. In contrast to these deterministic, continuum models, several different types of discrete techniques, such as: cellular automata and DLA [8], L-systems [9], and level-set method [10], are being used. Unlike in the continuum models, discrete models can follow individual cells and can reveal more details about cell dynamics and its interaction with the tissue. As shown in [3, 4, 6], the results of discrete models agree well with the predictions of the continuum model. In addition they are able to produce capillary networks with topology and morphology qualitatively similar to those observed in experiments. However, any of these techniques cannot be considered as flexible framework for developing advanced multi-scale models. The extension of CA to more general complex automata paradigm, e.g., as shown in [11], is rather an exception. Moreover, the difficulties in modeling tumor progression in its natural environment involving mechanical forces from surrounding environment and continuous remodeling of vessel networks in various types of tissue (muscle, bones, lungs) is a serious drawback of this on-grid methods. Other important aspects, lacking in the most of models of angiogenesis, are related to the blood flow through the capillary network. The blood flow has important consequences for drug delivery and optimization of chemotherapy regimes. Recent work modeling the flow through vascular networks appears in [12]. We discuss here the advantages of the moving particles model [13] as a metaphor of the process of tumor dynamics stimulated by the angiogenesis and as a useful framework for more advanced multi-scale models. In the following sections we describe the model of tumor progression and its particle based realization. Finally, we present the conclusions and discuss possible extensions of the model.
Particle Based Model of Tumor Progression Stimulated by the Process of Angiogenesis
179
2 Simplified Model of Tumor Growth The simplified model (basing on [3, 4, 6]) of tumor progression can be described in the three overlapping phases of growth: the avascular phase (I), the angiogenic phase (II) and the vascular phase (III). Phase I. Solid tumor, which is smaller than 2 mm in diameter, removes waste and acquires nutrients and oxygen through passive diffusion. The oxygen (nutrients), supplied by nearby blood vessels, filter through the surface of the solid tumor and diffuse in its body. The tumor cluster consists of the outer region with proliferating cells, an intermediate region of cells in hypoxia and the necrotic core of dead tumor cells. The cells in hypoxia produce cytokines called TAF (tumor angiogenic factors) such as growth inhibitory factors (GIF), and growth promoting factors (GPF). The GIF and GPF concentrations influence the tumor cells mitotic rate.
TUMOR GROWTH (particle dynamics) DIFFUSION Oxygen, GIF, GPF, TAF
BLOOD VESSELS AND CAPILLARIES
CLUSTER OF CELLS (tumor, tissue) EC CELLS (other i.e. blood cells) CHEMICAL SPECIES fibronectin CHEMICAL SPECIES peptides, cytokines, oxygen, others
vessel remodeling BLOOD FLOW in vessels and capillaries
SPACE
TUMOR
anastomoses vessel maturation
OXYGEN TRANSPORTATION
lumen growth
apoptosis
hypoxia cell (cluster of cells) division
mitosis
necrosis
EC motility in the matrix production and consumption of fibronectin TAF production TIME
Fig. 1. The scale separation map of the main processes considered in tumor-induced angiogenesis
Phase II. TAF diffuse through the tissue (extracellular matrix) thereby creating chemical gradients in the body surrounding the tumor. Once the cytokines reach any neighboring blood vessels its critical concentration stimulates several enzymes to degrade of basement membrane. Simultaneously extracellular matrix is degraded and endothelial cells migrate toward the growth factor sources and proliferate (chemotaxis). The fingering capillary sprouts are formed then by accumulation of endothelial cells which are recruited from the parent vessel. Due to the migration and further recruitment of endothelial cells at the sprout-tip the sprouts grow in length and continue to move towards the tumor. The high-molecular-weight glycoproteins – fibronectin – promote cell
180
R. Wcisło and W. Dzwinel
migration up its concentration gradient (haptotaxis). Initially, the sprouts grow parallel to each other. Then they tend to incline toward each other [5]. This leads to numerous tip-to-sprout links known as anastomoses. A network of loops or arcades develops. The blood starts to circulate in newly developed vessels. From the primary loops, new sprouts emerge providing for the further extension of the new capillary bed. Phase III. Once the sprouts approach the tumor, their branching dramatically increases until the tumor is eventually penetrated by vascular network. Due to better oxygenation the concentration of TAF decreases also inside the tumor. However, the newly formed vessels are subsequently remodeled due to tumor growth and pushed away producing regions of lesser concentration of oxygen, which initiates TAF production. This causes simultaneous growth in size of both tumor and its vasculature. In Fig. 1 we present the scale separation map (SSM) for this simplified model of tumor growth. The macroscopic scale refers to phenomena, which are typical for continuum systems: diffusion (oxygen, TAF), overall tumor condensation, blood flow while microscopic processes such as cell motility, cell reproduction (division), cell death (necrosis and apoptosis) are discrete. The arrows show dependences between these processes. The particle model presented in the following section refers only to the processes shaded in dark.
3 Particle Model In Fig. 2 we show the building blocks of our model while in Fig. 3 its implementation is briefly described. Unlike the simulations reproducing this process outside the tumor [3, 4, 6], we assume that the vasculature develops inside the growing cluster. This lets us to simulate the combination of Phases II and III and show how our model will cope with vasculature remodeling caused by mechanical forces being the result of the pressure exerted by the swelling tumor on the expanding vascular network. The particle model of angiogenesis follows the general principles of particle model construction described in [13]. The particles mimic the clusters of tumor cells. The cluster size depends on the model granularity. Particularly, in the finest spatial scale level, it can represent a single tumor cell. Nevertheless, this is rather unrealistic assumption, having in mind that the tumor of 1 mm in diameter consists of hundred of millions of cells. The tumor particles interact with each other via forces, which mimics mechanical interactions of the cell walls. In more advanced models surrounding tissue cells can be represented by using different kind of particles (e.g. by defining different interaction potential). Instead, we assume that only the tumor cells exist while the external pressure is simulated by the attractive tail of asymmetric interparticle potential Ω(dij) defined as follows:
⎧ ⎧a1 for d < 0 2 ⎪ad , for d ij < d cut , where a = ⎨ Ω( d ij ) = ⎨ ij ⎩a 2 for d ≥ 0 ⎪a d 2 , for d ≥ d cut ⎩ 2 cut and
and a1 << a 2
(1)
Particle Based Model of Tumor Progression Stimulated by the Process of Angiogenesis
G d ij = rij − (ri + rj )
181
(2)
where dij is the distance between particles. The particle interactions are depicted in Fig. 4.
cells growing & evolution
nutrients and TAF conversions & generations
cells positions calculations
mechanical interactions between cells
nutrients and TAF diffusion
Fig. 2. Building blocks of the tumor-induced angiogenesis model
The particle motion is governed by the Newtonian laws, that is:
mi
dVi = − a ⋅ ∇ Ω (d ij ) − λ ⋅ V i dt
d ri = Vi dt
rij = (ri − r j )⋅ (ri − r j )
T
(3)
where ri and Vi are the position and velocity of particle i, respectively, while λ is the friction coefficient. The diffusion of oxygen and TAF is faster than the process of tumor growth. Therefore, the concentrations of TAF and oxygen can be computed in the steady state. This allows for using the fast procedure of solving Laplace equation on a discrete irregular lattice of particles. Only the vessels with circulating blood (anastomosing vessels) are the sources of oxygen while the cells in hypoxia are the sources of TAF. In our simplified model we neglect blood flow in capillaries [5]. We assume instead, that all the vessels are sources of oxygen. Because the blood circulation is slower than diffusion but faster than mitosis cycle [6], in the following version of the model, the flow intensity in anastomozing vessels will be computed on the base of Kirchhoff’s laws [5]. The microscopic processes of the cell life-cycle and the growth of vascular network are simulated by using the following rules of tumor growth and blood vessel network expansion.
182
R. Wcisło and W. Dzwinel
Fig. 3. The implementation scheme of the particle model of angiogenesis
V rij ri
d cut
d
dij
rj
Fig. 4. The shape of inter-particle potential representing mechanical interactions between cell clusters
Tumor Growth. During the life-cycle the cluster of tumor cells (the particle) changes its state from new to necrotic, according to its individual clock and oxygen concentration. The particles of a given age and size being well oxygenated undergo the process of “mitosis”, i.e., one particle (the "mother" particle) divides producing two daughter particles. New born particle is in new state and has diameter equal to dMIN. It grows with time, proportionally to oxygen concentration up to dMAX. The minimum and maximum particle diameters depend on the model granularity. Finally, after time TN, the particles undergo apoptosis, i.e., programmed cell depth. The dead particles are in necrotic state and are removed from the system. For oxygen concentration smaller than a given threshold, the particle state (excluding necrotic one) can be converted to hypoxia state becoming the source of TAF. The particles, which are in hypoxia for a period of time longer than a given threshold, die (necrosis) and are removed from the system.
Particle Based Model of Tumor Progression Stimulated by the Process of Angiogenesis
183
Expansion of Blood Vessel Network. The blood vessels are made also of particles. The new sprout is formed when the TAF concentration exceeds a given threshold. Then the “vessel particle” undergoes the process of “mitosis” directed to the local gradient of TAF concentration. The haptotaxis is not included yet in the model. Only particles close to the sprout tip can replicate. The interactions between both two identical vessel particles and vessel_particle-tumor_particle describe the same potential as defined by Eq.1.
Fig. 5. Mechanical remodeling – growing tumor pushes the arteries away
Unlike in simulations of the angiogenesis described in [4, 6], where the vascularization starts and develops outside the tumor, our model allows the growth of tumor simultaneously with its internal vascularization. This situation is much more complicated, mainly due to the changes in TAF production and mechanical forces involving blood vessels remodeling. In Fig. 4 we display two snapshots of simulation, which shows how the growing tumor pushes the arteries away.
Fig. 6. The snapshot from 3-D simulation of tumor growth and the vessels formation due to the process of angiogenesis. The cells shaded in dark are better oxygenated.
184
R. Wcisło and W. Dzwinel
Fig. 7. Snapshot from 2-D simulations showing simultaneous growth of tumor and its vascular network. Darker regions display well oxygenated tumor cells.
We started the simulations with a single “vessel particle” located in the centre of small tumor cluster. Of course, this situation is not realistic. The initial conditions should consist of avascular tumor cluster located close to the blood capillary, which produces the network of capillaries growing towards the tumor. Moreover, only anastomosing vessels allowing for blood circulation are the new sources of oxygen while in our simulations all the vessel particles are the sources of oxygen. However, this is only the technical problem, which can be solved in the nearest future. The Figs. 6, 7 display the unique possibility of the particle based model, which can simulate tumor progression simultaneously with vascular network continuously remodeled by mechanical forces exerted by the tumor body.
4 Concluding Remarks Our concept of interacting particles represents a new framework for a more precise multi-scale model of the intrinsically complex process of angiogenesis. This approach is competitive to existing paradigms employing heterogeneous sub-models [4] or cellular automata [8, 11]. The possibility of simulation of progression of vascularized tumor involving simultaneous remodeling of the blood vessel system is the great advantage of particle based approach over other approaches. Moreover, realistic 3-D dynamics of the entire system consisting of the tumor and other tissue cells, blood vessels and blood flow can be reproduced by using the same model. The scalability of the particle model allows for simulating of large systems consisting of hundred of millions of particles. Our model can be supplemented by other microscopic and macroscopic processes, e.g., by modifying the “interaction” forces between clusters, incorporating more precise models of reproduction mechanism, cell-life cycle, lumen growth dynamics, blood flow in the vessels, including haptotaxis mechanism and others. The model can be supplemented also by such important process like blood circulation. In [14] we show that blood flow can be modeled by using dissipative particle dynamics in a very short spatio-temporal scale [14] (see Fig. 8).
Particle Based Model of Tumor Progression Stimulated by the Process of Angiogenesis
185
Fig. 8. The snapshot from simulation of red blood cells flowing in the vain with the chocking point modeled using fluid particle model [14]
Currently, we are working on using smoothed particle hydrodynamics (SPH) employing viscoelastic rheological models (like the Casson fluid) for simulating blood flows in larger scales. These two models of the processes of tumor growth and blood flow are planned to be combined together (see Fig. 1) in the scope of generalized multiscale particle model. The particle model of blood flow can be fine grained employing large number of particles (>106). As we show in [15], the code can be easily parallelized and run on multiprocessor cluster to speed up computations. We have estimated that we could obtain at least the same speed-up factor (i.e., about 20 on 32 processors) for the model of tumor growth. We plan to validate the model on the basis of the comparison between realistic images of tumor vascular networks from confocal microscopy and computer experiments. The comparisons will be made employing structural properties of tumor vascularization. The vascular networks can be described by the feature vectors with statistical and/or algebraic descriptors of complex networks as the vector components. Finally, pattern recognition methods such as clustering and features extraction will be used for data classification. Acknowledgments. We are indebted to Professor dr Arkadiusz Dudek from the University of Minnesota Medical School for numerous comments, valuable suggestions and his overall contribution to this work. This research is financed by the Polish Ministry of Education and Science, Project No. 3 T11F 010 30.
References 1. Folkman, J.: Tumor angiogenesis: Therapeutic implications. N. Engl. J. Med. 285, 1182– 1186 (1971) 2. Ferrara, N., Chen, H., Davis Smyth, T., Gerber, H.P., Nguyen, T.N., Peers, D., Chisholm, V., Hillan, K.J., Schwall, R.H.: Vascular endothelial growth factor is essential for corpus luteum angiogenesis. Nat. Med. 4, 336–340 (1998)
186
R. Wcisło and W. Dzwinel
3. Mantzaris, N., Webb, S., Othmer, H.G.: Mathematical Modeling of Tumor-induced Angiogenesis. J. Math. Biol. 49(2), 1416–1432 (2004) 4. Bellomo, N., de Angelis, E., Preziosi, L.: Multiscale Modeling and Mathematical Problems Related to Tumor Evolution and Medical Therapy. J. Theor. Med. 5(2), 111–136 (2003) 5. Stéphanou, A., McDougall, S.R., Anderson, A.R.A., Chaplain, M.A.J., Sherratt, J.A.: Mathematical Modelling of Flow in 2D and 3D Vascular Networks: Applications to Antiangiogenic and Chemotherapeutic Drug Strategies. J. Math. Comput. Model 41, 1137– 1156 (2005) 6. Chaplain, M.A.J.: Mathematical modelling of angiogenesis. J. Neuro-Oncol. 50, 37–51 (2000) 7. Luo, S., Nie, Y.: FEM-based simulation of tumor growth in medical image. In: Galloway Jr., R.L. (ed.) Medical Imaging 2004: Visualization, Image Guided Procedures, and Display. Proceedings of SPIE, vol. 5367, pp. 600–608 (2004) 8. Moreira, J., Deutsch, A.: Cellular Automaton Models of Tumor Development: A Critical Review. Adv. Complex Syst. 5/2(3), 247–269 (2002) 9. Fracchia, F.D., Prusinkiewicz, P., de Boer, M.J.M.: Animation of the development of multicellular structures. In: Magnenat-Thalmann, N., Thalmann, D. (eds.) Computer Animation 1990, pp. 3–18. Springer, Tokyo (1990) 10. Lappa, M.: A CFD level-set method for soft tissue growth: theory and fundamental equations. J. Biomech. 38(1), 185–190 (2005) 11. Hoekstra, A.G., Lorenz, E., Falcone, L.-C., Chopard, B.: Towards a Complex Automata Framework for Multi-scale Modeling: Formalism and the Scale Separation Map. In: Shi, Y., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2007. LNCS, vol. 4487, pp. 1611–3349. Springer, Heidelberg (2007) 12. McDougall, S.R., Anderson, A.R.A., Chaplain, M.A.J., Sherratt, J.A.: Mathematical modelling of flow through vascular networks: Implications for tumour-induced angiogenesis and chemotherapy strategies. Bull. Math. Biol. 64, 673–702 (2002) 13. Dzwinel, W., Alda, W., Yuen, D.A.: Cross-Scale Numerical Simulations Using DiscreteParticle Models. Molecular Simulation 22, 397–418 (1999) 14. Dzwinel, W., Boryczko, K., Yuen, D.A.: A Discrete-Particle Model of Blood Dynamics in Capillary Vessels. J. Colloid Int Sci. 258(1), 163–173 (2003) 15. Boryczko, K., Dzwinel, W., Yuen, D.A.: Modeling Heterogeneous Mesoscopic Fluids in Irregular Geometries using Shared Memory Systems. Molecular Simulation 31(1), 45–56 (2005)
A Multiphysics Model of Myoma Growth Dominik Szczerba1 , Bryn A. Lloyd1 , Michael Bajka2 , and G´ abor Sz´ekely1 1
Computer Vision Laboratory, ETH, CH-8092 Z¨ urich, Switzerland
[email protected] 2 Clinic of Gynecology, University Hospital of Z¨ urich, Switzerland
Abstract. We present a first attempt to create an in-silico model of a uterine leiomyoma, a typical exponent of a common benign tumor. We employ a finite element model to investigate the interaction between a chemically driven growth of the pathology and the mechanical response of the surrounding healthy tissue. The model includes neoplastic tissue growth, oxygen and growth factor transport as well as angiogenic sprouting. Neovascularisation is addressed implicitly by modeling proliferation of endothelial cells and their migration up the gradient of the angiogenic growth factor, produced in hypoxic regions of the tumor. The response of the surrounding healthy tissue in our model is that of a viscoelastic material, whereby a stress exerted by expanding neoplasm is slowly dissipated. By incorporating the interplay of four underlying processes we are able to explain experimental findings on the pathology’s phenotype. The model has a potential to become a computer simulation tool to study various growing conditions and treatment strategies and to predict post-treatment conditions of a benign tumor.
1
Introduction
Tumor growth is one of the leading diseases in humans. Whereas aggressively growing malignant tumors lead to anarchic morphological patterns also breaking into the surroundings, there are usually regularly developed structures respecting borders with a well developed network of vessels in slowly growing benign tumors. These structure-bearing characteristics predestine benign tumors to be analyzed by creating in silico models of the growth process. Better understanding of tumor development might help to optimize the existing therapeutic procedures, design new ones, or even provide post-treatment predictions. Uterine leiomyomas (fibroids, a typical exponent of a benign tumor) are the most common uterine neoplasm affecting more than 30% of women older than 30 years of age. Especially submucous leiomyoma are of primary clinical interest because they often distort the uterine cavity causing serious gynecological disturbances including pelvic pain, bleeding disorders and sterility. Fibroids are the dominating benign pathology of the uterus. They are predominantly composed of smooth muscle and a variable amount of fibrous tissue. Despite the amount of research in this area, the exact etiology and pathogenesis of a myoma is not known. It is assumed that the genesis is initiated by regular muscle cells with increased growth potential and that the growth of myomas is driven by estrogen and local M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 187–196, 2008. c Springer-Verlag Berlin Heidelberg 2008
188
D. Szczerba et al.
growth factors. In general, a myoma grows slowly but continuously until the beginning of menopause. The increase of volume by a factor of two usually takes several months or years. A myoma has a much stronger tendency to keep its shape than any of the tissues surrounding it, as it is composed of very dense fibrotic tissue. There is no real capsule around a myoma which is only surrounded by a clustered myometrium. Most often myomas are classified depending on their position relative to the uterine wall: intramural, subserosal (visible from the abdominal cavity) or submucosal (visible from the uterine cavity). Latter are discerned in pedunculated myomas (type 0), predominantly intracavitary myomas (type I, forming an acute angle with the uterine wall, intracavitary portion > 50%) and predominantly intramural myomas (type II, forming an obtuse angle with the uterine wall, intracavitary portion <= 50%). These different degrees of intracavitary protrusion may result from myoma migration trough the healthy surrounding muscular mesh during the growth process whereas the exact mechanism remains unclear. The endometrium is a reactive tissue carrying viscoelastic characteristics covering the entire uterine cavity including protruding myomas of any degree. It may be stretched and thinned out until its blood supply may be affected and punctual necrosis of the endometrium may occur. Blood supply to the fibroids is provided by one or two large vessels ([1]), whereas [2] found that small fibroids (1-3 mm) were avascular, in larger ones (< 1cm) a few small vessels invaded the lesion from the periphery, and the largest fibroids (> 1cm) contained irregular networks of blood vessels with density similar to or lower than in normal myometrium, being surrounded by an extremely dense vascular layer. Walocha concluded that, during development of leiomyoma, the pre-existing blood vessels undergo regression and new vessels invade the tumor from the periphery, where intense angiogenesis, probably promoted by growth factors secreted by the tumor, leads to the formation of a vascular capsule, responsible for supply of blood to the growing tumor. We aim at a physically correct simulation, while relying on realistic physiological parameters when describing the underlying processes. Only such a low-level modeling paradigm will have sufficient predictive power required to study e.g. effects of drug administration or the influence of various interventional strategies on the surgical outcome. In this paper we present a first attempt to create a coupled mechano-chemical model of a growing leiomyoma. Followed by a literature overview of the morphological and physiological tumor development and previously proposed methods for simulation of tumor growth, we present our multiphysics model. Results generated by its implementation demonstrate the feasibility of our approach.
2
Literature Overview
Tumor development has been extensively studied for the last 3 decades. An overview of the available literature relevant to our approach is given in previous works ([3], [4], [5]). We only briefly comment on the presented approaches, that we arbitrarily group into the following categories: 1) mathematical models, 2) cellular automata, 3) finite element methods. In general, mathematical models
A Multiphysics Model of Myoma Growth
189
are very interesting to study cellular metabolism and the temporal and spatial dynamics of tumor development, they are, however, often limited in scope (e.g. do not cover tissue mechanics) while being already significantly complex in the mathematical formulation used. Moreover, their mathematical sophistication is not always readily implementable for real-life clinical problems defined by patient-specific finite element discretizations of e.g. volumetric radiological data. Cellular automaton approaches are demonstrated to be particularly well suited to model dynamics of developing pathologies while being relatively accessible in formulation. Such models, however, typically do not address the important mechanical interactions between the tumor and healthy tissue. We identify our approach largely with the third group, adopting the modeling techniques from solid and fluid mechanics as well as chemical engineering. This is particularly true for the strain-induced cellular response, gaining acceptance as an influential player in tissue development as shown in [6]. With finite element models we can account for mechanical tissue deformations induced by developing pathologies and capture its interplay with the chemical environment. In this context multiphysics models, i.e. models combining a few independent physical phenomena into one computationally coherent simulation, become of particular interest. Feasibility of such an approach has recently been demonstrated by [4] on the example of micro-vessel growth and remodeling. We propose to apply the same modeling paradigm to simulate the development of a leiomyoma.
3
The Multiphysics Model
We initialize the pathology growth inside a hosting tissue (myometrium) and realize its development as a response to activating factors secreted by the growing tumor. The dividing cells acquire their oxygen and nutrients by intra-cellular diffusion-dominated supply (corresponding to small fibroids 1-3 mm ([2])). Such development will stop when a certain critical mass has been reached, as diffusionbased transport is no longer efficient for proliferation-dominated growth. In such cases the tumor will attract external blood supply from the adjacent parent vessels to acquire necessary oxygen and nutrients (corresponding to larger fibroids ([2])). The formation of blood vessels during angiogenesis in general - healthy or cancerous - is a process where capillary sprouts depart from pre-existing parent vessels in response to externally supplied chemical stimuli. By means of endothelial cell proliferation and migration the sprouts then organize themselves into a branched, connected network structure, created in order to feed the developing tumor. A detailed description of this process in context of computer modeling is given by [3]. The novelty of our model compared to the one described therein is not only to use the delivered nutrients to explicitly control the growth, but couple the growth back to the vessel network behavior via secretion of tumor angiogenesis growth factors (AGF). The components of this linked bio-chemomechanical model are described in the following sections.
190
3.1
D. Szczerba et al.
Tumor Neovascularization
We start our modeling from the widely accepted assumption that the initial response of the endothelial cells to the angiogenic growth factors is due to chemotaxis, enforcing cell migration towards the tumor or ischemic cell. Once secreted, the growth factors diffuse from the tumor (domain Ω2 ) into the surrounding tissue (domain Ω1 ) establishing a certain concentration gradient between the chemical source and the parent vessels. As the endothelial cells migrate through the extracellular matrix in response to this gradient there is some uptake and binding of the growth factors by the cells ([7]). Therefore, this process can be modeled by a diffusion equation with a natural decay term: ∂c1 = D1 ∇2 c1 − R1 c1 ∂t c1 = α1 H − |∂Ω12 H− = dΩ2
(1) (2) (3)
c3 ≤cth 3
with c1 being the chemical concentration of the growth agent, D1 its diffusion constant and R1 decay rate. The amount of the growth factor released depends (here, linearly) on the amount of tumor cells suffering from hypoxia H − , with a minimum required oxygenation threshold cth 3 . Once the distribution of the growth agent has been initiated, endothelial cells start to respond to the stimulus. The endothelial density evolution is governed by a conventional convectiondiffusion equation that we write as: ∂c2 = D2 ∇2 c2 + ∇ · (c2 u) ∂t k0 k1 u= ∇c1 k1 + c1
(4) (5)
with c2 being the endothelial cell density and D2 , R2 positive constants. Note the dependence of the convective velocity u on the growth factor concentration c1 . Now that a vascular system has changed according to the growth agent, a new distribution of oxygen results, governed by: ∂c3 = D3 ∇2 c3 − R3 c3 ∂t c3 = α3 c2 |∂Ω12
(6) (7)
with D3 being the diffusion constant and R3 the reaction rate. Here, the amount of oxygen delivered to the tumor depends linearly on the density of endothelial cells, assumed to form predominantly capillaries close to the tumor. 3.2
Tissue Mechanics
We base our modeling on experimental observations of mechanic tissue responses to chemical growth factors, see e.g. [6]. Production of such growth factors in our
A Multiphysics Model of Myoma Growth
191
model is a result of ischemia, as described in the preceding section. As the growth itself results largely from increased proliferation rate, we can model it as initial strain condition: t+Δt ∂N ∂t dt ∼ N (t + Δt, x, θ) − N (t, x, θ) 0 (t) = t (8) = N N (t, x, θ) where N (t, x, θ) is the number of cells in a finite element at time t and position x. For a constant growth rate the strain is Δt T2 , where T2 is the cell population doubling time. The proliferation rate and, therefore, the number of cells depends on certain environmental factors θ such as oxygen availability. Based on descriptions in the literature ([8]) we have defined this dependency with a piece-wise linear function (0 = Δt T2 f (c3 (t))), which relates the deviation from normal oxygen concentration to the amount of growth. A step function around a hypoxia threshold c3 has been assumed, thus 0 = α4 H + H+ = dΩ2
(9) (10)
c3 >cth 3
which aims to approximate the fact that cells receiving enough oxygen will proliferate resulting in stress both in the tumor and the surrounding tissue. To quantify this stress one needs a model for the tissue response to strain. Linear elastic material is long known to be inadequate to describe soft tissue responses to mechanical loads. To describe the tissue behavior under some strain (t) we take the standard viscoelastic model after [9]: Eσ (t) EE0 d d σ (t) + = + (E + E0 ) (t) dt μ μ dt
(11)
Solving symbolically for σ(t) leads to an expression for stress relaxation: σ(t) = 0 (E0 + Ee−Et/μ )
(12)
with 0 (E0 + E) the linear contribution to total stress and μ the relaxation time. Under the assumption of tissue incompressibility and approximate spherical symmetry, the thickness of the thin tissue rim surrounding the myoma from the cavity side has to change with the increase of the pathology radius in order to compensate for the volume change. Perfectly elastic wall (hoop) stress in a thin-walled sphere is given as: σ(r) = 1/2
−r +
pr √ r2 + 2 w0 r0 + w0 2
(13)
with r0 and w0 rest-state radius of the sphere and the thickness of its wall, respectively. Even if the strain-stress relation is linear, as seen in Fig. 1 (solid line), the wall stress is not, because of the variable wall thickness. Solution for a general (including viscous) stress will be of a similar form, multiplied by a
192
D. Szczerba et al.
relaxation factor (dashed line). What this means for our model is that we are able now to simulate a spectrum of tissue behaviors: from perfectly elastic (μ → 0) to perfectly plastic (μ → ∞), regulating the relaxation contribution with μ. Of course, in general, these model parameters will be different for the tumor and the surrounding tissue. The resulting overall model depends on 5 unknowns: 3 concentrations (scalars) and 2 stresses (tensors). The model parameters that we have used in the actual implementation are discussed below.
3.3
Estimation of the Model Parameters
Based on measurements of the average thickness dv of the viable rim of in vitro tumor spheroids in [10,11], and estimates of oxygen consumption rates in [12,13] and diffusion coefficient in [12,13], we can estimate the critical drop in the oxygen concentration which leads to hypoxia-induced growth factor production. By solving analytically a radially symmetric constant boundary reaction-diffusion problem in an idealized spherical geometry and using a linear oxygen reaction of 1.2 × 10−3 (ml O2 )(ml tissue)−1 , rate as before, we get an estimate for cthr 3 which corresponds to approximately 31mmHg. This seems reasonable, since the median partial pressure in breast cancers was 30mmHg as measured by [14]. We were unable to find any measurements of the AGF uptake rate R1 in the literature. We assume the value R1 = 10−6 s−1 . Estimates for the diffusion coefficient for different AGFs are in the range of one to two orders of magnitude smaller than for oxygen, we use the value D1 = 10−10 m2 s−1 after [15]. In order to set the coupling factor α1 , we measure the hypoxic area H of the tumor in the initial configuration where we define c1 = c0 = 10−7 mole/m3 ([16]), i.e. α1 = H/c0 which is approximately 1/3. The parameters D2 , k2 and k0 are taken from [16]. [17] measured the chemotactic coefficient k0 and diffusion coefficient D2 of migrating endothelial cells in gradients of aFGF. The measurements were however made with unconstrained endothelial cells. As pointed out by [16] these experimental conditions overestimate the random motility, therefore they chose a smaller value for D2 . The k1 is nearly constant for low AGF concentrachemotactic function k(c1 ) = kk10+c 1 tion c1 , but decreases for high concentrations in the range of k1 . The parameter k1 is therefore chosen close to c0 = 10−7 mole/m3 which is supposed to be a relatively high concentration, again measured for the chemoattractant aFGF. This receptor kinetic form reflects the more realistic assumption that chemotactic sensitivity decreases with higher AGF concentrations. It has been shown in [10] that the oxygen consumption is reduced at lower oxygen concentrations. Therefore, we take a linear reaction rate in our model R3 . In order to estimate the oxygen delivered to the tissue by the vasculature in terms of the endothelial cell density, we make some simplifications. The oxygen flux through a segment of a vessel is approximately 1 QO2 = dS KO2 (POblood − POtissue ) , 2 2 w
(14)
A Multiphysics Model of Myoma Growth
193
where POtissue is the oxygen partial pressure on the interface between the vessel 2 and the tissue, dS is the vessel surface, w wall thickness, and QO2 is the oxygen transported through the vessel wall ([13]). The partial pressure of the blood is . The coefficient KO2 is also known as Krogh’s oxygen diffusion given by POblood 2 coefficient ([18]). Assuming the known mean radius R of a capillary, and its thickness w, we can estimate oxygen source at the capillaries as SO2 =
QO2 2R = c2 · KO2 (POblood − POtissue ) = c2 · C0 2 2 2 V (R − (R − w)2 ) w
(15)
In order to obtain an estimate for α3 in eq. 7 we multiply C0 by the approximate diffusion length. For the mean tissue oxygen partial pressure we used the value measured in the breast by [14] to be 65mmHg. The blood oxygen pressure was assumed to be that of arterial blood.
4
Results
A development of a virtual myoma is shown in Fig. 2, intensity-coded with stress, using identical scale to facilitate magnitude comparisons. The neoplasm started growing as a small oval pathology in myometrium and continues to grow into the surrounding tissue that is largely dissipating the generated strain energy. As shown, the tissue after being exposed to the critical stress of around 1M P a (Fig. 2e, value out of scale), gradually relaxes its stress allowing the pathology to continue expanding and therefore only weakly constraining its progression. The formation of an experimentally observed type I myoma is inevitable. Should the behavior of the surrounding tissue be dominated by elasticity (i.e., the expansion induced stress in the thin tissue slab would be mostly preserved), the stretched layer of the myometrium would very strongly constrain the growth. This is because the reaction force would be equal to - or greater than - the accumulated 8
10
7
10
6
10
5
10
1
2
3
4
5
6
7
8
9
10
Fig. 1. Wall stress in the thin tissue layer accumulating the strain energy (solid line) and dissipating it (dashed line)
194
D. Szczerba et al.
Fig. 2. Simulation results of a myoma (type I) developing at the border of the myometrium. Due to stress dissipation there is no significant resistance from the tissue layer on the cavity side (upwards on the image). As a result the pathology envelope forms an acute angle with the basement layer.
stress, comparable in magnitude to the elastic modulus of the pathology (c.f. Fig. 1). We have tested this scenario as well and found a myoma of type II with appearance identical to that in Fig. 2e only with much higher (accumulated) stress, constraining further growth. The myoma surfaces, surfaces only partially (if at all) or even protrudes back into the myometrium, depending on the initial starting position of the growth. All these scenarios are observed during myoma resection surgeries and can well be explained by our model (cf. [1]).
5
Discussion and Outlook
The findings resulting from our first computational experiments are in good agreement with observations on real cases. We are now in position to study e.g. how modifications of the vascular system by means of anti-angiogenic therapy could influence the dynamics and appearance of a myoma. We can also investigate how the different tissue properties determine if the myoma surfaces into the cavity or protrudes back into the myometrium. This and other in-silico experiments are under way and will be reported separately.
A Multiphysics Model of Myoma Growth
195
A point of discussion is the viscoelasticity of the surrounding tissue. This could correspond to e.g. increased proliferation or loss of elasticity due to structural fiber damage under prolonged exposure to stress. From histological stains we know that the healthy myometrium mesh and the membranaceous endometrium may be stretched by a stiff growing myoma to the many fold of their original dimensions, coped by hypertrophy and/or hyperplasia. At a certain moment, thinning of healthy structures is not compatible with sufficient blood supply anymore, leading to necrosis in the top stressed area first, e.g. causing pain and bleeding disorders. Obviously, our model cannot be free from simplifications. We currently only consider oxygen as the nutrient while tumor cells often do not require much oxygen for growth, but glucose, fat, and amino acids. Even when addressing only oxygen in our current model, we work under the assumption that the oxygen is delivered only to the tumor surface. We do not currently model the vascular penetration. Modeling the blood vessels explicitly as in [5] would allow making studies and predictions of the actual structure and function of the feeding vessels. Distinguishing between myometrium and endometrium is another extension that we consider in the next version. The quantitative validation of the findings presented is currently prepared by comparisons to histologic stains and vascular corrosion castings of uteri containing myomas (c.f. [2]). Acknowledgments. This work is part of the Swiss National Center of Competence in Research on Computer Aided and Image Guided Medical Interventions (NCCR Co-Me), supported by the Swiss National Science Foundation. We are indebt to prof. dr. med. Haymo Kurz from the Institute of Anatomy and Cell Biology, University of Freiburg, for numerous discussions and explanations.
References 1. Mencaglia, L., Hamou, J.E.: Manual of Gynecological Hysteroscopy. Endo-Press (2001) 2. Walocha, J.A., Litwin, J.A., Miodonski, A.J.: Vascular system of intramural leiomyomata revealed by corrosion casting and scanning electron microscopy. Hum. Reprod. 18(5), 1088–1093 (2003) 3. Szczerba, D., Szekely, G.: Simulating vascular systems in arbitrary anatomies. In: Duncan, J.S., Gerig, G. (eds.) MICCAI 2005. LNCS, vol. 3750, pp. 641–648. Springer, Heidelberg (2005) 4. Szczerba, D., Szekely, G., Kurz, H.: A multiphysics model of capillary growth and remodeling. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2006. LNCS, vol. 3992, pp. 86–93. Springer, Heidelberg (2006) 5. Lloyd, B., Szczerba, D., Sz´ekely, G.: A coupled finite element model of tumor growth and vascularization. In: Ayache, N., Ourselin, S., Maeder, A. (eds.) MICCAI 2007, Part II. LNCS, vol. 4792, pp. 874–881. Springer, Heidelberg (2007) 6. Gordon, V.D., Valentine, M.T., Gardel, M.L., Andor-Ardo, D., Dennison, S., Bogdanov, A.A., Weitz, D.A., Deisboeck, T.S.: Measuring the mechanical stress induced by an expanding multicellular tumor system: a case study. Experimental Cell Research 289(1), 58–66 (2003)
196
D. Szczerba et al.
7. Ausprunk, D.H., Folkman, J.: Migration and proliferation of endothelial cells in preformed and newly formed blood vessels during tumor angiogenesis. Microvascular Research (1977) 8. Cristini, V., Lowengrub, J., Nie, Q.: Nonlinear simulation of tumor growth. Journal of Mathematical Biology V46(3), 191–224 (2003) 9. Humphrey, J.D., DeLange, S.: An Introduction to Biomechanics. In: Solids and Fluids, Analysis and Design. Springer, Heidelberg (2004) 10. Freyer, J.P., Sutherland, R.M.: A reduction in the in situ rates of oxygen and glucose consumption of cells in EMT6/Ro spheroids during growth. J. Cell. Physiol. 124(3), 516–524 (1985) 11. Frieboes, H.B., Zheng, X., Sun, C.H., Tromberg, B., Gatenby, R., Cristini, V.: An integrated computational/experimental model of tumor invasion. Cancer Res. 66(3), 1597–1604 (2006) 12. Salathe, E.P., Xu, Y.H.: Non-linear phenomena in oxygen transport to tissue. Journal of Mathematical Biology 30(2), 151–160 (1991) 13. Ji, J.W., Tsoukias, N.M., Goldman, D., Popel, A.S.: A computational model of oxygen transport in skeletal muscle for sprouting and splitting modes of angiogenesis. J. Theor. Biol. 241(1), 94–108 (2006) 14. Vaupel, P., Schlenger, K., Knoop, C., H¨ ockel, M.: Oxygenation of human tumors: evaluation of tissue oxygen distribution in breast cancers by computerized o2 tension measurements. Cancer Res. 51(12), 3316–3322 (1991) 15. Gabhann, F.M., Popel, A.S.: Interactions of VEGF isoforms with VEGFR-1, VEGFR-2, and neuropilin in vivo: a computational model of human skeletal muscle. Am. J. Physiol. Heart Circ. Physiol. 292(1), H459–H474 (2007) 16. Anderson, A.R.A., Chaplain, M.A.J.: Continuous and discrete mathematical models of tumor-induced angiogenesis. Bulletin of Mathematical Biology V60(5), 857– 899 (1998) 17. Stokes, C.L., Rupnick, M.A., Williams, S.K., Lauffenburger, D.A.: Chemotaxis of human microvessel endothelial cells in response to acidic fibroblast growth factor. Lab. Invest. 63(5), 657–668 (1990) 18. Ursino, M., Giammarco, P.D., Belardinelli, E.: A mathematical model of cerebral blood flow chemical regulation–Part I: Diffusion processes. IEEE Trans. Biomed. Eng. 36(2), 183–191 (1989)
Computational Implementation of a New Multiphysics Model for Field Emission from CNT Thin Films N. Sinha1 , D. Roy Mahapatra2, R.V.N. Melnik3 , and J.T.W. Yeow1 1
Department of Systems Design Engineering, University of Waterloo, Waterloo, Canada 2 Department of Aerospace Engineering, Indian Institute of Science, Bangalore, India 3 2 M NeT Lab, Wilfrid Laurier University, Waterloo, Canada
[email protected],
[email protected],
[email protected],
[email protected]
Abstract. Carbon nanotubes (CNTs) grown in a thin film have shown great potential as cathodes for the development several field emission devices. However, in modeling these important devices we face substantial challenges since the CNTs in a thin film undergo complex dynamics during field emission, which includes processes such as (1) evolution, (2) electromechanical interaction, (3) thermoelectric heating and (4) ballistic transport. These processes are coupled, nonlinear, and multiphysics in their nature. Therefore, they must be analyzed accurately from the stability and long-term performance view-point of the device. Fairly detailed physics-based models of CNTs considering some of these aspects have recently been reported by us. In this paper, we extend these models and focus on their computational implementation. All components of models are integrated at the computational level in a systematic manner in order to accurately calculate main characteristics such as the device current, which are particularly important for stable performance of CNT thin film cathodes in x-ray devices for precision biomedical instrumentation. The numerical simulations reported in this paper are able to reproduce several experimentally observed phenomena, which include fluctuating field emission current, deflected CNT tips and the heating process. Keywords: carbon nanotubes, field emission, current density, phonon.
1
Introduction
Since their discovery in 1991 [1], a substantial interest has been shown for potential applications of carbon nanotubes (CNTs). As a result, numerous devices incorporating CNTs have been proposed. Although some of the applications of CNTs may be realized in distant future, their application as electron field emitters already show great potential today [2]. With significant improvement in their growth conditions, they rank among the best emitters that are currently available. These filed emitting cathodes have several advantages over the conventional thermionic cathodes: (i) current density from field emission would be M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 197–206, 2008. c Springer-Verlag Berlin Heidelberg 2008
198
N. Sinha et al.
orders of magnitude greater than in the thermionic case, (ii) a cold cathode would minimize the need for cooling, and (iii) a field emitting cathode can be miniaturized. Field emission performance of an isolated CNT is found to be remarkable due to its structural integrity, high thermal conductivity and geometry. However, the situation becomes complex for cathodes comprising CNT thin films. In this case, individual CNTs are not always aligned normal to the substrate surface, which is due to electromechanical interaction among neighboring CNTs. Small spikes in the current have been observed experimentally [3]. These can be attributed to change in gap between the CNT tip and the anode plate either due to elongation of CNTs under high bias voltage or due to degradation/fragmentation of CNTs. Also, there is a possibility of dynamic contact of pulled up CNT tips with the anode plate when the bias voltage is very high. In order to stabilize the collective field emission from a CNT based thin film, preferential breakdown of a small number of CNTs is achieved by increasing the bias voltage after initial exposure to certain low voltage [4]. In addition, the coupled electron-phonon transport may produce temperature spikes. The temperature can significantly influence the electrical conductivity [5]. From the modeling aspect, this becomes a general case, but is very challenging. In this paper, we extend the results of [6]-[7] and include some of these aspects, focusing on the device-level performance of CNTs in a thin film. A diode configuration is considered here, where the cathode contains a CNT thin film grown on a metallic substrate. The anode acts as the field emission current collector. A major concern in this work is the inherent coupling among (i) electromechanical forces causing deformation of CNTs and (ii) the ballistic electron-phonon transport. From a system perspective, such a detailed study proves to be useful in understanding the reason behind the experimentally observed fluctuation in device current, which is undesirable for applications such as precision x-ray generation biomedical devices.
2
CNT Field Emission as a Multiphysics Process: The Development of a Mathematical Model
The physics of field emission from a flat metallic substrate is fairly well understood. The current density (J) due to field emission from a metallic surface is usually obtained by using the Fowler-Nordheim equation [8], which can be expressed as CΦ3/2 BE 2 exp − J= , (1) Φ E where E is the electric field, Φ is the work function for the cathode material, and B and C are constants. In the CNT thin film problem, under the influence of sufficiently high voltage at ultra-high vacuum, the electrons emitted from the CNTs (mainly from the CNT tip region and emitted parallel to the axis of the tubes) reach the anode. Unlike the metallic emitters, here the surface of the cathode is not smooth. The cathode here consists of hollow tubes (CNTs) in curved shapes and with certain spacings. In addition, certain amount of impurities and carbon
Computational Implementation of a New Multiphysics Model
199
clusters may be present within the otherwise empty spaces in the film. Moreover, the CNTs undergo reorientation due to electromechanical interactions with the neighbouring CNTs during field emission. Analysis of these processes requires the determination of the current density by considering the individual geometry of the CNTs, their dynamic orientations and the variation in the electric field during electronic transport. Z Z'
Anode Z
E(x,y)
T
d R Cathode
h
X'
0 X
Fig. 1. CNT array configuration
In the present problem, we consider an array as shown in Fig. 1. A representative volume (Vcell ), which contains several CNTs with a prescribed distribution of their spacing at the substrate and random distribution of their curved shapes, is considered for the purpose of modeling. Furthermore, we discretize the CNT into several segments and nodes by treating each CNT as a 1D nanowire. At each node, we assign the quantities of interest, such as displacement, electron density, electric field and temperature. Next, governing equations involving these quantities of interest are derived in a systematic manner. An initial description of the thin film is given in terms of the tip angles and the curved shapes of the CNTs in Vcell , uniform conduction electron density of unstrained CNTs, a bias electric field and a reference temperature (temperature of the substrate). The phenomenological model of evolution of CNTs is given by four nonlinear coupled ordinary differential equations [6]. Based on this model, the rate of degradation of CNTs vburn is defined as 1/2 s(s − a1 )(s − a2 )(s − a3 ) dn1 (t) , (2) vburn = Vcell dt n2 a21 + m2 a22 + nm(a21 + a22 − a23 ) where n1 is the concentration of carbon atoms in the cluster form in the cell, a1 , a2 , a3 are lattice constants, s = 12 (a1 + a2 + a3 ), n and m are integers (n ≥ |m| ≥ 0). The pair (n, m) defines the chirality of the CNT. Therefore, at a given time, the length of a CNT can be expressed as h(t) = h0 − vburn t, where h0 is the initial average height of the CNTs and d is the distance between the cathode substrate and the anode (see Fig. 1). The effective electric field component for field emission calculation in Eq. (1) is expressed as dV(z) , (3) Ez = −e−1 dz
200
N. Sinha et al.
where e is the positive electronic charge and V is the electrostatic potential energy. The total electrostatic potential energy can be expressed as z G(i, j)(ˆ nj − n) , (4) V(x, z) = −eVs − e(Vd − Vs ) + d j where Vs is the constant source potential (on the substrate side), Vd is the drain potential (on the anode side), G(i, j) is the Green’s function [9] with i being the ring position, and n ˆ j denotes the electron density at node position j on the ring. The field emission current (Icell ) from the anode surface associated with Vcell of the film is obtained as N Jj , (5) Icell = Acell j=1
where Acell is the anode surface area and N is the number of CNTs in the volume element. The total current is obtained by summing the cell-wise current (Icell ). This formulation takes into account the effect of CNT tip orientations and one can perform statistical analysis of the device current for randomly distributed and randomly oriented CNTs. However, due to the deformation of the CNTs due to electromechanical forces, the evolution process requires a much more detailed treatment from the mechanics point of view. Based on the studies reported in published literature [10]-[12], it is reasonable to expect that a major contribution is by the Lorentz force due to the flow of electron gas along the CNT and the ponderomotive force due to electrons in the oscillatory electric field. The oscillatory electric field could be due to hopping of the electrons along the CNT surfaces and the changing relative distances between two CNT surfaces. In addition, the electrostatic force and the van der Waals force are also important. The net force components acting on the CNTs parallel to the Z and the X directions are calculated as [7] (6) fz = (flz + fvsz )ds + fcz + fpz , fx =
(flx + fvsx )ds + fcx + fpx .
(7)
where fl , fvs , fc and fp are Lorentz, van der Waals, Coulomb and ponderomotive forces, respectively, and ds is the length of a small segment of CNTs. Under the assumption of small strain and small curvature, the longitudinal strain εzz (including thermal strain) and stress σzz can be written as, respectively, (m) 2 (m) ∂uz 0 (m) ∂ ux − r + αΔT (z ) , σzz = E εzz , (8) εzz = ∂z ∂z 2 where the superscript (m) indicates the mth wall of the multi walled CNT (MWNT) with r(m) as its radius, ux and uz are lateral and longitudinal dispacements of the oriented CNTs, E is the effective modulus of elasticity of CNTs under consideration, ΔT (z ) = T (z ) − T0 is the difference between the absolute
Computational Implementation of a New Multiphysics Model
201
temperature (T ) during field emission and a reference temperature (T0 ), and α is the effective coefficient of thermal expansion (longitudinal). Next, by introducing the strain energy density, the kinetic energy density and the work density, and applying the Hamilton principle, we obtain the governing equations in (ux , uz ) for each CNT, which can be expressed as E A2
∂ 4 u x ∂2u ¨ x − ρA2 + ρA u ¨ − fx = 0 , 0 x ∂z 4 ∂z 2
(9)
∂ 2 uz 0 E A0 α ∂ΔT (z ) − + ρA0 u¨z 0 − fz = 0 , (10) ∂z 2 2 ∂z where A2 is the second moment of cross-sectional area about Z-axis, A0 is the effective cross-sectional area, and ρ is the mass per unit length of CNT. We assume fixed boundary conditions (u = 0) at the substrate-CNT interface (z = 0) and forced boundary conditions at the CNT tip (z = h(t)). By considering the Fourier heat conduction and thermal radiation from the surface of CNT, the energy rate balance equation in T can be expressed as − E A0
dQ −
πd2t dqF − πdt σSB (T 4 − T04 )dz = 0 , 4
(11)
where dQ is the heat flux due to Joule heating over a segment of a CNT, qF is the Fourier heat conduction, dt is the diameter of the CNT and σSB is the Stefan-Boltzmann constant. First, the electric field at the nodes are computed and then all the governing equations are solved simultaneously at each time step and the curved shape s(x + ux , z + uz ) of each of the CNTs is updated. The angle of orientation θ between the nodes j + 1 and j at the two ends of segment Δsj is expressed as j j j j (xj+1 + uj+1 u x ux x ) − (x + ux ) −1 j θ(t) = tan )] , = [Γ (θ(t − Δt) , j j ujz ujz (z j+1 + uj+1 z ) − (z + uz ) (12) where Γ is the usual coordinate transformation matrix which maps the displacements (ux , uz ) defined in the local (X , Z ) coordinate system into the displacements (ux , uz ) defined in the cell coordinate system (X, Z).
3
Computational Scheme, Results and Discussions
A key characteristics is the device current, and in what follows we focus on the systematic integration of all the models at the computational level to calculate the device current. At a given time, the evolved concentration of carbon clusters due to the process of degradation and CNT fragmentation is obtained from the nucleation coupled model, which is modeled by assuming the degradation as a reverse process of growth (the nucleation theory has been used for the growth of CNTs [13]). This information is then used in a time-incremental manner to describe the evolved state of the CNTs in the cells. At each time step, the net
202
N. Sinha et al.
Nucleation coupled model
Momentum balance equation ti-1
ti Electromechanical forces, geometry update
ti-1
ti
ti-1
ti
Ballistic transport at the tip region
Kinematics of CNTs
ti ti ti
ti Current density
ti
Electric field at CNT tips
ti-1
Electron gas flow
ti-1 ti
Update atomic coordinates
ti Device current
Fig. 2. Computational flowchart for calculating the device current −6.2
600 Voltage Current 500
−6.4
−6.6
300
Voltage (V)
log(I)
400
200 −6.8 100
−7 0
10
20
30 Time (s)
40
50
0 60
Fig. 3. Spikes in the field emission current at low bias voltage due to reorientation and pull-up of few CNTs
electromechanical force is computed using the momentum balance equation and equation for electron gas flow. Subsequently, the orientation angle of each CNT tip is obtained. Thereafter, we compute the electric field at the tip of CNTs at each time step. Finally, the current density and device current are calculated by employing Eq. (1). The computational flow chart for calculating the device current is shown in Fig. 2. The CNT film considered in this study consists of randomly oriented MWNTs. The film was grown on a stainless steel substrate. The film surface area (projected on anode) is 49.93 mm2 and the average height of the film (based on randomly distributed CNTs) is 12 μm. Actual experiments were carried out in a pressure controlled vacuum chamber and field emission current histories were measured under various DC bias voltages. Fig. 3 shows the occurence of current spikes at
Computational Implementation of a New Multiphysics Model
−5 −5.2
203
800 Voltage 700
Current −5.4
600 500
log(I)
−5.8 −6
400
−6.2
Voltage (V)
−5.6
300
−6.4 200 −6.6 100
−6.8 −7 380
385
390
395
400 405 Time (s)
410
415
0 420
Fig. 4. Fluctuation of field emission current from a baked sample having vertically aligned CNTs −5 h/25 h/20
−5.5
h/15
log(I)
−6 −6.5 −7 −7.5 V/d = 0.0144 V/nm −8 0
20
40
60
80
100
Time (s)
Fig. 5. Field emission current histories for various initial tip defelctions under a bias voltage of 500V
a voltage of 500V, which indicates that few CNTs are emitting heavily and are pulled up towards anode. More spikes are observed as bias voltage is increased to 700V (see Fig. 4). In the simulation and analysis, the constants B and C in Eq. (1) were taken as B = (1.4×10−6 )×exp((9.8929)×Φ−1/2 ) and C = 6.5×107 , respectively [14]. The initial height distribution h and the orientation angle θ were randomly distributed. The electrode gap (d) was maintained at 34.7μm. The orientation of CNTs was parametrized in terms of the upper bound of the CNT tip deflection (denoted by h0 /m , m >> 1). Several computational runs were performed and the output data were averaged out at each sampling time step. Figures 5-6 show the simulated current histories at different tip deflections and at different bias voltages. Following observations have been made from the
204
N. Sinha et al.
1 h/25 h/20 h/15
log(I)
0
−1
−2
−3 V/d = 0.0202 V/nm −4 0
20
40
60
80
100
Time (s)
Fig. 6. Field emission current histories for various initial tip defelctions under a bias voltage of 700V
−6.044
log(I)
−6.048
−6.052
−6.056
−6.06 0
20
40 60 80 CNT number in the array
100
Fig. 7. Current at the CNT tips at t=100 s of field emission
results: (i) at a constant bias voltage, as the initial state of deflection of the CNTs increases (from h0 /25 to h0 /20), the average current reduces until the initial state of deflection becomes large enough (h0 /15) that the electrodynamic interaction among CNTs produces sudden pull in the deflected tips towards the anode resulting in current spikes; (ii) the amplitude factor of current spikes at higher bias is of the order of ∼ 103 . On the other hand, the trend indicates current spikes with an amplitude factor of ∼ 102 for lower bias voltage. Figure 7 shows the tip current distribution at t = 100s for an array of 100 CNTs. After calculating strain from Eq. (8), corresponding changes in the bandgap along the CNT length were calculated using tight binding formulation for bandstructure as a function of strain [15]. The value of Young’s modulus used for the calculation was 0.27 TPa. The strained energy bandgap along the length of CNT is shown in Fig 8. The unstrained bandgap value was found to be 3.0452 eV. As evident from
Computational Implementation of a New Multiphysics Model
205
3.5
Energy Bandgap (eV)
3 2.5 2 1.5 1 500V 700V
0.5 0 0
0.2
0.4
0.6
0.8
1
CNT Length (m)
1.2 −5
x 10
Fig. 8. Cross-sectional energy bandgap distribution along the CNT at 500V and 700V
360
T (K)
340
320
300 0
20
40 60 CNT number in the array
80
100
Fig. 9. Maximum temperature of CNT tips during 100 s of field emission
Fig. 8, as the strain increases, the energy bandgap decreases. In Fig. 9, maximum tip temperature distribution on the 100 CNTs during field emission over 100s duration is plotted. The maximum temperature rises up to approximately 358K.
4
Concluding Remarks
In this paper, a new multiphysics model has been proposed, which incorporates nonlinearities and coupling related to electrodynamic, mechanical and thermodynamic phenomena during the process of field emission. This model handles several complexities at the device scale and helps in understanding the fluctuation in the device current. Using the developed computational scheme, we were able to capture the transients in the field emission current, which have been
206
N. Sinha et al.
observed in actual experiments. This model can be useful in designing CNT thin film cathodes having a stable field emission current and without compromising their lifetime.
References 1. Iijima, S.: Helical tubules of graphitic carbon. Nature 354, 56–58 (1991) 2. Bonard, J.M., Salvetat, J.P., Stockli, T., Forro, L., Chatelain, A.: Field emission from carbon nanotubes: perspectives for applications and clues to the emission mechanism. Appl. Phys. A 69, 245–254 (1999) 3. Bonard, J.M., Klinke, C., Dean, K.A., Coll, B.F.: Degradation and failure of carbon nanotube field emitters. Phys. Rev. B 67, 115406 (2003) 4. Siedel, R.V., Graham, A.P., Rajashekharan, B., Unger, E., Liebau, M., Duesberg, G.S., Kreupl, F., Hoenlein, W.: Bias dependence and electrical breakdown of small diameter single-walled carbon nanotubes. J. Appl. Phys. 96, 6694–6699 (2004) 5. Huang, N.Y., She, J.C., Chen, J., Deng, S.Z., Xu, N.S., Bishop, H., Huq, S.E., Wang, L., Zhong, D.Y., Wang, E.G., Chen, D.M.: Mechanism responsible for initiating carbon nanotube vacuum breakdown. Phys. Rev. Lett. 93, 75501 (2004) 6. Sinha, N., Roy Mahapatra, D., Yeow, J.T.W., Melnik, R.V.N., Jaffray, D.A.: Carbon nanotube thin film field emitting diode: understanding the system response based on multiphysics modeling. J. Compu. Theor. Nanosci. 4, 535–549 (2007) 7. Sinha, N., Roy Mahapatra, D., Sun, Y., Yeow, J.T.W., Melnik, R.V.N., Jaffray, D.A.: Electromchanical interactions in a carbon nanotube based thin film field emitting diode. Nanotechnology 19(1-12), 25701 (2008) 8. Fowler, R.H., Nordheim, L.: Electron emission in intense electric field. Proc. R. Soc. Lond. A 119, 173–181 (1928) 9. Svizhenko, A., Anantram, M.P.: Effect of scattering and contacts on current and electrostatics in carbon nanotubes. Phys. Rev. B 72, 85430 (2005) 10. Slepyan, G.Y., Maksimenko, S.A., Lakhtakia, A., Yevtushenko, O., Gusakov, A.V.: Electrodynamics of carbon nanotubes: dynamic conductivity, impedance boundary conditions, and surface wave propagation. Phys. Rev. B 60, 17136–17149 (1999) 11. Glukhova, O.E., Zhbanov, A.I., Torgashov, I.G., Sinistyn, N.I., Torgashov, G.V.: Ponderomotive forces effect on the field emission of carbon nanotube films. Appl. Surf. Sci. 215, 149–159 (2003) 12. Ruoff, R.S., Tersoff, J., Lorents, D.C., Subramoney, S., Chan, B.: Radial deformation of carbon nanotubes by van der Waals forces. Nature 364, 514–516 (1993) 13. Watanabe, T., Notoya, T., Ishigaki, T., Kuwano, H., Tanaka, H., Moriyoshi, Y.: Growth mechanism for carbon nanotubes in a plasma evaporation process. Thin Solid Films 506/507, 263–267 (2006) 14. Huang, Z.P., Tu, Y., Carnahan, D.L., Ren, Z.F.: Field emission of carbon nanotubes. Encycl. Nanosci. Nanotechnol. 3, 401–416 (2004) 15. Yang, L., Anantram, M.P., Han, J., Lu, J.P.: Band-gap change of carbon nanotubes: effect of small uniaxial strain and torsion strain. Phys. Rev. B 60, 13874–13878 (1999)
A Multiphysics and Multiscale Software Environment for Modeling Astrophysical Systems ´ Nuall´ Simon Portegies Zwart1 , Steve McMillan2 , Breannd´ an O ain1 , 3 4 5 Douglas Heggie , James Lombardi , Piet Hut , Sambaran Banerjee6 , Houria Belkus7 , Tassos Fragos8, John Fregeau8 , Michiko Fuji9 , Evghenii Gaburov1 , Evert Glebbeek10 , Derek Groen1 , Stefan Harfst1 , Rob Izzard10 , Mario Juri´c5 , Stephen Justham11 , Peter Teuben12 , Joris van Bever13 , Ofer Yaron14 , and Marcel Zemp15 1
15
University of Amsterdam, Amsterdam, The Netherlands
[email protected] 2 Drexel University, Philadelphia, PA, USA 3 University of Edinburgh, Edinburgh, UK 4 Allegheny College, Meadville, PA, USA 5 Institute for Advanced Study, Princeton, USA 6 Tata Institute of Fundamental Research, India 7 Vrije Universiteit Brussel, Brussel, Belgium 8 Northwestern University, Evanston IL, USA 9 University of Tokyo, Tokyo, Japan 10 Utrecht University, Utrecht, the Netherlands 11 University of Oxford, Oxford, UK 12 University of Maryland, College Park, MD, USA 13 Saint Mary’s University, Halifax, Canada 14 Tel Aviv University, Tel Aviv, Israel University of California Santa Cruz, Santa Cruz, CA, USA
Abstract. We present MUSE, a software framework for tying together existing computational tools for different astrophysical domains into a single multiphysics, multiscale workload. MUSE facilitates the coupling of existing codes written in different languages by providing inter-language tools and by specifying an interface between each module and the framework that represents a balance between generality and computational efficiency. This approach allows scientists to use combinations of codes to solve highly-coupled problems without the need to write new codes for other domains or significantly alter their existing codes. MUSE currently incorporates the domains of stellar dynamics, stellar evolution and stellar hydrodynamics for a generalized stellar systems workload. MUSE has now reached a “Noah’s Ark” milestone, with two available numerical solvers for each domain. MUSE can treat small stellar associations, galaxies and everything in between, including planetary systems, dense stellar clusters and galactic nuclei. Here we demonstrate an examples calculated with MUSE: the merger of two galaxies. In addition we demonstrate the working of MUSE on a distributed computer. The current MUSE code base is publicly available as open source at http://muse.li. M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 207–216, 2008. c Springer-Verlag Berlin Heidelberg 2008
208
S. Portegies Zwart et al. Keywords: Stellar Dynamics And Evolution; Radiative Transfer; Grid Computing; High-Performance Computing; Multi-Scale Computing.
1
Introduction
The Universe is a multi-physics environment in which, from an astrophysical point of view, Newton’s gravitational force law, radiative processes, nuclear reactions and hydrodynamics mutually interact. Astrophysical problems are generally multi-scale, with, in the extreme, spatial and temporal scales ranging from 104 meters and 10−3 seconds on the small end to 1020 m and 1017 s on the large end. The combined multi-physics, multi-scale environment presents a tremendous theoretical challenge for modern science. While observational astronomy fills important gaps in our knowledge by harvesting ever wider spectral coverage with continuously increasing resolution and sensitivity, our theoretical understanding lags behind these exciting developments in instrumentation. Computational astrophysics is situated between observations and theory. The calculations generally cover a wider range of physical phenomena, whereas purely theoretical studies are often tailored to a relatively limited range of spectral coverage. On the other hand, extensive calculations can support observational astronomy by mimicking observations and support the interpretation by enabling wide parameter space studies. They can elucidate complex consequences of physical theories. But extensive computer simulations in order to deepen our knowledge of the physics require large programming efforts and a good fundamental understanding of the underlying physics. Where modern instruments are generally built by tens or hundreds of people, the development of theoretical models and software environments are generally one-person endeavors. Theory lends itself excellently for this relatively individualistic approach, but scientific computing is in a less favorable position. Developing a simulation environment suitable for multi-physics scientific research is not a simple task. In contrast to purely theoretical studies computer models often require a much broader scope which non-linear couplings between various physical domains. As long as the physical scope remains relatively limited the software only needs to address the problem of solving sets of differential equations in a single physical domain and with a limited range in size scales and time scales. Such software can be built by a single scientific programmer or a numerically well educated astronomer. Regretfully, these packages are often “single-written single-use”, and thus single purpose: reuse of scientific software within astronomy is still rarely done. Problems which encompass multiple time or size scales are sometimes coded by small teams of astronomers. There are several examples of successful projects, such as FLASH [1], GADGET [2] and starlab [3], in which a team of several scientists collaborates in writing a large scale simulation environment. The resulting software of these projects has a broad user base and is applied several times for a variety of problems. These packages, however, address one very specific task,
A Multiphysics and Multiscale Software Environment
209
and their use is limited to the type of physics that is addressed and the solver that is used. In addition, it requires considerable expertise to use these packages. In this paper we describe a software framework that targets multi-scale, multiphysics problems in a hierarchical and somewhat consistent implementation. The development of this framework is based on the philosophy of “open knowledge” 1 .
2
The Concept of MUSE
The development of MUSE was initiated during the MODEST-6a 2 workshop in Lund (Sweden [12]), but the first lines of code were written during MODEST6d/e in Amsterdam (The Netherlands). During the two workshops MODEST-7f in Amsterdam and MODEST-7a in Split (Croatia) the concept of Noah’s Arc was initiated and realized (see Sect. 2.1).
Fig. 1. Basic structure design of the framework (MUSE). The top layer (flow control) is connected to the middle (interface layer) which controls the command structure for the individual applications. These parts and the underlying interfaces are written in Python, whereas the applications can we written in any language. In this example only a selection of numerical techniques is shown for each of the applications, such as smoothed particle hydrodynamics (such as starcrash [4] and gadget [2]) to solve the gas dynamics, Metropolis-Hastings Monte Carlo for addressing the radiative transfer and ionization/thermal/chemical balance (such as Moccasin [5]), Henyey (STARS[6], EZ[7]) or parameterized code (like in SeBa [8]) for stellar evolution and direct integration for the stellar dynamics (Barnes-Hut tree code [9], Hermit0 or the kira integrator in starlab[3]).
The development of a multi-physics simulation environment can be approached from a monolithic or from a modular point of view. In the monolithic approach 1 2
See for example http://www.artcompsci.org/ok/. MODEST stands for MOdeling DEnse STellar Systems, and the term was coined during the first MODEST meeting in New York (US) in 2001. The web page for this coalition is http://www.manybody.org/modest. See also, [10,11]
210
S. Portegies Zwart et al.
a single numerical solver is subsequently expanded to include more physics. Basic design choices for the initial numerical solver are petrified in the initial architecture. Nevertheless, such codes are sometimes successfully redesigned to include two or possibly even three solvers for a different physical phenomenon (see FLASH where hydrodynamics has been combined with magnetic fields). Rather than forming a self consistent framework, the different physical domains in these environments are made to co-exist. This approach is prone to errors and the resulting large simulation packages are often hampered by bugs, redundancy in source code, chunks of dead code and a lack of homogeneity. The assumptions needed to make these codes compile and operate without fatal errors often hampers the science. In addition, the underlying assumptions are rarely documented and the resulting science is at best hard to interpret. We address these issues by the development of a modular numerical environment, in which independently developed specialized numerical solvers are coupled at a meta level, resulting in a framework as depicted in Fig. 1. The modular approach has many advantages. Existing codes which have been well tuned and tested in their own domains can be reused by wrapping these in a thin layer and interfacing them to a framework. The identification and specification of suitable interfaces for such codes allows the codes to be interchanged easily. An important element of the framework will be the provision of documentation and exemplars for the design of new modules, and their integration into the framework. A user can “mix and match” modules like building blocks to find the most suitable combination for his application. The resulting framework is also more easily maintainable, since the dependencies between modules is well separated from their functionality. A particular advantage of a modular framework is that a motivated scholar can focus the attention on a narrower area, write a module for it and integrate it with a knowledge of only the bare essentials of the framework interfaces. For example it will take little extra work to adapt the results of a successful student project into a separate module, or a researcher working with his own code for one field of physics may wish to find out how his code interacts in a multiphysics environment. The shallower learning curve of the framework will lower the barrier for entry, will make it more accessible and ultimately leads to a more open and extensible system. The only constraint that code must meet to be wrappen as a module is that it is written in a programming language with a Foreign Function Interface which can be linked to a contemporary Unix-like system. This includes many popular languages such as C, C++ and Fortran as well as other high-level languages such as C#, Java or Haskell. The flexibility of this framework allows a much broader range of applications to be prototyped and the bottom-up approach makes the code much easier to understand, expand and maintain. If a particular combination of modules is found to be particularly suited to an application, greater efficiency can be achieved, if desired, by hard coding the interfaces and factoring out the glue code, thus ramping up to a specialized monolithic code.
A Multiphysics and Multiscale Software Environment
2.1
211
Noah’s Ark
Instead of writing a new code from scratch, we envision a software framework in which a glue language is used to bind a wide collection of diverse applications. We call this environment MUSE, for MUltiphysics Software Environment. The MUSE framework consist of a hierarchical component architecture that encapsulates dynamic shared libraries for simulating stellar evolution, stellar dynamics and treatments for colliding stars. Additional packages for file I/O, data analysis and plotting are included. Our objective is to eventually include gas dynamics and radiative transfer but at this point these are not yet incorporated. We have so far included at least two working packages for each domain of stellar collisions (hydrodynamics), stellar evolution and stellar dynamics, in what we label the Noah’s Ark milestone. The homogeneous interface which connects the kernel modules enables us to switch packages at runtime via a scheduler. In this paper we demonstrate modularity and interchangeability. Stellar Collisions. The physical interaction between stars is incorporated by means of (semi)hydrodynamics solvers to the framework. At the moment two methodologies are incorporated, one is based on the make-me-a-star (MMAS) package [13]3 and the revised version make-me-a-massive-star (MMAMS) [14]4 ; the other solution is based on sticky spheres. The former (MMAS and MMAMS) can be combined with full stellar evolution models, as they process the internal stellar structure in a similar fashion to the stellar evolution codes. The sticky sphere approximation only works with parameterized stellar evolution, as it does not require any knowledge of the internal stellar structure. Stellar Dynamics. To simulate gravitational dynamics (e.g., between stars and/or planets), we incorporate packages to solve Newton’s equations of motion by means of gravitational N -body solvers. Currently two N -body kernels are available: a direct force evaluation method and a tree code. The direct N -body code is based on the 4th order Hermite predictor-corrector N -body integrator with block time steps [15]. If present, the code can benefit from special hardware like GRAPE [16] and modern GPUs [17,18]. This method provides the high accuracy needed for simulating dense stellar systems, but even with special computer hardware it lacks the performance to simulate systems with more than 106 particles. For simulating large N systems we have incorporated a Barnes-Hut [9] tree-code. Stellar Evolution. Many applications require the structure and evolution of stars to be followed at various levels of detail. Examples are stellar masses and radii as a function of time (important for feedback on stellar dynamics), luminosities and photon energy distribution of the stellar spectrum (important for feedback on radiative transfer), mass loss rates, outflow velocities and yields of 3 4
See http://webpub.allegheny.edu/employee/j/jalombar/mmas/ See http://modesta.science.uva.nl/
212
S. Portegies Zwart et al.
various chemical elements (returned to the gas in the system and to be followed hydrodynamically) and even the detailed interior structure (to follow the outcome of a stellar merger or collision). Consequently the stellar evolution module should ideally incorporate both a very rapid but approximate code for applications where speed (i.e. huge numbers of stars) is paramount (like when using the Barnes-Hut tree code for addressing the stellar dynamics) and a fully detailed (but much slower) structure and evolution code where accuracy is most important (for example when studying relatively small but dense star clusters). Currently two stellar evolution modules are incorporated. One is based on fits to precalculated stellar evolution tracks [19], the other solves the set of coupled partial differential equations of stellar structure and evolution [6]. The lower speed of the second method is inconvenient but the better physics allows for much more realistic treatment of unconventional stars, such as collision products. 2.2
Performance
Large scale simulations, in particular the multiscale and multiphysics simulations for which our framework is intended, require a large number of very different algorithms, many of which achieve their highest performance on a specific computer architecture. For example, the gravitational N -body simulations are best performed on a GRAPE enabled PC, the hydrodynamical simulations are accelerated using GPU hardware whereas the trivially parallel execution of a thousand single stars is best done on a Beowulf cluster computer. The top level organization of where what should run is managed using a resource broker, which is grid enabled (see Sect. 2.4). The individual packages have to indicate on what hardware they operate optimal. Some of these modules are individually parallelized using the MPI library, whereas others (like stellar evolution) are handled via a master-slave approach by the top level manager. Certain parts of the individual modules benefit enormously from dedicated computing. For example, the gravitational direct N-body calculations are sped up by special purpose GRAPE-6 [20,16] or GPU hardware to orders of magnitude faster than on workstations [17,18,21]. 2.3
Units
A notorious pitfall in combining scientific software is the failure to perform correct conversion of physical units between modules. In a highly modular environment such as MUSE, this is a significant concern. One approach to the problem could have been to insist on a standard set of units for modules to be incorporated into MUSE but this is neither practical nor in keeping with the MUSE philosophy. Instead we provide a Units module, in which is encoded information about the physical units used in all other modules, conversion factors between them and certain useful physical constants. When a module is added to MUSE, the programmer adds a declaration of the units which that module prefers. When several modules are imported into a MUSE experiment, the Units module then
A Multiphysics and Multiscale Software Environment
213
takes care of ensuring that all values passed to each module are in its preferred units. Naturally the flexibility which this approach affords also introduces an overhead. But it is this flexibility which is MUSE’s great advantage; it allows the experimenter to easily mix and match modules until the desired combination is found. When the desired combination is found, then the dependence on the Units module can be removed and conversion of physical units performed by explicit code. This leads to more efficient interfacing between modules, while the correctness of the manual conversion can be checked against that of the Units module. 2.4
MUSE on the Grid
Due to the wide range in computational characteristics of each of the modules, we plan on running MUSE on a computational grid with a number of specialized machines. Here we report on our preliminary grid interface which allows us to use remote machines to distribute individual MUSE modules on the grid. In this way we reduce the runtime by adopting computers which are best suited for each module, rather than continuing a calculation even though the selected machine may be less suitable for that particular part of the calculation. For example, we can select a large GRAPE cluster in Tokyo for a direct N -body calculation while the stellar evolution is calculated on a Beowulf cluster in Amsterdam. The current preliminary interface uses the MUSE scheduler as the manager of grid jobs and replaces internal module calls with a job execution sequence. This is implemented with PyGlobus, an application programming interface to the Globus grid middleware written in Python. The execution sequence for each module consists of: Write the state of a module, such as initial conditions, to file • Transfer state file to the destination site. • Construct a grid job definition using Globus resource specification language. • Submit the job to the grid. The launched job subsequently: - read the state file, - execute the specified MUSE module, - write the new state of the module to a file, - and copy the state file back to the MUSE scheduler. • Then read new state file and resume the simulation The grid interface has been tested successfully using the distributed ASCI computer (DAS-3). We executed individual invocations of stellar dynamics, stellar evolutions and stellar collisions modules on remote machines.
3
MUSE Example: Two Black Holes in Merging Galaxies
Here we demonstrate the possibility of changing the integration method within a MUSE application during runtime. We deployed two integrators for simulating
214
S. Portegies Zwart et al.
1 PP PP+TC TC
log10(r)
0.5
0
-0.5
-1
-1.5 2
4
6
8
10
12
14
16
18
20
time [N-body]
Fig. 2. Time evolution of the distance between two black holes which initially reside in the center of a galaxy with 2048 particles that is a hundred times more massive than the black hole. Initially the two “galaxies” were located far apart. The curves indicate calculations with the direct integrator (PP), a tree code (TC) and using the hybrid method in MUSE (PP+TC). The units along the axis are in dimensionless N -body units [22].
the merging of two galaxies, each with a central black hole. The final stages of such a merger with two black holes orbiting each other can only be integrated accurately using a direct method. Since this is computationally expensive, the early evolution of such a merger is generally ignored and these calculations are typically started some time during the merger process, for example when the two black holes form a hard bound pair inside the merged galaxy. These, rather arbitrary, starting conditions for binary black hole mergers can be improved by integrating the initial merger between the two galaxies. We use the BHTree code to reduce the computational cost of simulating this merger process. When the tree code fails to produce accurate results the simulation is continued using the direct integration method. Overall this results in a considerable reduction of the runtime while still preserving an accurate integration of the close interactions. In Fig. 2 we show the results of such a simulation. The initial conditions are two Plummer spheres with 1024 particles each, all with the same mass. Each “galaxy” receives a black hole with a mass of 1% of that of the galaxy. The two stellar systems are subsequently set on a collision orbit, but at fairly large distance from each other. The simulation is performed three times, once using Hermite0, once using BHTree and once using the hybrid method. In the latter case the equations of motion are integrated using Hermite0 if the two black holes are within 0.3 N-body units [22]5 , otherwise we use the tree code. In Fig. 2 we show the time evolution of the distance between the two black holes. The integration with only the direct Hermite0 integrator took about 4 days on a normal workstation and the tree code took about 2 hours. The hybrid code 5
Or see http://en.wikipedia.org/wiki/Natural units#N-body units.
A Multiphysics and Multiscale Software Environment
215
took almost 2 days. As expected, the relative error in the energy of the direct N -body simulation (< 10−6 ) is orders of magnitude smaller than the error in the tree code (∼ 1%). The energy error in the hybrid code is comparable to that of the tree code, which in part is caused by regularly changing methodology. Even though we were unable to further reduce the energy error in the hybrid code, it seems safe to assume that the close encounters of the two black holes are treated much more accurately in the hybrid approach compared to the pure tree-code simulation. This is supported by the very small O(10−6 ) error in the energy during the close interactions between the two black holes, which were computed using the direct integrator. If the system was integrated using the tree code, the energy errors were also characteristic for the adopted methodology. The latter then obviously dominates the total error, irrespective of the accurate integration of the close encounters. Acknowledgments. We are gretefull to Atakan G¨ urkan, Junichiro Makino, Stephanie Rusli and Dejan Vinkovi´c for many discussions. This research was supported in part by the Netherlands Organization for Scientific Research (NWO grant No. 635.000.001 and 643.200.503), the Netherlands Advanced School for Astronomy (NOVA), the Leids Kerkhoven-Bosscha fonds (LKBF), the International Space Science Institute (ISSI) in Bern, the ASTRISIM program of the European Science Foundation, by NASA ATP grant NNG04GL50G and grant No. NNX07AH15G, by the National Science Foundation under Grant No. PHY9907949 (S.L.W.McM.) and No. 0703545 (J.C.L.), the Special Coordination Fund for Promoting Science and Technology (GRAPE-DR project), the Japan Society for the Promotion of Science (JSPS) for Young Scientists, Ministry of Education, Culture, Sports, Science and Technology, Japan and DEISA. Part of the calculations were done on the LISA cluster and the DAS-3 wide area computer in the Netherlands. We are also grateful to SARA computing and networking services, Amsterdam for their support.
References 1. Fryxell, B., et al.: FLASH: An Adaptive Mesh Hydrodynamics Code for Modeling Astrophysical Thermonuclear Flashes. Apjs 131, 273–334 (2000) 2. Springel, V., Yoshida, N., White, S.D.M.: GADGET: a code for collisionless and gasdynamical cosmological simulations. New Astronomy 6, 79–117 (2001) 3. Portegies Zwart, S.F., McMillan, S.L.W., Hut, P., Makino, J.: Star cluster ecology - IV. Dissection of an open star cluster: photometry. MNRAS 321, 199–226 (2001) 4. Lombardi Jr., J.C., Warren, J.S., Rasio, F.A., Sills, A., Warren, A.R.: Stellar Collisions and the Interior Structure of Blue Stragglers. apj 568, 939–953 (2002) 5. Ercolano, B., Barlow, M.J., Storey, P.J.: The dusty mocassin: fully self-consistent 3d photoionization and dust radiative transfer models. MNRAS 362, 1038–1046 (2005) 6. Eggleton, P.P.: The evolution of low mass stars. MNRAS 151, 351 (1971) 7. Paxton, B.: EZ to Evolve ZAMS Stars: A Program Derived from Eggleton’s Stellar Evolution Code. PASP 116, 699–701 (2004)
216
S. Portegies Zwart et al.
8. Portegies Zwart, S.F., Verbunt, F.: Population synthesis of high-mass binaries. A & A 309, 179–196 (1996) 9. Barnes, J., Hut, P.: A Hierarchical O(NlogN) Force-Calculation Algorithm. Nat 324, 446–449 (1986) 10. Hut, P., et al.: MODEST-1: Integrating stellar evolution and stellar dynamics. New Astronomy 8, 337–370 (2003) 11. Sills, A., et al.: MODEST-2: a summary. New Astronomy 8, 605–628 (2003) 12. Davies, M.B., et al.: The MODEST questions: Challenges and future directions in stellar cluster research. New Astronomy 12, 201–214 (2006) 13. Lombardi, J.C., Thrall, A.P., Deneva, J.S., Fleming, S.W., Grabowski, P.E.: Modelling collision products of triple-star mergers. MNRAS 345, 762–780 (2003) 14. Gaburov, E., Lombardi, J.C., Portegies Zwart, S.: Mixing in massive stellar mergers. MNRAS 9, L5–L9 (2008) 15. Makino, J., Aarseth, S.J.: On a hermite integrator with ahmad-cohen scheme for gravitational many-body problems. Publ. Astr. Soc. Japan 44, 141–151 (1992) 16. Makino, J.: Direct Simulation of Dense Stellar Systems with GRAPE-6. In: Deiters, S., Fuchs, B., Just, A., Spurzem, R., Wielen, R. (eds.) ASP Conf. Ser. 228: Dynamics of Star Clusters and the Milky Way, p. 87 (2001) 17. Portegies Zwart, S.F., Belleman, R.G., Geldof, P.M.: High-performance direct gravitational N-body simulations on graphics processing units. New Astronomy 12, 641–650 (2007) 18. Belleman, R.G., B´edorf, J., Portegies Zwart, S.F.: High performance direct gravitational N-body simulations on graphics processing units II: An implementation in CUDA. New Astronomy 13, 103–112 (2008) 19. Eggleton, P.P., Fitchett, M.J., Tout, C.A.: The distribution of visual binaries with two bright components. 347, 998–1011 (1989) 20. Makino, J., Taiji, M.: Scientific simulations with special-purpose computers: The GRAPE systems. In: Makino, J., Taiji, M. (eds.) Scientific simulations with specialpurpose computers: The GRAPE systems, John Wiley & Sons, Chichester, Toronto (1998) 21. Hamada, T., Fukushige, T., Makino, J.: PGPG: An Automatic Generator of Pipeline Design for Programmable GRAPE Systems. In: ArXiv Astrophysics eprints (March 2007) 22. Heggie, D.C.: Binary evolution in stellar dynamics. MNRAS 173, 729–787 (1975)
Dynamic Interactions in HLA Component Model for Multiscale Simulations Katarzyna Rycerz1,2 , Marian Bubak1,3 , and Peter M.A. Sloot3 1
Institute of Computer Science, AGH, al. Mickiewicza 30, 30-059 Krak´ ow, Poland Academic Computer Centre CYFRONET AGH, Nawojki 11, 30-950 Krak´ ow, Poland Faculty of Sciences, Section of Computational Science, University of Amsterdam, Kruislaan 403, 1098 SJ Amsterdam, The Netherlands {kzajac,bubak}@uci.agh.edu.pl,
[email protected] Phone: (+48 12) 617 39 64, Fax: (+48 12) 633 80 54 2
3
Abstract. In this paper we present a High Level Architecture (HLA) component model, particularly suitable for distributed multiscale simulations. We also present a preliminary implementation of HLA components and the CompoHLA environment that supports setting up and managing multiscale simulations built in the described model. We propose to integrate solutions from High Level Architecture (such as advanced time and data management) with possibilities given by component technologies (such as reusability and composability) and the Grid (such as joining geographically distributed communities of scientists). This approach will allow users working on multiscale applications to more easily exchange and join the simulations already created. The particular focus of this paper is on the design of a HLA component. We show how to insert simulation logic into a component and make it possible to steer from outside its connections with other components. Its functionality is shown through example of multiscale simulation of dense stellar system. Keywords: Components, Grid computing, HLA, distributed multiscale simulation, problem solving environments.
1
Introduction and Motivation
Building a simulation system from modules of different scale is an interesting, but non-trivial issue. There are many approaches to combine different models and scales in one problem solution, which is difficult and depends on actual models being combined. When designing problem solving environments for simulation of multiphysics multiscale systems, one can identify issues related to actual connection of two or more models together (this includes support for joining models of different internal time management, efficient data exchange between two modules of different scale etc.). Apart from that, there are also issues related to reusing existing models and their composability. For scientists working on multiscale systems it could be useful to find already existing models needed and connect them either together or to their own models. To facilitate M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 217–226, 2008. c Springer-Verlag Berlin Heidelberg 2008
218
K. Rycerz, M. Bubak, and P.M.A. Sloot
that, there is a need to wrap their simulations into recombinant components that can be selected and assembled in various combinations to satisfy specific requirements. Also, there is a need for an infrastructure that allows to exchange these components between scientists in relatively easy way across administrative domains. To summarize, there is a need for a component based model with support specific for multiscale simulations (complex time interactions, efficient data transfer between modules of different scale etc) and for an environment that would allow to build and reuse such components across administrative domains. For the first requirement, we decided to use solutions provided by High Level Architecture (HLA) [1], especially because of its advanced time management mechanism that allows to connect modules of different internal time management. Together with time management, other HLA services (data management, ownership management etc.) give a powerful tool for multiscale simulations [2]. Additionally, HLA partially supports the second requirement – interoperability and composability of simulation models, as it separates actual simulations from the communication infrastructure. Also, each simulation (federate) when connecting to simulation system (federation) is required to have description of objects and events exchanged with others. HLA even provides Management Object Model (MOM) that allows one of the federates in the federation manage connections of the others. However, for a component approach this is not enough, as dynamic joining federates in the federation and easy manipulating their connections from outside by the third party is not directly possible. To satisfy this requirement we propose a HLA-based component model that allows an external module (e.g. builder) to define and control particular behavior of components and their connections on the user request. A user is able not only to dynamically set up multiscale simulation system comprised of chosen HLA components residing on the Grid [3], but also to decide how components will interact with each other by setting up appropriate HLA data and time management and to change nature of their interactions during simulation runtime. To make HLA components more user friendly for their developers, we have designed them in a way that they do not require full knowledge of the quite complicated HLA API. The third requirement – ability to manipulate components across administrative domains – can be fulfilled by using a Grid infrastructure. As a Grid platform hosting HLA components we have chosen the H2O environment [4]. The main purpose of our approach is to facilitate joining simulations of different scale, so it is directed to the users that want to create new multiscale simulation systems from existing components or join their own new component to the multiscale system. For the users that have their own HLA application and want to run it almost unaltered efficiently using the Grid, we suggested using our previous work [5], where we have focused on execution management of existing legacy HLA applications and the best usage of available Grid resources, which can be achieved by using provided migration and monitoring services. In this paper we focus on a dynamic change of the component’s time policy and on manipulating objects’ exchange between simulation modules (which is
Dynamic Interactions in HLA Component Model for Multiscale Simulations
219
done by switching on/off publish-subscribe mechanism). The functionality of the system will be shown through an example of multiscale simulation of dense stellar systems. The paper is organized as follows: in Section 2 we outline related work, in Section 3 we describe the HLA component model and prototype of its implementation. Section 4 presents the idea of a Grid support system for such model. Section 5 presents an example multiscale simulation from which we have taken the simulation logic for two components used in our experiment as well as the results of the experiment. Summary and future plans are described in Section 6.
2
Related Work
Multiscale simulations are an important and an interesting field of research. Examples include multiphysics capillary growth [6] or modeling colloidal dynamics [7]. The vastly growing number of papers in this area shows the need for an environment that facilitates exchange of developed models between scientists working in that field and for a reusable component solution. Among component standards worth to be mentioned are: Common Component Architecture (CCA) [8] (with its implementations like XCAT [9] or MOCCA [10]), Corba Component Model [11] or Grid Component Model [12] (with its implementation ProActive [13]). However, none of this models provide advanced features for distributed multiscale simulations (in particular they do not support advanced time management mechanism). An important approach to using services and component technology to distributed simulations is described in [14] and is general on distributed simulations, without special focus on multiscale simulations systems. Another important component framework for simulations [15] is specifically designed for partial differential equations.
3
HLA Component Model
The HLA component model differs from popular component models (e.g. CCA [8]), where the federates are not using direct connections (e.g. in CCA one component is connected with other component, when its uses port is connected with partner’s provides port as shown in the Fig. 1a). Instead, all federates within a federation are connected together and can communicate using HLA mechanisms like time, data, ownership management etc. This is illustrated in the Fig. 1b. The difference between the component view proposed in this paper and the original HLA approach is that the particular behavior of the component and it’s connections are defined and set by an external module on the user request. The main advantage of this approach over original HLA is that it facilitates the user to create federations from federates developed by others. The particular federation, in which a federate is going to take part, does not need to be defined by the federate developer, but can be created later – from outside – in the process of setting up a distributed simulation system. Therefore the presented approach increases reusability and composability of simulations.
220
K. Rycerz, M. Bubak, and P.M.A. Sloot
A
a)
uses port
b)
B provides port
A join/resign B
C Federation
set time policy publish/unpublish
time management
subscribe/unsubscribe
data management
etc.
other HLA mechanisms
Fig. 1. CCA and HLA component models
Dynamic interactions in the HLA component model. All federates within a federation are connected together using a tuple space, which they can use for subscribing and publishing events and data objects. The tuple space takes care of sending the appropriate data from the publisher to the subscriber (in most HLA implementations objects and interactions to be exchanged have to be specified before federation creation, but there exists an HLA implementation based on Grid Services [16] which does not have this limitation). HLA also includes advanced time management mechanisms that allows to connect federates with different internal time management. There is no single time for all simulations, but every simulation has its own internal time and some federates can regulate the time of the others (called constrained) if necessary. The communication between federates is done by exchanging time stamped objects and interactions. In the HLA component model, as shown in the Fig. 1b, it is possible not only to dynamically join federate to a federation from outside [17], but also to steer the connections between federates. Therefore, it is possible to pose a request to one of the components to subscribe to the particular object class (from the objects declared in its description) and to another to publish this object class. In this way one can steer the data flow between simulations during runtime. It is also possible to set one federate to be constrained and another to be regulating. Section 5 provides an example that illustrates this. HLA component implementation. As a Grid framework we have chosen the H2O [4] platform as it is lightweighted and enables dynamic remote deployment. The HLA component is implemented as a H2O pluglet. In our prototype we have implemented basic requests to start – stop the simulation and requests for federation management: join and resign described in more detail in [17]. In this
Dynamic Interactions in HLA Component Model for Multiscale Simulations
221
start/stop join/resign
code with simulation logic
publish compoHLA library subscribe H2O pluglet set time policy
.
Fig. 2. Relationship between component’s developer code (simulation logic), HLA RTI implementation and compoHLA library
paper we focus on chosen requests for time and data management: set time policy for regulating or constrained, subscribe and publish for object class. The set of requests can be easy extended according to HLA possibilities in the future. The component developer has to provide a simulation logic code which is connected with a pluglet by interfacing with the compoHLA library as shown in the Fig. 2. The CompoHLA library introduces two classes with abstract methods that should be overridden by a component developer. One is the CompoHLASimulator class, from which the developer has to inherit and point to the main function starting simulation. There is also the CompoHLADataObject class that has to be inherited for each data object that is going to be published by the federate and be visible outside for external user (who is going to chose this component to be connected to his simulation system). The developer has to specify how actual simulation data fits into HLA data objects that could possibly be exchanged with other federates. Also, a developer has to override FederateAmbassador class callbacks (there are used by RTI to communicate with developer code e.g. when receiving data from other federates) as in the original RTI federate. The simulation developer can also call methods of the CompoHLAFederate class which, in turn, uses the HLA RTIambassador class (main class providing HLA services). The methods include getting info about federate time and requests of time advance as well as checking if stop request came (in order to perform final operations before the simulation exit). The use of the compoHLA library does not free the developer from understanding HLA time management and data exchange mechanisms, but simplifies use of them and allows the HLA component to be steered from outside (by external requests as described above).
4
CompoHLA – Environment Supporting HLA Component Model
Fig. 3 shows the proposed CompoHLA environment that supports setting and managing applications consisting of HLA components and consists of following elements: HLA Component Description Repository – stores description of components – including information about data objects and interactions that the
222
K. Rycerz, M. Bubak, and P.M.A. Sloot
5:requests for setting up and managing HLA Components
Broker
4:FOM
User Interface 1:search_descriptions()
2:descriptions including SOMs
HLA Component
HLA Components Description Repository
3:create_FOM(SOMs)
HLA Component
Federation Manager Component
HLA Components Description Assembler
Fig. 3. Collaboration diagram illustrating interactions between basic elements of CompoHLA environment. Simulation Object Model (SOM) contains the description of objects and events that can be exchanged by a HLA component. Fedaration Object Model (FOM) consists of SOMs and describes a federation of HLA Components.
component can exchange with others (which is called Simulation Object Model – SOM), the type of time management that makes sense for this component and additional information that may be useful for user that wants to set up multiscale system (e.g. units of produced data, scale of simulation time, if rollback is possible, how subscription for particular data affects simulation, average execution time etc.). HLA Components Description Assembler – produces Federation Object Model (FOM) needed to start federation from given SOMs of components that will comprise simulation system. Builder – sets up a simulation system on behalf of the user. It uses Federation Management Component to create federation and instructs HLA Components to join it. It also can instruct chosen components to set appropriate time management mechanism and subscribe or publish chosen data objects or events. Federation Manager Component – manages whole federation on the component level and sets up connection with RTI coordination process for federations (RTIExec in HLA RTI DoD implementation, rtig in Certi RTI) HLA Components – wrap actual functionality of federates into components. In this paper we particularly focus on this part of the system.
Dynamic Interactions in HLA Component Model for Multiscale Simulations
223
The collaboration between system elements is shown in the Fig. 3. A user can search the HLA Components Description Repository to find HLA components of interest to him. Then he can use the HLA Components Description Assembler to build a federation description (FOM) from the available components’ descriptions (SOMs) and pass it to the Builder that sets up the federation from appropriate HLA components. The user than can dynamically change the nature of connections between components using a Builder that makes direct requests (described in this paper) to HLA components.
5
Experiments with an Example Application – Multiscale Multiphysics Scientific Environment (MUSE)
For the purposes of this research we have used simulation modules of different time scale taken from Multiscale Multiphysics Scientific Environment (MUSE) [18] for simulating dense stellar systems like globular clusters and galactic nuclei. However, the presented HLA component approach can be applied to modules of any other multiscale simulation system. Section 3 describes in datail how to wrap new simulation kernels into components. The original MUSE consists of the Python scheduler and three simulation modules of different time scale: stellar evolution (in macro scale), stellar dynamics (nbody simulation – in meso scale) and hydro dynamics (simulation of collisions – in micro scale). Also, there are plans to add: radiative transfer, 3D gas and dust density distribution and stellar spectral energy distribution. In [2] we have shown that distribution of MUSE modules using HLA on the Grid can be beneficial for such applications. In particular, we have shown the usefulness of HLA advanced time management for this kind of simulations. For the purposes of this paper, we have chosen to make components from two MUSE modules: evolution (macro scale) and dynamics (meso scale) that run concurrently. The data are sent from macro scale to meso scale as star evolution data (change of star mass) is needed by dynamics. In that particular case, no data is needed from dynamics to evolution. The simulation system has to make sure that dynamics will get update from evolution before it actually passes the appropriate point in time. The HLA time management mechanism [1] offers many useful features there. The mechanism of regulating federate (evolution) that controls time flow in constrained federate (dynamics) is essential. It’s main advantage is that it does not require to specify explicitly the period in which the constrained federate should stop and wait for the regulating federate. Instead, the maximal point in time which the constrained federate can reach at certain moment is calculated dynamically according to the position of the regulating federate on the time axis. HLA data management is the second important mechanism useful here. Used together with time management, it allows federates to exchange data objects and interactions with time stamps and assures that objects arrive in appropriate order at the appropriate time of the simulation. To achieve data from evolution, the dynamics federate subscribes to data object containing information about stars published by evolution.
224
K. Rycerz, M. Bubak, and P.M.A. Sloot
This HLA component model allows these mechanisms to be accessible to the external user who sets up the simulation system from existing dynamics and evolution components created by someone else. This will allow users working on multiscale simulations to more easily exchange the models already created. Performance Results. We have created two prototype HLA components for dynamics and evolution simulations and measured execution time of requests to them. The dynamics component was asked to set its time policy to be constrained and to subscribe to the star object class. The evolution component was asked to set its time policy to regulating and to publish the star object class. In our prototype, publication and subscription is done to the whole object class, but it is easy to extend it to support subscription and publication of a subset of class attributes. In our implementation we have used H2O v2.1 and HLA Certi implementation v3.2.4. Experiments were done on Dutch Gride DAS3 [19]. The RTI control process was run on a Grid node at the Amsterdam Free University (dual CPU, dual-core, 2.4 GHz AMD Opteron, 4GB RAM), the client was run at Leiden University (dual CPU, single-core, 2.6 GHz AMD Opteron, 4GB RAM). In the first experiment the dynamics component was run at University of Amsterdam (dual CPU, dual-core, 2.2 GHz AMD Opteron, 4GB RAM) and the evolution component at Delft (dual CPU, single-core, 2.4 GHz AMD Opteron, 4GB RAM). As the requests to evolution and dynamics were of different kind, in the second experiment we switched locations of dynamics (run at Delft) and evolution (run at Amsterdam) to compare time of all kinds of requests made to components residing on the same site. The network bandwith between Grid sites is 10 Gbps. Table 1 shows results of experiments (average of 10 runs) in miliseconds. In both experiments, the time of requests to components residing on the Amsterdam site was slightly shorter then to these in Delft. This is probably because of a slightly different architecture of both sites (see above). It is worth notice that for both sites, the overhead of CompoHLA component layer is rather small – for all requests order of a few milliseconds. Time elapsed of the pure RTI execution depends on the particular HLA implementation. In [2] we have shown that the execution time of sequential and distributed version of MUSE (evolution and dynamics modules) are comparable and that synchronisation between those two modules does not produce much overhead. In [2] we have also discussed different types of time interactions that can appear between components in multiscale Table 1. Time of HLA component request execution for evolution and dynamics components measured from MUSE Amsterdam whole request RTI time avr, ms σ avr, ms σ set time constrained 4.5 0.5 0.1 0.02 set time regulating 5 0.3 0.1 0.02 publish 6 0.4 1 0.1 subscribe 6 0.4 1 0.1 action
Delft whole request RTI time avr, ms σ avr, ms σ 7 0.3 0.1 0.02 7.5 0.5 0.1 0.03 10 0.5 2 0.1 10 0.4 3 0.1
Dynamic Interactions in HLA Component Model for Multiscale Simulations
225
simulation system. Although they are described on the example of MUSE, in general they can appear in applications from any other domain knowledge. We have shown that such systems can benefit from being distributed using HLA. Additionally, using HLA component approach presented in this paper will increase reusability and interoperability of multiscale modules.
6
Summary and Future Work
The main objective of the research presented in this paper was to offer scientists working with multiscale simulations a component model that facilitates joining elements of different scale and providing them with an environment supporting applications in that model. The novelty of proposed approach consists in integrating solutions from HLA, component technologies and the Grid to support multiscale applications. As any of the existing component models does not explicitly address requirements of multiscale simulations, we have presented the idea of a model based on HLA, which enables users to dynamically compose/decompose distributed simulations from multiscale components residing on the Grid. Using the proposed CompoHLA environment a user is able to decide how components will interact with each other (e.g. by setting up appropriate HLA subscription/publication and time management mechanism) and to change nature of their interactions during simulation runtime. This approach differs from that in original HLA, where all decisions about actual connections are made by federates themselves. The functionality of the prototype is shown in the example of multiscale simulation of dense stellar system – the MUSE environment [18]. The results of experiments show that the execution time of requests are relatively low and that the component layer does not introduce much overhead. In the future we plan to fully design and implement other modules of the presented support system. Also, we plan to extend the prototype of HLA Component to allow to insert plugins (e.g. build by physicist whom wants to join two already existing components together) that can scale output from one federate to be appropriate an input for another federate. Acknowledgments. The authors wish to thank Maciej Malawski for discussions on component models and Simon Portegies Zwart for valuable discussions on MUSE. We also acknowledge the access to the DAS3 distributed Computer System. This research was partly funded by EU IST Project CoreGRID and the Polish State Committee for Scientific Research SPUB-M, through the BSIK project Virtual Laboratory for eScience and ACK CYFRONET AGH Grant No. 500–08.
References 1. IEEE Standard for Modeling and Simulation (M&S) High Level Architecture (HLA) (2004), http://standards.ieee.org/catalog/olis/compsim.html 2. Rycerz, K., Bubak, M., Sloot, P.M.A.: Using HLA and Grid for Distributed Multiscale Simulations. In: Parallel Processing and Applied Mathematics: 7th International Conference (PPAM 2007). LNCS, vol. 4967. Springer, Heidelberg (to appear, 2008)
226
K. Rycerz, M. Bubak, and P.M.A. Sloot
3. Foster, I., Kesselman, C., Nick, J., Tuecke, S.: The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration. Open Grid Service Infrastructure WG, Global Grid Forum (June 2002) 4. Kurzyniec, D., Wrzosek, T., Drzewiecki, D., Sunderam, V.S.: Towards SelfOrganizing Distributed Computing Frameworks: The H2O Approach. Parallel Processing Letters 13(2), 273–290 (2003) 5. Rycerz, K.: Grid-based HLA Simulation Support. PhD thesis, University of Amsterdam, Promotor: Prof. Dr. P.M.A. Sloot, Co-promotor: Dr. M.T. Bubak (June 2006), http://dare.uva.nl/en/record/192213 6. Szczerba, D., Sz´ekely, G., Kurz, H.: A Multiphysics Model of Capillary Growth and Remodeling. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2006. LNCS, vol. 3992, pp. 86–93. Springer, Heidelberg (2006) 7. Dzwinel, W., Yuen, D., Boryczko, K.: Bridging diverse physical scales with the discrete-particle paradigm in modeling colloidal dynamics with mesoscopic features. Chemical Engineering Sci. 61, 2169–2185 (2006) 8. Armstrong, R., Kumfert, G., McInnes, L.C., Parker, S., Allan, B., Sottile, M., Epperly, T., Dahlgren, T.: The CCA component model for high-performance scientific computing. Concurr. Comput.: Pract. Exper. 18(2), 215–229 (2006) 9. Krishnan, S., Gannon, D.: XCAT3: A Framework for CCA Components as OGSA Services. In: Proc. Int. Workshop on High-Level Parallel Progr. Models and Supportive Environments (HIPS), Santa Fe, New Mexico, USA, April 2004, pp. 90–97 (2004) 10. Malawski, M., Kurzyniec, D., Sunderam, V.S.: MOCCA – Towards a Distributed CCA Framework for Metacomputing. In: 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), CD-ROM / Abstracts Proceedings, Denver, CA, USA, 4-8 April (2005) 11. CORBA Component Model, v4.0 (2006), http://www.omg.org/technology/documents/formal/components.htm 12. Deliverable D.PM.02 – Proposals for a Grid Component Model (2006), http://www.coregrid.net 13. ProActive project homepage, http://www-sop.inria.fr/oasis/ProActive/ 14. Chen, X., Cai, W., Turner, S.J., Wang, Y.: SOAr-DSGrid: Service-Oriented Architecture for Distributed Simulation on the Grid. In: Principles of Advanced and Distributed Simulation (PADS), pp. 65–73 (2006) 15. Parker, S.G.: A component-based architecture for parallel multi-physics PDE simulation. Future Generation Computer Systems 22, 204–216 (2006) 16. Pan, K., Turner, S.J., Cai, W., Li, Z.: A Service Oriented HLA RTI on the Grid. In: IEEE International Conference on Web Services, 9-13 July 2007, pp. 984–992 (2007) 17. Rycerz, K., Bubak, M., Sloot, P.M.A.: HLA Component Based Environment for Distributed Multiscale Simulations (submitted to Special Issue of Scientific Programming on Large-Scale Programming Tools and Environments) 18. MUSE Web page, http://muse.li/ 19. The Distributed ASCI Supercomputer 3 web page, http://www.cs.vu.nl/das3
An Agent-Based Coupling Platform for Complex Automata Jan Hegewald1 , Manfred Krafczyk1 , Jonas T¨ olke1 , 2 3 Alfons Hoekstra , and Bastien Chopard 1
Technical University Braunschweig, Germany {hegewald,kraft,toelke}@irmb.tu-bs.de 2 University of Amsterdam, The Netherlands
[email protected] 3 University of Geneva, Switzerland
[email protected]
Abstract. The ability to couple distinct computational models of science and engineering systems is still a recurring challenge when developing multiphysics applications. The applied coupling technique is often dictated by various constraints (such as hard- and software requirements for the submodels to be coupled). This may lead to different coupling strategies/implementations in case a submodel has to be replaced in an existing coupled setup. Additional efforts are required when it comes to multiscale coupling. At least one of the submodels has to be modified to provide a matching interface on a specific spatial and temporal scale. In the present paper we describe a generic coupling mechanism/framework to reduce these common problems and to facilitate the development of multiscale simulations consisting of a multitude of submodels. The resulting implementation allows the coupling of legacy as well as dedicated codes with only minor adjustments. As the system is being build upon the JADE library, our platform fully supports computations on distributed heterogeneous hardware. We discuss the platform’s capabilities by demonstrating the coupling of several cellular-automata kernels to model a coupled transport problem. Keywords: generic coupling, heterogeneous, JADE, multi-scale, mutual interactions.
1
Complex Automata
In order to be able to develop non-trivial computational models it is usually an essential prerequisite to identify the major elements of the target simulation system. In doing so, one can construct complex multi-science models and the same elements can be used to start with the software design. Each contributing sub-model may use a different modelling technique, such as cellular automata (CA), a numerical kernel for PDEs or multi-agent based systems (MABS) (e. g. used for biomedical simulations [1]). In addition the models M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 227–233, 2008. c Springer-Verlag Berlin Heidelberg 2008
228
J. Hegewald et al.
will most likely use varying abstractions of a shared item, e. g. different spatial or temporal scales of a shared domain. In this work the resulting combined model will be termed Complex Automata (CxA) [2], a paradigm emerging from the EU funded project COAST [3].
2
Coupling Environment
The software design for a CxA should be able to inherit the concepts of the involved sub-models very closely to allow for better maintenance and reusability. This will result in separate software components for every sub-model, which have to be coupled to build the complete CxA, instead of a monolithic code for each individual CxA. Since there is no limit to the number of sub-models (kernels) of a CxA, there should be some kind of middleware or library to aid the development and maintenance of complex CxA. This middleware should also embed the execution model of the CxA approach, so the implementation of each kernel can focus on the specific sub-model itself. This way we can easily exchange individual sub-model implementations and the reuse of existing sub-model code is greatly simplified. 2.1
Core Implementation
Though our CxA middleware implementation, the Distributed Space Time Coupling Library (DSCL), is a work in progress, we already have non-trivial CxA simulations up and running. The major core modules of the DSCL are implemented in the Java programming language. As of now, the library supports the coupling of Fortran, C and C++ kernels, next to pure Java codes. Multiple languages may be freely intermixed within a single CxA. A setup is configured via ASCII files which declare the CxA graph and the scale separation map (SSM). Each vertex in the graph represents an involved sub-simulation (e. g. CA and MABS) and the edges designate a sub-model coupling. The SSM describes the different spatial and temporal scales of the individual sub-models [4]. Because each sub-simulation has its own controlling task (i. e. thread), a CxA may run fully distributed in a distributes environment. Since the idea is to keep each sub-simulation implementation unaware of the whole CxA layout, the system can make use of distributed-memory environments. This is even possible on heterogeneous hardware setups. The thread control is done via the MABS middleware JADE [5], where the threads belong to a software agent [6]. The JADE provides peer to peer message passing functionality which is currently used for the communications along the graph edges. A bandwidth comparison between JADE and MPICH MPI [7,8] is shown in Fig. 1. These bandwidth measurements have been performed on a Linux cluster, whose nodes (AMD 64, Debian) are directly connected via various network interfaces. For these tests the 100 Mbit and 1000 Mbit channels have been used where the nodes are connected via a Gbit capable network switch. See also [9] for further scalability and performance tests regarding JADE.
An Agent-Based Coupling Platform for Complex Automata
229
Fig. 1. Comparison of JADE and MPICH MPI bandwidth using TCP
MABS are well suited for building large software systems. Complex software systems, like CxA, consist of several more or less tightly coupled subsystems that are organized in a hierarchical fashion i. e. the subsystems can contain other subsystems. It is argued that MABS is well suited to reduce the complexity of building such software systems [10]. The topic is influenced by ongoing research in areas such as parallel and distributed discrete event simulation [11] and object oriented simulation [12]. 2.2
Reusability and Coupling Interface
Since the connected kernels of a CxA have no information of the whole graph, we can keep the kernel implementations separate from a specific CxA application. This allows us to reuse the kernels in different CxA setups, even without the need for recompilation. To achieve this, there has to be some coupling knowledge where data transfer takes place at the coupling interfaces (edges). In the DSCL these smart edges are called “conduits” which act as a kind of network (or local) pipe with a data sink and a data source. In its simplest form such a conduit is an unidirectional pipe and can pass data within a single machine or transfer it across a network. For a generic coupling approach we have to map data from the sending side to the required input format of the receiving kernel. This involves scale mapping and also data conversion, if necessary. Using this technique it is possible to connect a kernel k1 to multiple kernels at runtime (k2 , k3 . . .), which may require different I/O data formats e. g. due to different scales required by the sub-models.
230
J. Hegewald et al.
Another benefit of the conduits is the flexible substitution of kernels which provide functionality for the same vertex in the CxA graph. If, for example, a kernel k0,α is used to represent vertex v0 , the kernel may be exchanged with another implementation k0,β . This is even possible at runtime. In order to reduce the effort to use legacy code as a participating kernel in a CxA setup, each kernel remains full control over its main time-loop. All communication calls to the conduits are issued from the responsible kernel. Except for booting, the kernel implementation does not need to provide a public interface. We allow the CxA to influence the local time-loop execution of a kernel by means of blocking calls from kernel to conduit. This can be compared to blocking MPI send/receive calls which often can simply be replaced by the DSCL equivalents (highlighted lines in Fig. 4). The time-loop synchronization is implicitly synchronized this way. To make a kernel available for the DSCL, it has to be glued to a Java class which is (derives from) a JADE agent (Fig. 2). 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
public c l a s s PseudoKernelWrapper extends CAController { protected void a d d P o r t a l s ( ) { // add d a t a i n l e t s / o u t l e t s addExit ( r e a d e r = new ConduitExit ( . . . ) ) ; addEntrance (new ConduitEntrance ( . . . ) ) ; } protected void e x e c u t e ( ) { // l a u n c h k e r n e l bootKernel ( ) ; // done } private void b o o t K e r n e l ( ) { // k e r n e l main−l o o p f o r ( time = 0 ; ! s t o p ( ) ; time += INTERNAL_DT) { // read from our c o n d u i t ( s ) a t d e s i g n a t e d f r e q u e n c y i f ( time % DT_READ == 0 ) data = r e a d ( ) ; // p r o c e s s d a t a // dump t o our c o n d u i t ( s ) a t d e s i g n a t e d f r e q u e n c y i f ( time % DT_WRITE == 0 ) w r i t e ( data ) ; } } } Fig. 2. Pseudo code of a kernel agent
An Agent-Based Coupling Platform for Complex Automata
3
231
Transport Problem Example
During the initial development phase, a testbed with two distinct simulation kernels was developed from an originally single-kernel flow solver. This existing code is a preliminary kernel to simulate river channel erosion and sediment transport within a fluid flow. The solver uses a modified Lattice-Boltzmann automaton [13] to simulate incompressible flow where terms to simulate buoyancy where added. Currently the automaton works as a single monolithic solver on uniform grids. The research done on this sediment-erosion-model is part of another research project [14] and is written in the C++ programming language. Main elements of the simulation are: – – – –
advection of the sediment in the fluid diffusion of the sediment in the fluid sediment concentration in the fluid morphological changes of the river channel (erosion, deposition)
t = 500 Δt
t = 10000 Δt Fig. 3. Sediment concentration and boundary changes due to erosion
In Fig. 3 we show two different snapshots of such a simulation where the current sediment concentration is displayed. Some parts of the bedrock are subject to erosion (removal) of sediment, where in other areas deposition (adding of sediment) takes place.
232
J. Hegewald et al.
To couple two distinct automata with the first prototype of the COAST coupling mechanism, the existing sediment erosion kernel was taken apart into two mutually interacting solvers: one Lattice-Boltzmann automaton to simulate the fluid flow and a second automaton to simulate the sediment advection/diffusion/erosion processes. Both kernels rely on the same uniform grid, so there is no scale separation here [4]. The sediment solver depends on the current fluid velocity at each discrete location, whereas the flow solver depends on the changing sediment boundary. These two kernels where now coupled with the first implementation of DSCL. Calculation results and integrity of the CxA implementation could successfully be validated against the original monolithic solver.
Flow solver 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
f o r ( t =1; t<=s t e p C o u n t ; t++) { i f ( s t o p ( ) ) b r e a k ; // a l t e r n a t i v e // c a l c u l a t e f l o w s o l v e r r e s u l t s scale ( . . . ) ; calcInCompressibleCollision ( . . . ) ; addForcing ( . . . ) ; propagate ( . . . ) ; setFs ( . . . ) ; // s e n d / p u t v e l o c i t y fillVelocityBuffer ( velocityBuffer ) ; sendVelocity ( velocityBuffer ) ; // r e c e i v e / g e t s e d i m e n t b o u n d a r y changes receiveActive ( activeBuffer ) ; unpackActiveBuffer ( activeBuffer ) ; receiveQs ( qsBuffer ) ; unpackQsBuffer ( qsBuffer ) ; }
Sediment solver 1 2 3 4 5 6 7
f o r ( t =1; t<=s t e p C o u n t ; t++) { i f ( s t o p ( ) ) b r e a k ; // a l t e r n a t i v e // r e c e i v e / g e t v e l o c i t y receiveVelocity ( velocityBuffer ) ; unpackVelocityBuffer ( velocityBuffer ) ;
8 9 10
// c a l c u l a t e s e d i s o l v e r r e s u l t s scaleBeforeTRTCollisionSediment (...) ; calcTRTCollision ( . . . ) ; propagateSediment ( . . . ) ; setFsSedi ( . . . ) ; depositionLB ( . . . ) ; sedierosionLB ( . . . ) ; toppling ( . . . ) ;
11 12 13 14 15 16 17 18 19 20 21 22 23 24
// s e n d / p u t s e d i m e n t b o u n d a r y changes fillActiveBuffer ( activeBuffer ) ; sendActive ( activeBuffer ) ; f i l l Q s B u f f e r ( qsBuffer ) ; sendQs ( q s B u f f e r ) ; }
Fig. 4. Pseudo code of coupled sedi and flow kernels
4
Conclusion and Outlook
The Complex Automata Simulation Technique describes a method to couple multi-science models in order to simulate multi-scale problems. To facilitate the design and implementation of a coupled simulation software, we developed the Distributed Space Time Coupling Library which implements the ideas developed in the COAST project (i. e. CxA graph and scale separation map) using an agent based approach. The library aims to provide a flexible and powerful way for model developers to integrate new kernels as well as legacy code. The different software kernels of a CxA are strictly separated which guarantees their reusability for future CxA projects. This way CxA implementations may evolve with upcoming (re)implementations of kernels (vertices).
An Agent-Based Coupling Platform for Complex Automata
233
Within the COAST project we will focus on the complex simulation of coronary artery in-stent restenosis [15] as a demonstrator application to validate the CxA modeling concept and the coupling library. These first simulations will include multi-science models from physics, biology and biochemistry, acting on different spatial and temporal scales. The results of these simulations will be subject to future publication.
References 1. Walker, D.C., Southgate, J., Holcombe, M., Hose, D.R., Wood, S.M., Neil, M.S., Smallwood, R.H.: The epitheliome: Agent-based modelling of the social behaviour of cells. Biosystems 76(1-3), 89–100 (2004) 2. Hoekstra, A.G., Chopard, B., Lawford, P., Hose, R., Krafczyk, M., Bernsdorf, J.: Introducing complex automata for modelling multi-scale complex systems. In: Proceedings of European Complex Systems Conference, European Complex Systems Society, Oxford (2006) 3. Mission of coast – complex automata, http://www.complex-automata.org/ 4. Hoekstra, A.G., Lorenz, E., Falcone, J.L., Chopard, B.: Towards a complex automata framework for multi-scale modeling: Formalism and the scale separation map. In: Shi, Y., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2007. LNCS, vol. 4487, pp. 922–930. Springer, Heidelberg (2007) 5. Bellifemine, F.L., Caire, G., Greenwood, D.: Developing Multi-Agent Systems with JADE. Wiley Series in Agent Technology. Wiley, Chichester (2007) 6. Davidsson, P.: Multi agent based simulation: Beyond social simulation. In: Moss, S., Davidsson, P. (eds.) MABS 2000. LNCS (LNAI), vol. 1979, pp. 97–107. Springer, Heidelberg (2001) 7. MPICH Home Page, http://www-unix.mcs.anl.gov/mpi/mpich1 8. Message Passing Interface Forum: Message passing interface (mpi) forum home page, http://www.mpi-forum.org 9. Scalability and Performance of JADE Message Transport System, AAMAS Workshop, Bologna (2002) 10. Jennings, N.R.: An agent-based approach for building complex software systems. Communications of the ACM 44(4), 35–41 (2001) 11. Fujimoto, R.: Parallel and distributed simulation. In: Winter Simulation Conference, pp. 122–131 (1999) 12. Page, B.: Diskrete Simulation. Eine Einf¨ uhrung mit Modula-2. Springer, Berlin (1991) 13. Geller, S., Krafczyk, M., T¨ olke, J., Turek, S., Hron, J.: Benchmark computations based on lattice-boltzmann, finite element and finite volume methods for laminar flows 35, 888–897 (2006) 14. Stiebler, M., T¨ olke, J., Krafczyk, M.: An Advection-Diffusion Lattice Boltzmann Scheme for Hierarchical Grids (Computers and Mathematics with Applications) (in press) 15. Mudra, H., Regar, E., Klauss, V., Werner, F., Henneke, K.H., Sbarouni, E., Theisen, K.: Serial Follow-up After Optimized Ultrasound-Guided Deployment of Palmaz-Schatz Stents: In-Stent Neointimal Proliferation Without Significant Reference Segment Response. Circulation 95(2), 363–370 (1997)
A Control Algorithm for Multiscale Simulations of Liquid Water Evangelos M. Kotsalis and Petros Koumoutsakos Computational Science, ETH Z¨ urich, CH-8092, Switzerland
[email protected]
Abstract. We present multiscale simulations of liquid water using a novel control algorithm to couple non-periodic molecular dynamics (MD) and continuum descriptions in the context of a Schwarz alternating method. In the present multiscale approach the non-periodic MD simulations are enhanced by an effective external boundary force that accounts for the virial component of the pressure and eliminates spurious density oscillations close to the boundary. This force is determined by a simple control algorithm that enables coupling of the atomistic description to a coarse grained or a continuum description of liquid water. The proposed computational method is validated in the case of equilibrium and parallel flow.
1
Introduction
Multiscale modeling is critical for computational studies of new technologies such as those at the interface of nanotechnology and biology. Nanodevices present us the opportunity to interact with biological systems at the molecular level for applications such as molecular drug delivery and artificial nanosyringes. These nanodevices are in turn often embedded in continuous systems (e.g. when nanofluidic channels are interfacing microfluidic domains) and their simulation dictates an inherently multiscale problem. Multiscale approaches coupling MD simulations of atomistic descriptions with grid/particle discretizations of the continuum domain have been mainly developed for Lennard Jones fluids. These approaches can be distinguished in the way information is exchanged between the two descriptions, resulting in two broad classes of coupling schemes: the method of direct flux exchange [1,2,3,4] and the Schwarz alternating method [5,6,7,8]. In recent years a number of efforts have addressed multiscale simulations to systems involving polyatomic, polar molecules such as water. In these simulations the handling of electrostatic and steric effects becomes critical and requires several extensions over multiscale modeling of monatomic systems. Fabritiis et al. [4] applied the flux exchange scheme to simulate liquid water. Praprotnik et al. [9] presented multiscale simulations of liquid water using a spatially adaptive molecular resolution procedure to change from a coarse-grained to an all-atom representation on-the-fly. In this paper we extend the Schwarz alternating method implemented in [7,8] to conduct multiscale simulations of liquid water. The key challenge of these M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 234–241, 2008. c Springer-Verlag Berlin Heidelberg 2008
A Control Algorithm for Multiscale Simulations of Liquid Water
235
simulations is the appearance of spurious density fluctuations due to the presence of non-periodic boundary conditions in the MD domain. The present method circumvents this difficulty by introducing a boundary force that is determined via a control algorithm. We note that this approach does not involve the calculation of the pressure tensor that is required in the flux base methods. This helps reduce the computational cost of the method, as the calculation of the pressure tensor [5] requires a large amount of samples or the use of large cell sizes at the expense of spatial resolution. In addition, in the case of the flux-based methods a buffer is necessary that should be large enough, such that the atoms and molecules inside the bulk liquid do not interact with the rarefied region [8]. In the present method the ”buffer” is restricted to the region where we impose the velocity boundary condition from the continuum.
2
Coupling Continuum Liquids with Molecular Dynamics of Water
The coupling of continuum and atomistic descriptions of liquid water , implies MD simulations subject to non-periodic boundary conditions (BC). In MD simulations the position ri = (xi , yi , zi ) and velocities vi = (ui , vi , wi ) of the i−th particle evolve according to Newton’s equation of motion: d ri = vi (t) dt d ∇U (rij ), mi vi = Fi = − dt
(1) (2)
j=i
where mi is the mass and Fi the force on particle i. The interaction potential U (rij ) models the physics of the system. Here we consider MD simulations using the rigid SPC/E water model et al. [10], with an O-H bond length of 1 ˚ A and a H-O-H angle of 109.47◦ constrained using the SHAKE algorithm [11]. The SPC/E model involves a Lennard-Jones potential between the oxygen atoms 12 6 σαβ σαβ Uαβ (rij ) = 4αβ − , for rij < rc , (3) rij rij where rc is the radius of truncation (Uαβ (rij ) = 0 for rij > rc ), and α and β refer to the atomic species (here oxygen-oxygen: OO = 0.6501 kJ mol−1 and σOO = 0.3166 nm). The model involves a Coulomb potential acting between all atom pairs of different water molecules U (rij ) =
qi qj 1 , 4π0 rij
(4)
where 0 is the permittivity in vacuum, and qi is the partial free charge, qO = −0.8476 and qH = 0.4238, respectively. These partial free charges describe the fixed, static dipol moment of water. [10]. In this work all interaction potentials are
236
E.M. Kotsalis and P. Koumoutsakos
truncated for distances beyond a cutoff radius (rc ) of 1.0 nm and the equations of motion are integrated using a Verlet algorithm with a timestep dt of 2 fs. The elimination of periodic boundary conditions in the MD simulations results in a non-uniform density across the domain and hinders the calculation of the virial pressure. In order to correct the calculation of the mean virial pressure, we impose a boundary force Fm , along with a specular wall that prevents molecules from leaving the atomistic domain and maintains the kinetic component of the system pressure. We detect the collision of a water molecule by checking whether its center of mass has crossed the boundary. Molecules that hit a moving wall during time step n are identified by computing the collision time t = (comn − n+1/2 xw )/(uw − u ˜x ), where comn is the center of mass of the water molecule, xw n+1/2 and uw the initial wall position and wall speed, and u ˜x is the center of mass velocity of the molecule after the regular leap-frog update but before a possible reflection. If t is smaller than the time step dt then the molecule is crossing the boundary and the new velocity and position of each atom are calculated as: ˜n+1/2 − 2(un+1/2 − uw ) un+1/2 = u x
(5)
xn+1 = xn + t u˜n+1/2 + (dt − t )un+1/2
(6)
The wall force (Fm ) can be computed from the pair potential and the pair correlation function g(r) of the working fluid [7]. This technique was shown to alleviate many of the drawbacks of existing methods but it is not sufficient for all state points of dense liquids. In [8] a simple control algorithm was proposed that can achieve the target density in the system for a wide range of temperatures. This algorithm, originally developed for monoatomic liquids, is extended here for the case of water. Since the kinetic energy in the system is conserved, we enforce right virial pressure in the non-periodic system, by applying an external boundary force to eliminate all the disturbances in the density.
Fig. 1. Schematic of the control algorithm for reducing density fluctuations
3
Control Algorithm and Validation
We examine the validity of this method in a water system at equilibrium in the liquid regime (T = 300 K, ρ = 997 kg m−3 ). The size of the computational domain is 3 nm × 3 nm × 3 nm and periodicity is removed in the x-direction. The system is weakly coupled to a Berendsen thermostat [12] with a time constant of 0.1 ps. The heat bath is imposed cellwise using 6 × 1 × 1 cells and after equilibration we heat only the atoms located in the cells close to the x boundary. In order
A Control Algorithm for Multiscale Simulations of Liquid Water
237
to reduce spurious oscillations in the density we apply the control algorithm to the mean external boundary force applied to the MD system. The control approach is sketched in Fig. 1. Each iteration involves the following steps: we start by applying no external boundary force. Then we measure the 1.1
0
1.0
-2 0.9
ρ+
Fm (r) [ molkJnm ]
2
-4
0.8 -6 0.7
-8 -10 0.0
0.2
0.4
0.6
rw [nm]
0.8
0.6 0.0
1.0
0.2
(a)
0.4
0.6
rw [nm]
0.8
1.0
(b)
Fig. 2. a) The resulting external boundary force computed after applying a control algorithm. (b) The corresponding controlled (—) and uncontrolled (no external boundary force) reduced density values (ρ+ = ρ/ρbulk ) (- - -). The value used for KP is 3 mol and both the force and density have been sampled over 2.4 ns. 0.00332 nm amu kJ
density in short time intervals filtering away high frequency noise. This filtering makes the method more suitable for coupling the atomistic to the continuum description since it is found to improve the convergence of the method. The den sity ρm is measured with a spatial resolution δx of 0.0166 nm in time intervals of 6 ps and processed twice through a gaussian filter resulting in: 1 (x − y)2 )dy (7) ρm (x) = ρm (y) exp(− 2 1 (x − y)2 )dy (8) ρm (x) = ρm (y) exp(− 2 where = 2δx. The cutoff used for the discrete evaluation of the convolution is 3δx [8]. We then evaluate the error as: e(rw ) = ρt − ρm (rw ),
(9)
where rw is the distance to the boundary, ρt the desired constant target density and ρm the measured filtered value. We compute the gradient of this error as (rw ) = ∇e(rw ) = −∇ρm (rw ) and amplify it with a factor KP to obtain the adjustment ΔF to the boundary force as: ΔFi = KP i ,
(10)
for each ith bin with | xi − xw |< 0.9 rc . The boundary force is finally computed as: Finew = Fiold + ΔFi ,
(11)
238
E.M. Kotsalis and P. Koumoutsakos 8
E [%]
6
4
2
0 0
3
6
9
12
15
estimations
when adjusting the external boundary force (KP = 0.00332 of E corresponds to 30 updates of the online control force.
N 1 2 i=1 ei ) in density N nm3 mol ). Each estimation amu kJ
Fig. 3. Convergence of the root mean square of the error (E =
and applied to the center of mass of each water molecule. Here we note that through all the iterations ΔFi is constrained to the trivial value if | xi − xw |> 0.9 rc to avoid any undesirable influence of noise in this region close to the cutoff where no oscillation in the density is observed (see Fig. 2). We consider the that N 2 1 method has converged once the root mean square of the error E = N i=1 ei is less than a prescribed value, here 1%. The online controller keeps acting on the system and E is computed in time intervals of 180 ps. We test this approach for the liquid state. We start by applying no external force, and density oscillations 3 mol close to the boundary are observed. We set KP = 0.00332 nm amu kJ and the results shown in Fig. 2 demonstrate that the control approach eliminates the density oscillations. Here we note that the averaged value (over 2.4 ns) of the integral of the r external boundary force ρn 0 c Fm (r)dr is equal to the virial pressure of liquid water in an atomistic simulation subject to periodic boundary conditions, where ρn is the number density of water. In [8] we showed that the value of KP determines the stablity properties and the convergence rate of the algorithm. In Fig. 3 mol 3 we show the convergence of the method for the value of KP = 0.00332 nm amu kJ . In this case the method has converged approximately after 900 ps. As a final diagnostic we measure the angle φ between the normal of the wall and the dipole moment of each water molecule. In Fig. 4 we show the probability distribution of cos(φ) at the wall, at the cutoff distance from the boundary and at the center of the computational domain (1.5 nm from the wall). The NPBC result into a spurious preference in the orientation of the water molecules at the wall that vanishes as the distance from the boundary increases. The mean deviation from the periodic reference case is 1.0◦ at a cutoff distance. At the center of the domain the distribution is uniform with no deviation from the reference case. In addition, we examine the performance of the control algorithm in the case of parallel flow at the same liquid state. The periodicity is broken in the flow
1.00
1.00
0.75
0.75
P (cos(φ))
P (cos(φ))
A Control Algorithm for Multiscale Simulations of Liquid Water
0.50
0.25
0.00 -1.0
-0.5
0.0
0.5
0.50
0.25
0.00 -1.0
1.0
cos(φ)
239
-0.5
0.0
0.5
1.0
cos(φ)
(a)
(b)
P (cos(φ))
1.00
0.75
0.50
0.25
0.00 -1.0
-0.5
0.0
0.5
1.0
cos(φ)
(c)
8.0
1.2
4.0
1.1
ρ+
Fm (r) [ molkJnm ]
Fig. 4. The probability distribution of the cosine of the angle between the dipole of each water molecules with the normal to the x−boundary at the wall (a), at the cutoff distance of 1 nm (b) and at the center of the domain (c). As the distance from the wall increases the liquid recovers the correct uniform distribution.
0.0
-4.0
-8.0 0.0
1.0
0.9
0.2
0.4
0.6
rw [nm]
0.8
1.0
(a)
0.8 0.0
1.0
x [nm]
2.0
3.0
(b)
u [m s−1 ]
12
10
8 0.0
1.0
x [nm]
2.0
3.0
(c)
Fig. 5. (a) The resulting external boundary forces in the case of parallel flow: (- - -) inflow, (—) outflow. (b) The resulting reduced density (ρ+ = ρ/ρbulk ) profiles in the x direction: (solid line) controlled case, (dashed line) uncontrolled case (no external boundary force). (c) Resulting velocity profiles in the x-direction in the case of parallel flow: (—) controlled case, (- - -) uncontrolled case (no external boundary force). The 3 mol and the force, density and velocity have been sampled value used for KP is 0.0166 nm amu kJ over 2.4 ns.
240
E.M. Kotsalis and P. Koumoutsakos
direction (x). The system is weakly coupled to a Berendsen thermostat [12] with a time constant of 0.1 ps. The heat bath is imposed cellwise using 6 × 1 × 1 cells. The flow is imposed by adjusting the mean flow velocity of the atoms [13] in the computational cells with center points (x = 0.25) and (x = 2.75) to achieve a mean velocity of 10 m s−1 . After equilibration we heat only the atoms located in the boundary boxes at the inlet and the outlet. As described in [7], atoms at the non-periodic boundary of the computational domain bounce with hard walls, which move with the local fluid velocity. At the end of each time step these walls are reset to their initial positions to maintain a fixed frame of reference. As a consequence some particles may remain outside the computational domain and are reinserted in regions of inflow using the Usher algorithm [14]. This removal and insertion of particles is not symmetric and therefore we control the forces at the inlet and outlet of the computational domain separately 3 mol with a KP of 0.0166 nm amu kJ . We update the control force every 60 ps. In Fig. 5 we show the external boundary force in the controlled and uncontrolled cases and the resulting reduced density and velocity profiles. The perturbed density in the uncontrolled case also leads to oscillations in the stream velocity (u) of the molecules. The controller successfully eliminates the deviations from the target value for both quantities at both the inlet and outlet boundaries.
4
Conclusions and Future Work
We have presented a control algorithm to eliminate density fluctuations in the coupling of atomistic models with continuum descriptions of liquid water. A dynamic controller, based on the errors measured in the local fluid density, provides an appropriate boundary forcing, which applies the correct virial pressure to the system. The algorithm is validated for water at rest and it is shown to eliminate density oscillations with amplitude of the order of 10 %. In simulations of a uniform parallel flow the controller prevents the propagation of the perturbations, removing density and velocity oscillations of 11 %. Future work will involve the extension of the method to non-equilibrium configurations and the treatment of long range electrostatics by the reaction field method [15] for multiscale modeling of ionic solvents.
References 1. O’Connell, S.T., Thompson, P.A.: Molecular Dynamics-Continuum Hybrid Computations: A Tool for Studying Complex Fluid Flow. Phys. Rev. E 52, 5792–5795 (1995) 2. Flekkoy, E.G., Wagner, G., Feder, J.: Hybrid Model for Combined Particle and Continuum Dynamics. Europhys. Lett. 52, 271–276 (2000) 3. Flekkoy, E.G., Delgado-Buscalioni, R., Coveney, P.V.: Flux Boundary Conditions in Particle Simulations. Phys. Rev. E. 72, 26703 (2000) 4. De Fabritiis, G., Delgado-Buscalioni, R., Coveney, P.V.: Multiscale Modeling of Liquids With Molecular Specificity. Phys. Rev. Lett. 97, 134501 (2006)
A Control Algorithm for Multiscale Simulations of Liquid Water
241
5. Hadjiconstantinou, N.G.: Hybrid Atomistic-Continuum Formulations and the Moving Contact-Line Problem. J. Comput. Phys. 154, 245–265 (1999) 6. Nie, X.B., Chen, S.Y., Robbins, W.N.E., Robbins, M.O.: A Continuum and Molecular Dynamics Hybrid Method for Micro- and Nano-fluid Flow. J. Fluid Mech. 55, 55–64 (2004) 7. Werder, T., Walther, J.H., Koumoutsakos, P.: Hybrid Atomistic-Continuum Method for the Simulation of Dens Fluid Flows. J. Comput. Phys. 205, 373–390 (2005) 8. Kotsalis, E.M., Walther, J.H., Koumoutsakos, P.: Control of Density Fluctuations in Atomistic-Continuum Simulations of Dense Liquids. Phys. Rev. E 76, 16709 (2007) 9. Praprotnik, M., Matysiak, S.L., Delle Site, L., Kremer, K., Clementi, C.: Adaptive Resolution Simulation of Liquid Water. J. Phys. Condens. Matter 19, 292201 (2007) 10. Berendsen, H.J.C., Grigera, J.R., Straatsma, T.P.: The Missing Term in Effective Pair Potentials. J. Phys. Chem. 91, 6269 (1987) 11. Ryckaert, J.P., Cicotti, G., Berendsen, H.J.C.: Numerical Integration of the Cartesian Equations of Motion of a System with Constraints: Molecular Dynamics of n-alkanes. J. Comput. Phys. 23, 327–341 (1977) 12. Berendsen, H.J.C., Postma, J.P.M.: van Gunsteren,: The Missing Term in Effective Pair Potentials. J. Phys. Chem. 91, 6269 (1987) 13. Walther, J.H., Werder, T., Jaffe, R.L., Koumoutsakos, P.: Hydrodynamic Properties of Carbon Nanotubes. Phys. Rev. E 69, 62201 (2004) 14. De Fabritiis, G., Delgado-Buscalioni, R., Coveney, P.V.: Energy Controlled Insertion of Polar Molecules in Dense Fluids. J. Chem. Phys. 121, 12139 (2004) 15. Tironi, I.G., Sperb, R., Smith, P.E., van Gunsteren, W.F.: A Generalized Reaction Field Method for Molecular-Dynamics Simulations. J. Chem. Phys. 102, 5451–5459 (1995)
Multiscale Models of Quantum Dot Based Nanomaterials and Nanodevices for Solar Cells Alexander I. Fedoseyev1, Marek Turowski1, Ashok Raman1, Qinghui Shao2, and Alexander A. Balandin2 1
CFD Research Corporation (CFDRC), 215 Wynn Drive, Huntsville, AL 35805 {aif, mt, ar2}@cfdrc.com 2 Nano-Device Laboratory, Department of Electrical Engineering, University of California – Riverside, Riverside, California 92521 {qshao, alexb}@ee.ucr.edu
Abstract. NASA future exploration missions and space electronic equipment require improvements in solar cell efficiency and radiation hardness. Novel nano-engineered materials and quantum-dot (QD) based photovoltaic devices promise to deliver more efficient, lightweight, radiation hardened solar cells and arrays, which will be of high value for the long term space missions. We describe the multiscale approach to the development of Technology Computer Aided Design (TCAD) simulation software tools for QD-based semiconductor devices, which is based on the drift – diffusion and hydrodynamic models, combined with the quantum-mechanical models for the QD solar cells. Keywords: Nanostructured solar cell, quantum dot, photovoltaic, nanostructures, hydrodynamics, drift-diffusion, multiscale, computer-aided design, intermediate band solar cells.
1 Introduction The novel modeling and simulation tools which, include multiscale, fluid and quantum models for the quantum-dot-based nanostructures, help one to better understand and predict behavior of nano-devices and novel materials in space environment, assess technologies, devices, and materials for new electronic systems [1]. The QD models are being integrated into our photonic-electronic device simulator NanoTCAD [2,3], which can be useful for the optimization of QD superlattices as well as for the development and exploring of new solar cell designs. A prototype structure for the modeling of the quantum-dot superlattice (QDS)-based photovoltaic (PV) cell is shown in Fig. 1. The basic element of this PV cell is a stack of quantum dots arrays, referred to as QDS. The QDS can be implemented on Si/Ge or other material systems including III-Vs group materials such as GaAs. The QDS forms an intrinsic layer in a regular n-i-p (p-i-n) solar cell configuration. Quantum confinement of charge carriers (electrons and holes) in variable-size quantum dots, which form the i-layer, increases the effective band gap of the material. The quantum dot size variation allows one to optimize absorption at different wavelengths and create a multicolor quantum PV cell with estimated efficiency greater than 50% [4]. M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 242 – 250, 2008. © Springer-Verlag Berlin Heidelberg 2008
Multiscale Models of Quantum Dot Based Nanomaterials and Nanodevices
243
Fig. 1. Schematic structure of the PV cell based on the quantum dot superlattice (QDS), which is used as a prototype for the development of the PV cell simulation tools. The structure contains a stack of multiple quantum-dot arrays with the variable dot size, which maximizes absorption of the different light wavelengths in a controllable way.
2 Multiscale Approach for Efficient Solution of Quantum and Fluid Level Models Drift-diffusion (DD) based models have a long and fruitful history in applications to 3D simulations of not only modern electronic devices, but also to optoelectronic ones. In recent years, however, a new class of devices has been emerging and they require tools that include quantum effects (quantum well, tunnel junction, Schottky contact, quantum dot nanostructures etc.) and also allow for efficient numerical implementation. We have proposed and tested in 3D device simulator NanoTCAD a number of the reduced models for quantum scale of the problem, which have been successfully verified on experimental and numerical data [3]. For example, for the quantum well
(a)
(b)
Fig. 2. (a) Comparison of I-V characteristics of MIT NMOSFET transistor calculated with BTE and DD versus experimental data [4]. (b) CFDRC modeling for Schottky-based Philips MSM photodetector [6], I-V characteristic for reverse and forward bias. A good agreement has been obtained for the forward bias and correct trend for the reverse bias.
244
A.I. Fedoseyev et al.
within 3D electronic device, we have proposed “tunneling mobility”, which was calculated from pure quantum problem for carriers tunneling through the barrier. That approach has shown good accuracy and efficiency, compared to Wigner function method and Boltzmann transport equation with quantum corrections [3]. In the current problem we are developing the approach similar to one successfully developed for the incorporation of kinetic effects into the 3D drift-diffusion (DD) model. The problem for the device region with strong kinetic effects has been solved using the 4D Boltzmann transport equation (BTE), 3D geometry, and 1D energy space, and macroscopic transport coefficient calculated from the kinetic probability distribution function (see Fig.2). The details of the solution are reported elsewhere [5]. The modeling of the photovoltaic cell with QDS (see Fig. 3 (a)) is conducted with 3D NanoTCAD device simulator, which uses the quantum level computed transport parameters for the i-layer, the device region containing quantum dot superlattice, while for other device regions, the classical DD models will be used. Typical I-V curves for photovoltaic device, silicon p-i-n solar cell, calculated with NanoTCAD are shown in Fig. 3(b). f
Silicon p-i-n solar cell simulated by CFDRC NanoTCAD I - V curve for various doping density
Current (mA/m)
0.9
0.6
P+ = 2e24, N+ = 4e24 /m3 P+ = 2e25, N+ = 4e25 /m3 0.3
P+ = 2e23, N+ = 4e23 /m3 P+ = 2e22, N+ = 4e22 /m3 P+ = 2e21, N+ = 4e21 /m3
0.0 0
(a)
(b)
0.2
0.4 Voltage (V)
0.6
Fig. 3. (a) Typical photovoltaic cell with quantum dot superlattice and (b) our results for solar cell simulation with NanoTCAD [2]
3 Models Implemented with 3D Device Simulator Nanotcad The multiscale photovoltaic (PV) models discussed above are being integrated within the advanced software tool NanoTCAD, which is a 3D device simulator developed and commercialized by CFD Research Corporation [2]. This integration provides a user-friendly interface and a large database of the semiconductor material properties available in NanoTCAD. It also makes possible a complete PV-cell simulation including both quantum and classical models for the appropriate PV-cell elements, both DC and transient regimes, etc. The models are currently being extended to incorporate simulation of the electron-phonon transport in QDS made of semiconductors with
Multiscale Models of Quantum Dot Based Nanomaterials and Nanodevices
245
both cubic and hexagonal crystal lattice, e.g., InAs/GaAs, Ge/Si, CdSe, ZnO. The drift-diffusion model implemented in NanoTCAD is described below. 3.1 Drift-Diffusion Model Drift-diffusion models are formulated based on continuity equations for electrons and holes and Poisson equation for electrostatic potential. They are able to provide good comparison with experimental data for transistors with channel length down to 15 nm. Conservation of charge for electrons is represented by continuity equation q
∂n +∇⋅J =qR , n ∂t
(1)
q
∂p +∇⋅J =-qR , p ∂t
(2)
J n = − qμ n (U T ∇n − n∇Ψ ) ,
(3)
J p = − qμ p (U T ∇p − p∇Ψ ) .
(4)
and similarly for holes as
where the electron current is
and hole current is
3 Here n and p are electron and hole densities [1/cm ], Ψ is electrostatic potential [V], q k T B , and diffusion coefficients are is carrier charge (electron charge e), U = T q 2 D =U μ and D =U μ [cm /s]. n T n p T p Electron and hole mobilities μ , μ are calculated parameters (models depend on n n the material, device, or calculated from quantum or kinetic level problems). Electrostatic potential which appears in current equations is governed by Poisson equation ∇ε∇Ψ= q (n-p-C) ,
(5)
where Ψ is electrostatic potential [V], ε is dielectric constant, and C is a doping, + C=ND-NA . 3.2 Boundary Conditions Boundary conditions for n, p, and Ψ are shown below for the example of Ohmic contact. At the Ohmic contact we assume thermal equilibrium and vanishing space charge which results in
246
A.I. Fedoseyev et al.
2 n⋅p-ni =0 ,
(6)
n-p-C=0 .
(7)
Solving a quadratic equation for n, p we get Dirichlet conditions for n and p on the boundary (Ohmic contact)
n0 =
1 2
(
C 2 + 4ni2 + C ,
)
(8)
p0 =
1 2
(
C 2 + 4ni2 − C .
)
(9)
The boundary potential at an Ohmic contact is the sum of the externally applied potential (voltage) V (t) and the so called built-in potential, which is produced by C doping Ψ=Ψ +V (t) . bi C
(10)
⎡ C ( x ) + C 2 + 4n 2 ⎤ i ⎥, Ψ bi = UT ln ⎢ 2ni ⎢⎣ ⎥⎦
(11)
The built-in potential is
where the intrinsic concentration n is: i n= i
n⋅p .
(12)
3.3 Solution of Governing Equations for Drift-Diffusion Based Model Governing equations (1) to (4), (5) are discretized by finite volume method and solved simultaneously using the Newton technique, to ensure a good convergence. In NanoTCAD, we use a high-performance iterative linear solver CNSPACK, developed by Fedoseyev [7] . CNSPACK uses a high order preconditioning by incomplete decomposition to ensure the accuracy, stability and convergence of the simulations. The linear algebraic system is solved in CNSPACK using a CGS-type iterative method with preconditioning by the incomplete decomposition of the matrix. Comparing the CGS and GMRES methods [8] , [9] in different tests, it was found that both methods converge well, if a good preconditioner is used. The CGS method needs less memory to store only eight work vectors. To reduce the memory requirements, a compact storage scheme for matrices is used in CNSPACK. It stores only the nonzero matrix entries. The incomplete decomposition (ID) used for preconditioning, is constructed as a product of triangular and diagonal matrices, P = LDU. To avoid diagonal pivot degeneration, the Kershaw
Multiscale Models of Quantum Dot Based Nanomaterials and Nanodevices
247
diagonal modification is used [10]. If the value of diagonal element becomes small -t during the construction of preconditioning matrix, i.e. |a |<α= 2 σμ , the diagonal ii a is replaced by α . Here σ and μ are the maximum magnitudes of current row and ii i -t column elements, and 2 is a machine precision (t bits in mantissa, see the proof of i eligibility in [10]). For the first order ID, the matrix P has the same non-zero entry pattern as the original matrix. For a second order or higher ID, matrix P has one or more additional entries near the locations of the non-zero matrix entries, where the original matrix entries are zeros. Fig. 4 shows a typical dependence of total memory and CPU time for NanoTCAD simulations of 3D devices using unstructured meshes. Compared to linear solver, used in other commercial device simulators, NanoTCAD solver uses dramatically less memory (by more than one order of magnitude), and CPU time is also smaller by one order of magnitude compared to [11] for similar problem size / number of unknown. NanoTCAD simulator can solve a transient 3D multi-branched ion strike problem with 100,000 node unstructured mesh within 512MB memory. It can solve the problem with 3D unstructured mesh for up to 500,000 nodes within 2GB memory. Typical steady state solution for 3D semiconductor device / solar cell takes 5 to 10 Newton iterations to reach the ten order reduction of initial residual. This corresponds to very short computational time of NanoTCAD (typically a couple of minutes or less). Large transient problems, for example, simulation of radiation effects, produces
Fig. 4. NanoTCAD simulator performance for modeling 3D devices: memory, solid curve with diamonds(in MB), and CPU time, solid curve with circles (in secs per Newton iteration) versus number of mesh nodes. One can see that the dependence is almost linear for both memory and CPU time, and close to theoretical estimate for the CPU time, T=O(N5/4) (dashed curve).
248
A.I. Fedoseyev et al.
by high-energy ion strike through the electron device (not relevant to the problems, considered in this paper, please see another our paper [15]), may need more times (few days). This efficient linear solver made possible to use adaptive unstructured 3D mesh generation for a transient 3D multi-branched ion strike problem. 3.4 Solution for Carrier Transport in 3D Quantum Dot Superlattice We use the Lazarenkova-Balandin model for computation of the electron energy spectra of 3D regimented quantum dot superlattice, which has been proposed earlier [12, 13]. Schematic structure of the quantum dot superlattice is shown in Fig. 5. The electron spectrum is analyzed by using one-electron Schroedinger equation:
⎡ =2 ⎤ 1 ⎢ − ∇r * ∇r + V (r ) ⎥ ϕ(r ) = Eϕ(r ) . m ⎣ 2 ⎦
(13)
Here ϕ(r) is the electron wave function, E is the electron energy, and the confining potential profile V(r) corresponds to an infinite sequence of quantum dots of sizes L ,L ,L separated by the barriers of thicknesses H ,H ,H . The potential V(r) is set x y z x y z to zero in the barrier region while inside the quantum dot it is equal to the band offset in the conduction (or valence) band of the considered material system taken with a negative sign. The solution of the Schroedinger equations can have semi-analytical approximation using the simplified potential model [13]. Fig. 5(b) shows the results of quantum level approximation only, for the efficiency of the PV cell depending of the QDS dimensions, for a particular material. 3.5 Coupling of Quantum and Fluid Level Models of Electronic Device Electron and hole mobilities μ , μ in Eq. (3) and (4) are calculated parameters . n n Mobility depends on the material, device structure for the fluid transport dominated device region, or calculated from quantum or kinetic level problems for the device regions where the fluid model is not valid. Then, this calculated “quantum” or “kinetic” mobility is used in Eq. (3), (4) for the fluid model. Finally, the whole device is simulated with fluid level model, using the transport coefficient from quantum level, where this is needed only (in the device region with dominated quantum transport, like quantum dot layers in Fig. 3). For example, for the quantum dot layers in Fig. 3, we calculate the “quantum” mobility in the device region with quantum dominated transport as follows:
μq = ∑ μi where
nsi =
m π= 2
nsi , ns
(14)
∞
∫ f ( E )dE is the concentration of electrons in subband i with the
Esi
bottom energy Esi , f(E) is Fermi-Dirac distribution function, subband mobility
Multiscale Models of Quantum Dot Based Nanomaterials and Nanodevices
e 1 μi = m τE
249
−1
, where the brackets mean the average value. The electron momen-
tum relaxation time (life time) τi in the state E of subband i, is calculated using the subband energy levels from the solution of Eq. (13) and the scattering rates of an electron, see for example [14], p.205. Similar approach has been successfully developed for the incorporation of kinetic effects into the 3D carrier fluid transport model. In the device region with strong kinetic effects, we solved the 4D Boltzmann transport equation (BTE), (3D geometry, plus 1D energy space), and macroscopic transport coefficient have been calculated from the kinetic probability distribution function. Results have compared well with the ones published in literature, obtained using accurate sophisticated models, like Wingner equations, and experimental data (see details in [5]).
Fig. 5. (a) Schematic structure of the quantum dot superlattice in PV cell, Li, Hi, di are dot size, inter-dot separation and QD lattice period in i-direction, i=x,y,z. (b) . Photovoltaic power conversion efficiency as a function of the quantum dot size in InAs0.9N0.1 /GaAs0.98Sb0.02 quantum dot superlattice. The results are shown for several interdot separations.
4 Conclusions We have discussed how the new multiscale models for the charge carrier transport in the quantum dot superlattices are being implemented into the advanced photonicelectronic 3D device simulator NanoTCAD. Based on our successful multiscale coupling for kinetic and hydrodynamics transport levels in sub-micron semiconductor devices, we are proposing the approach of multiscale integration for quantum dot based solar cells. This include the quantum physics based model for quantum dot superlattice and classical hydrodynamics / drift diffusion model for carrier transport in 3D semiconductor devices.
250
A.I. Fedoseyev et al.
References 1. Balandin, A.A., Lazarenkova, O.L.: Mechanism for thermoelectric Fig.-of-merit enhancement in regimented quantum dot superlattices. App. Phys. Let. 82, 415 (2003) 2. CFDRC, NanoTCAD website (2006), http://www.cfdrc.com/bizareas/microelec/micro_nano/ 3. Fedoseyev, A.I., Turowski, M., Wartak, M.S.: Kinetic and Quantum Models for Nanoelectronic and Optoelectronic Device Simulation. J. Nanoelectr. Optoelectr. 2, 234–256 (2007) 4. Shao, Q., Balandin, A.A., Fedoseyev, A.I., Turowski, M.: Intermediate-band solar cells based on quantum dot supracrystals. App. Phys. Let. 91(16) (2007) 5. Fedoseyev, A.I., Kolobov, V., Arslanbekov, R., Przekwas, A.: Kinetic simulation tools for nano-scale semiconductor devices. Microelectronic Eng. 69, 577–586 (2003) 6. Seto, M., Leduc, J.-V., Lammers, A.M.F.: Al-n-Si Double Schottky Photodiodes for Optical Storage Systems. In: Proc. ESSDERC 1997 (1997) 7. Fedoseyev, A.I., Bessonov, O.A.: Iterative solution for large linear systems for unstructured meshes with preconditioning by high order incomplete decomposition. Comput. Fluid Dynam. J. 10, 296 (2001) 8. Wesseling, P., Sonneveld, P.: Numerical experiments with a multiple grid and a preconditionned Lanczos type methods. Lecture Notes in Mathematics, vol. 771, pp. 543–562. Springer, Berlin (1980) 9. Saad, Y.: GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Statist. Comput. 7, 856 (1986) 10. Kershaw, D.S.: The incomplete Cholesky-conjugate gradient method for the iterative solution of systems of linear equations. J. Comput. Phys. 2, 43 (1978) 11. Schenk, O., Rllin, S., Hagemann, M.: Recent advances in sparse linear solver technology for semiconductor device simulation matrices. In: Proc. Int. Conf. Simulation of Semi. Processes and Devices, Boston, MA, USA, vol. 103 (2003) 12. Lazarenkova, O.L., Balandin, A.A.: Electron and photon energy spectra in a threedimensional regimented quantum dot superlattice. Phys. Rev. 66, 245–319 (2002) 13. Balandin, A.A., Lazarenkova, O.L.: Miniband formation in a quantum dot crystal. J. App. Phys. 89, 5505 (2001) 14. Landau, L.D., Lifshitz, E.M., Pitaevskii, L.P.: Statistical Physics, Science, p. 387 (1980) 15. Fedoseyev, A.I., Turowski, M., Raman, A., Alles, M.L., Weller, R.A.: Multiscale Numerical Models for Simulation of Radiation Events in Semiconductor Devices. In: Bubak, et al. (eds.) ICCS 2008 part II. LNCS, vol. 5102, pp. 281–290. Springer, Heidelberg (2008)
Multi-scale Modelling of the Two-Dimensional Flow Dynamics in a Stationary Supersonic Hot Gas Expansion Giannandrea Abbate1 , Barend J. Thijsse2 , and Chris R. Kleijn1 1
Dept. of Multi-Scale Physics & J.M.Burgers Centre for Fluid Mechanics, Delft University of Technology, Prins Bernhardlaan 6, Delft, The Netherlands
[email protected],
[email protected], http://www.msp.tudelft.nl 2 Dept. of Materials Science and Engineering, Delft University of Technology, Mekelweg 2, Delft, The Netherlands
[email protected], http://www.3me.tudelft.nl
Abstract. A stationary hot gas jet supersonically expanding into a low pressure environment is studied through multi-scale numerical simulations. A hybrid continuum-molecular approach is used to model the flow. Due to the low pressure and high thermodynamic gradients, the accuracy of continuum mechanics results are doubtful, while, because of its excessive time expenses, a full molecular method is not feasible. The results of the proposed hybrid continuum-molecular approach have been successfully validated against experimental data. An important question for the full understanding of the processes governing the flow is addressed: the demonstration of an invasion of the supersonic part of the flow by background particles. Through the tracking of particles and collisions in the supersonic region it could be definitively proven that background particles are present in this region. We present a complete two dimensional picture of how the invading background particles distribute and collide with local particles into the supersonic region. Keywords: Direct Simulation Monte Carlo, Coupled Method, Hybrid Method, Rarefied Gas Flow, Supersonic Expansion.
1
Introduction
The necessity to model a hot gas jet supersonically expanding in a low pressure environment is present in several gas fluidic applications of current importance; gas thruster nozzles and plume flows [1] and processes of thin film deposition from expanding plasma or gas jets [2] are some examples. The gas in the jet is generally at relatively high pressure, and then it rapidly expands into a low pressure environment. For this reason, the gas first supersonically expands and then quickly compresses through a stationary shock wave M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 251–260, 2008. c Springer-Verlag Berlin Heidelberg 2008
252
G. Abbate, B.J. Thijsse, and C.R. Kleijn
(Mach disk). In addition, the expansion zone is surrounded by a barrel shaped shock (barrel shock) (Fig. 1). Although several studies have been devoted to supersonic expansion of gas jets in a low pressure environment [3,4,5,6], full understanding of the processes governing the flow has not been reached yet. In particular, it is still not completely clear whether, due to rarefaction effects, the barrel shock still protects the supersonic part of the flow from the invasion of background molecules, i.e. molecules that are present outside the expansion-shock region [3]. Invading the supersonic part of the jet, these background molecules could, therefore, influence its properties. Already Fenn and Anderson in 1966 [7] and Campargue in 1970 [8] predicted this phenomenon, but a full understanding of it has not yet been reached. In the current paper the important issue of the presence of background particles in the supersonic region will be addressed. This issue will be studied through the tracking of particles and collisions in the supersonic region. It has been shown [5] that, because of rarefaction effects, the Navier-Stokes equations are not adequate for the description of this problem, and therefore, the continuum CFD (Computational Fluid Dynamics) approach fails in predicting the flow field in the expansion-shock region. A kinetic simulation model should be applied to correctly study the flow. For this reason, in the past, DSMC (Direct Simulation Monte Carlo) has been used [5]. However, because DSMC computational expenses scale with Kn−4 , it was impossible to fulfill the DSMC requirements with respect to the size of gridcells and timesteps, expecially near the inlet, where the Knudsen number is quite low [5]. In order to overcome this problem, one needs to construct a model that on the one hand accounts for the molecular nature of the gas flow where needed, and on the other hand uses a continuum model where allowed. In the past years several hybrid continuum/molecular models have been proposed [9,10,11,12,13,14]. The present hybrid method has been presented in [12,13] and validated in [14]. The method applies the continuum approach in the wide continuum region in order to save computational time and to get numerical simulations of the flow, pressure and temperature of the expanding jet at large scales. DSMC is only applied in the expansion-shock region where it is necessary in order to correctly model the rarefied nature of the flow and to show how the invading background particles distribute and collide with local particles into the supersonic region. Whereas in [14] we focused on the validation of 1-dimensional temperature and velocity profiles along the axis against experimental data from [3], in the present paper we will present 2-dimensional velocity fields which will be compared to experimental data from [6].
2
Studied Configuration and Measurement Technique
In order to have an experimental support to our conclusions, all our numerical results have been validated by comparing them to experimental measurements by Engeln et al. [3], Vankan et al. [4] and Gabriel et al. [6].
Multi-scale Modelling of the Two-Dimensional Flow Dynamics
253
The measurements described in [3,4,6] were performed on an expanding thermal plasma jet. Nevertheless, because of the low ionization degree, they can be used to validate our present results on a neutral gas flow, neglecting the presence of electrons and ions [3] and the effects of ionization and recombination on the flow field [3,15]. The experimental set-up in which the expanding thermal plasma jet is created has been extensively described in [16] and it is schematically depicted in Fig. 1. Two techniques, the Thomson-Rayleigh and the laser induced fluorescence spectroscopy (LIF), have been used to study the flow. The latter provided detailed, 2-dimensional information on the velocity field. A description of the two techniques is given in [3,4].
3
Numerical Simulation Method
The present method has been described in detail in [12,13] and validated in [14], here only a short description will be given. In order to properly characterize the various regimes in the gas flow, the Knudsen number Kn is defined as the ratio between the mean free path and a relevant macroscopic length scale. When Knudsen is low (Kn < 0.01), the gas may be treated as a continuum and the gas flow modelled using CFD (Computational Fluid Dynamics). When Knudsen is high (Kn > 10), the gas behavior is molecular and may be modelled using Molecular Dynamics techniques. In the intermediate regime, the DSMC (Direct Simulation Monte Carlo) approach is the most commonly used simulation technique. However, its computational expenses scale with Kn−4 and it becomes very time demanding as Kn becomes lower than ∼ 0.05. In order to overcome this dilemma and solve the flow throughout the expanding gas jet, we use a hybrid CFD/DSMC model, which takes into account the molecular nature of the gas flow where needed, and uses a continuum model where allowed. The continuum breakdown parameter Knmax [18] is employed in the present study as a criterion for selecting the proper solver. If the calculated value of the continuum breakdown parameter in a region is larger than a limiting value Knsplit , then that region cannot be accurately modelled using the N-S equations, and DSMC has to be used. For Knsplit a value of 0.05 was used. The method has been found to be rather insensitive to the precise CFD/DSMC interface location w.r.t. Knsplit [12]. The CFD code used is a 2-D, unsteady code based on a finite volume formulation in compressible form. It uses an explicit, second-order, flux-splitting, MUSCL scheme for the Navier-Stokes equations. The 2-D DSMC code developed is based on the algorithm described in [17]. A ”particle reservoirs” approach was used to implement the inlet (outlet) boundary conditions. Molecules were generated in those reservoirs with a ChapmannEnskog velocity distributions.
254
3.1
G. Abbate, B.J. Thijsse, and C.R. Kleijn
Schwarz Coupling
The multi-scale hybrid coupling approach used is based on the overlapped Schwarz method with Dirichlet-Dirichlet boundary conditions [10] and it consists of two stages. The first stage is a prediction stage, where the unsteady N-S equations are integrated in time until steady state on the entire domain. From this steady state solution, the continuum breakdown parameter Knmax is computed and its values are used to split the domain in the subdomains, where the flow field will be evaluated respectively using DSMC and the N-S solver. Between the DSMC and CFD regions an overlap region is considered, where the flow is computed with both the DSMC and the CFD solver. In the second stage, DSMC and CFD are run in their respective subdomains with their own time steps until steady state. The boundary conditions to the DSMC region come from the solution in the CFD region and are imposed by a ”particle reservoir” approach, whereas the boundary conditions to the CFD region come from the solution in the DSMC region, averaged over the CFD cells. Once a steady state solution has been obtained in both the DSMC and N-S regions, the continuum breakdown parameter Knmax is re-evaluated and a new boundary between the two regions is computed. This second stage is iterated until the relative difference (in pressure, velocity and temperature) between the DSMC and CFD solutions in the overlapping region is less than a prescribed value. The advantage of using a Schwarz method with Dirichlet-Dirichlet boundary conditions, instead of the more common flux-based coupling technique [11], is that, the latter requires a much higher number of samples than the Schwarz method [11]. In fact, the DSMC statistical scatter involved in determining the fluxes is much higher than that associated with the macroscopic state variables. 3.2
Modelled Geometry
The computational domain (Fig. 1) is a d = 32 cm diameter cylinder of length L = 50 cm. From a circular hole of diameter din = 8 mm, on its top, a flow of 56 sccs of argon is injected at a temperature Tin = 8000 K. The top and lateral walls are at a temperature Tw = 400 K, while the bottom wall is at a temperature Tsub = 600 K. The pumping exit, which in reality is a circular hole, in our 2-D model has been represented as a lout = 2 cm wide ring on the bottom of the cylinder at a distance of Rout = 12 cm from the axis. A pressure Pout = 20 P a in the exit has been considered, since for this outlet pressure a large amount of experimental data is available [3,6]. Inside the chamber we suppose the flow to be 2-D axi-symmetric. The continuum grid is composed of 100 cells in the radial direction and 200 cells in the axial direction. Grid independence has been tested by doubling the continuum grid in each of the two directions, leading to variations in the solution below 3%. The code automatically refines the mesh in the DSMC region to fullfill its requirements. The number of simulated particle in the DSMC region is N ≈ 8·105 .
Multi-scale Modelling of the Two-Dimensional Flow Dynamics
255
Fig. 1. Scheme of the low pressure chamber
4 4.1
Results General Flow Field Characteristics
In Fig. 2 the number density (a) and pressure (b) profiles along the z-axis in the expansion-shock region, as evaluated by the hybrid approach at 20 P a chamber pressure, are shown. It is well known [19] that, in the expansion, the density decreases quadratically with the distance z to the inlet (1/z 2 ), whereas the pressure has a 1/z 2γ dependence, where γ is the gas constant (for argon γ = 1.67). In Fig. 2(a), the number density profiles measured with the ThomsonRayleigh technique by Vankan et al. at 40 and 10 P a chamber pressures are also presented. Even if no density measurements were available at 20 P a chamber pressure, the present hybrid results are exactly between the experimental data at 40 and 10 P a chamber pressures measured by Vankan et al. [4]. Fig. 3(a) shows the division between the DSMC, continuum and overlapping regions in our hybrid method. In Fig. 3(b)-(e) a comparison between experimental data from [6], results from the present hybrid method, results from full DSMC simulations performed by Selezneva et al. [5], and results from present continuum simulations for the 2-dimensional velocity field in the expansion-shock region is presented. The velocity contours in Fig. 3(b) are the result of an interpolation of measured velocities at various positions in the expansion-shock region [6]. It is clear that the hybrid method predicts the experimental data better than the
256
G. Abbate, B.J. Thijsse, and C.R. Kleijn
1022
10
21
10
20
10
19
N (m
10
(a)
Pressure (Pa)
104
-3 )
1023
Slope = -2
10
-3
10
-2
z (m)
10
-1
3
102 101 100
10-1 -3 (b) 10
Slope = -2 γ
10-2
z (m)
10-1
Fig. 2. Number density (a) and pressure (b) distributions along z-axis in the expansionshock region. Hybrid approach at 20 P a (solid kernel ), Theoretical trend in the expansion (dashed kernel ), Experimental number density distribution from [4] at 10 P a (bullet), and at 40 P a (triangle).
other approaches. The reason why the hybrid approach predicts experimental data even better than the full DSMC simulations is that, as discussed in section 1 and as already highlighted by Selezneva et al. [5], in the full DSMC simulations it was not possible to respect DSMC requirements in the near inlet region and a too coarse mesh had to be used. If we first compare the experimental data from [6] (Fig. 3(b)) to the results of the full CFD approach (Fig. 3(e)), the velocity predicted by the continuum approach in the expansion-shock region is significantly (200 − 500 m/s) lower than the experimental one. Because of rarefaction, in fact, upstream of the shock the expansion is stronger, reaching higher velocity values. If we compare the experimental data from [6] (Fig. 3(b)) to the full DSMC simulations by Selezneva et al. [5] (Fig. 3(d)), we can notice that the DSMC predicts correct velocity values in the expansion, but the maximum velocity along the z-axis is moved ∼ 1 cm upstream with respect to the experimental data. Finally, hybrid simulations (Fig. 3(c)) result in a very good agreement with the experiments (Fig. 3(b)); The hybrid approach, in fact, was able to predict the correct velocity values and the right position for the velocity peak in the expansion. 4.2
Invasion of the Expansion-Shock Region
In this section we want to demonstrate the invasion of background particles into the expansion-shock region. In continuum conditions, because of the presence of the shock, these particles would not be able to enter the supersonic region. However, we will show that, because of the rarefaction effects, the shock becomes transparent and does not protect the supersonic region. Therefore some particles may actually move into it from the subsonic part of the flow. To demonstrate this hypothesis, it is necessary to know the origin of the particles present in the supersonic region. For this reason, for the particles in the DSMC region, two different labels were used; one for the particles which, after entering the reactor chamber, have always been in the supersonic region (the so called ”inlet particles”), and a different one for the background particles.
Multi-scale Modelling of the Two-Dimensional Flow Dynamics
0
0 2400
1400
0.02
1400
1400
2200
00
0.02
0.02
0.02
0
0
z (m)
0.05
22
1400
0
0
(d)
r (m)
0.02
0.04
1800
0.02
0.04
0
r (m)
180
0
0 180
0.04
1800
0.04
0
(c)
240
0.02
1000
0
0.1
220
1000
r (m)
600
600
0 2603000
0
(b)
0
0
0 18
00
0.04
00 0
20
20
14
22
1000
2600 3000
r (m)
600
00
2600 3000
0 (a)
0
24
1000
0
N-S
Continuum
DSMC
20
600
DSM C
N-S/DSMC interf.
00 14 00 8 1
200
240
Over lap
Hybrid
Experiments 1400
z (m)
0
N-S Overlap
257
0
(e)
r (m)
0.02
Fig. 3. N-S/DSMC domains splitting (a) and velocity field (m/s) zoomed in the expansion-shock region at 20 P a chamber pressure. Experimental data from [6] (b), present hybrid simulations (c), DSMC data from [5] (d), present continuum simulations (e).
In Fig. 4 we compare the axial velocity distribution of our simulation at r = 0 and z = 59 mm (a) and the radial velocity distribution of our simulation at r = 22 mm and z = 50 mm (b) to the ones measured by Engeln et al. It is clear that there is a very good agreement between our current hybrid simulations and the experiments from [3]. In Fig. 4 we also show the contribution of the background particles and the inlet particles to the axial velocity distribution, at the position r = 0 and z = 59 mm (a), and to the radial velocity distribution at the position r = 22 mm and z = 50 mm (b). The presence of background particles in the supersonic region is evident. The peaks of the two contributions to the radial velocity distribution (Fig. 4(b)) are located on opposite sides of the zero velocity position, meaning that particles coming from the inlet are moving away from the axis because of the expansion, whereas background particles are penetrating into the supersonic region and moving toward the axis. Once the background particles have penetrated the supersonic region, they start colliding and interacting with the particles that are already there, decelerating them and being accelerated by them. In order to further prove the hypothesis of the presence of background particles in the supersonic region and explain how they collide and interact with the local particles, a study was performed, at the molecular scale, by tracking particles and collisions in the supersonic region. The results of this study are presented in Fig. 5. The background particles concentration in Fig. 5(a) further proves their presence in the supersonic region.
258
G. Abbate, B.J. Thijsse, and C.R. Kleijn
r = 0 mm z = 59 mm
0.4 0.3
f(v)
f(u)
0.15
0.2 0.1 0
(a)
r = 22 mm z = 50 mm
0.2
0.1
0.05
0
2
0
4
Axial Velocity u (Km/s)
-2
-1
0
1
Radial Velocity v (Km/s)
(b)
2
0.01
0.02
r (m)
0 0.4
Mach disk
0.04
(b)
0
0.2
0.6
0.05
0.2
0.05
1.8
0
0.2
1.4
0.05
(a)
0.01 0.03
0.02
0.05 0.1
0.15
Mach disk
Barrel shock
1
0.25
0.03
0.02
0.05 0.15
0.04
z (m)
0
Barrel shock
0.01
z (m)
Fig. 4. Relative contribution of background particles to the axial velocity distribution at r = 0 and z = 59 mm (a) and to the radial velocity distribution at r = 22 mm and z = 50 mm (b). Hybrid simulation total velocity distribution (solid kernel ), experimental total velocity distribution from [3] (bullet), Inlet particles contribution (dash-dotted kernel ), Background particles contribution (dashed kernel ).
0.01
0.02
r (m)
Fig. 5. Fractional concentration of background particles in the supersonic region (a) and average number of times that inlet particles have collided with background particles before reaching the given location (b).
In the expansion region, the velocity increases, reaching a maximum value on the axis at a distance z = 3 cm from the inlet (Fig. 3), whereas density and pressure decrease reaching a minimum (Fig. 2) at the same location. The invading background particles are driven into the region of minimum pressure by favorable pressure gradients. For the same reason, once they are there, it is difficult for them to cross the Mach disk because of the adverse pressure gradient. Therefore, the invading background particles concentrate in the region of minimum pressure, reaching values of up 25% of the total number of particles. Finally, Fig. 5(b) presents the average number of collisions with background particles that an inlet particle has undergone before reaching its position. As expected, the number of collisions increases along the z axis and it reaches the maximum value of ∼ 1.8 collisions. This is of course an averaged value, meaning that there are inlet particles which did not collide as well as inlet particles that have collided much more than 1.8 times with background particles. This clearly
Multi-scale Modelling of the Two-Dimensional Flow Dynamics
259
demonstrates that the inlet particles do interact with the background particles that invaded the supersonic region. Engeln et al. in [3] and Gabriel et al. in [6] also found experimental indications for the presence of background particles in the expansion-shock region. Therefore, our study gives a numerical support to the hypothesis of Engeln et al. and Gabriel et al. that background particles can penetrate the supersonic region and, by interacting with the inlet particles, can influence the flow field.
5
Conclusions
The gas dynamics of a hot gas jet supersonically expanding into a low pressure environment is studied by means of a multi-scale hybrid coupled continuumDSMC method. This method gives the possibility to save computational time using CFD in most of the domain and to use DSMC only where it is necessary in order to correctly model the flow. The answer to an important question about supersonic expansion in a low pressure environment has been found: the invasion of the supersonic region by background particles. By tracking particles and collisions in the supersonic region, we have demonstrated the presence of background particles in this region, thus proving the invasion of the supersonic region by background particles and describing how they can influence the flow field by colliding and interacting with the local particles. Acknowledgments. We thank Profs. D.C.Schram, M.C.M.Van de Sanden, R.Engeln and O.Gabriel for usefull discussions and for making available to us their experimental data and the DCSE (Delft Centre for Computational Science and Engineering) for financial support.
References 1. Cai, C., Boyd, I.D.: 3D Simulation of Plume Flows from a Cluster of Plasma Thrusters. In: 36th AIAA Plasmadynamics and Laser Conference, Toronto, Ontario, Canada, June 6-9, 2005, AIAA-2005-4662 (2005) 2. Gielen, J.W.A.M., Kessels, W.M.M., van de Sanden, M.C.M., Schram, D.C.: Effect of Substrate Conditions on the Plasma Beam Deposition of Amorphous Hydrogenated Carbon. J. Appl. Phys. 82, 2643 (1997) 3. Engeln, R., Mazouffre, S., Vankan, P., Schram, D.C., Sadeghi, N.: Flow Dynamics and Invasion by Background Gas of a Supersonically Expanding Thermal Plasma. Plasma Sources Sci. Technol. 10, 595 (2001) 4. Vankan, P., Mazouffre, S., Engeln, R., Schram, D.C.: Inflow and Shock Formation in Supersonic, Rarefied Plasma Expansions. Phys. Plasmas 12, 102–303 (2005) 5. Selezneva, S.E., Boulos, M.I., van de Sanden, M.C.M., Engeln, R., Schram, D.C.: Stationary Supersonic Plasma Expansion: Continuum Fluid Mechanics Versus Direct Simulation Monte Carlo Method. J. Phys. D: Appl. Phys. 35, 1362 (2002)
260
G. Abbate, B.J. Thijsse, and C.R. Kleijn
6. Gabriel, O., Colsters, P., Engeln, R., Schram, D.C.: Invasion of Molecules and Supersonic Plasma Expansion. In: Proc. 25th Int. Symph. Rarefied Gas Dynamics, S.Petersburg, Russia (2006) 7. Fenn, J.B., Anderson, J.B.: Rarefied Gas Dynamics. In: de Leeuw, J.H. (ed.), 2nd edn. Academic Press, New York (1966) 8. Campargue, R.: Aerodynamic Separation Effect on Gas and Isotope Mixtures Induced by Invasion of the Free Jet Shock Wave Structure. J. Chem. Phys. 52, 1795 (1970) 9. Le Tallec, P., Mallinger, F.: Coupling Boltzmann and Navier-Stokes Equations by Half Fluxes. Journal Computational Physics 136, 51 (1997) 10. Wu, J.S., Lian, Y.Y., Cheng., G., Koomullil, R.P., Tseng, K.C.: Development and Verification of a Coupled DSMC-NS Scheme Using Unstructured Mesh. Journal of Computational Physics 219, 579 (2006) 11. Schwartzentruber, T.E., Boyd, I.D.: A Hybrid Particle-Continuum Method Applied to Shock Waves. Journal of Computational Physics 215(2), 402 (2006) 12. Abbate, G., Thijsse, B.J., Kleijn, C.R.: An Adaptive Hybrid Navier-Stokes/DSMC Method for Transient and Steady-State Rarefied Gas Flows Simulations. Journal Computational Physics (submitted) 13. Abbate, G., Thijsse, B.J., Kleijn, C.R.: Coupled Navier-Stokes/DSMC Method for Transient and Steady-State Gas Flows. In: Shi, Y., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2007. LNCS, vol. 4487, p. 842. Springer, Heidelberg (2007) 14. Abbate, G., Thijsse, B.J., Kleijn, C.R.: Validation of a Hybrid NavierStokes/DSMC Method for Multiscale Transient and Steady-State Gas Flows. special SMMS 2007 issue of International Journal Multiscale Computational Engineering 6(1), 1 (2008) 15. Selezneva, S.E., Rajabian, M., Gravelle, D., Boulos, M.I.: Study of the Structure and Deviation From Equilibrium in Direct Current Supersonic Plasma Jets. J. Phys. D: Appl. Phys. 34(18), 2862 (2001) 16. van de Sanden, M.C.M., de Regt, J.M., Jansen, G.M., van der Mullen, J.A.M., Schram, D.C., van der Sijde, B.: A Combined Thomson-Rayleigh Scattering Diagnostic Using an Intensified Photodiode Array. Rev. Sci. Instrum. 63, 3369 (1992) 17. Bird, G.A.: Molecular Gas Dynamics and Direct Simulation Monte Carlo. Claredon Press Oxford Science (1998) 18. Wang, W.L., Boyd, I.D.: Continuum Breakdown in Hypersonic Viscous Flows. In: 40th AIAA Aerospace Sciences Meeting and Exhibit, January 14-17, 2002, Reno, NV (2002) 19. Ashkenas, H., Sherman, F.S.: Experimental Methods in Rarefied Gas Dynamics. In: de Leeuw, J.H. (ed.) Rarefied Gas Dynamics, vol. II, p. 84. Academic Press, New York (1965)
Multiscale Three-Phase Flow Simulation Dedicated to Model Based Control Dariusz Choiński, Mieczyslaw Metzger, and Witold Nocoń Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, ul. Akademicka 16, 44-100 Gliwice, Poland {dariusz.choinski, witold.nocon, mieczyslaw.metzger}@polsl.pl
Abstract. Multiphysics and multiscale three-phase flow simulation is proposed for model based control. Three-phase flow is considered by means of particles movement in a pipe with two-phase gas and liquid vacuum pumping. The presented model and simulation algorithm were implemented using a software system working in real-time mode. The software system can simulate a part of the pipe net with configured pipe profile, pump station and valve parameters and also inlet mixture composition. In addition, the system includes algorithm for pressure control. Keywords: Multiphisics and multiscale modelling and simulation, three-phase flow, particle movement, model based process control.
1 Introduction Unsteady two-phase flow is a challenging problem for control engineering. Such type of flow is typical for vacuum pumping technology like vacuum sewerage system (see e.g. [1],[2],[3],[4]) and also in petrol applications (see e.g. [5],[6]). Complexity of the problem is increasing when three phase flow is considered by means of particles movement in a pipe. Such system needs unsteady flow in order to avoid plugging by insoluble parts and particles. The sawtooth lifts, which are used for uphill liquid transport and the same type of pipe profile are necessary for uphill liquid transport and when no valves are opened, no liquid transport takes place, causing medium in the pipe to lie in the low spots. Only part of the pipe cross section is occupied by liquid, so that momentum transfer from air to liquid takes place largely trough the action of shear stresses. Although, the complex system of pipes with complicated profiles as well as with three-phase flow control is difficult for investigation, the control system can improve vacuum pumping system ability with an application of model based control algorithms after simulation for model validation. Complexity of the model should be adjusted to the requirements of the real-time computing involved in the control system. Even non-standard efforts such as special methods for real-time simulation [7] slightly improve computing possibility for the problem under consideration. Fulfilment of these requirements can help modelling of multiphysics and multiscale system (see e.g. [8], [9], [10]). Multiphysics-multiscale approach couples M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 261 – 270, 2008. © Springer-Verlag Berlin Heidelberg 2008
262
D. Choiński, M. Metzger, and W. Nocoń
calculations at different scales for two-phase flow and particle movement as the third phase. A specially rearranged three-phase model with simulation implementation for control purposes is presented in this paper. Mathematical descriptions such as two-phase gas-liquid modelling, modelling particle movement in the liquid pipeline, modelling two phase-flow through pipe inclinations, already known when investigated separately, have been implemented together for the presented investigations. The presented model and simulation algorithm were implemented using a software system working in real-time mode. The software system can simulate a part of the pipe net with configured pipe profile, pump station and valve parameters and also inlet mixture composition. The system also includes algorithm for pressure control. Multiscale approach reduced the number of parameters sent between pipe sections. This gives the ability to connect particular programs using TCP/IP protocol in different computers for parallel simulation of a wide net of pipes and valves.
2 Problem Under Consideration and Two-Phase Flow Mathematical Model The paper deals with a mechanized system of pipe transport that uses differences in gas pressure to move the liquid. The system requires a normally closed vacuum valve and a central vacuum pump station. The pressure difference between atmosphere and vacuum becomes the driving force that propels the liquid to the vacuum station. For two-phase flow (gas and liquid) the momentum equations can be written separately for each phase and such description is sufficient for the present properties of the whole flow. In such a model the phases are treated as if they were separated and as if they were flowing in unspecified parts of the cross section. Both phases have different velocities and fluid viscosities. The gas compression is the main reason to differ air velocity from water velocity [11][12][13] and it is considered in the presented simulation. The presented model was developed under assumption that is based on vacuum pipes network designer instruction. Such approach gives possibilities of easier model calibration and validation. The pressure drop caused by separated phase flow can be correlated using the Lockhart-Martinelli method [14,15]. The pressure drop multipliers Φ l2 and Φ 2g are defined as follows: ⎛ ΔP ⎞ ⎜ ⎟ ⎝ L ⎠ Mix ⎛ ΔP ⎞ ⎛ ΔP ⎞ ⎜ ⎟ ⎜ ⎟ Φ 2g ⎝ L ⎠g ⎝ L ⎠l 2 X = 2 = = ⎛ ΔP ⎞ ⎛ ΔP ⎞ Φl ⎜ ⎟ ⎜ ⎟ ⎝ L ⎠ Mix ⎝ L ⎠ g ⎛ ΔP ⎞ ⎜ ⎟ ⎝ L ⎠l
where: ⎛⎜ ΔP ⎞⎟
⎝ L ⎠ Mix
, (1)
- pressure drop gradients along the pipe section which can be measured
and controlled. Next gradients are calculated and depend on liquid and gas phase correlation,
Multiscale Three-Phase Flow Simulation Dedicated to Model Based Control ⎛ ΔP ⎞ ⎜ ⎟ ⎝ L ⎠l
⎛ ΔP ⎞ ⎜ ⎟ ⎝ L ⎠g
263
pressure gradient for flow of liquid along the pipe section,
- pressure gradient for flow of gas along the pipe section.
The pressure drop gradient along the whole pipe is evaluated based on the measured absolute pressure in a vacuum station, absolute pressure in a pipe connected to a valve or to previous part of pipe net and also boundary conditions determined by pipe profile inclination angle. Pressure gradient for the flow of liquid and gas along the pipe section is calculated from Lockhart-Martinelli parameter X2. The parameter X2 may be evaluated in terms of air mass friction [14]:
x=
Gg
(2)
G Mix
Gg [kg/s m2] – the superficial mass flux of gas is calculated from the volume flow of the vacuum pump in pump station multiplied by density of gas corrected with pressure, temperature and humidity using Brietty-Bridgeman real gas state equation [14,15]. The simulation procedures should take advantage of the fact that gas is compressible and its specific volume is a function of pressure, temperature and humidity. For example, the absolute pressure varies between 15kPa and 65 kPa, specific volume of air changes between 5.61 m3/kg and 1.294 m3/kg, (ratio 4.3:1). For wet air (humidity 80%) the ratio is 4.5:1. Liquid is considered incompressible. GMix [kg/s m2] – the superficial mass flux of gas mass and liquid mass measured in the pump station. Gl=GMix-Gg [kg/s m2] – the superficial mass flux of the liquid measured in the pump station as volume of liquid multiplied by liquid density corrected to the ambient temperature. The mass of gas and liquid is used for model validation. These parameters are investigated during real system design. The Lockhart-Martinelli parameter X2 is defined as: ⎛ 1− x ⎞ X2 =⎜ ⎟ ⎝ x ⎠
1.8
⎛ ρg ⎜ ⎜ρ ⎝ l
⎞⎛ μ l ⎟⎜ ⎟⎜ μ ⎠⎝ g
⎞ ⎟ ⎟ ⎠
0.2
(3)
where:
ρg – gas density [kg/m3], ρl – liquid density [kg/m3], μg – dynamic gas viscosity [Pa sec], μl – dynamic liquid viscosity [Pa sec]. The separate side-by-side flow of gas and liquid is considered. When the vacuum valve is closed, liquid is at the bottom of the pipe, respectively to sawtooth profile of the pipe. The pipe is divided into sections with similar liquid level and pressure loss in the steady state. Initial conditions of the liquid level can be adjusted during simulation. Additional liquid level simulates solid wastes, the velocity of which is much slower than liquid linear velocity.
264
D. Choiński, M. Metzger, and W. Nocoń
The liquid and air momentum, Ml and Mg respectively, can be calculated as follows: M l = Gl S , (4)
M g = Gg S ,
(5)
where: S – cross section of the pipe [m2], Sa – cross section occupied by gas [m2], Sw – cross section occupied by liquid [m2]. For further consideration, void fraction α and mass fraction of gas ω are used and are defined as follows:
α=
ω=
Sg S
(6)
,
Mg G Mix S
.
(7)
The above equations are valid, assuming the following condition:
dω =0 . dx
(8)
The pipe is subdivided into n sections of length L. The liquid level in a section is the same as for boundary condition. For the separated flow, assuming equilibrium, force balance equation for gas phase is as follows: −m
⎛ dωG Mix ⎞ ⎟ ω 2 G Mix 2 2K g ⎜ ⎜ μg ⎟ ⎛ dP ⎞ ⎝ ⎠ −⎜ ⎟ = dρ g ⎝ dL ⎠ g
,
(9)
where: d – internal tube diameter [m]. For laminar flow Kg=16 and m=1, for turbulent flow Kg=0.046 and m=0.2. The Reynolds number necessary for choosing of the flow type is calculated as follows: Re Mix =
GMix d
ωμ g + (1 − ω )μl
.
(10)
Pressure drop calculation for liquid phase is based on Hazen-William formula [14,15]. This formula is used as a design guide for vacuum sewerage and coefficients for several tube types are described. Generally, an empirical relationship for the friction head loss hL [m] in a PVC pipe segment typically used for vacuum sewerage, may be expressed in a form: 1.2128 ⎛ G hl = 4.87 ⎜⎜ l d ⎝ ρl
1.85
⎞ ⎟ ⎟ ⎠
.
(11)
Multiscale Three-Phase Flow Simulation Dedicated to Model Based Control
265
The pipe is subdivided into sections with varying cross section occupied by the liquid (see Fig. 1).
Fig. 1. Multiscale diagram of computing; l – liquid, g – gas, Mix – gas+liquid, p – particle, S – cross section of the pipe, Θ - pipe inclination angle, h – friction head loss, G – superficial mass flux, V – velocity, P – pressure, L – pipe segment length.
The initial conditions for the simulation should start from the set up of the liquid volume in the particular section of the pipe in the steady state. The program automatically calculates the slop of the pipe section properly to the volume in the previous, current and the next part. This set up assumes the pressure drop profile for gas phase and the friction head loss. The following initial conditions are: maximum absolute pressure of the pump station, buffer capacity of the vacuum pump, minimum start pressure for valve opening and the time period during the valve closure. Volume inserted to the pipe while the valve is opened and particle contents are established as well. The valve is located at the first pipe segment, while the pump station is located at the last segment. The output from the pipe can be connected not only to the pump station, but also to the next pipe segment or another simulated system using TCP/IP protocol. Respectively, the first segment can be connected to the last segment of another pipe. For clear presentation of flow, the pipe is presented as straight, while the effect of the sawtooth profile are the varying liquid levels (see Fig. 2). In the particular steps of the simulation algorithm for the two-phase flow scale, the following values are calculated:
• Static absolute pressure of the gas phase in the relationship to pressure drop profile. • Differential pressure for pipe segment • Density and viscosity of gas and liquid phase
266
D. Choiński, M. Metzger, and W. Nocoń
• Superficial mass flux of gas and liquid • • • •
New volume of liquid in the pipe section Continuity of equation by liquid phase mass balance in the whole pipe is checked Pressure correction for proper mass balances Medium velocity for mixture, gas and liquid phase with respect to the pipe cross section • Calling the coarse scale for friction head loss correction pipe cross-section in steady state
pipe cross-section in steady state transformed to model
gas liquid particle
only vertical and horizontal forces
Fig. 2. Model simplifying for coarse scale
3 Multiscale Particle Motion Simulation For particle motion simulation 2-D coarse scale is implemented. The stratified, annular and slug flow are considered. Types of two-phase flows are determined by gas and liquid superficial mass flux, cross section of pipe occupied by liquid and the inclination angle of pipe profile. The coarse grid is used for simulation of bubble and particles motion, pressure drop for liquid phase and friction head loss. Simulated bubble movement and local liquid velocity enables calculation of particle behaviour: motion and sedimentation. The grids are considered in particular pipe segments with assumption that density and viscosity are constant, enabling simple momentum calculation. For example, in the pipe with inner diameter of 0.05 [m] the boundary conditions are as follows [2]: Stratified flow for: Gg<2 and Gl<100; Annular flow for: Gg >10 and Gl <1000; Slug flow 10> Gg >2 and 100< Gl <100. Resolution of the grid can be different for the particular pipe segment and is selected to the estimated gas bubble diameter based on empirical investigation. The grid is considered for the particular pipe segment as far as bubble behaviour is concerned and for the whole pipe as far as particle motion and sedimentation are concerned. The matrix construction for the grid representing liquid and gas phase in pipe for investigation of the bubble motion in pipe segment starts from calculation of liquid level inclination angle to the gas velocity vector and onset slug flow based on empirical equation [15]:
Multiscale Three-Phase Flow Simulation Dedicated to Model Based Control
⎡⎛ ρ G g ≥ Gl + 0.487 ⎢⎜ l ⎢⎣⎜⎝ ρ g
267
1
⎤2 ⎞ ⎟ gh g ⎥ , ⎟ ⎥⎦ ⎠
(12)
where: hg – the height of the gas pocket during slug flow. For the accurate friction head loss correction in the two-phase slug flow the prediction of the liquid holdup is necessary. The dimensionless correction factor RS equals [16]:
⎡ ⎛ ρ V d ⎞⎤ RS = exp ⎢− ⎜⎜ 0.45Θ + 2.48 • 10 −6 l Mix ⎟⎟⎥ μ l ⎠⎥⎦ ⎢⎣ ⎝
(13)
where: Θ - the pipe inclination angle in radians (0..1.57) In practice, mixture velocity VMix value is usually low hence practically, the correction factor mainly depends on pipe inclination angle. The most important factor is slug flow condition that is described by gas bubble to liquid motion. The matrix for pipe segment is divided into a moving window with resolution 3x3. Motion of this window is parallel to the liquid velocity vector. Matrix in the window describes simple relation between liquid and gas elements representative for investigated stratified, annular and slug flow. In this step, elements of the matrix which represent particles are moving parallel to liquid elements. In case of the slug flow, scrolling direction is upwards, otherwise it is downwards. The proper gas, liquid and particle volume, summarized momentum and maximum height of the gas pockets for slug flow are checked after scrolling the whole segment. The matrix constructed by contact of matrixes representing pipe segments is used for investigation of particle motion. In such investigations, a moving window is also used but in opposite direction and dimension of the window along the pipe is 6, across the pipe it is 3. For every matrix element that represents particle, a velocity vector based on the near 17 elements is calculated and this vector determines a target matrix element. After that, the matrix element that represents the particle is being interchanged with target element. Also, liquid volume in pipe segment is calculated and corrected. The last step is calculation of friction head loss factor which is proportional to the value derived using the above equation and the height of the gas pocket after particle motion correction. During stratified flow of particles the gravity force vector is considered. For gas phase in this flow case, the most significant force vector depends on uplift pressure. Simulation program is prepared as a stand-alone application dedicated only to the presented model as a real-time simulator of the distributed object with controller and communication interface for real controller connection. Calculation complexity, realtime simulation and 3D graphs for on-line presentation of the distributed parameters object require a program that is well optimized for speed and size.
4 Selected Results and Concluding Remarks The simulation investigations show, that the best way to force movement of undesirable particles in the inclined pipes is to destabilise the flow (see Fig. 3).
268
D. Choiński, M. Metzger, and W. Nocoń
Fig. 3. Visualization of the pipe segment (with reduced resolution for print); start of the unsteady process (top), movement of the particles (bottom)
35
slug flow
p ip e le n g th [m ]
30
25
20
annular flow
15
10
particle trace 5 2
4
6
8 tim e [s]
10
12
14
Fig. 4. Time changes of normalized density for three-phase flow. Vacuum being controlled at the pipe output. The valve is opened at t=4 s.
For clear presentation of flow, the pipe is presented as straight, while the effect of the sawtooth profile are the varying liquid levels. At the beginning of simulation experiment, a number of particles lie at the bottom of sawtooths (black points), whereas the liquid at the end of the pipe becomes unsteady (white bubbles in the liquid). The transient response presented at the bottom window shows the situation in which some of the particles jump over the sawtooths and move in the desired direction. Additional information is presented in 3D diagrams, in which the x-t profiles are presented (see Figs. 4 and 5). The x variable in Figs. 4 and 5 represents the normalized density at the pipe cross section, hence the ratio of the averaged density for three phases and liquid density. For higher densities, the color on the plot is darker. For steady state, the density gradient corresponds to the pipe leveling. “averaged color” corresponds to annular flow, hence to the mixing of different phases. A clearly visible color gradient displaced with respect to the pipe leveling accompanies the slug flow. Additionally, different type of visualization of experimental studies are plots of liquid phase flow along the pipe (see for example Fig. 6).
Multiscale Three-Phase Flow Simulation Dedicated to Model Based Control
269
stable annular flow
35
p ip e le n g th [m ]
30
25
20
15
10
5 5
10
15 tim e [s ]
20
25
30
Fig. 5. Time changes of normalized density for three-phase flow. Vacuum being controlled along the pipe. The valve is opened at t=4s and t=18s. A stable annular flow is present, that decreases the hydraulic resistance and enables flow of solid particles.
liquid velocity [m/s]
2 1.5 1 0.5 0 40 30 24
20
20
pipe length [m]
16 12
10
8
time [s]
4 0
Fig. 6. Linear velocity of liquid-phase flow in case of vacuum being controlled along the pipe
The process operator should make the appropriate decision based on the simulated control system with monitoring and visualisation. A real-time simulator of the process improves the optimal operating control. The proposed software system is dedicated as a simulation support for model based operating control and monitoring. In this case, the operator can check control decisions using the simulator and find the most appropriate decision. In the mode of the reduced coarse grid resolution, the proposed system can also be helpful for direct digital control based on PCembedded controller.
270
D. Choiński, M. Metzger, and W. Nocoń
Acknowledgements. This work was supported by the Polish Ministry of Science and Higher Education using funds for 2006-2008, under grant no. N514 006 31/1739.
References 1. Manual: Alternative Wastewater Collection Systems, United States Environmental Protection Agency, Office of Water, EPA/625/1-91/024 (1991) 2. Mao, F., Desir, F.K., Ebadian, M.A.: Pressure drop measurement and correlation for threephase flow of simulated nuclear waste in a horizontal pipe. Int. J. Multiphase Flow 23(2), 397–402 (1997) 3. Petruk, A., Cassar, A., Dettmar, J.: Advanced real time control of a combined sewer system. Wat. Sci. Technol. 37(1), 319–326 (1998) 4. Choinski, D.: Real-time simulation of two-phase flow in vacuum sewerage with pressure control. In: Proceedings of the 11th IEEE MMAR Conference, Międzyzdroje, pp. 813–818 (2005) 5. Storkaas, E., Skogestad, S., Alstad, V.: Stabilization of Desired Flow Regimes in Pipelines. In: Proceedings of AIChE Annual Meeting in Reno USA, November 9 (2001) 6. Drengstig, T., Magndal, S.: Slug control of production pipeline. SIKT-rapport Nr: SIKTPR-8-2, Hogskolen i Stavanger (2001) 7. Metzger, M.: A comparative evaluation of DRE integration algorithms for real-time simulation of biologically activated sludge process. Simul. Pract. Theory 7, 629–643 (2000) 8. Chopard, B., Droz, M.: Cellular Automata Modeling of Physical Systems. Cambridge University Press, Cambridge (1998) 9. Chopard, B., Dupuis, A.: Lattice Boltzmann models: an efficient and simple approach to complex flow problems. Comp. Phys. Communications 147, 509–515 (2002) 10. Artoli, A.M., Hoekstra, A.G., Sloot, P.M.A.: Optimizing lattice Boltzmann simulations for unsteady flows. Computers & Fluids 35, 227–240 (2006) 11. Jansen, F.E., Shoham, O., Taitel, Y.: The elimination of severe slugging-experiments and modeling. Int. J. Multiphase Flow 22(6), 1055–1072 (1996) 12. Taitel, Y., Barnea, D., Brill, J.P.: Stratified three phase flow in pipes. Int. J. Multiphase Flow 21(1), 53–60 (1995) 13. Taitel, Y., Barnea, D.: Simplified transient of two phase flow using quasi-equilibrium momentum balances. Int. J. Multiphase Flow 23(3), 493–501 (1997) 14. Holland, F.A., Bragg, R.: Fluid Flow for Chemical Engineers, Arnold, London (1995) 15. Hager, W.H.: Abwasser-hydraulik Theorie und Praxis. Springer, Heidelberg (1994) 16. Gomez, L.E., Shoham, O., Taitel, Y.: Prediction of slug liquid holdup: horizontal to upward vertical flow. Int. J. Multiphase Flow 26, 517–521 (2000)
Simulation of Sound Emitted from Collision of Droplet with Shallow Water by the Lattice Boltzmann Method Shinsuke Tajiri1, Michihisa Tsutahara1, and Hisao Tanaka2 1
Graduate school of Engineering, Kobe University Rokko, Nada-ku, Kobe 657-8501 Japan
[email protected],
[email protected] 2 Universal Shipbuilding Corporation 1-3 Kumozu-kokan-cho, Tsu, Mie 514-0398, Japan
[email protected]
Abstract. The sound emitted from splash of water droplet colliding with shallow water is simulated by the finite difference lattice Boltzmann method. Twoparticle immiscible fluid model is used, and the under water sound is considered by introducing the elasticity for the liquid phase. After the collision, sounds propagating into the gas and liquid phases are successively detected. The directivity of the sound is shown to depend on the depth of the water. Keywords: Finite difference lattice Boltzmann method, Aero-acoustics, Twophase flow, Two-fluid interface.
1 Introduction The sounds generated at interfaces between liquid and gas are commonly heard in ordinary life such as sound from raindrops and so on. But the mechanism is still not clear because the scales of motion of the interface and sound are very different and the experimental investigations are still hard. It is also still very difficult to calculate sound waves and deformation of interfaces simultaneously. As to the fluid dynamic sound, direct simulations have become popular by using high-resolution schemes. On the other hand, many calculation methods for interfaces have been developed. MAC (Marker And Cell) [1], VOF (Volume Of Fluid) [2] and LS (Level-Set) [3] methods are widely used. However, these methods sometimes suffer the lack of conservation of volume. CIP (Cubic- Interpolated Pseudo-particle [4]) has been considered to reduce the diffusivity at the interface. Immiscible models based on the LBM (Lattice Boltzmann Method), which are excellent in conservation of volume or mass, have been developed [5,6] for simulation of multi-phase flows. In this paper, we present a two-particle model for two-phase flows, gas and liquid, considering large density difference, surface tension effect, and compressibility of the liquid phase. M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 271 – 280, 2008. © Springer-Verlag Berlin Heidelberg 2008
272
S. Tajiri, M. Tsutahara, and H. Tanaka
2 Basic Equations 2.1 Discrete BGK Equation The basic equation for the finite difference lattice Boltzmann method is the following discrete BGK equation [7] (1) where f i is the distribution function which is the number density of the particle having velocity ciα and subscripts i represents the direction of particle translation and α is the Cartesian co-ordinates. fi eq is the local equilibrium distribution function. The term on the right hand side represents the collision of particles and τ is called the single relaxation time factor. Superscript k represents gas and liquid phases, and we will use two particle model and k = G represents gas phase and k = L liquid phase, respectively. Macroscopic variables, density ρ , flow velocity u , and internal energy e are obtained by (2) (3) (4) where m represents the number of the particles. The pressure P = ρ e and sound p speed in gas phase csG = 2e for two-dimensional flows. 2.2 Interface Treatment In order to obtain sharp interface for immiscible two-particle model, artificial separation techniques [5] are sometimes introduced. In this study, phase separation or recolor technique by Latva-Kokko et al [8,9] is employed. In this technique, an additional term is introduced to the discrete BGK equation as (5) where fi ′k is re-distributed function calculated from the gradient of the interface. fi ′k is given by
Simulation of Sound Emitted from Collision of Droplet with Shallow Water
273
(6a,b)
where κ is a parameter to control the thickness of the diffusive interface and f i eqk (0) is the equilibrium distribution function for velocity zero considering the natural or pure diffusion. Each sum of fi ′G , f i ′L is not changed at the collision, and then the density, the momentum and the energy are conserved. ϕ is the angle between the gradient of the density and the particle velocity and given by (7) (8) 2.3 Introduction of External Forces External forces are introduced [10] to the discrete BGK equation as (9) where R is the gas constant and T is the absolute temperature. Alternatively, the force is introduced to the equilibrium distribution function as an impulsive force, and we employ the latter technique The equilibrium distribution function fi eqk = fi eqk ( t , ρ k , uα ) is modified by the external force Fα as (10) (11) (12) where W is the work done by the external force, and give by (13)
2.4 Surface Tension We employ a model proposed by Gunstensen [11] and Continuum Surface Force (CSF) method [12]. In CSF, the surface tension force FS is given by (14)
274
S. Tajiri, M. Tsutahara, and H. Tanaka
where σ is the surface (interfacial) tension coefficient Κ is the curvature of the interface, nˆ is the normal unit vector of the interface, and nˆ ( x ) = n ( x ) n ( x ) . The normal vector on the interface n ( x ) is calculated by (15) The curvature Κ is also calculated by (16)
2.5 Model for Two Fluids with Large Density Difference He et al [13] have proposed a large density difference fluid model up to about 40, and Inamuro et al [14] have proposed a model with density difference up to 1000. But the fluid of liquid phase is completely incompressible in Inamuro’s model, and the sound in liquid phase can not simulated by his model. Therefore we will propose a novel model for two-phase flows with large density difference. The density difference is realized by changing the acceleration, and this effect is also introduced to the model by impulsive force for each particle (17)
where P′ is the effective pressure, and a is the acceleration to cancel the original force term, a′ is newly introduced acceleration due to the density and the viscosity of considering fluid. m represents an averaged density and μ ′ is an averaged viscosity (18a,b) The density ratio of the two fluids is given as mL / mG , and in case of water and air, it is about 800, and the ratio of viscosity is about 70 and they change continuously across the interface.
2.6 Compressibility of Liquid A simple model of bulk elasticity is introduced to consider the compressibility of the liquid as (19)
Simulation of Sound Emitted from Collision of Droplet with Shallow Water
275
where β is a parameter to control the elasticity and corresponds the bulk (elasticity) modulus and P0 is a reference pressure and ρ0 is a reference density and is fixed to unity in this study. In liquid phase, we do not consider the temperature change.
3 Two-Dimensional Thermal Model 3.1 Two-Dimensional 21 Velocity Model In this study, two-dimensional thermal 21-velocity (D2Q21) model [15,16] is used, whose equilibrium distribution function is given by (20) The velocity set and the coefficients are shown in Fig.1, Tables 1 and 2, respectively.
Fig. 1. Distribution of particles of D2Q21 model Table 1. Velocity set in D2Q21 model
i
Velocity vector
1
( 0, 0)
0
2-5
( 1, 0), ( 0, 1), (-1, 0), ( 0,-1)
1
6-9
( 2, 0), ( 0, 2), (-2, 0), ( 0,-2)
2
10-13
( 3, 0), ( 0, 3), (-3, 0), ( 0,-3)
3
14-17
( 1, 1), (-1, 1), (-1,-1), ( 1,-1)
2
18-21
( 2, 2), (-2, 2), (-2,-2), ( 2,-2)
2 2
276
S. Tajiri, M. Tsutahara, and H. Tanaka Table 2. The coefficients Fi and B in D2Q21 model
i
Fi
5 17 35 49 1 2 2 4 2 4 Bc 96 B c 48Bc 45 1 13 71 3 2 2 4 8 Bc 16 B c 24 Bc2 1 5 25 3 16 Bc2 16 B 2 c 4 24 Bc2 5 1 1 1 1 24 Bc2 16 B 2 c 4 8 Bc2 15
1 2-5 6-9 10-13
Bc2 1 3 6 3 4B c
14-17
1 8
1 2 Bc2 3 1536 B 3 c 6
18-21
Sound speed Cs
B
8 7 6 5
1 2e
e=0.125 (Liquid) e=0.320 (Liquid) e=0.500 (Liquid) e=1.000 (Liquid) Full marks - Gas (Original)
4 3 2 1 0
0
1
2
3
4
5
6
7
8
Fig. 2. Sound speed versus bulk modulus by using D2Q21 model on internal energies e0=0.125, 0.32, 0.5, and 1. Full marks represent the sound speed of original FDLBM on each internal energy. The sound speed of liquid depends on only bulk elasticity modulus.
3.2 Sound Speed of Liquid Phase The sound speed of the liquid phase is given as (21) In order to check the sound speed, we calculated very weak pressure wave by a piston problem. The results are shown in Fig. 2, and the relation given above is shown to
Simulation of Sound Emitted from Collision of Droplet with Shallow Water
277
be satisfied. This result does not depend on numerical scheme. If we consider a system consists of water and air, the ratio of both sound speeds is about 4.4 and β m can be adequately chosen.
4 Sound Generated by Collision of a Water Drop on a Shallow Water The sound generated by rain drops is a very well-known sound, and intensive studies have been done [17]. In this repot, we calculated the sound generated by collision of a water drop to shallow water using the above mentioned model. The reason why we chose this problem is that the results, such as splash shape except sound, can be compared with other simulation results. In this calculation, we did not impose the gravitational force, because some internal compression waves are generated just after the gravitational force is imposed. Instead, the initial velocity was imposed as shown in Fig. 3. Space differential is discretized by the third order upwind scheme (UTOPIA) and time integration is done by the second order Runge-Kutta method. The parameters of the calculation are given as follows. The number of grids is 503 × 401 and the minimum grid size is 2 ×10−5 . Time increment is set 2 × 10−6 . The density ratio mL / mG = 1000 , kinematic viscosity of gas ν G = 1 ×10 −6 , and the bulk elastic modulus of liquid is 6400. The phase separation coef-
ficient κ = 0.9 , the surface tension coefficient σ = 1×10 −7 , the droplet diameter D = 2 ×10−3 , the water depth D f = 0.075 D , and the initial velocity of the droplet U = 0.02 . It should be noted here that these parameters are all non-dimensional values based on the minimum particle speed c = 1 , the reference time t = 1 , and the ref-
erence length c / t = 1 . Mirror boundary condition
Calculation domain
Gas Liquid droplet Non-slip wall
z y
D x
U
Df
Fig. 3. Calculation domain for impact of a drop on a thin film of the same liquid. Mirror boundary condition was employed for the central domain, and the upper boundary condition was free boundary.
S. Tajiri, M. Tsutahara, and H. Tanaka
Spread factor r*=r/D
.
278
10
1
Re=100 Re=500 Re=1000 R=(DUt)^0.5
0.1
r
0.01 0.01
0.1
1
10
Dimensionless time t*=Ut/D
Fig. 4. Log-log plot of the spread factor r * . The solid line corresponds to the power law R = DUt .
Pressure propagation in liquid phase
t*=0.16
Pressure fluctuation of impact
Splashing
t*=0.40
Small bubble
Fig. 5. Density field and pressure distribution for Reynolds number Re=1000
Figure 4 shows the relationship between another non-dimensional time t* = Ut / D and space factor r* = r / D , where U is the initial velocity of the drop, D is the diameter of the drop and r is the radius of the splash root. The relationship fits well to a power law. The sound generated at the collision is shown in Fig. 5. Sounds propagating into the gas phase and also into the liquid phase are simultaneously shown there. The sound generated by the deformation of the drop (round shaped sound) and generated by the underwater sound (sound speed is 4.4 times larger than the former) are clearly
Simulation of Sound Emitted from Collision of Droplet with Shallow Water
(a) D f
0.075 D
(b) D f
0.3 D
279
Fig. 6. The directivity of sound propagating into gas phase for different water depth
presented. It should be noted that small gas bubbles are also generated and caught by water. Figure 6 shows that the sound propagating into gas phase has complicated directivity, and that the directivity depends on the depth of the water.
5 Conclusion A two-phase-flow model with large density difference and including the elasticity of liquid is proposed and simulation of sound generated when a water droplet collides with shallow water was performed. The sounds propagating into gas and liquid phases were successively simulated.
References 1. Harlow, F.H., Welch, J.E.: Numerical calculation of time-dependent viscous incompressible flow of fluid with free surface. Physics of Fluids 8, 2182–2189 (1965) 2. Hirt, C.W., Nichols, B.D.: Volume of fluid (VOF) method for the dynamics of free boundaries. Journal of Computational physics 39, 201–225 (1981) 3. Sussman, M., Smereka, P.: Axisymmetric free boundary problem. Journal of Fluid Mechanics 341, 269–294 (1997) 4. Yabe, T., et al.: The constrained Interpolation profile method for multiphase analysis. Journal of Computational physics 169(2), 556–593 (2001) 5. Shan, X., Chen, H.: Simulation of non-ideal gases and liquid-gas phase transition by the lattice Boltzmann equation. Physical Review E 49, 2941–2948 (1994) 6. Swift, M.R., Osborn, W.R., Yeomans, J.M.: Lattice Boltzmann simulation of non-ideal fluids. Physical Review Letter 75, 830–833 (1995)
280
S. Tajiri, M. Tsutahara, and H. Tanaka
7. Succi, S.: The lattice Boltzmann equation for fluid dynamics and beyond, Oxford, pp. 70– 73 (2001) 8. Latva-Kokko, M., Rothman, D.H.: Diffusion properties of gradient-based lattice Boltzmann models of immiscible fluids. Physical Review E 71, 056702 (2005) 9. Latva-Kokko, M., Rothman, D.H.: Static contact angle in lattice Boltzmann models of immiscible fluids. Physical Review E 72, 046701 (2005) 10. He, X., Shan, X., Doolen, G.D.: Discrete Boltzmann equation model for nonideal gases. Physical Review E 57, 13–16 (1998) 11. Gunstensen, A.K., Rothman, D.H., Zaleski, S., Zanetti, G.: Lattice Boltzmann model of immiscible fluid. Physical Review A 43, 4320–4327 (1991) 12. Brackbill, J.U., Kothe, D.B., Zemach, C.: A continuum method for modeling surface tension. Journal of Computational Physics 100, 335–354 (1992) 13. He, X., Chen, S., Zhang, R.: A lattice Boltzmann Scheme for Incompressible Multiphase Flow and Its Application in Simulation of Rayleigh-Taylor Instability. Journal of Computational Physics 152, 642–663 (1999) 14. Inamuro, T., Ogata, T., Tajima, S., Konishi, N.: A lattice Boltzmann method for incompressible two-phase flows with large density differences. J. Computational Physics 198, 628–644 (2004) 15. Tsutahara, M., Kurita, M., Kataoka, T.: Direct simulation of Aeolian tone by the finite difference Lattice Boltzmann method. Computational Fluid Dynamics 2002, 508–513 (2003) 16. Tsutahara, M., Kataoka, T., Shikata, K., Takada, N.: New model and scheme for compressible fluids of the finite difference lattice Boltzmann method and direct simulations of aerodynamic sound. 37, 79–89 (2007) 17. Pumphrey, H.C., Crum, L.A., Bjomo, L.: Underwater sound produced by indivisual drop impacts and rainfall. Journal of Acoustical Society of America 85(4), 1518–1526 (1989)
Multiscale Numerical Models for Simulation of Radiation Events in Semiconductor Devices Alexander I. Fedoseyev1, Marek Turowski1, Ashok Raman1, Michael L. Alles 2, and Robert A. Weller 2 1
CFD Research Corporation (CFDRC), 215 Wynn Drive, Huntsville, AL 35805
[email protected] 2 Vanderbilt University, Nashville, Tennessee 37237, USA
Abstract. This paper describes the new CFDRC mixed-mode simulator which combines multiscale 3D Technology Computer Aided Design (TCAD) device models (fluid carrier transport and nuclear ion track impact), and advanced compact transistors models. Key features include an interface and 3D adaptive meshing to allow simulations of single event radiation effects with nuclear reactions and secondary particles computed by Vanderbilt’s MRED/Geant4 tools. Keywords: Nanoscale device, radiation event, nuclear reaction, 3D transient simulation, 3D adaptive mesh generation, hydrodynamics, drift-diffusion, multiscale, finite volume method, computer-aided-design.
1 Introduction Space and ground-level electronic equipment with devices of nanometer-scale, extremely high densities and highly complex circuits may be strongly affected by ionizing radiation. Accurate multiscale numerical models are necessary for better analysis of radiation effects and design of radiation tolerant devices [1,2]. Nano-scale electronic device responses to radiation are strongly related to the microstructure of the radiation event. Multiscale models and adaptive dynamic 3D mesh generation are needed to enable efficient simulations of transient semiconductor device and circuit responses to realistic multi-branched track, produced by ionizing particle. Predictive calculations can leverage efficient mixed-mode simulation capability, that is, threedimensional (3D) physical simulations of ionizing radiation effects in semiconductor structure coupled with external circuit solver. To design radiation-hardened (rad-hard) electronics and predict circuit/system characteristics, modeling tools are required at multiple levels (Fig. 1). The energy deposition by ionizing particles (so called single event effects) can no longer be treated as an average deposition (linear energy transfer, LET) along a linear path (Fig. 2). The impact of such radiation events has implications for nano-scale devices reliability operating in space exploration environments. The new multiscale approach to understanding the single-event response of semiconductor materials, devices, and circuits is necessary for reliable engineering models. M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 281 – 290, 2008. © Springer-Verlag Berlin Heidelberg 2008
282
A.I. Fedoseyev et al.
Adaptive dynamic 3D mesh generation capability has been developed to allow complex, multi-branched track data generated by Vanderbilt’s Geant4/MRED tools [1,2] simulations to be incorporated into CFDRC NanoTCAD device simulator [3-5] for transient device response simulations (Fig 3).
Fig. 1. Space electronics systems require automated design tools across multiple levels: macro, micro, nano. More accurate radiation-effects models are needed for new, nano-scale generations of semiconductor technologies, devices, and ICs.
Fig. 2. Examples of realistic ion tracks, to be simulated for device respond, computed by the Vanderbilt University group (middle: 100MeV proton of energy goes through strikes silicon device, generates hundreds of secondary tracks; right: 1GeV proton strike generates few secondary high-energy tracks)
In the following sections we provide the description of the multiscale simulation approach with focus on the new adaptive 3D mesh generation around ion tracks, and efficient solution methods for large 3D unstructured meshes which are necessary for transient radiation effects simulations.
2 Numerical Models Implemented and Governing Equations The numerical models for efficient 3D simulation of realistic radiation events in semiconductor nano-scale devices are being integrated within the advanced software tool NanoTCAD, which is a 3D device simulator developed and commercialized by
Multiscale Numerical Models for Simulation of Radiation Events
283
incoming particle
(a)
Metal 1
Metal 2
NMOS
PMOS
(b) Fig. 3. (a) 3D model of the CMOS inverter test structure, with two metal layers (M1 and M2) and examples of two radiation events computed by MRED/Geant4, where the incoming particle is hitting M1 (copper) above the NMOS-source tungsten contact. (b) 3D mesh generated (semitransparent) for NanoTCAD simulator with ion tracks shown.
CFD Research Corporation [4]. This integration provides a user-friendly interface and a large database of the semiconductor material properties available in NanoTCAD. It also makes possible a complete PV-cell simulation including both quantum and classical models for the appropriate PV-cell elements, both DC and transient regimes, etc. The models are currently being extended to incorporate simulation of the electronphonon transport in QDS made of semiconductors with both cubic and hexagonal crystal lattice, e.g., InAs/GaAs, Ge/Si, CdSe, ZnO. The drift-diffusion model implemented in NanoTCAD is described below. 2.1 Fluid Carrier Transport Model Fluid carrier transport drift-diffusion models are formulated based on continuity equations for electrons and holes and Poisson equation for electrostatic potential. They are able to provide good comparison with experimental data for transistors with channel length down to 15 nm. Conservation of charge for electrons is represented by continuity equation q
∂n +∇⋅J =qR n ∂t
(1)
q
∂p +∇⋅J =-qR p ∂t
(2)
and similarly for holes as
where the electron current is J =qμ (U ∇n-n∇Ψ) n n T
(3)
J =-qμ (U ∇p-p∇Ψ) p n T
(4)
and hole current is
284
A.I. Fedoseyev et al.
3 Here n and p are electron and hole densities [1/cm ], Ψ is electrostatic potential [V], q k T B is carrier charge (electron charge e), U = , and diffusion coefficients are T q 2 D =U μ and D =U μ [cm /s]. n T n p T p Electron and hole mobilities μ , μ are calculated parameters (models depend on n n the material, device, or calculated from quantum or kinetic level problems). Electrostatic potential which appears in current equations is governed by Poisson equation ∇ε∇Ψ=q(n-p-C)
(5)
where Ψ is electrostatic potential [V], ε is dielectric constant, and C is a doping, + + − C=ND-NA where N D and N A are constitutive doping . 2.2 Boundary Conditions Boundary conditions for n, p, and Ψ are shown below for the example of Ohmic contact. At the Ohmic contact we assume thermal equilibrium and vanishing space charge which results in 2 n⋅p-ni =0 (6) n-p-C=0 (7) Solving a quadratic equation for n, p we get Dirichlet conditions for n and p on the boundary (Ohmic contact)
n0 =
1 2
(
C 2 + 4ni2 + C
)
(8)
p0 =
1 2
(
C 2 + 4ni2 − C
)
(9)
The boundary potential at an Ohmic contact is the sum of the externally applied potential (voltage) V (t) and the so called built-in potential, which is produced by C doping Ψ=Ψ +V (t) bi C The built-in potential is
(10)
Multiscale Numerical Models for Simulation of Radiation Events
⎡ C ( x ) + C 2 + 4n 2 i Ψ bi = U T ln ⎢ 2ni ⎢⎣
⎤ ⎥ ⎥⎦
285
(11)
where the intrinsic concentration n is: i n= i
n⋅p
(12)
2.3 Discretization of Governing Equations and Efficient Solution of Linear Algebraic System Governing equations (1) to (4), (5) are discretized by finite volume method and solved simultaneously using the Newton technique, to ensure a good convergence. Unstructured adaptive meshes are used for domain discretization. In NanoTCAD, we use a high-performance iterative linear solver CNSPACK, developed by Fedoseyev et al. in (2001) [8]. CNSPACK uses a high order preconditioning by incomplete decomposition to ensure the accuracy, stability and convergence of the simulations. The linear algebraic system is solved in CNSPACK using a CGS-type iterative method with preconditioning by the incomplete decomposition of the matrix. Comparing the CGS and GMRES methods in different tests, it was found that both methods converge well, if a good preconditioner is used. The CGS method needs less memory to store only eight work vectors. To reduce the memory requirements, a compact storage scheme for matrices is used in CNSPACK. It stores only the nonzero matrix entries. The incomplete decomposition (ID) used for preconditioning, is constructed as a product of triangular and diagonal matrices, P=LDU. To avoid diagonal pivot degeneration, the Kershaw diagonal modification is used [9]. If the value of diagonal element becomes small dur-t ing the construction of preconditioning matrix, i.e. |a |<α= 2 σμ , the diagonal a ii ii is replaced by α . Here σ and μ are the maximum magnitudes of current row and i -t column elements, and 2 is a machine precision (t bits in mantissa). i For the first order ID, the matrix P has the same non-zero entry pattern as the original matrix. For a second order or higher ID, matrix P has one or more additional entries near the locations of the non-zero matrix entries, where the original matrix entries are zeros. The incomplete decomposition order is selected automatically to ensure a stable and fast convergence, typical values for ID are 2 to 6. Compared to other 3D commercial device simulators, NanoTCAD solver uses dramatically less memory (by a factor of 20), and CPU time is much smaller (by a factor of 10 to 20) for the similar problem and similar mesh size. Very large problems are solved within days instead of months compared to other commercial software.
286
A.I. Fedoseyev et al.
NanoTCAD simulator can solve a transient 3D multi-branched ion strike problem with 100,000 node unstructured mesh within 512MB memory. It can solve the problem with 3D unstructured mesh for up to 500,000 nodes within 2GB memory. This efficient linear solver made possible to use adaptive unstructured 3D mesh generation for a transient 3D multi-branched ion strike problem.
3 Novel Multiscale Single-Event Radiation Effect Models We are developing multiscale single-event models, which better reflect the true space environment and help to: (i) better understand and predict response of nano-devices and novel materials to the space radiation environment, particularly high atomic number and energy particles (HZE particles) and energetic protons, and the interaction with materials and devices; (ii) develop and assess radiation hardening techniques for space electronics; (iii) better evaluate the radiation response of new electronic devices and circuits at early design stage; (iv) reduce the amount radiation-laboratory testing, hence reducing the cost and time of exploration electronic system design. These new models have required development of a new 3D adaptive mesh generation tool (which we have named Germes, for Gerris-based Mesher) to ensure better accuracy of the simulations and more reliable prediction of radiation issues in nanoscale technologies and devices. The adaptive 3D mesh is particularly important for accurate modeling of complex, multi-branched tracks of single events (ion strikes) computed by MRED/Geant4 tools from Vanderbilt University, like, for example, presented in Fig.4 .
Fig. 4. View of “event 267” track-energy data (“tsv” file), plotted using “filaments” in CFDRC 3D software package. Boundaries of track segments from Geant4 are visible as black stripes. The incident particle was moving in the +Y direction (from bottom to top). All sizes are in μm.
Multiscale Numerical Models for Simulation of Radiation Events
287
The NanoTCAD package contains an unstructured mesh solver. This makes a wider choice of possible meshes and refinement methods. Ones of particular interest are binary-tree (bin-tree) mesh-refinement methods. The mesh refinement is based on bin-tree levels, to discretize each particular part of the sub-domain. In this case, one can generate the refined grid cells/points in the volume of the transistor in the vicinity of particle tracks. We use more levels for higher energy deposition in the vicinity of tracks, Fig. 5.
(a)
(b) Fig. 5. (a) Adaptive mesh generation for 2D example case, which uses more levels for higher energy deposition in the vicinity of tracks, and (b) for the real 3D event 267 tracks. Different isosurface of energy value is plotted with more details. The adaptive mesh refinement in the vicinity of tracks is shown in different plane cross-sections at Y=const .
The advantage of this approach is that the refined mesh is quite localized in the vicinity of tracks only, dramatically saving the computational time. The difficulty of this approach is a much more complicated implementation. For our experiments, we have used the open-source modules from Gerris flow solver [6], [7]. It has 2D and 3D capabilities, and both static and dynamic mesh refinement modules. We have developed the interfaces from MRED/GEant4 radiation event data to the Gerris mesh generation module and performed a series of tests for the mesh refinement with different particle-track models in the volume of the devices, both 2D and 3D cases. Fig. 5 shows bin-tree mesh refinement examples in the vicinity of ion tracks
288
A.I. Fedoseyev et al.
for 2D and 3D test cases. One can see that the mesh refinement is local, in the vicinity of tracks only. This is very important for the efficiency of computations for very large problems .
4 Nanotcad Transient Simulations with MRED/Geant4 Tracks with Developed Accurate Radiation Event Model The 3D mesh, created using the new CFDRC tools described above, can be used for transient 3D simulations of semiconductor nano-device response to such a radiation event generated by the Vanderbilt MRED/Geant4 interface. First such simulations have been performed at CFDRC, for a NMOS-transistor with external circuit (Fig. 6). Vdd
Vdd
Vdd
Vdd
10 (floating) load_inv
Vdd
P1
1 low input
P2
2
P3
3
4
P4
5
6
7
8
9 Vout
Vin inv5 N1
S
G
Ion Strike
N2
N3
inv6
inv7
inv8
N4
D Vss
0
Vss
inv2
Vss
inv3
P-subs
Vss
inv4
Circuit Simulated by SPICE 3D NMOS Model solved by NanoTCAD
Fig. 6. The 8-inverter chain simulation setup used for our mixed-mode simulations
The ion strike duration time-scale is one ps, but the time-scale of transient device respond is up to six orders of magnitude higher, 1μs. Numerical simulation of such transient behavior is difficult and required adaptive integration in time. Sample results are shown in Fig. 7 (b), the collected charge conservation is very good. For the device response computation, we simulated the NMOS device using the CFDRC NanoTCAD 3D solver [6] and the rest of the circuit in Spice (using the CFDRC mixed-mode package, MixSpice). In the NMOS 3D device volume, we applied the radiation event data obtained from Vanderbilt MRED simulation, which were then used in CFDRC new tool, Germes, to generate a 3D model with adaptive meshing around the ion tracks. The multi-branched ion-track segments outside of the NMOS 3D TCAD model can be ignored, as not affecting the device behavior. A series of test simulations was done, with different mesh sizes, with adaptive mesh refined near the ion tracks and doping profiles. We found that a 3D mesh with about 30,000 nodes for NMOS was sufficiently good, producing results for currents and collected charge <8% different from the results obtained for 3D mesh sizes of 80,000
Multiscale Numerical Models for Simulation of Radiation Events
289
Source Drain
(a)
(b) Fig. 7. Multiscale numerical simulations : MRED/Geant4 nuclear reaction radiation event 3D simulation in NMOS by CFDRC NanoTCAD. (a) Ion tracks (isosurface of deposited energy), 3D mesh locally refined around tracks, and resulting electron concentration (color map), at t = 2.4 ps. (b) Transient currents flowing through the contacts and the collected charges versus time (for this event, the total collected charge Qcol = 59 fC).
and 120,000 nodes, while saving the computation time dramatically. Our typical transient simulation with 3D mesh of 30K-40K nodes takes about 40-60 minutes on a 2GHz CPU.
5 Conclusions A description of multiscale numerical models involving fluid carrier transport and nuclear ion track generation, adaptive 3D meshing approach, and efficient solution method for large 3D problems using unstructured meshes has been provided. Multiscale models and adaptive dynamic 3D mesh generation presented in this paper enable efficient simulations of transient semiconductor device response to realistic multibranched track, produced by ionizing particle. The developed approach efficiently
290
A.I. Fedoseyev et al.
uses computer resources, makes possible solution of transient 3D multi-branched ion strike problems on a regular desktop computer. Key features include an interface and adaptive meshing to allow simulations of single event radiation effects with nuclear reactions and secondary particles computed by MRED/Geant4 tools from Vanderbilt University. As shown by many recent papers, the contribution of nuclear reactions to single event upset cross-section of modern ICs may be very significant.
References 1. Weller, R.A., Sternberg, A.L., Massengill, L.W., Schrimpf, R.D., Fleetwood, D.M.: Evaluating average and atypical response in radiation effects simulations. IEEE Trans. Nucl. Sci. 50, 2265–2271 (2003) 2. Kobayashi, A.S., Sternberg, A.L., Massengill, L.W., Schrimpf, R.D., Weller, R.A.: Spatial and Temporal Characteristics of Energy Deposition by Protons and Alpha Particles in Silicon. IEEE Trans. Nucl. Sci. 51, 3312 (2004) 3. Turowski, M., Raman, A., Schrimpf, R.: Non-Uniform Total-Dose-Induced Charge Distribution in Shallow-Trench Isolation Oxides. IEEE Trans. on Nuclear Science 51, 3166–3171 (2004) 4. CFDRC, NanoTCAD (2006), http://www.cfdrc.com/bizareas/microelec/micro_nano/ 5. Fedoseyev, A.I., Turowski, M., Wartak, M.S.: Kinetic and Quantum Models for Nanoelectronic and Optoelectronic Device Simulation. J. of Nanoelectronics and optoelectronics 2, 234–256 (2007) 6. Popinet, S.: Numerical modelling of fluid flow. Marsden Update (July 2004) 7. Popinet, S.: Free Computational Fluid Dynamics. ClusterWorld, 2 (June 2004) 8. Fedoseyev, A.I., Bessonov, O.A.: Iterative solution for large linear systems for unstructured meshes with preconditioning by high order incomplete decomposition. Comput. Fluid Dynamics Journal 10, 296–304 (2001) 9. Kershaw, D.S.: The incomplete Cholesky-conjugate gradient method for the iterative solution of systems of linear equations. J. Comput. Phys. 2, 43 (1978)
Scale-Splitting Error in Complex Automata Models for Reaction-Diffusion Systems Alfonso Caiazzo1, Jean Luc Falcone2 , Bastien Chopard2 , and Alfons G. Hoekstra1 1
Section Computational Science, University of Amsterdam, The Netherlands {acaiazzo,alfons}@science.uva.nl 2 CUI Department, University of Geneva, Switzerland {Jean-Luc.Falcone,Bastien.Chopard}@cui.unige.ch
Abstract. Complex Automata (CxA) have been recently proposed as a paradigm for the simulation of multiscale systems. A CxA model is constructed decomposing a multiscale process into single scale sub-models, each simulated using a Cellular Automata algorithm, interacting across the scales via appropriate coupling templates. Focusing on a reactiondiffusion system, we introduce a mathematical framework for CxA modeling. We aim at the identification of error sources in the modeling stages, investigating in particular how the errors depend upon scale separation. Theoretical error estimates will be presented and numerically validated on a simple benchmark, based on a periodic reaction-diffusion problem solved via multistep lattice Boltzmann method. Keywords: Complex Automata, multiscale algorithms, reaction-diffusion, lattice Boltzmann method.
1
Introduction
Complex Automata (CxA) have been recently proposed [1,5,6] as a paradigm for the simulation of multiscale systems. The idea of a CxA is to model a complex multiscale process using a collection of single scale algorithm, in the form of Cellular Automata (CA), lattice Boltzmann methods, Agent Based Models, interacting across the scales via proper coupling templates [6]. To construct a CxA we identify the relevant sub-processes, defining their typical time and space scale. The concept of the Scale Separation Map (SSM) [6] helps in this modeling stage. It is defined as a two dimensional map with the Cartesian axes coding the temporal and spatial scales. Each single scale model defines a box on the SSM, whose leftmost and bottom edges indicate the resolution of the single scale Automaton, while the rightmost and top edges are defined by the characteristics of the single scale process (e.g. the extreme of spatial and temporal domains). The key idea of a CxA is to transform a single big box spanning several time and space scales on the SSM in a set of interconnected smaller boxes (see Fig. 1). M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 291–300, 2008. c Springer-Verlag Berlin Heidelberg 2008
292
A. Caiazzo et al. space scale CA2
CA3
L1 ξ1
CA1 CA4
Δx1 Δt1
τ1
T1
time scale
Fig. 1. Example of a scale separation map for a CxA model. The solid box represents a complex process spanning several space and time scales. The CxA model splits the box in N smaller boxes (dashed lines, centered in the typical scales (τi , ξi ) of the sub-processes i) interacting across the scales.
Formally, we replace the original multiscale process, identified by state variable + update rule, with a collection of N single scale models, defined by (state variables + update rule)i , i = 1, . . . , N plus a set of coupling templates, describing how the single scale models interact together. Aim of this paper is to propose a formalism and a procedure to analyze CxA models. Focusing on an algorithm designed for reaction-diffusion problems, we investigate the scale-splitting error, i.e. the difference between the numerical solution obtained using a single multiscale algorithm based on single time and space discretization (equal to the ones needed to resolve the small scales) and the numerical solution obtained using the corresponding CxA model. In section 2 we introduce shortly a lattice Boltzmann scheme for a reactiondiffusion problem, describing how a CxA model for that system can be constructed. In section 3 we investigate the error introduced in the CxA model, deriving explicit estimates for the considered benchmark, which are validated on simple numerical simulations. Finally, we draw the conclusion in section 4.
2 2.1
From a Multiscale Algorithm to Single Scale Models Lattice Boltzmann Method for a Reaction-Diffusion Problem
We want to construct a CxA model for a reaction-diffusion process (RD) described by the equation ∂t ρ = D∂xx ρ + R(ρ), t ∈ (0, Tend ], x ∈ (0, L]
(1)
(plus additional initial and boundary conditions). To solve numerically (1) one can use an algorithm based on the lattice Boltzmann method (LBM) (overviews of the LBM can be found for example in [4,8], or in [2] for application to RD problems). For a chosen small parameter h, we
Scale-Splitting Error in Complex Automata Models
293
discretize the space interval with a regular grid Gh = {0, . . . , Nx − 1} of step size Δxh = h, and employ a D1Q2 model based on two velocities ci ∈ {−1, 1} and an update rule of the form 1 1 eq tn (f (ρ (j)) − fitn (j)) + h2 R(ρtn (j)) . τ i 2
fitn +1 (j + ci ) = fitn (j) +
(2)
Here j ∈ Gh is the spatial index of the node1 , while n ∈ N0 is the index which counts the time steps of length such that Δth = const. Δx2h
(3)
For the D1Q2 model we have fieq (ρ) =
ρ , i = 1, −1 . 2
(4)
Algorithm (2) leads to a second order approximation of the solution of (1) [7] if the parameter τ in (2) is chosen according to the diffusion constant in (1) as τ=
Δth 1 +D . 2 Δx2h
(5)
Observe that τ is independent from h in virtue of (3). In a more compact form, we can rewrite (2) as t fˆhn+1 = Ph (Ih + ΩDh (τ ) + ΩRh )fˆhtn ,
(6)
where fˆh (omitting the subscript i) represents a h-grid function , i.e. a real valued function defined on a grid of step h: fˆh : Gh → IR2 , fˆh : j → fˆhn (j) . Introducing the set Fh = φ : Gh → IR2 we have fˆh ∈ Fh . With this notation the subscript h denotes operators acting from Fh to itself: Ih is simply the identity on Fh , Ph acts on a grid function shifting the value on the grid according to ci Ph fˆh (j) = fˆi,h (j − ci ), i
while ΩDh and ΩRh are the operations defined in the right hand side of (2): 1 1 n ΩDh fˆh (j) = (fieq (ρ(fˆh (j)) − fˆh,i (j)), ΩRh fˆh (j) = h2 R(ρ(fˆh (j)) . τ 2 i Since ΩDh depends on the non-equilibrium part of fˆh and ΩRh is a function of the moment of the distribution, i.e. of the equilibrium part, it can be shown [3] that (7) ∀fˆh : ΩDh ΩRh fˆh = 0 . 1
For simplicity we assume periodic boundary conditions, considering the addition j + ci modulo Nx .
294
A. Caiazzo et al.
and similarly, since the operator ΩDh maps any element of Fh in grid function with zero moment: ∀fˆh : ΩRh (Ih + ΩDh )fˆh = ΩRh fˆh .
(8)
Relation (7) allows to split the LB algorithm (6) (see also [2]) as t fˆhn+1 = Ph (Ih + ΩDh (τ ))(Ih + ΩRh )fˆhn = Dh Rh fˆhtn ,
(9)
separating reaction Rh = Ih + ΩRh and diffusion Dh = Ph (Ih + ΩDh (τ )). Scale Separation Map. The next step towards the CxA model is the scale separation map. We start with the discrete process described by equation (6), which represents in our case the multiscale algorithm. On the SSM (in this simplified case restricted to the time scale axis) it spans a range of time scales between Δth = h2 and Tend (fig. 2a). Equation (9) shows that we can split the problem in two processes. Assuming that the diffusion is characterized by a typical time scale larger than the reaction time scale (for example if D is small compared to κ), we introduce a coarser time step ΔtD = M Δth ,
M ∈ N.
In practice, it corresponds to execute M steps of the reaction R, up to a time TR = ΔtD , followed by a single diffusion step. (a)
(b)
(RD)
Δth
(R)
Tend
Δth
(c) (D)
ΔtD =TR
(D)
(R)
Tend
T Δth R
ΔtD
Tend
Fig. 2. The SSM for the reaction-diffusion problem. In (a) reaction (dashed line) and diffusion (solid line) are considered as a single multiscale algorithm. In (b) we assume to use different schemes, where the diffusion time step ΔtD is larger than the original Δth . (c) represents the situation where the two processes are time separated, with a very fast reaction yielding an equilibrium state in a time TR ΔtD .
Back to the SSM, we have shifted to the right the lower edge of the diffusion box (figure 2b) and to the left the upper extreme of the reaction box. We have TR = ΔtD , since the reaction is run in the whole time interval [0, ΔtD ] with time step Δth and reinitialized after each diffusion step. Observe that this is not the only possibility. In some systems, the reaction leads very quickly to an equilibrium state in a typical time smaller than the discrete time step of the diffusion. The scale map for this situation is depicted in
Scale-Splitting Error in Complex Automata Models
295
Fig. 2c. However, this is a rather special case and will not be discussed here. We restrict the analysis to a single species reaction-diffusion with linear reaction, focusing on the SSM in Fig. 2b, where the time scales are not completely separated. A more complete treatment of the different cases (including more general multiscale processes) shall be topic of an upcoming work. CxA Model for Reaction-Diffusion. After coarsening the time scale of diffusion algorithm with ΔtD = M Δth , we can define a coarser space discretization selecting a new parameter h2 Δxh2 = h2 = MX h,
MX ∈ N .
(10)
Note that equation (2) must be modified according to√(5). We restrict to two possibilities: MX = 1 (time coarsened CxA) or MX = M (time-space coarsened CxA, in order to preserve relation (3)). Introducing the vector of small parameter H = (h, h2 ),the CxA can be formally described with the state variable fˆH = fˆ1,h , fˆ2,h2 , whose components denote the state after reaction fˆ1,h and after diffusion fˆ2,h2 and evolve according to (CA R) t1,0 =t2,n2 init ˆt2,n2 ˆ f1,h1 = f1 f2,h2 , t1,n1 +1 1,t fˆ1,h = R1,h1 fˆ1,hn1 1 , n1 =0,...,M−1 1
(CAD ) 0 init ˆ ˆ f2,h2 = f2,h2 (ρ0 ), (11) t2,n2 +1 t2,n2 +MΔtR ˆ f fˆ2,h = D Π . 2,h2 h2 ,h 1,h1 2
init being the initial condition. fˆ2,h 2 Using different discretizations, we define also a projection Πh2 ,h from the grid Gh to Gh2 and a new operator Dh2 . In fact, it depends on the relaxation time τ , which must be modified according to (5). With the terminology introduced in [6] (11) is a CxA composed of two Automata coupled through the following templates. (i) CAR is coupled to CAD through the initial conditions, since the output value of a single CAD iteration is used to define CAR initial conditions. (ii) CAD is coupled to CAR through the collision operator, since the output value of the reaction (after M iterations of CAR ) is used to compute CAD collision operator. This example is a special case where the two processes act on the same variable, and it is possible to write the algorithm only depending on fˆ2,h2 : t2,n +1 ˆt2,n2 fˆ2,h 2 = D2,h (τ(M) )RM 2,h f2,h , 0 init fˆ2,h = fˆ2,h (ρ0 ) .
(12)
General Formalism. From a more general point of view, we can formalize the Complex Automata modeling technique starting with an algorithm in the form t
fhn+1 = Φh fhtn
(13)
296
A. Caiazzo et al.
defined on fine space and time scales, identified by a discretization parameter h. Equation (13) describes the evolution of a single multiscale algorithm for a complex process where the numerical solution fhtn denotes the state of the system at the n-th iteration and Φh is the update rule. As before, we introduce the spatial grid Gh so that fhtn ∈ Fh = {φ : G(h) → R2 },
Φh : Fh → Fh .
(14)
Constructing a CxA we replace algorithm (13) with several simpler single ˆ scale Cellular Automata, described by the state variable fH = fˆ1,h , . . . , fˆR,h 1
R
(where H = (h1 , . . . , hR )) and an update rules for each component of fˆH : ΦH = (Φ1,h1 , . . . , ΦR,hR ) .
(15)
CxA The numerical solution of the CxA model fˆH belongs to a space FH ⊂ F1 × . . . FR (since the state spaces can be shared by different Automata), and ΦH : FH → FH . We can also introduce a general projection operator Π which defines the way we coarsen our process and lift operator Λ which describes an approximate reconstruction of the fine grid solution starting from the CxA one:
ΠHh , Πhr ,h : Fh → Fhr , r = 1, . . . , R,
ΛhH : FH → Fh .
The relevant spaces and operators introduced in the formalism in the particular case of reaction-diffusion can be represented with the diagram fh ∈ Fh
ΠHh =(Πh1 ,h ,Πh2 ,h )
- fH = fˆ1,h1 , fˆ2,h2 ∈ FH
M ΦM h =(Dh Rh )
? fh ∈ Fh
ΦH =(RM 1,h ,D2,h2 )
ΛhH
fH
? ∈ FH
where a single step of the update rule ΦH corresponds to M steps of the Φh , since ΔtD = M ΔtR . In this example example, if h1 = h2 = h, the components of the projection ΠHh are equal to the identity on Fh . If h1 = h2 , Πh2 ,h can be a sampling of the numerical solution on a coarser grid, and Λh,h2 an interpolation routine.
3
The Scale-Splitting Error
The idea of the CxA model is to replace (Ah ): the original (complex) multiscale algorithm depending on a discretization parameter h, with (CxA): a collection of coupled single scales algorithms.
Scale-Splitting Error in Complex Automata Models
297
This yields an improvement of the performance which is paid by a possible loss of precision. In analyzing the CxA modeling procedure, we are interested in quantifying E Ah ,CxA , expressing the difference between the numerical solutions of (Ah ) and (CxA), which we call scale-splitting error or CxA-splitting error. This quantity can be related to the loss of accuracy. Calling E CxA,EX the error of the CxA model with respect to the exact solution and E Ah ,EX the absolute error of the model (Ah ), we can write CxA,EX A ,EX A ,CxA E ≤ E h + E h . (16) 3.1
Error Estimates for the Reaction-Diffusion Model
For the CxA model of (RD), since the algorithm is designed to approximate the variable ρ, we can define the scale-splitting error at time iteration tN as tN = ρ ΠHh fˆtN − fˆtN E A,CxA (ρ; M, MX , tN ) = ρ Πh2 ,h fˆhtN − ρ fˆ2,h h 2,h 2 2 (17) representing the difference between ρ(fˆh ), i.e. the numerical solution of the finegrid algorithm (9) and ρ(fˆ2,h2 ), i.e. the output of the CxA model (11) after both reaction and diffusion operators have been applied. Observing that fˆh is the solution of (2) and fˆ2,h2 is obtained from (11) we can rewrite (17) as M ˆtN −MΔth Λ f = ρ Πh2 ,h (Dh Rh ) fˆhtN −MΔth − D2,h2 Πh2 ,h RM (18) h,h 2 2,h2 h For simplicity, let us assume that the two solutions coincide at the previous time and that we can write2 tN −MΔth fˆhtN −MΔth = Λh,h2 fˆ2,h 2
(for example if tN − M Δth corresponds to the initial time). Since the reaction is local, we have M ∀M > 0 : RM h Λh,h2 = Λh,h2 R2,h2 . Additionally, note that Πh2 ,h Λh2 ,h fˆ2,h2 = fˆ2,h2 (projecting after reconstructing gives the same function). Hence, we conclude that the distance between the numerical solutions can be estimated by measuring the distance between the algorithms 2
In general, it holds t −M Δth ) tN −Δt2 = ΛhH fˆH + (MX , M, tN − Δt2 ) + Λ,Π (H, h) fˆhN
where (MX , M, t) is bounded by E(MX , M, t) and Λ,Π (H, h) depend on the accuracy of projection and lift operations. The derivation of the estimate for this case is not reported in this short communication [3].
298
A. Caiazzo et al.
M ≤ E A,CxA (MX , M ) := ρ Πh2 ,h (Dh Rh ) Λh,h2 − D2,h2 RM 2,h2 M Λh,h2 + ρ Πh2 ,h (Dh Rh ) − DhM RM h
ρ ΠHh DhM − D2,h2 Πh2 ,h RM = E (1) + E (2) . h Λh,h2
(19)
The contribution E (1) depends on the difference (Dh Rh )M − DhM RM h , which can be estimated as a function of [Dh , Rh ] = Dh Rh −Rh Dh , i.e. the commutator of the operators Rh and Dh . For example, if M = 2 we have 2
(Dh Rh ) = Dh Rh Dh Rh = Dh2 R2h − Dh [Dh , Rh ]Rh . An argument based on asymptotic analysis [3,7], assuming that the numerical solution (which is a h-grid function) can be approximated by a smooth function evaluated on the grid points, i.e. fˆh (n, j) ≈ f (tn , xj ),
(20)
can be used to show (also in virtue of the properties (7)-(8)) that
[Dh , Rh ] ∈ O h3 ∂x ρκ .
(21)
By counting the number of times the commutator appears in the difference M (Dh Rh ) − DhM RM h , we have 2 3
(Dh Rh )M − DhM RM h ∈ O M h ∂x ρκ .
(22)
(2)
The part EH derives from the coarsening of the original lattice Boltzmann algorithm. Assuming (20) for both the coarse- and the fine-grid solutions we obtain
2 2 ρ Πh2 ,h DhM − D2,h2 Πh2 h = O(MX h + M 2 D 3 h2 ) . (23) We are interested in how the error depends on the coarsening in time and space, fixed a fine discretization parameter h and for particular values of κ and D which regulate the typical time scales of the system. In conclusion, (22)-(23) yield
if MX = 1: E A,CxA (M ) = O M 2 h3 ∂x ρκ + O(M 2 D3 h2 ),
2 : E A,CxA (M ) = O M h2 . if MX = 1, M = MX 3.2
(24)
Numerical Results
We consider the benchmark problem (1) with D = 0.1, κ = 3 and initial value ρ(0, x) = sin (2πx), for which we have the analytical solution ρEX (t, x) = sin (2πx) exp
R − D4π 2 t .
(25)
Scale-Splitting Error in Complex Automata Models
299
From the simulation results fˆh of algorithm (2) and fˆ2,h2 of model (11), we evaluate (17) as E(MX , M ; tN ) =Πh2 ,h ρ(fˆh ) − ρ(fˆ2,h2 ) = ⎧ ⎫ ⎨ ⎬ (26) 1 tN = max (j)) , Πh2 ,h ρ(fˆhtN )(j) − ρ(fˆ2,h 2 tN Nx (h2 ) ⎩ ⎭ j∈Gh2
Using (25), we compare also the scale-splitting error with the quantity ⎧ ⎫ ⎨ ⎬ 1 ρ(fˆtn (j)) − ρEX (tn , xj )) , E Ah ,EX = ρ(fˆh ) − ρEX h = max h n Nx (h) ⎩ ⎭ j∈Gh
(27) i.e. the error of the original fully fine-discretized algorithm (6) (evaluated with an opportune norm on the fine-grid space, according to the norm chosen in (26)). Fig. 3 shows the results of scale-splitting error investigation, choosing different 2 . values of M and comparing the cases MX = 1 and M = MX (b)
(c)
0.06 −2
slope ∼ 1.1
0.05
maxtN E
EH (1,M,tN )
(a) 0.02
0.015
−3
0.04
M =16
−4
0.03
0.01
−5
0.02 −6
M =9
0.005
0.01
M =4 0
0
0.5
1
tN
1.5
E Ah
slope ∼ 2
−7
0 2
0
5
10
15
20
25
−8
1
1.5
2
2.5
3
3.5
M
Fig. 3. (a): CxA-splitting error versus time for M = 4, 9, 16, fixing M√X = 1. (b): Maximum values of the CxA-splitting error for MX = 1 (◦) and MX = M (×) over A ,EX a complete simulation, as a function of M . The dashed line shows the error Eh h 2 of the fully fine-discretized algorithm (Δxh = h, Δth = h ) with respect to the exact solution (25). (c): Order plot (fig. (b) in double logarithmic scale) of maximum CxA-splitting error versus M . The approximate slopes confirm the behavior predicted by (24).
In particular, in fig. 3b we compare the maximum scale-splitting error in time for different values of M also with the error E Ah ,EX defined in (27). It shows that for small M the splitting error is of the same order of the discretization error of the original lattice Boltzmann algorithm (6). In this cases, the simplification obtained with the CxA model does not affect the quality of the results. The order plot in fig. 3c confirms estimates (24). In fact, the splitting error increases linearly in M , while for MX = 1 the increment is quadratic. As a consequence, for a moderate range of M , MX = 1 can produce quantitatively better results.
300
4
A. Caiazzo et al.
Conclusions and Outlook
We introduced a formalism to describe Complex Automata modeling. In particular we investigated the scale-splitting error, i.e. the modeling error introduced by replacing a fully fine-discretized problem with multiple Cellular Automata on different scales. Restricting to a lattice Boltzmann scheme for reaction-diffusion problems, we have derived explicit estimates for the splitting error verifying the expectation on simple numerical simulations. The investigation of the scale-splitting error represents the basis of a theoretical foundation of the Complex Automata simulation technique. In a future work we will discuss generalizations of the concepts presented here and of the estimates derived in this particular example to more complicate systems and more general CxA models. Acknowledgments. This research is supported by the European Commission, through the COAST project [1] (EU-FP6-IST-FET Contract 033664).
References 1. The COAST project, http://www.complex-automata.org 2. Alemani, D., Chopard, B., Galceran, J., Buffle, J.: LBGK method coupled to time splitting technique for solving reaction-diffusion processes in complex systems. Phys. Chem. Chem. Phys. 7, 1–11 (2005) 3. Caiazzo, A., Falcone, J.L., Hoekstra, A.G., Chopard, B. Asymptotic Analysis of Complex Automata models for Reaction-Diffusion systems. (in preparation, 2008) 4. Chopard, B., Droz, M.: Cellular Automata Modelling of Physical Systems. Cambridge University Press, Cambridge (1998) 5. Hoekstra, A.G., Chopard, B., Lawford, P., Hose, R., Krafczyk, M., Bernsdorf., J.: Introducing Complex Automata for Modelling Multi-Scale Complex Systems. In: Proceedings of European Complex Systems Conference 2006 (CD), European Complex Systems Society, Oxford (2006) 6. Hoekstra, A.G., Lorenz, E., Falcone, J.-L., Chopard, B.: Towards a Complex Automata Formalism for Multi-Scale Modeling. (accepted for publication) International Journal for Multiscale Computational Engineering 5, 491–502 (2007) 7. Junk, M., Klar, A., Luo, L.-S.: Asymptotic analysis of the lattice Boltzmann Equation. Journal Comp. Phys. 210, 676–704 (2005) 8. Succi, S.: The Lattice Boltzmann Equation for Fluid Dynamics and Beyond. Oxford University Press, Oxford (2001)
Wavelet Based Spatial Scaling of Coupled Reaction Diffusion Fields Sudib K. Mishra1, Krishna Muralidharan2, Pierre Deymier2, George Frantziskonis1,2, Srdjan Simunovic3, and Sreekanth Pannala3 1
Civil Engineering & Engineering Mechanics, University of Arizona 2 Materials Science & Engineering, University of Arizona, Tucson, AZ 85721 USA 3 Computer Science & Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA {sudib,krishna,deymier,frantzis}@email.arizona.edu, {pannalas,simunovics}@ornl.gov
Abstract. Multiscale schemes for transferring information from fine to coarse scales are typically based on some sort of averaging. Such schemes smooth the fine scale features of the underlying fields, thus altering the fine scale correlations. As a superior alternative to averaging, a wavelet based scheme for the exchange of information between a reactive and diffusive field in the context of multiscale reaction-diffusion problems is proposed and analyzed. The scheme is shown to be efficient in passing information along scales, from fine to coarse, i.e. up-scaling as well as from coarse to fine, i.e. down-scaling. In addition, it retains fine scale statistics, mainly due to the capability of wavelets to represent fields hierarchically. Critical to the success of the scheme is the identification of dominant scales containing the majority of useful information. The scheme is applied in detail to the analysis of a diffusive system with chemically reacting boundary. Reactions are simulated using kinetic Monte Carlo (KMC) and diffusion is solved by finite differences. Spatial scale differences are present at the interface of the KMC sites and the diffusion grid. The computational efficiency of the scheme is compared to results obtained by local averaging, and to results from a benchmark model. The spatial scaling scheme ties to wavelet based schemes for temporal scaling, presented elsewhere by the authors. Keywords: Multiscale, wavelets, up-scaling, down-scaling, reaction, diffusion.
1 Introduction Diffusion from a reactive boundary is inherently a spatially and temporally multiscale problem, since the reaction process typically involves spatiotemporal scales quite different than those in the diffusion process. Reaction phenomena involve microscopic spatial and temporal scales, while diffusion involves mesoscopic ones. Thus, coupling of the two processes efficiently is challenging. Recently, the authors and co-workers developed a wavelet based scheme that addresses temporal scaling in M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 301–310, 2008. © Springer-Verlag Berlin Heidelberg 2008
302
S.K. Mishra et al.
such problems [1]. The present work complements those schemes efficiently by offering efficient spatial scaling that can be used in conjunction with the temporal one. In fact, it is shown in this paper that a combined spatial and temporal scaling is needed in order to minimize error from the tradeoff between computational efficiency and loss of information due to up and down-scaling in multiscale schemes. Modern simulation techniques deal with the coupling of the spatial and temporal scales involved in a problem. Multiscale models fall under two general categories, namely sequential, and concurrent. In the former ones, typically, a set of calculations performed at lower scales is used to estimate parameters needed in a model at higher scales. An example is molecular dynamic simulations used to come up with the parameters needed to describe the constitutive behavior of materials and processes at higher scales modeled by finite difference or finite element methods. In the latter ones (concurrent) different models are used in different spatial regions concurrently, while compatibility of the methods at interfaces is enforced or assumed. An example is the simulation of crack propagation, where atomistic simulations are used in the vicinity of the crack tip while continuum finite elements are employed to simulate the far field response. A general review on multiscale methods is that of [2]. Discipline specific review papers are also available in the literature, e.g. in general chemical engineering [3], surface science [4], reactor technology [5], heat transfer and thermal modeling [6], and works implementing general approaches such as gap-tooth and time stepping [7]. A general multiscale methodology based on wavelets has been examined for both spatial scales [8] as well as temporal scales [9, 10]. These works take full advantage of the inherent capabilities of wavelet analysis to represent objects in a multiscale fashion. The waveletbased approach, termed the compound wavelet matrix method, establishes a communication bridge between phenomena at different scales.
2 Reaction Diffusion System The reaction diffusion system is simulated by utilizing principles of operator splitting, which allows treating the reaction operator separately from the diffusion one. Once the two phenomena are simulated at two different scales the multiscale bridging and passing information from lower to higher scales and getting feedback from higher to lower scales is accomplished through multiscale interfacing. This section presents the details of the reaction and diffusion processes for the problems herein. The chemical processes are treated under the framework of reaction kinetics. First order reactions are considered in this work, i.e. the reaction rate is proportional to the concentration of the reactant to the first power. For reversible reactions such as k AB A ⎯⎯→ B (1) k BA B ⎯⎯→ A the first order rate constants k AB , k BA , each of inverse time units, define the reaction kinetics governed by d [ A] d [ B] = −k AB [ A] + k BA [ B ] , = −k BA [ B ] + k AB [ A] (2) dt dt The stochastic formulation of (2) yielding a KMC process is based on the probability distribution function for reaction events expressed as an exponential
Wavelet Based Spatial Scaling of Coupled Reaction Diffusion Fields
303
P ( R = ri ) = 1 − e − k [ S ]Δt (3) where P is the probability of the event ri . Here k denotes the reaction rate constant, [S] denotes concentration, and Δt is the reaction time demand (time until the next reaction event will occur). Using the time demand, as is standard in formulation of KMC algorithms, the following equations for forward (AÆB) and backward reaction (BÆA) are obtained 1 ln(1 − R1 ) t AB = − k AB [ A] (4) 1 ln(1 − R2 ) t BA = − k BA [ B ] where R1 and R2 are random numbers uniformly distributed between zero and unity. At any time in the simulation the reaction which requires the least time is the one that will occur. Thus, at every KMC iteration step, two random numbers are generated, i.e. R1 , R2 , and t AB , t BA are evaluated based on (4). The minimum of t AB , t BA is the time increment associated with the selected reaction event. Since the chemical processes at the boundary are coupled to the diffusion of species by operator splitting, the uncoupled reaction process was described above. For the same reason, the uncoupled diffusion process is described next, with proper boundary condition at the reactive site. The governing equations for the diffusion of species on a two dimensional, x-y spatial domain is ⎛ ∂ 2 u ( x, y, t ) ∂ 2 u ( x, y, t ) ⎞ ∂u ( x, y, t ) = D⎜ + (5) ⎟⎟ ⎜ ∂t ∂x 2 ∂y 2 ⎝ ⎠ where D denote the diffusion coefficients, considered constant over the domain, and u denotes concentration of species. A finite difference explicit Euler scheme, first order in time and second order in space, with fixed time steps and fixed spatial discretization is used to solve (5). The stability criteria for the numerical integration process is guaranteed when the Courant condition is satisfied, i.e. ⎡ ( Δs ) 2 ⎤ Δt ≤ ⎢ (6) ⎥ −1 ⎢⎣ 2 D ⎥⎦ where Δt , Δs denote the time increment and minimum spatial grid size, respectively. In this work, the diffusion of species in the 2-D domain is considered deterministic. The stochastic version of the diffusion process yields a Brownian motion process, not examined herein. The spatially 2-D model consists of the semi-infinite positive half space (diffusion domain) with chemical reactions taking place at the boundary of the half space (reaction domain). The boundary condition at the reactive site is that both A, B are specified by the values evaluated from the reaction kinetics during the operation splitting process. The reflecting boundary condition for the half space is implemented in the finite difference scheme by setting the outgoing flux to zero. At the other end of the discretized diffusion problem (far from the reactive site), the species are absorbed. Simulating an absorbing boundary in a finite difference algorithm is not trivial. However, true absorbing boundary can be simulated using an infinite element in a finite element method. For the present problem this issue is
304
S.K. Mishra et al.
tackled, simply, by taking a sufficiently large diffusion domain along diffusion direction (x) so that species do not reach the end within the time frame considered. In a general context, however, appropriate measures such as using infinite finite elements must be implemented. Finally, periodic boundary conditions are assumed for the (y) diffusion direction. Let { R}r denote the spatial “signal” or vector of concentrations along r discrete
equidistantly spaced reactive points, x1 , x2 ,..., xr . Similarly, let { D}d denote the spatial “signal” of concentrations along d discrete equidistantly spaced diffusion nodes, x1 , x2 ,..., xd , along the reactive boundary. For a benchmark solution of the reaction diffusion problem r = d ; in this benchmark solution, no spatial scaling is required, yet the process is computationally cumbersome. For an efficient solution, however, d r , thus, the number of diffusion grid nodes along the reactive boundary are less than the number of reaction sites. This is examined within the context of averaging and wavelet based mapping in the following two sections. The averaging, wavelet based, and benchmark schemes are illustrated using the discretization parameters listed in Table 1. The benchmark scheme is the most accurate one, since it uses a diffusion domain discretization at the scale of the reaction sites. For this reason, it is computationally expensive. The wavelet based scheme maps 256 reaction sites to 32 nodes in the diffusion grid, thus there are 8 reactive sites contributing to each node in the diffusion domain. The averaging scheme maps 256 reaction sites to 32 nodes in the diffusion grid. Table 1. Discretization used in the three schemes Diffusion domain
Benchmark Wavelet scheme Averaging scheme
Nodes in x 64 64 64
Nodes in y 256 32 32
Discretization in x 0.125 0.125 0.125
Reaction sites Discretization in y 0.1250 1.0282
Reaction sites 256 256
1.0282
256
Discretization in x 0.125 0.125 0.125
Isotropic diffusion is considered with a diffusion coefficient D=0.05 spaceunits/sec2. The reaction rate for both forward (AÆB) and backward (BÆA) reactions is considered the same, 2.5 per sec. Thus the reaction kinetics is solely governed by the reactive species concentration and is biased to the direction of lower species concentration. The initial concentration of species A is 100 and of B is 10 at the reactive boundary, while away from the boundary (diffusion domain) the initial concentrations are zero.
3 Multiscale Interfacing by Averaging Averaging can be used for passing information from fine to coarse scales. Here, the concentrations at the nodes along the reactive boundary belonging to the diffusion
Wavelet Based Spatial Scaling of Coupled Reaction Diffusion Fields
305
grid are taken as the mean concentration of the reactive sites contributing to that node, i.e. those in its vicinity. Conversely, these mean concentration fields are taken as the initial concentrations along the reactive boundary for the next KMC step. A simple way to map the reaction “signal” over the diffusion grid is to take the mean field of the contributing reaction sites, yielding two vectors based on operations on { R}r , i.e.
⎧ ..... ⎫ ⎪R ⎪ ⎪ i−2 ⎪ ⎪ Ri −1 ⎪ ⎧.... ⎫ ⎪ ⎪ ⎪ ⎪ ⎨ Ri ⎬ → ⎨ Di ⎬ , ⎪ R ⎪ ⎪.... ⎪ ⎪ i +1 ⎪ ⎩ ⎭ ⎪ Ri + 2 ⎪ ⎪ ⎪ ⎩ ..... ⎭
⎧ ..... ⎫ ⎪δ ⎪ ⎪ i−2 ⎪ ⎪ δ i −1 ⎪ ⎪ ⎪ ⎨ δi ⎬ ⎪δ ⎪ ⎪ i +1 ⎪ ⎪δ i + 2 ⎪ ⎪ ⎪ ⎩ ..... ⎭
(7)
where {δ }r denotes the deviators (residuals) from the mean field at the reaction sites,
and { D}r denotes the mean concentrations at the diffusion nodes. Based on averaging,
{D}d = [ H ]d ,r {R}r t
t
(8)
holds, with superscript t denoting evaluation at time t, and matrix H of dimensions d,r, is circulant, expressed as ⎡ ci ⎢c ⎢ i −1 H = [ ] ⎢0 ⎢ ⎢ci + 2 ⎢⎣ ci +1
ci +1 ci 0 0 ci + 2
ci + 2 ci +1 0 0 0
0
ci + 2 ... 0 0
0 0 ... 0 0
0 0 0 0 ... 0 0 ci − 2 0 0
0 0 0 ci −1 ci − 2
ci − 2 0 0 ci ci −1
ci −1 ⎤ ci − 2 ⎥⎥ 0 ⎥ ⎥ ci +1 ⎥ ci ⎥⎦ d , r
(9)
The coefficients in H can be set to appropriate weights for evaluating the mean field. In a way, H acts as a multiscale operating matrix, and this will become more evident in the sequence where the wavelet based scaling is described. The deviators from the mean field are expressed as
{δ }r = { R}r − [ M ]r , d {D}d = {R}r − [ M ]r , d [ H ]d , r {R}r
(10)
where matrix M maps the d diffusion grid nodes along the reactive boundary to the r reactive sites, and has a sort of unit matrix structure. Thus a stochastic species concentration spatial signal can be decomposed into a mean field which is sent across to coarser scales, i.e. to the diffusion grid, which diffuses the species, and the residual part which contains information on the correlation structure among the individual sites. However, for a deterministically reactive boundary, all sites are reacting equivalently so the residual vector is null in that case. Statistically, all of them are fully correlated. Since mean field in this case is the entire reaction field and the residual part is zero, such a deterministic system should behave as if there is no spatial scaling. This fact will be used in the sequence for validating the implementations.
306
S.K. Mishra et al.
Up-scaling maps the reaction sites to the diffusion grid, where species are allowed to diffuse over a time period that covers several reaction time steps. The reason for this is that the characteristic time scale for reaction is orders of magnitude lower than that of the diffusion time scale. An analytical estimate of the number of intermediate KMC steps to be covered between two successive diffusion steps is provided. At the end of the diffusion time step, the species concentrations as they appear on the diffusion grid need to be mapped back onto the reactive sites. This step, crucial to the success of the simulation is called down-scaling (not to be confused to “up-sampling” which is a terminology used mostly in signal processing), or backward feedback from diffusion grid to the reaction sites. Within the averaging framework, this is done by first assigning the updated mean field of concentrations from the diffusion grid to its contributing reactive sites and then adding to them the residuals δ . This operation is expressed as (constituting an explicit scheme)
{R}r
t +1
= [ M ]r , d { D}d + {δ }r t
t
t
(11)
where superscript t+1 indicates the time at one diffusion time step after time t.
4 Multiscale Interfacing Using Wavelets Here, analogously to the averaging scheme, the concentration at the reactive sites represented through vector { R}r are to be up-scaled to the diffusion grid nodes, thus
yielding vector { D}d . Let WR ( s, x ) denote the wavelet transform of { R}r , expressed
in terms of scale s and spatial coordinate x. WR ( s, x ) is decomposed as
WR ( s, x ) = f R ( s0 , Δx ) ⊕ f R ( s1 , Δx ) ⊕ f R ( s2 , 21 Δx ) ⊕ ... ⊕ f R ( sn , 2n −1 Δx )
(12)
where f R ( si , 2i −1 Δx ) , i = 1,.., n denote the wavelet transform at scale si and sampling interval 2i −1 Δx , and f R ( s0 , Δx ) denotes the transform at the coarsest scale using the
scaling function (thus it appears as if there are two wavelet decompositions at the first level, however one is that of the scaling function and the other of the wavelet in (12). Biorthogonal splines of so-called order [10,4] were used in this work. It is noted that for vector { R}r there are n = log 2 r − 22 number of scales. In (12), ⊕ implies scalewise association in the wavelet analysis formality. This hierarchical decomposition can be used to up-scale { R}r to the diffusion grid by using only the relevant few dominant scales in wavelet domain and truncating the remaining scales. Since the spatial resolution of the diffusion grid is smaller than that of the reaction sites, it may not be possible to up-scale { R}r and also have the same level of resolution as in
{D}d . In the wavelet domain this translates to having coarser resolutions than those
dictated by the sampling interval at the corresponding scale. Thus, if there are d number of points in the diffusion grid desired to have Δxd resolution then out of n = log 2 r − 22 available scales at the reaction sites only m = log 2 d − 22 scales,
Wavelet Based Spatial Scaling of Coupled Reaction Diffusion Fields
307
which preferably should all belong to the dominant ones, should be retained. Then, the wavelet transform of {D}d, WR ( s, x ) can be written as WD ( s, x ) = f R ( s0 , Δx ) ⊕ f R ( s1 , Δx ) ⊕ f R ( s2 , 21 Δx ) ⊕ ... ⊕ f R ( sm , 2m −1 Δx )
(13)
where the notation is the same as in (12). By truncating the coefficient at higher scales, it is natural to query whether the energy norm of { R}r is preserved. This partially guaranteed by the dominant scale representation of the field. However, truncation of finer scales introduces loss/gain of energy contained in those higher wavelet sub bands. But for all practical purposes the total energy (or mean) remains same; in addition the basic correlation structure is transferred through these scales of dominance. Relevant results are presented in subsequent section. With the species concentration up-scaled from the reaction sites to the diffusion grid, it is convenient and efficient to solve the diffusion equations in the wavelet domain, until the next, i.e. down-scaling step, described below. Since the wavelet transform is linear, it is permitted to carry over the diffusion equation on the wavelet domain and then back transform it to the physical domain. With the bridging of scales along the y coordinate, the diffusion equation (5) is transformed, with respect to the spatial axis y only in this case, into the wavelet domain, yielding ∂Wy [u ( x, y, t )] ∂t
⎡ ∂ 2Wy [u ( x, y, t )] ∂ 2Wy [u ( x, y, t )] ⎤ = D⎢ + ⎥ ∂x 2 ∂y 2 ⎣⎢ ⎦⎥
(14)
where Wy denotes the wavelet transform in the y direction only. Equation (14) can be written as
⎡ ∂ 2 u ( x, W y , t ) ∂ 2 u ( x, W y , t ) ⎤ ⎥ (15) = D⎢ + ∂t ∂x 2 ∂y 2 ⎢⎣ ⎥⎦ thus expressing u as a function of x, t, and the wavelet transform of spatial coordinate y. The inverse wavelet transform yields u in the x, y, t domain, i.e. ∂u ( x, Wy , t )
u ( x, y, t ) = Wy−1 ⎡⎣u ( x, Wy , t ) ⎤⎦
(16)
Finally, it is mentioned that the scales considered in (14)-(16) are the dominant ones. Similarly to the steps in the averaging scheme, after the diffusion step, { D}d is down-scaled to { R}r thus providing feedback to the reaction sites from the diffusion grid. The wavelet transform of
{D}d
expressed in (14) contains m = log 2 d − 22
scales, while n = log 2 r − 22 scales are required to define the wavelet transform of
{R}r .
Since n>m, the information in the “lacking” scales, i.e. the finer ones, is
obtained from the previous time step. This saliently assumes that the fine scale fluctuations of the concentrations remain invariant or change only a little between two successive up- and down-scaling operations and small diffusion time steps. Given the dominant scales described above, this is a reasonable assumption that does not introduce measurable error. Then, after down-scaling, the wavelet transform of {R}r , WR ( s, x ) is expressed appropriately.
308
S.K. Mishra et al.
5 Results Before proceeding to present any results the first thing is to check for the conservation. Conservation is one necessary condition that must be satisfied for any multiscale schemes. We check the mass conservation through counting the total number of species present in the system in kinetic evolution time steps. The relevant evolution shows that, with varying level of spontaneous fluctuations, mass is conserved in both the multiscale schemes. The parameters listed in TABLE 1 are used for three different schemes. From these, it can be concluded that all three schemes show obvious fluctuations in the kinetic evolution of species, which is due to the KMC reaction process. However, the fluctuations in both the wavelet and averaging schemes are reduced, as compared to the benchmark model. One reason for this is the intermediate time steps. Another reason for the wavelet scheme is that by only considering the dominant scales in the wavelet transform, small scale fluctuations are smeared out. Such smearing is more pronounced in the averaging scheme, since, even though it is incapable of recognizing dominant information, each averaging yields a spatially flat reaction field. The efficiency of the multi scale schemes depends on their capability to accurately retain temporal as well as spatial information on the concentrations. Thus, along with the temporal evolution the spatial distribution of concentrations should be studied; for compactness, such spatial maps are not shown here. It can be concluded that, as expected, both the averaging and wavelet spatial scaling schemes compromise the fine scale fluctuations for the gain in computational efficiency. However, it will be shown in the sequence that the wavelet scheme provides superior information as compared to the averaging scheme. The ensemble statistics of the diffusion profile along the diffusion field show that while the ensemble mean values are captured accurately by the multiscale schemes, the standard deviations are not captured as accurately by the averaging scheme as are by the wavelet one. In particular, the averaging scheme overestimates the standard deviations whereas the wavelet scheme underestimates them. The benchmark model is thus sandwiched between the two. The enhanced standard deviation values result from the sharp peaks present in the averaging model. The exclusion of ultra fine scales in wavelet filtering is responsible for a lower standard deviation of the fluctuations along reactive sites. However by increasing the number of wavelet scales in up- and down-scaling brings increasingly more fluctuations. Another important measure is the capturing of correlation structures by the multiscale schemes. Even though the KMC reactions are uncorrelated, diffusion of species in the y-direction introduces correlations. In order to examine whether the multiscale schemes are capable of capturing such correlations, an enhanced correlation on the concentrations along the reactive sites is imposed. Even though these do not contribute to the physics of the problem, they do help understand the effect of multiscaling in detail. Figure 1 shows the spatial autocorrelation of concentrations along the reactive sites (y-direction, at x=0) as it results from the three schemes. The benchmark scheme has the correlation structure as it results from the diffusion processes. The wavelet scheme is capable of capturing the spatial correlations by even including only first two
Wavelet Based Spatial Scaling of Coupled Reaction Diffusion Fields
1.0
309
(a)
autocorrelation
0.8 0.6 0.4 0.2 0.0
autocorrelation
1.0
(b)
0.8
3
6
9 12 spatial lag
1/9 th scaling 1/8 th scaling
0.6 0.4 0.2 0.0 -0.2 0
3
6
9 spatial lag
12
15
18
15
18
1.0 autocorrelation
0
(c)
0.8 0.6 0.4 one wavelet scales three wavelet scales
0.2 0.0 0
3
6
9 12 spatial lag
15
18
Fig. 1. The autocorrelation function resulting from (a) benchmark, (b) homogenization, (c) wavelet scheme. The scaling in the homogenized and wavelet schemes is defined as
S = (log 2 r / log 2 d ) where
r is the number of reaction sites, and d is the number of
diffusion sites.
coarsest scales (i.e. the scales containing the scaling coefficients and the next finer scale containing the coarsest wavelet coefficients). With three (or more) scales the autocorrelation structure from the wavelet scheme is practically identical to the one from the benchmark. To the contrary the averaging scheme is incapable of capturing the correlations.
6 Conclusions The spatial scaling wavelet scheme presented herein is offered as a complement to the wavelet based temporal multiscaling presented elsewhere. Even though the spatial multiscaling offers reduced CPU time demand while providing reasonable accuracy, a combined temporal and spatial scaling is required to simulate problems as the present reaction diffusion system efficiently, i.e. representing the behavior accurately at all spatiotemporal scales. An averaging scheme may offer a simple alternative to the wavelet one, yet its efficacy is inferior, especially with respect to correlations.
Acknowledgements This research is sponsored by the Mathematical, Information, and Computational Sciences Division; Office of Advanced Scientific Computing Research; U.S. Department of Energy with Dr. Anil Deane as the program manager. The work was partly performed at the Oak Ridge National Laboratory, which is managed by
310
S.K. Mishra et al.
UT-Battelle, LLC under Contract No. De-AC05-00OR22725. Discussions with M. Syamlal, T. J. O'Brien, and D. Alfonso of the National Energy Technology Laboratory (NETL), C.S Daw and P. Nukala of Oak Ridge National Laboratory, Rodney Fox and Z. Gao of Iowa State University have been very useful.
References 1. Frantziskonis, G., Mishra, S.K., Pannala, S., Simunovic, S., Daw, C.S., Nukala, P., Fox, R.O., Deymier, P.A.: Wavelet-based spatiotemporal multi-scaling in diffusion problems with chemically reactive boundary. Int. J. Multiscale Comp. Eng. 4, 755–770 (2006) 2. Vvedensky, D.D.: Multiscale modelling of nanostructures. J. Phys. Cond. Matter 16, R1537–R1576 (2004) 3. Marin, G.B. (ed.):Multiscale Analysis. Advances in Chemical Engineering, book series, vol. 30, pp. 1–309 (2005) 4. Dollet, A.: Multiscale Modeling of CVD Film Growth—a Review of Recent Works. Surface and Coatings Technology, 177–178, 245–251 (2004) 5. Hsiaotao, T., Jinghai, L.I.: Multiscale Analysis and Modeling of Multiphase Chemical Reactors. Advanced Powder Technol. 15, 607–627 (2004) 6. Murthy, J.Y., Narumanchi, S.V.J., Pascual-Gutierrez, J.A., et al.: “Review of Multiscale Simulation in Submicron Heat Transfer. Int. J. Multiscale Computational Engineering 3, 5–31 (2005) 7. Vasenkov, A.V., Fedoseyev, A.I., Kolobov, V.I., et al.: Computational Framework for Modeling of Multi-scale Processes. Computational and Theoretical Nanoscience 3, 453– 458 (2006) 8. Frantziskonis, G., Deymier, P.A.: Wavelet Methods for Analyzing and Bridging Simulations at Complementary Scales – the Compound Wavelet Matrix and Application to Microstructure Evolution. Modelling Simul. Mater. Sci. Eng. 8, 649–664 (2000) 9. Frantziskonis, G., Deymier, P.: Wavelet-based Spatial and Temporal Multiscaling: Bridging the Atomistic and Continuum Space and Time Scales. Phys. Rev. B 68, 024105 (2003) 10. Muralidharan, K., Mishra, S., Frantziskonis, G., Deymier, P.A., Nukala, P., Simunovic, S., Pannala, S.: The dynamic compound wavelet matrix for multiphysics/multiscale problems. Phys. Rev. E 77, 026714 (2008)
Domain Decomposition Methodology with Robin Interface Matching Conditions for Solving Strongly Coupled Problems Fran¸cois-Xavier Roux ONERA, High performance Computing Unit, 29 avenue de la Division Leclerc, 92320 CHATILLON, France
[email protected], http://www.onera.fr
Abstract. In the case of strongly coupled problems like fluid-structure models in aero-elasticity or aero-thermo-mechanics, a standard solution methodology is based on so called Dirichlet-Neumann iterations. This means that, for instance, the velocity at the interface between the two media is imposed in the fluid, the solution of the fluid problem gives a pressure that is imposed at the boundary of the structure, and then the solution of the problem in the structure gives a new velocity to be imposed to the fluid. This method is not always stable, depending on the relative properties of the media, unless a suitable relaxation parameter is introduced. In order to enforce both velocity and pressure continuity at the interface, the matching conditions can be formulated, like in domain decomposition methods, in a mixed form. This means that the boundary conditions derived in one physical domain from the other one is of Robin type. With Robin boundary condition, an interface stiffness, in the case of velocity-pressure conditions, is introduced. The optimal choice for this stiffness can be proved to be, in the case of linear problems, the so called ”Dirichlet-Neumann” operator of the opposite domain, this means for the discrete equations, the static condensation on the interface of the domain stiffness matrix. Of course, the static condensation cannot be performed in practice, since it is extremely expensive and that the resulting matrix is dense. But it can be approximated in several ways. The underlying general idea behind that methodology is the following: with Robin boundary conditions on the interface, a constitutive law is imposed on the boundary of each media that should optimally exactly represent the interaction with the other media. Keywords: Domain decomposition, strongly coupled, fluid-structure coupling.
1
Introduction
Fluid-structure interaction for aero-elasticity of airplanes or for aero-thermomechanics of aeronautical engines involve strong coupling. That means that not only primal quantities, like temperature or velocities, must be continuous, but M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 311–320, 2008. c Springer-Verlag Berlin Heidelberg 2008
312
F.-X. Roux
also dual quantities, like heat flux or pressure, must be balanced. The solution context then is mathematically similar to the one of domain decomposition for a single equation. A very standard methodology for solving strongly coupled problem relies on alternate iterations on the various quantities at the interface between the domains until stagnation. When stagnation occurs, interface continuity requirements are fulfilled. In the context of domain decomposition method, this methodology is called, for second order PDEs like steady heat equation or linear elasticity, DirichletNeumann iteration [1]. In the second section of this paper standard convergence results for this method will be recalled. In the third section, Robin interface conditions will be introduced and will be proved to give a better stability to the method. General methodological approaches to design optimal Robin interface conditions will be mentioned in the concluding section.
2 2.1
Dirichlet-Neumann Iterations Model Problem
For a sake of simplicity, we considerthe steady heat transfer problem with two domains Ω1 and Ω2 . Note Γ3 = ∂Ω1 ∂Ω2 , the interface between both domains, like in Fig. 1. Inside each domain, the steady heat equation must be solved.
Ω2
Ω1 Γ3
Fig. 1. Splitting in two domains
in Ωi −ki Δui = fi ui = 0 on ∂Ωi \Γ3
The Stokes formula gives the variational form of equation (1). ∂ui 1 ki ∇ui ∇vi = fi vi + ki vi ∀vi ∈ H0∂Ω (Ωi ) i \Γ3 ∂ni Ωi Ωi Γ3
(1)
(2)
The interface coupling conditions are then continuity of the temperature: u1 − u2 = 0 on Γ3 and equilibrium of heat flux:
(3)
Domain Decomposition Methodology with Robin Interface
k1
∂u1 ∂u2 = −k2 ∂n1 ∂n2
313
on Γ3
(4)
The principle of Dirichlet-Neumann method is as follows. Solve the problem in Ω1 , with prescribed values of u1 on interface (Dirichlet problem). Compute the ∂u1 ∂u2 ∂u1 flux k1 ∂n on Γ3 . Solve the problem in Ω2 with balanced flux k2 ∂n = −k1 ∂n on 1 2 1 Γ3 (Neumann problem). Take the trace on interface of the solution of Neumann problem in Ω2 and impose it as Dirichlet boundary condition in Ω1 , u1 = u2 on Γ3 . Restart iteration until stagnation. This simple iterative process consists in enforcing alternatively the continuity of temperature and the equilibrium of flux at interface. It is clear that if the process converges it gives local solutions that satisfy both interface conditions of equations (3) and (4). Hence, the converged local solution fields satisfy the global coupled problem. 2.2
Discretization
Consider a discretization of equation (1) using finite element method. Each subdomain has its own mesh and the interface nodes are present in both meshes as in Fig. 2. Local stiffness matrices of the discrete problems in the two domains
Ω1
Ω2 Γ3 Fig. 2. Two meshes with interface
can be written like in equation (5), using subscripts 1 and 2 for inner degrees of freedom of domains Ω1 and Ω2 and subscript 3 for degrees of freedom of interface Γ3 . Interface Γ3 is present in both meshes. Hence, there are two interface blocks, one in each local matrix, noted with superscripts (1) and (2). K11 K13 K22 K23 K1 = K2 = (5) (1) (2) K31 K33 K32 K33
314
F.-X. Roux
The discretization of variational formulation of equation (2) in domain Ωi is: Kii Ki3 xi bi (6) (i) (i) = (i) (i) K3i K33 x3 b3 + g3 (i)
∂ui on Γ3 . where g3 is the vector representing the discretization of flux ki ∂n i For vectors that satisfy inner equation of (6), there is an explicit relationship between the inner values and the interface values:
xi = Kii−1 bi − Kii−1 Ki3 x3
(i)
(7)
From explicit relation between inner and interface values in equation (7) and thanks to the representation of discrete flux in equation (6), the standard following relation linking the trace and the flux of a vector satisfying inner equation can be derived: (i)
(i) (i)
(i)
g3 = K3i xi + K33 x3 − b3 (i) (i) (i) (i) = K3i (Kii−1 bi − Kii−1 Ki3 x3 ) + K33 x3 − b3 (i) (i) (i) = (K33 − K3i Kii−1 Ki3 )x3 − (b3 − K3i Kii−1 bi ) (i) (i) (i) = S x3 − c3
(8)
Matrix S (i) = K33 − K3i Kii−1 Ki3 is the Schur complement matrix. It is the discretization of the so called Dirichlet to Neumann mapping that defines the bi-continuous one to one correspondence between the trace and the flux on the boundary of a field that satisfies the steady heat equation inside the domain. It is symmetric positive definite if the Ki matrix is symmetric positive definite. The same holds, of course, for any other second order coercive PDE. For the Dirichlet problem in domain Ω1 , only test functions with null trace on Γ3 have to be considered, so the discrete form of this problem is, for a prescribed trace x3 on interface: (1) K11 x1 + K13 x3 = b1 (9) (1) x3 = x3 (i)
The discrete flux associated with the solution of equation (9) can be recovered from equation (6), as already done to define Schur complement in equation (8): (1)
(1) (1)
(1)
g3 = K31 x1 + K33 x3 − b3
(10)
Equation (6) gives directly the discrete formulation of Neumann problem in (2) domain Ω2 for a prescribed value g3 of flux on Γ3 : x2 b2 K22 K23 = (11) (2) (2) (2) (2) K32 K33 x3 b3 + g3 Finally, continuity relations (3) and (4) have the following discrete counterparts: (1)
(2)
x3 = x3
(12)
Domain Decomposition Methodology with Robin Interface (1)
(2)
g3 = −g3
315
(13)
If local solution vectors satisfy the continuity condition (12), then, thanks to (1) (2) equations (9) and (11), the global vector x, with x3 = x3 = x3 satisfy inner equations in domains Ω1 and Ω2 :
K11 x1 + K13 x3 = b1 K22 x2 + K23 x3 = b2
(14)
Equilibrium condition (13), combined with local equations (10) and (11), gives the following interface equation: (1)
(1)
(2)
(2)
K31 x1 + K33 x3 − b3 = −(K32 x2 + K33 x3 − b3 ) ⇔ (1) (2) (1) (2) K31 x1 + K32 x2 + (K33 + K33 )x3 = b3 + b3
(15)
Inner equations (14) and interface equation (15) mean that local solution vectors satisfying both the continuity condition (12) and the equilibrium condition (13) define a global vector solution of the assembled global problem: ⎡
⎤⎡ ⎤ ⎡ ⎤ K11 0 K13 x1 b1 ⎣ 0 K22 K23 ⎦ ⎣ x2 ⎦ = ⎣ b2 ⎦ K31 K32 K33 x3 b3 (1)
(2)
(1)
(16)
(2)
where K33 = K33 + K33 and b3 = b3 + b3 . Of course, equation (16) is the standard discretization of the global coupled problem, that is, in the present case, the steady heat equation on the complete domain. Performing elimination of inner unknowns as already done to define Schur complement in equation (8) in the global coupled system (16) leads to a coupled condensed problem on Γ3 : −1 −1 −1 −1 (K33 − K31 K11 K13 − K32 K22 K23 )x3 = b3 − K31 K11 b1 − K32 K22 b2
(17)
−1 −1 K13 − K32 K22 K23 is the The complete Schur complement S = K33 − K31 K11 sum of the two local ones: −1 −1 K13 − K32 K22 K23 = S (1) + S (2) S = K33 + K33 − K31 K11 (1)
(2)
(18)
In the same way, the right hand side of equation (17) is the sum of two local contributions: −1 −1 b1 − K32 K22 b 2 = c3 + c3 c3 = b3 + b3 − K31 K11 (1)
(2)
(1)
(2)
(19)
316
2.3
F.-X. Roux
Discrete Dirichlet-Neumann Iteration
Iteration p of Dirichlet-Neumann iteration consists in the following steps. 1. Given xp3 , solve the Dirichlet problem in domain Ω1 : (1)p K11 xp1 + K13 x3 = b1 (1)p x3 = xp3
(20)
2. Compute interface flux: (1)p
g3
(1) (1)p
= K31 xp1 + K33 x3
(1)
− b3
3. Solve Neumann problem in domain Ω2 : p x2 bi K22 K23 (2) (2)p = (2) (1)p K32 K33 x3 b3 − g3
(21)
(22)
4. Derive new value of interface trace: (2)p
= x3 xp+1 3
(23)
In reality, only interface values matter in the iteration, since inner values of x can be derived from interface ones, thanks to the inner equations (14). More precisely, thanks to the relationship between x3 and g3 via the Schur complement as stated in equation (8), the Dirichlet-Neumann iteration can be redefined only in terms of values of x3 and g3 . 1. Given xp3 , solve the Dirichlet problem in domain Ω1 and compute interface flux: (1)p (1) (24) g3 = S (1) xp3 − c3 2. Derive new value of interface trace from solution of Neumann problem in domain Ω2 : (2) (1)p = c3 − g 3 (25) S (2) xp+1 3 Finally, xp3 and xp+1 satisfy the following relation: 3 (1)
(2)
S (2) xp+1 = −S (1) xp3 + c3 + c3 3
(26)
Equation (26) shows that the Dirichlet-Neumann iteration is a relaxation iteration for the coupled global problem whose condense form stated in equation (17) can be written: (1) (2) (27) (S (1) + S (2) )x3 = c3 + c3 The iteration matrix for the error of this method is derived from equation (26): xp+1 = −S (2)−1 S (1) xp3 + S (2)−1 c3 = −S (2)−1 S (1) xp3 + S (2)−1 Sx3 3 = −S (2)−1 (S − S (2) )xp3 + S (2)−1 Sx3 = −S (2)−1 Sxp3 + xp3 + S (2)−1 Sx3 (28) and so: − x3 = (I − S (2)−1 S)(xp3 − x3 ) = −S (2)−1 S (1) )(xp3 − x3 ) xp+1 3
(29)
Domain Decomposition Methodology with Robin Interface
317
Equation (29) shows that the method converges if and only if S (2)−1 S (1) < 1. This means that S (2) must be, in some sense, larger than S (1) . Suppose the two domains are geometrically symmetric, with the same thermal conductivity, then the S (1) and S (2) matrices are identical. If the thermal conductivity is different from one domain to the other, the ratio between the two matrices is equal to the ratio between the thermal conductivity of both domains. This means that the Dirichlet-Neumann iteration will converge if the Dirichlet problem is solved in the domain with lower conductivity and the Neumann problem in the domain with the higher conductivity. The physical interpretation is that the domain with the higher conductivity receives the heat flux on the interface from the domain with the lower conductivity and in return imposes its temperature on the interface. Convergence can be obtained by introducing a suitable over-relaxation parameter β: p+ 1 = (1 − β)xp3 + βx3 2 (30) xp+1 3 p+ 1
where x3 2 is given by the simple Dirichlet-Neumann iteration like in equation (26). This leads to a new iteration matrix for the error: xp+1 − x3 = (I − βS (2)−1 S)(xp3 − x3 ) 3
(31)
Since S (2) and S are symmetric positive definite, I − βS (2)−1 S < 1 if β is small enough. Unfortunately, if the Neumann problem is not solved on the right side, β may have to be chosen very small and, in such a case, the convergence is very slow. Furthermore, it may be very difficult to find the right value of β if the materials are heterogeneous on both sides of the interface.
3
Robin Interface Condition
3.1
Robin Boundary Condition
The continuous Robin boundary condition for the steady heat equation in domain Ωi takes the following form: ki
∂ui + αi ui = wi ∂ni
on Γ3
(32)
Introducing the Robin boundary condition (32) in variational formula (2) leads to the following new variational form for the steady heat equation in domain Ωi : 1 ki ∇ui ∇vi + αi ui vi = fi vi + wi vi ∀vi ∈ H0∂Ω (Ωi ) (33) i \Γ3 Ωi
Γ3
Ωi
Γ3
The discretization of variational formulation of equation (33) in domain Ωi is: xi bi Kii Ki3 (34) (i) (i) (i) = (i) (i) K3i K33 + A33 x3 b3 + g˜3
318
F.-X. Roux (i)
∂ui where g˜3 is the vector representing the discretization of augmented flux ki ∂n + i αi ui on Γ3 . Elimination of inner unknowns in equation (34) gives the following relation between the trace of the solution of the Robin problem and the discretization of augmented flux on Γ3 : (i)
(i)
(i)
(i)
(i)
g˜3 = K3i xi + (K33 + A33 )x3 − b3 (i) (i) (i) (i) (i) = K3i (Kii−1 bi − Kii−1 Ki3 x3 ) + (K33 + A33 )x3 − b3 (i) (i) (i) (i) = (K33 + A33 − K3i Kii−1 Ki3 )x3 − (b3 − K3i Kii−1 bi ) (i) (i) (i) = (S (i) + A33 )x3 − c3
(35)
This means that augmented system (34) gives an augmented Schur complement (i) with the same augmentation matrix A33 . ∂ui and the augmented The relation between the discretization of the flux ki ∂n i ∂ui flux is ki ∂ni + αi ui given by: (i)
(i)
(i) (i)
g3 = g˜3 − A33 x3 (i) (i) = (S (i) )x3 − c3 3.2
(36)
Dirichlet-Robin Method
The Dirichlet-Robin method consists in enforcing the equilibrium of flux between domain Ω1 and Ω2 via a mixed condition, leading to a Robin problem in Ω2 : k2
∂u2 ∂u1 + α2 u2 = −k1 + α2 u1 ∂n2 ∂n1
on Γ3
(37)
The discrete Dirichlet-Robin can be directly written in term of interface unknowns: 1. Given xp3 , solve the Dirichlet problem in domain Ω1 and compute interface flux: (1)p (1) (38) g3 = S (1) xp3 − c3 2. Derive new value of interface trace from solution of Robin problem in domain Ω2 : (2) (2) (1)p (2) (S (2) + A33 )xp+1 = c3 − g3 + A33 xp3 (39) 3 Finally, xp3 and xp+1 satisfy the following relation: 3 (2)
(2)
(1)
(2)
= (−S (1) + A33 )xp3 + c3 + c3 (S (2) + A33 )xp+1 3
(40)
Since the solution of coupled problem (17) satisfies the same equation as (40): (2)
(2)
(1)
(2)
(S (1) + S (2) + A33 − A33 )x3 = c3 + c3
(41)
the iteration matrix for the error of this method is given by the following equation: − x3 = −(S (2) + A33 )−1 (S (1) − A33 )(xp3 − x3 ) xp+1 3 (2)
(2)
(42)
Domain Decomposition Methodology with Robin Interface
319
The method converges if and only if (S (2) + A33 )−1 (S (1) − A33 ) < 1. It can be the case, even though S (2) is not larger than S (1) , provided that the (2) augmentation matrix A33 is close to the Shur complement of domain Ω1 , S (1) . (2) If A33 = S (1) , the method converges in only one iteration: it is a direct method. This property can be also derived from the elimination of inner unknowns of domain Ω1 in global system (16), that reduces the system as follows in Ω2 : b2 K22 K23 x2 = (43) −1 −1 x3 K32 K33 − K31 K11 K13 b3 − K31 K11 b1 (2)
(2)
This system can be rewritten: b2 K23 K22 x2 = (2) (2) (1) x3 K32 K33 + S (1) b 3 + c3
(44)
Equation (44) means that the restriction in Ω2 of the solution of the global coupled problem is solution of a generalized local Robin problem, where the operator in the Robin condition is the Schur complement of domain Ω1 . This is hardly a surprise on a physical point of view. The Schur complement is the discretization of Dirichlet-Neumann mapping. If a temperature and a flux satisfy the relation given by equation (8), they are respectively the trace and the flux of solution of local problem (6). A generalized Robin boundary condition sets a constitutive law for the interface. If the constitutive law on interface of domain Ω2 is such that the trace and the flux of the local problem in domain Ω2 with the corresponding Robin boundary condition are the trace and flux of the solution of the local problem in domain Ω1 , it entails that the solution of Robin problem in Ω2 is the restriction of the global coupled problem. The optimal generalized Robin condition forces the boundary of a domain to behave in the same way as the boundary of the neighboring domain. 3.3
Robin-Robin Method
The Dirichlet problem in domain Ω1 can be also replaced by a Robin problem similar to (40). It can be proved that the iteration matrix for the error of the Robin-Robin method is equal to: (S (2) + A33 )−1 (S (1) − A33 )(S (1) + A33 )−1 (S (2) − A33 ) (2)
(2)
(1)
(1)
(45)
Equation (45) means that performing a Robin-Robin iteration is equivalent to performing alternatively a Dirichlet-Robin and a Robin-Dirichlet iteration.
4
Conclusion
The coupling algorithms based on Robin interface conditions have many interesting features. First, Robin boundary condition can always be formulated in such a
320
F.-X. Roux
way that associated local problem is always well posed, even though the Neumann problem is not. Secondly, it is not necessary to impose the Robin condition on the ”right” side to get convergence. It can even be imposed on both sides. In practice however, it is not possible to compute the exact Schur complement that defines the optimal Robin interface condition, for several reasons. Its computation is very expensive and it is a dense operator that couples all interface unknowns. At continuous level, the Dirichlet-Neumann mapping is not local. Furthermore, it can be computed only for linear problems with adequate variational formulation. Nevertheless, there are several approximation methods for the DirichletNeumann mapping. It can be done by defining a local approximation of the continuous operator, like in the methods to set up approximate absorbing conditions for wave equation [2], and by discretizing it. The analysis can also be performed directly on the discrete operator. Any physical approach to set up the constitutive law associated with the optimal Dirichlet-Neumann mapping can be valuable as well. Also, purely algebraic approaches, like the ones developed in [3] can also be used to compute a sparse approximation of the Schur complement. This methodology has already been successfully used for various coupled fluid-structure problems.
References 1. Funaro, D., Quarteroni, A., Zanolli, P.: An Iterative Procedure with Interface Relaxation for Domain Decomposition Methods. SIAM J. Numer. Anal. 25(6), 1213–1236 (1998) 2. Gander, M.J., Halpern, L., Nataf, F.: Optimal Schwarz Waveform Relaxation for the One Dimensional Wave Equation. SIAM J. Numer. Anal. 41, 1643–1681 (2003) 3. Magoules, F., Roux, F.-X., Series, L.: Algebraic Approximation of Dirichlet-toNeumann Maps for the Equations of Linear Elasticity. Computer Methods in Applied Mechanics and Engineering 195(29-32), 3742–3759 (2006)
Transient Boundary Element Method and Numerical Evaluation of Retarded Potentials Ernst P. Stephan1 , Matthias Maischak2 , and Elke Ostermann1 1
2
Leibniz Universit¨ at Hannover, Institut f¨ ur Angewandte Mathematik, Am Welfengarten 1, 30167 Hannover, Germany {stephan,osterman}@ifam.uni-hannover.de Brunel University, School of Information Systems, Computing & Mathematics John Crank Building, Uxbridge UB8 3PH, United Kingdom
[email protected]
Abstract. We discuss the modeling of transient wave propagation with the boundary element method (BEM) in three dimensions. The special structure of the fundamental solution of the wave equation leads to a close interaction of space and time variables in a so-called retarded timeargument. We give a detailed derivation of the discretization scheme and analyse a new kind of ”geometrical light cone” singularity of the retarded potential function. Moreover, we present numerical experiments that show these singularities. Keywords: retarded potential, transient boundary element method.
1
Introduction
The simulation of sound radiation for automotive systems has become of great industrial interest and here the transient boundary element method is a powerful tool, especially for high frequency radiation phenomena. Nevertheless, the simulation of three dimensional time dependent problems is a challenging task. HaDuong [1] was able to prove the unconditional stability of the transient boundary element method with Galerkin’s method in space and time. Hence, the question arises, why have so many instabilities in numerical experiments been reported. For a first kind integral equation with the retarded single layer potential we perform a p-version boundary element method both in space and time. We address especially the question of how to gain an accurate quadrature method. Thus we analyse the potentials that arise in the computation of the Galerkin elements and show that a new kind of ”geometrical light cone” singularity can cause severe inaccuracies in the computation of the Galerkin stiffness matrix. Therefore, an accurate quadrature scheme which respects these singularities, e.g. by an appropriate grading, needs to be used for the computation of the matrix entries. The paper is structured as follows. First, we represent the solution u of the Dirichlet problem of the wave equation via a single layer ansatz with unknown density function p and derive a first kind integral equation for p on the surface Γ of the scatterer. For this integral equation we give a variational formulation which M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 321–330, 2008. c Springer-Verlag Berlin Heidelberg 2008
322
E.P. Stephan, M. Maischak, and E. Ostermann
we solve approximately with Galerkin’s method using piecewise polynomials in space and time. We show in detail how the retarded time argument in the kernel function of the single layer potential influences the entries of the Galerkin stiffness matrix. Here the time discretization leads to special ”domains of influence” Ek , annular regions in space whose radii are given by the time discretization. If the triangle of the trial function and Ek do not overlap, the resulting matrix entry vanishes. For the computation of these entries, we introduce a new quadrature scheme paying special attention to the interaction between the triangles and the domain of influence. We show that the retarded integral operators poss ”geometrical light cone” singularities (section 4). Finally, we present numerical experiments in 3 confirming this fact.
Ê
2
Model Problem and Single Layer Potential
Let us consider the transient sound radiation of some bounded domain Ω − , a − bounded open domain with connected complement Ω := R3 \Ω . We investigate the wave equation for the displacement u(t, x) with t ∈ + , x ∈ Ω
Ê
∂2u − Δu = 0. ∂t2
(1)
We assume causal functions f (t), i.e. f (t < 0) = 0 and use the initial conditions u(0, x) = ut (0, x) = 0
for x ∈ Ω,
(2)
where the lower index t indicates the time derivative. Given Dirichlet boundary conditions on Γ := ∂Ω u=f
on R × Γ.
We make a single layer potential ansatz for the solution u 1 p(t − |x − y|), y dsy (x ∈ u(t, x) = Sp(t, x) = / Γ ), 4π Γ |x − y|
(3)
(4)
with density function p and the retarded time argument t − |x − y|. The latter connects the space and time variables and physically results in the retarded signal of a source point. As we will see later, this retarded time argument has a significant impact on our discretization scheme. As the single layer potential S is continuous everywhere, we have for its limit on Γ p(t − |x − y|), y 1 dsy (x ∈ Γ ). (5) V p(t, x) = 4π Γ |x − y| and using (3), there holds the boundary integral equation for p f (t, x) = V p(t, x).
(6)
Transient Boundary Element Method and Numerical Evaluation
323
Note that the mapping V is continuous (see [1]). We can deduce the following variational formulation: given f , find p such that for all η ∞ ∞ V p(t, x)η(t, x)dsx dt = f (t, x)η(t, x)dsx dt. (7) 0
Γ
Γ
0
Due to the theory of Ha-Duong [1] this formulation is uniquely solvable in appropriate Sobolev spaces on Γ and [0, ∞). It can be solved approximately with a Galerkin scheme in space and time for appropriate piecewise polynomial subspaces. For some convergence results see [1] and the references therein. We concentrate on suitable quadrature schemes for actually computing the Galerkin solution leading to a fully discrete system discussed in the next section.
3
The Discrete Problem
In the following we denote by Nt the number of degrees of freedom in time and by Ns the number of degrees of freedom in space. Let us assume, that the boundary Γ is subdivided into triangles Tj with maximal diameter h and let us assume that we have a decomposition of the time interval [0, T ] with T < ∞, into Nt subintervals with step size Δt. Let tn = nΔt. We discretize the variational form of the boundary integral equation (7) using basis functions of order at least constant in both space and time, such that ph (t, x) =
Nt Ns m=1 i=1
m bm i γ (t)ϕi (x) =
Nt
γ m (t)φm , γ m ∈ S pt (Δt), ϕi ∈ S ps (h)
m=1
(8) where γ m (t) is a Legendre polynomial of degree pt ⎧ pt ⎨ β ti for t ∈ (t , t i m m+1 ] . γ m (t) = i=0 ⎩ 0 else Then, the p-version of the Galerkin scheme reads: find ph as in (8), such that T T V ph (t, x)γ n (t)ϕj (x)dsx dt = f (t, x)γ n (t)ϕj (x)dsx dt (9) 0
Γ
0
Γ
for n = 1, . . . , Nt and j = 1, . . . , Ns . For the left hand side, we have to compute integrals of the type T 1 1 γ m (t − |x − y|)ϕi (y)γ n (t)ϕj (x)dsy dsx dt (10) Vijmn := 4π 0 Γ Γ |x − y| If we change the order of integration, we can evaluate the time integral analytically and the resulting integral only depends on the time difference, such that Vijmn = Vijn−m . Let k := n − m and let p ∈ , then
Æ
324
T 0
E.P. Stephan, M. Maischak, and E. Ostermann
⎧ 1 p+1 ⎪ − tp+1 ) (x, y) ∈ Ek−1 ⎨ p+1 ((t1 + |x − y|) k p p+1 1 p+1 t 1t−|x−y|∈(t0 ,t1 ] 1t∈(tk ,tk+1 ] dt = p+1 (tk+1 − |x − y| ) (x, y) ∈ Ek ⎪ ⎩0 else (11)
where 1 denotes an indicator function on the time intervals and we define the domain of influence (light cone) Ek by Ek := {(x, y) ∈ Γ × Γ : tk ≤ |x − y| ≤ tk+1 } .
(12)
The rather unusual domain Ek is due to the retarded time argument in the single layer potential. Let us now return to the evaluation of the Galerkin integrals (10). As the matrix depends only on the time difference, we can translate our basis functions γ m (t) and γ n (t) in time onto (t0 , t1 ) and (tk , tk+1 ), respectively. To simplify the notation, we indicate all terms connected to the ansatz function γ m (t) with the index A and for the test function γ n (t) with T , thus γ m (t) is of polynomial degree pA and γ n (t) of degree pT . Using (11), applying the binomial series and reordering, we can rewrite (10) with k := n − m p αk−1 |x − y|l−1 ϕi (x)ϕj (y)dsx dsy Vijk = l l=0
+
p l=0
where
|x − y|l−1 ϕi (x)ϕj (y)dsx dsy ,
αkl
(13)
Ek
⎧p p A T βiA βjT i+j+1 ⎪ ⎪ tk+1 ⎪ i+j+1 ⎪ ⎪ i=0 j=0 ⎪ ⎪ p p ⎪ T m
l ⎪ ⎪ ⎪ βA β T (−1) i ti+j−l+1 ⎪ ⎨j=0 j i=l i i+j−l+1 l k+1
l=0
l−j−1 l−j−1 (−1)d+1 ⎪ A ⎪ +βl−j−1 l = 1, . . . , pA ⎪ l−d d ⎪ ⎪ l=0 ⎪ ⎪ ⎪ pT l−j−1 ⎪ l−j−1 (−1)d+1 ⎪ ⎪ A ⎪ βjT βl−j−1 l = pA + 1, . . . , pA + pT + 1 ⎩ l−d d j=0 l=0 ⎧ pA pT
(−1)l i+j−l+1 ⎪ ⎪ βiA βjT il i+j−l+1 tk ⎪ ⎪ ⎪ i=l j=0 ⎪ ⎪ ) ⎨ pT l min(l−j−1,f
l−d A l = 0, . . . , pA + βl−j−1 βjT (−1)l l−j−1 := d f −d ⎪ j=0 f =0 d=0 ⎪ ⎪ ⎪ ) pT l min(l−j−1,f ⎪
l−d ⎪ A ⎪ βl−j−1 βjT (−1)l l−j−1 l=pm +1,...,pA +pT +1 ⎩ d f −d
αkl :=
αk−1 l
Ek−1
j=0
f =0
d=0
We see from (13) that the entries of the Galerkin matrix are given by a sum of integrals with integration domains depending on the time step light cone and
Transient Boundary Element Method and Numerical Evaluation
325
with kernels only depending on the space variables. Therefore, returning to (9), we can rewrite our Galerkin scheme with (8) as the linear system n
V n−m φm = F n
(n = 1, . . . , Nt ),
(14)
m=1
where the matrix entries are defined by (13). We can formulate a time stepping algorithm for (14) as summarized in the time stepping algorithm. In each time step we have to compute a new matrix V n−m and a new solution vector φn . The matrices are sparsely populated due to the restriction of the interaction of all triangles with the domain of influence and, as the radii are time dependent and our body Ω − is bounded, at some point the domain of influence has passed the body and the new matrices vanish. Note that the system matrix V 0 remains the same in each time step. Also note that F n is computed via numerical evaluation of the right hand side of (9). As mentioned before, for the above algorithm severe numerical instabilities have been reported, although Ha-Duong proved its unconditional stability [1]. In the next section we point out special ”geometrical light cone” singularities, that occur in the computation of the inner integral of the integrals given in (13), which have to be dealt with through the development of a sufficiently accurate quadrature scheme for the outer integral in (13) in order to achieve a sufficiently accurate quadrature for the integral. Time Stepping Algorithm. for n = 1, . . . Nt do if V n−2 = 0 then Domain of influence has passed the body. No more matrix evaluation needed. else Compute matrix V n−1 end Compute RHS F n Solve V 0 φn = F n −
n−1
V n−m φm
m=1
and store new solution vector φn . end
4
Analysis of the Retarded Potential and Quadrature Scheme
In this section we analyse the behavior of typical potential terms like (15) below. Such an integral has a different smoothness depending on whether the light cone
326
E.P. Stephan, M. Maischak, and E. Ostermann
hits a corner or an edge of the triangle T , see the situations given in Fig. 4. As we see in the proof of Theorem 1 the solution of the potential becomes singular in the second case. In the following we restrict ourselves to constant basis functions in space and time and use a numerical quadrature for the outer integrals in (13). Let us take y ∈ 3 as a quadrature point. Then, when the integration is performed only on one triangle T ∈ 3 , the integration domain becomes
Ê
Ê
ET := {x ∈ T : rmin ≤ |x − y| ≤ rmax }
y∈
with 0 ≤ rmin < rmax . Thus the integrals are of the type I(y) = k(x, y)dsx ,
Ê3 (fixed), (15)
ET
where k(x, y) = |x − y|α for α ≥ −1. In the following, we will refer to (15) as the potential integral. Let us focus on the situation where the triangle T and the quadrature point y lie in the same plane. Introduce a local coordinate system with respect to the triangle T . The domain of influence of the triangle ET becomes for fixed y ∈ 3 an annular ring. Therefore, we introduce polar coordinates (r, ϕ) in our local coordinate system with origin y, orientate our triangle under these settings and devide the triangle into elements of the type as sketched in Fig. 1. The upper
Ê
Fig. 1. 4 Quadrature Elements
and lower boundaries of all these elements can be parameterized as r(ϕ) and hence the integrals (15) can be computed (both analytically and numerically). The generalization to the situation where y and T do not lie in the same plane can be solved by applying the above scheme, as the kernel k(x, y) only depends on the distance between x and y, see Fig. 2 (a more detailed description of the quadrature scheme is given in [2,3]). The following theorem describes the regularity of the typical potentials (15) and gives the most relevant singularities resulting from so-called edge-singularities. When taking higher derivatives of the potentials even more singularities might appear, but this is not further investigated here for brevity. Theorem 1. The potential integrals (15), i.e. I( ) in (16) and (17), posses ”geometrical light cone” singularities, which are of order −1/2 ln for the case α = −1. These singularities lie parallel to the triangle sides with distances rmin and rmax as illustrated in Fig. 3.
Transient Boundary Element Method and Numerical Evaluation
327
Proof. In order to understand, what is happening when the light cone hits the triangle we study the following two reference situations. Here we choose a ball of fixed radius R with center m ( ≥ 0) moving either towards an edge or a corner. Note that R corresponds to rmin and rmax . These two reference situations are sketched in Fig. 4. For the corner, the reference
(Fig. 4, left) is as follows. We have two
situation rays g1 and g2 given by γ 01 and γ 10 respectively, with γ ≥ 0. We move the
1 √ center of the ball along 11 towards these rays with m = −R ( ≥ 0). Thus 2 1 the intersection points with the rays are T 1 2 2 √ S1 ( ) = 0, ( − R) + 2R − ( − R) 2 T 1 S2 ( ) = √ ( − R) + 2R2 − ( − R)2 , 0 2 For our integral (15) we have I(y) = I(m ) and therefore √ 2 π 1 2 √ 2
2
I(m ) =: I( ) =
(−R)+
2R −(−R)
dr dϕ 0
π = √ ( − R) + 2R2 − ( − R)2 2 2
(16)
0
thus π I ( ) = √ 2 2
R−
1+ 2R2 − ( − R)2
We see that I ( ) → √π2 for → 0. The singularity for → (1 + relevant if the circle lies completely in the triangle.
√ 2)R is only
T
y
Fig. 2. Geometry of quadrature
Fig. 3. Triangle light cone with ”geometrical light cone” singularities indicated by 12 straight line segments
328
E.P. Stephan, M. Maischak, and E. Ostermann
g1
S2 ()
S1 () m S2 () m
g2
S1 ()
Fig. 4. Corner reference situation (left) and edge reference situation (right)
In order to understand, what happens when a sphere meets
the edge of a triangle (see Fig. 4, right), we choose a reference edge along γ 01 (γ ∈ ) and a sphere with midpoint (−R, 12 )T . This sphere intersects with the edge at the points S1 ( ) = (0, 12 + (2R − ))T and S1 ( ) = (0, 12 − (2R − ))T , when the midpoint is moving towards the edge m = ( − R, 12 )T , where ≥ 0. The angle of S1 ( ) relative to m is now
(2R − ) ϕ1 = arctan
−R
Ê
and the angle of S √2 ( ) is ϕ2 = 2π − ϕ1 . (2R−) With r1 (ϕ) = sin ϕ and r2 (ϕ) = R, (15) becomes in polar coordinates
ϕ1
r2 (ϕ)
dr dϕ
I( ) := −ϕ1
(17)
r1 (ϕ)
ϕ1
(2R − ) ln tan 2 =: I1 ( ) + I2 ( ).
= 2Rϕ1 − 2
Now, for ± determined by the sign of cos(ϕ1 ), it holds 2R I1 ( ) =
(2R − ) I2 ( )
2(R − ) =
(2R − ) +
ln
(2R − ) ( − R)(1 ± R( − R))
2 ± 4R( − R) 2(R − ) − .
(2R − ) ( − R)(1 ± R( − R))
Therefore, the derivative of I( ) becomes unbounded for → 0, → R and
→ 2R. What does this mean for (15)? The singularity for = R corresponds to the case where y lies on the edge of a triangle in (15), and this corresponds to the
Transient Boundary Element Method and Numerical Evaluation
3 2 1
@ T@
-2
-1
0
1
0
2
3
0.01 0.009 0.008 0.007 0.006 0.005 0.004 0.003 0.002 0.001 0
3 2 1
@ T@
-1
-2
-2 -1
0
1
0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01
0
-1
-2
329
2
3
Fig. 5. Contour plot of the triangle-potential (left) and the norm of its gradient (right) for rmin = 2.0, rmax = 2.1
Fig. 6. Surface plot of the norm of the gradient of the triangle potential for rmin = 2.0, rmax = 2.1
so called edge singularities for solutions of 3D boundary value problems, which are singular due to the kernel. For = 0 and = 2R, we have geometrical singularities that are occurring when the boundary of the light cone ET passes the edge with the outer and inner circle with radius rmax and rmin , respectively. If we apply the polar quadrature scheme described above to the reference triangle with vertices (0, 0)T , (0, 1)T , (1, 0)T , we can observe exactly the behavior discussed in Theorem 1. In the left plot of Fig. 5 we see the contour plot of integral (15) with k(x, y) = |x − y|−1 . We fix the usually time dependent radii to rmax = 2.1 and rmin = 2.0 and evaluate (15) for y ∈ [−2.5, 3, 5]2. We observe that the contour levels become dense at the propagated edges, see Fig. 3. This observation is underlined by the right plot of Fig. 5, where the norm of the gradient of (15) is plotted and the propagated edges become more visible. In Fig. 6 the gradient of the potential of the triangle is plotted again as a 3D plot.
330
E.P. Stephan, M. Maischak, and E. Ostermann
Let us finally remark, that the domain of influence of a triangle is of course a three dimensional domain. In this paper we have restricted our analysis to the case where a triangle T and a point of observation y are in the same plane, but the main arguments hold true for the fully three dimensional case. This will be a topic of a further paper. Here, our main result is that the classical near field and far field notation of the elliptic integral operators does not hold true for hyperbolic problems. Nevertheless, from our potential analysis (Thm. 1) we can obtain a grading strategy for the outer quadrature of the Galerkin integrals.
References 1. Ha-Duong, T.: On retarded potential boundary integral equations and their discretisation. In: Topics in computational wave propagation. Lect. Notes Comput. Sci. Eng, vol. 31, pp. 301–336. Springer, Berlin (2003) 2. Maischak, M., Ostermann, E., Stephan, E.P.: On the numerical evaluation of retarded potentials. Institut f¨ ur Angewandte Mathematik, Leibniz Universit¨ at Hannover, Preprint no. 89 (2007) 3. Ostermann, E.: PhD thesis, Institut f¨ ur Angewandte Mathematik, Leibniz Universit¨ at Hannover (in preparation)
A Multiscale Approach for Solving Maxwell’s Equations in Waveguides with Conical Inclusions Franck Assous1,2 and Patrick Ciarlet, Jr.3 1
Bar-Ilan University, 52900 Ramat-Gan, Israel Ariel University Center, 40700 Ariel, Israel
[email protected] ENSTA, 32 bvd Victor, 75739, Paris Cedex 15
[email protected]
2
3
Abstract. This paper is devoted to the numerical solution of the instationary Maxwell equations in waveguides with metallic conical inclusions on its internal boundary. These conical protuberances are geometrical singularities that generate in their neighborhood, strong electromagnetic fields. Using some recent theoretical and practical results on curl-free singular fields, we have built a method which allows to compute the instationary electromagnetic field. It is based on a splitting of the spaces of solutions into a regular part and a singular one. The singular part is computed with the help of a multiscale representation, written in the vicinity of the geometrical singularities. As an illustration, numerical results in a rectangular waveguide are shown. Keywords: Maxwell equations; waveguides; conical inclusions; multiscale approach.
1
Introduction
Many practical problems require the computation of electromagnetic fields. In this paper, we are interested in computing the electromagnetic fields which propagate in three-dimensional waveguides with metallic conical inclusions. This can be useful in microwave and millimeter-wave technology, for instance for design purpose. The knowledge of the electromagnetic fields in the vicinity of irregular points of a surface is an old, widely studied problem (cf. [1], [2]). However, it remains a present problem, particularly the influence and the numerical treatment of such irregular points in the wave propagation (see for instance [3] and the references therein). Within this framework, we developed a numerical method for solving the instationary Maxwell equations [4], with continuous approximations of the electromagnetic field. However in practical examples, the boundary of the computational domain may include reentrant corners and/or edges. They are called geometrical singularities, generate strong fields, and require a careful computation of the electromagnetic field in their neighborhood. Using some new theoretical and practical results, we have built a method, the singular complement (hereafter SCM), which consists in splitting the space of M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 331–340, 2008. c Springer-Verlag Berlin Heidelberg 2008
332
F. Assous and P. Ciarlet Jr.
solutions into a two-term direct sum (cf. [5]). The first subspace (the regular one) coincides with the whole space of solutions, provided that the domain is either convex, or with a smooth boundary. So, one can compute the regular part of the solution with the help of a classical method such as the finite element or finite difference method. The singular part is computed with the help of a multiscale representation, written in the vicinity of the geometrical singularities. In this paper, we consider a 3D waveguide with sharp conical metallic protuberances on its internal boundary. Then the singularities consist of conical vertices. We propose an application of the SCM for the non-symmetric 3D case. Based on a multiscale representation of the singular part of the solution at the tip of each cone, coupled with a three-dimensional Maxwell formulation, we construct a numerical method of solution. Numerical results are shown. This extends the numerical results obtained in two-dimensional cartesian [5] or axisymetric domains [6]. See also [7] for an extension to prismatic domains based on a Fourier expansion. Following [8], extensions to rounded corners seem to be also possible.
2
Instationary Maxwell Equations
Let Ω be a open, polyhedral subset of R3 , unbounded in one direction (typically a hollow metallic cylinder or a rectangular waveguide) and Γ its boundary. We denote by n the unit outward normal to Γ . If we let c, ε0 and μ0 be respectively the light velocity, the dielectric permittivity and the magnetic permeability (ε0 μ0 c2 = 1), Maxwell’s equations in vacuum read ∂E 1 − c2 curl B = − J , ∂t ε0 ∂B + curl E = 0, ∂t
div E =
ρ , ε0
div B = 0,
(1) (2)
where E and B are the electric and magnetic fields, ρ and J the charge and current densities which depend on the space variable x and on the time variable t. It is well known that ρ and J have to verify the charge conservation equation ∂ρ/∂t + div J = 0 .
(3)
Waveguide problems are generally set without charges (ρ = 0), and this condition expresses that the current density J is divergence-free. These equations are supplemented with appropriate boundary conditions. Since the domain of interest is unbounded in one direction, we denote by ΓC the ”real part” of the boundary Γ (i.e. the metallic part), on which one imposes the perfect conducting boundary condition. As it is well known, this is modelled by E × n = 0 and B · n = 0 on ΓC .
(4)
This condition expresses that the tangential components of E, and the normal component of B uniformly vanish on ΓC . Nevetheless, these boundary conditions are not sufficient to model efficiently our problem. Since the domain is not
A Multiscale Approach for Solving Maxwell’s Equations
333
bounded, it has to be ”numerically closed” to perform computations. As a consequence, the boundary Γ of Ω is also made up of an artifial part ΓA = Γ \ ΓC , on which a Silver-M¨ uller boundary condition is imposed, namely (E − cB × n) × n = e × n on ΓA ,
(5)
where the surface field e is given. In waveguide problems, the artificial boundary ΓA is often splitted into ΓAi and ΓAa . On ΓAi , we model incoming plane waves by a non-vanishing function e , whereas we impose on ΓAa an absorbing boundary condition by choosing e = 0. Without loss of generality, one can choose the location of the artificial boundary ΓA , in such a way that it does not intersect with the conical inclusions. Moreover, one can also choose a regular shape for ΓA . Hence, the boundary ΓA is not an issue for the conical inclusions. For this reason, we will only consider in what follows perfectly conducting boundary (i.e. ΓA = ∅). The case of Silver-M¨ uller boundary condition will be then introduced. Writing the Maxwell equations as two second-order in time equations, the electric and the magnetic fields can be handled separately. In this paper, we will focus on the electric field formulation. Let us recall the definitions of the following spaces H(curl , Ω) = {u ∈ L2 (Ω), curl u ∈ L2 (Ω)} , H(div , Ω) = {u ∈ L (Ω), div u ∈ L (Ω)} , H1 (Ω) = {u ∈ L2 (Ω), grad u ∈ L2 (Ω)} . 2
2
(6) (7) (8)
We define the space of electric fields E with X = {x ∈ H(curl , Ω) ∩ H(div , Ω) : x × n|Γ = 0} ,
(9)
endowed with the norm · X = [ · 20 + curl · 20 + div · 20 ]1/2 . When the domain is convex (or with a smooth boundary), the space of electric fields X is included in H1 (Ω). That is not the case anymore in a singular domain (see for instance [9]), in particular in a waveguide with conical inclusions. One thus introduces the regular subspace for electric fields (indexed with R ) XR = X ∩ H1 (Ω),
(10)
which is actually closed in X [10]. Hence, one can consider its orthogonal subspace (called singular subspace and indexed with S ), and then define a two-part, direct, and orthogonal sum of the space as ⊥X
X = XR ⊕ XS .
(11)
As a consequence, one can split an element x into an orthogonal sum of a regular part and a singular one, namely x = xR +xS . This is the principle of the Singular Complement Method. As far as numerical computations are concerned, one can also introduce direct, but non-orthogonal two-part sum for the space X. The choice is made
334
F. Assous and P. Ciarlet Jr.
with respect to the ease of implementation and depends on the characterization of the singular space. The characterization (and so is the computation) of the elements of XS is complicated (see [10]). So, one is able to introduce another (non-orthogonal) singular subspace LS and define a two-part, direct sum for the space X as X = XR ⊕ LS , (12) so that an element x of X will be splitted into x = xR + lS . Following [10], elements of LS are the curl-free elements of X that satisfy: div lS = sD in Ω , lS ×n|Γ = 0 ,
(13) (14)
where sD is the (non-vanishing) singular solution (i.e. that belongs to L2 (Ω)) of the Dirichlet homogeneous boundary problem ΔsD = 0 sD = 0
in Ω , on Γ .
(15) (16)
Hence, it appears more convenient to solve a scalar Laplace problem to determine LS , than a vector one that would appear for the XS characterization. Let us now briefly re-introduce the boundary ΓA . A first way could be to consider the space of solutions XΓA = {x ∈ H(curl , Ω) ∩ H(div , Ω) : x × n|ΓC = 0} .
(17)
Then introduce the regular subspace XΓRA = XΓA ∩ H1 (Ω), and construct the ad hoc orthogonal (or non-orthogonal) splitting, in which appears a singular space, say XΓSA (or LΓSA for the non-orthogonal curl-free part). Nevertheless, it is more interesting from a numerical point of view, to consider the (non-orthogonal) splitting XΓA = XΓRA ⊕ LS ,
(18)
first since the subspace of singular magnetic fields is LS , as before (with only a perfect conducting boundary condition). Second, modelling incoming plane waves, or imposing an absorbing boundary condition has no impact, as far as the singular subspace is concerned. It will be sufficient, as soon as ΓA is not empty, to add in the variational formulation, integral terms on ΓA as for a regular domain Ω.
3
Multiscale Representation for the Singular Part
In general 3D domains, the main difficulty of such an approach is to take into account accurately singular subspaces. They originate from geometrical singularities, such as reentrant edges, that can meet at reentrant vertices. In addition to generating infinite dimensional singular subspaces, the challenge is to understand and resolve the links between singular edges and singular vertex functions.
A Multiscale Approach for Solving Maxwell’s Equations
335
In that case, other approaches are possible and very useful, see [11]. Nevertheless, in many cases, one can proceed with this approach. This is the case for 3D prismatic domains, or for 3D domains invariant by rotation (cf. [7,12]). This is also the case for conical vertices that generate a finite dimensional singular subspace (LS in our case), even in a full 3D geometry. However, computations retain their 3D character. To simplify the presentation, we assume in this section that there is only one conical vertex, so that LS is of dimension 1. The numerical method consists in first computing numerically the basis of the singular subspace LS , based on a multiscale representation. Following the characterization (13-14), we shall need sD to compute lS ∈ LS . Then using ad hoc singular mappings (see [10]), one can associate to sD a unique scalar potential ψS such that − ΔψS = sD ψS = 0
in Ω ,
(19)
on Γ .
(20)
Then, the singular basis function lS will be easily inferred from ψS by taking its gradient, lS = grad ψS . As the keypoint is to compute sD , let us now examine the behavior of this solution in the vicinity of a sharp cone. This can be done by deriving the asymptotic expansions. As the conical inclusion is locally invariant by rotation, we consider the system of spherical coordinates (r, θ, ϕ), centered at the tip of the cone. Let Γcone be the part of the boundary which intersects the cone of equation θ = w0 /2, with w0 the aperture angle of the cone (a priori π ≤ w0 ≤ 2π). Hence, one can find sD in the form of spherical harmonic functions rμ Pμ (cos θ) sin(mϕ),
(21)
where P· denotes a Legendre function of order 0 and index μ, which is an eigenfunction of the Laplace-Beltrami operator, related to the eigenvalue −μ(μ + 1). Proposition 1. There exists a sequence (cl )l∈IN such that one can write, for any N ∈ IN, the expansion sD (r, θ, ϕ) = r−λ−1 Pλ (cos θ) +
N
cl rλl Pλl (cos θ) + O(rλN +1 ),
r −→ 0 . (22)
l=1
In addition, the coefficient λ can be uniquely determined by solving Pλ (cos(
w0 )) = 0 . 2
(23)
Remark first that, since the boundary Γcone is locally invariant by rotation, the above expansion does not depend on the variable ϕ, namely is axisymmetric (see for instance [13]). Moreover, computing the behavior of λ, as a function of the angle w0 , we find that only sharp conical inclusions (i. e. cones for which the aperture is larger than a given value β 130o 43 ), generate a singular function.
336
F. Assous and P. Ciarlet Jr.
In other words, r−λ−1 is singular (i.e. belongs to L2 (Ω), but not to H 1 (Ω)) only for cones for which the aperture is larger than β. This result is in accordance with the ones obtained in invariant by rotation domains [14]. For the other conical inclusions (with an aperture lower than β), no specific treatment will be necessary . We are ready now to compute sD . If we denote by s˜D = sD − r−λ−1 Pλ (cos θ), the regular part of sD (i.e. that belongs to H 1 (Ω)), and recall that r−λ−1Pλ (cos θ) is known as soon as λ was determined, one can compute sD by solving Δ˜ sD = 0 s˜D = −r
−λ−1
Pλ (cos θ)
in Ω,
(24)
on Γ.
(25)
Note that the above equation is homogeneous as Δ [r−λ−1 Pλ (cos θ)] = 0. Moreover, the boundary condition also vanishes on Γcone , due to the term Pλ (cos θ). Then, one solves the discrete variational formulation, using continuous, P1 Lagrange finite elements. The computation of ψS is carried out analogously. In that case, we have to solve a non-homogeneous Dirichlet problem. From a numerical point of view, the sD v dx appears in the right-hand side of the variational formulation, term Ω s˜D v dx + r−λ−1 Pλ (cos θ) v dx. where v is a test function. It is splitted as Ω
Ω
Then, in order to compute the second term accurately, one uses the analytical knowledge of r−λ−1 Pλ (cos θ) – up the quasi-exact value of λ – in conjunction with Gauss quadrature formulas, exact up to the 6th order. These formulas require 15 integration points per element of the mesh (in our case, tetrahedron). It is emphasized that these formulas are well-suited, in the sense that no integration point coincides with the tip of the cone, where the (absolute) value of r−λ−1 Pλ (cos θ) is infinite. Finally, the computation of lS (= ∇ψS ) is carried out with the help of the P analytical expression in spherical coordinates of lP S (= ∇ψS ), namely ⎞ ⎛ cos ϕ [Pλ (cos θ) − cos θPλ−1 (cos θ)] ⎜ sin θ ⎟ λ−1 ⎜ sin ϕ ⎟ lP (26) S = λr [Pλ (cos θ) − cos θPλ−1 (cos θ)] ⎠ . ⎝ sin θ Pλ−1 (cos θ)
4
Numerical Algorithms
Now, to solve the problem, we have to modify a classical method by handling the above decomposition. We first write Amp`ere and Faraday’s laws as two secondorder in time equations, and focus on the electric field problem. To enforce the divergence constraint div E = 0, we introduce a Lagrange multiplier (say p), to dualize Coulomb’s law. In this way, one builds a mixed variational formulation (VF) of Maxwell equations. It is well-posed, as soon as the well-known infsup (or Babuska-Brezzi [15,16]) condition holds. In addition to begin Mixed,
A Multiscale Approach for Solving Maxwell’s Equations
337
we use here an Augmented VF (see [4]), which results in a Mixed, Augmented
VF. To this end, one adds to the bilinear form curl E · curl F dx, the term Ω
div E div F dx. This formulation reads Ω Find (E, p) ∈ X × L2 (Ω) such that d2 2 2 E · F dx + c curl E · curl F dx + c div E div F dx + p div E dx dt2 Ω Ω Ω Ω 1 d =− J · F dx, ∀F ∈ X, (27) ε0 dt Ω div E q dx = 0 ∀q ∈ L2 (Ω) . (28) Ω
To include the SCM in this formulation, the electric field E is split like E(t) = ER (t) + ES (t). One has ES (t) = κ(t)lS , where κ is a continous time-dependent function (to be determined). The same splitting is used for the test functions of the variational formulation, which is discretized in time, with the help of the leap-frog scheme. From a practical point of view, we choose the Taylor-Hood, P2 iso-P1 finite element. In addition to being well-suited for discretizing saddle-point problems, it allows to build diagonal mass matrices, when suitable quadrature formulas are used. Thus, the solution to the linear system, which involves the mass matrix, is straightforward [4]. This results in a fully discretized VF: MΩ En+1 R
+ MRS κn+1 + LΩ pn+1 = Fn ,
(29)
MTRS En+1 R LTΩ En+1 R
+ MS κ
n
(30)
n
(31)
n+1
+
LTS κn+1
+ LS p
n+1
=G , =H .
Above MΩ denotes the usual mass matrix, and LΩ corresponds to the divergence term involving lhR and the discrete Lagrange multiplier ph (t). Then, MRS is a rectangular matrix, which is obtained by taking L2 scalar products between regular and singular basis functions, MS is the ”singular” mass matrix, and finally, LS corresponds to the divergence term involving lS and ph (t). To solve this sysn+1 tem, one first removes the unknown κn+1 , so that the unknowns (En+1 ) R ,p can be obtained with the help of a Uzawa-type algorithm. Finally, one concludes the time-stepping scheme by computing κn+1 with the help of (30). What is added, when compared to the same formulation posed in a waveguide without conical inclusions is, first, equation (30); second, additional terms that appear in (29) and (31), which express the coupling between singular and regular parts. However, these terms are independent of the time variable, so they are computed once and for all, at the initialization stage.
5
Numerical Experiments
We consider a metallic rectangular waveguide with two conical inclusions on its boundary (see Fig. 1). The metallic walls of the guide are assumed to be entirely perfectly conducting and the extremities are absorbing boundary conditions.
338
F. Assous and P. Ciarlet Jr.
Support of the current
Fig. 1. Rectangular waveguide with conical inclusions
10-1 2.23
10-1 9.79
Ey = Ey regular + Ey singular
with SCM without SCM
-0.71
500 Δt
Fig. 2. Comparison of the total field, its singular and regular parts
-5.33
500 Δt
Fig. 3. Comparison of the total field with and without the SCM
The initial conditions are set to zero. The electromagnetic wave is generated by a current, with density J (t) = Jz ez , the support which is depicted on Fig.1. The value of the z-component is Jz (x, y, z) = C sin(ωt) .
(32)
Above, C is a constant, set to C = 10−5 , and ω is associated to the frequency ν = 5.109 Hz. We focus on the evolution of the electric field over time, at different nodes of the finite element mesh. First, one can check that the numerical method enforces causality, by studying the evolution at node A. By causality, we mean that the value of the electric field at A should be zero as long as the wave generated by the current has not reached A. To that aim, we show, on the same Figure, the – strong – total electric field, Ey (A), its regular part EyR (A), and its singular
A Multiscale Approach for Solving Maxwell’s Equations
339
part EyS (A) (see Fig. 2.) The singular coefficient κ(t) is non-zero as soon as the wave hits the tip of the cone, and it adopts a more or less sinusoidal pattern, with a frequency which depends on ν. Still, the total electric field at A, here Ey (A) does not vary, until the wave actually reaches A: the regular part ’compensates’ for the variations of the singular part before that. Then, we compare the results at some point B, with or without the SCM. In particular, results differ significantly in amplitudes (see the component Ey (B), Fig. 3): the amplitudes vary with a factor of more than two.
6
Conclusion
In this paper, we were interested in the treatment of conical protuberances in waveguide problems. This is a real 3D problem, even if around each sharp conical vertex, the singular subspace is finite-dimensional. Using a multiscale representation of the electromagnetic field in the vicinity of singular points, we propose an extension of the SCM in this non symmetric 3D case. Such approach may be useful in models in which it is possible and reasonable to determine or approximate the behavior of the singular electromagnetic field near irregular points. Extensions to ”pseudo-singualrities”, namely rounded corners seems also possible (see [8]).
References 1. Jackson, J.D.: Classical electrodynamics. John Wiley & Sons, New-York (1975) 2. Van Bladel, J.: Electromagnetic fields. McGraw-Hill, New York (1985) 3. Juntunen, J.S., Tsiboukis, T.D.: On the FEM Treatment of Wedge Singularities in Waveguide problems. IEEE Trans. Microwave Theory Tech. 48(6), 1030–1037 (2000) 4. Assous, F., Degond, P., Heintz´e, E., Raviart, P.-A., Segr´e, J.: On a finite element method for solving the three-dimensional Maxwell equations. J. Comput. Phys. 109, 222–237 (1993) 5. Assous, F., Ciarlet Jr., P., Segr´e, J.: Numerical solution to the time-dependent Maxwell equations in two-dimensional singular domain: The Singular Complement Method. J. Comput. Phys. 161, 218–249 (2000) 6. Assous, F., Ciarlet Jr., P., Labrunie, S., Segr´e, J.: Numerical solution to the time-dependent Maxwell equations in axisymmetric singular domains: the singular complement method. J. Comput. Phys. 191, 147–176 (2003) 7. Ciarlet Jr., P., Garcia, E., Zou, J.: Solving Maxwell equations in 3D prismatic domains. C. R. Acad. Sci. Paris, Ser. I 339, 721–726 (2004) 8. Ciarlet Jr., P., Kaddouri, S.: A justification of Peek’s empirical law in electrostatics. C. R. Acad. Sci. Paris, Ser. I 343, 671–674 (2006) 9. Grisvard, P.: Singularities in boundary value problems, vol. 22. RMA Masson, Paris (1992) 10. Assous, F., Ciarlet Jr., P., Garcia, E.: A characterization of singular electromagnetic fields by an inductive approach. Int. J. Num. Anal. Mod. 5(3), 491–515 (2007) 11. Ciarlet Jr., P., Jamelot, E.: Continuous Galerkin methods for solving the timedependent Maxwell equations in 3D geometries. J. Comput. Phys. 226, 1122–1135 (2007)
340
F. Assous and P. Ciarlet Jr.
12. Labrunie, S.: La m´ethode du compl´ement singulier avec Fourier pour les ´equations de Maxwell en domaine axisym´etrique, Technical Report Institut Elie Cartan, 2004-42, Nancy, Paris, France (in French) (2004) 13. Assous, F., Ciarlet Jr., P., Garcia, E., Segr´e, J.: Time-dependent Maxwell’s equations with charges in singular geometries. Comput. Methods Appl. Mech. Engrg. 196 (1-3), 665–681 (2006) 14. Bernardi, C., Dauge, M., Maday, Y.: Spectral methods for axisymmetric domains. Series in Applied Mathematics. Gauthiers-Villlars, Paris and North Holland, Amsterdam (1999) 15. Babuska, I.: The finite element method with Lagrange multipliers. Numer. Math. 20, 179–192 (1973) 16. Brezzi, F.: On the existence, uniqueness and approximation of saddle point problems arising from Lagrange multipliers. RAIRO Anal. Num´er., 129–151 (1974)
3rd Workshop on Computational Chemistry and Its Applications (3rd CCA) Ponnadurai Ramasami Department of Chemistry, University of Mauritius, Réduit, Republic of Mauritius,
[email protected]
The 3rd workshop on computational chemistry and its applications (3rd CCA) will be held, as part of ICCS-2008, from 23rd to 25th June 2008 in Kraków, Poland. This workshop is the third one after being two successful events in Reading, UK and Beijing, China for ICCS-2006 and ICCS-2007 respectively. The main aim of this workshop is to consider only high standards, original work not published, after peer reviewing. Computational chemistry is one of the fields of computational science and hence this workshop is suitable for the ICCS conference. Computational chemistry is rapidly expanding with the explosive growth of computational power. It is widely used in research and more interestingly in interdisciplinary research involving computational science. The major goals of this workshop are to highlight the latest scientific advances within the broad field of computational chemistry in education, research, industry and society. This workshop will provide the opportunity for researchers coming from corners of the world to be on a single platform for discussion, exchanging ideas and possibly developing collaborations. It will also be a suitable platform for researchers from different fields of computational science to meet so that ideas for interdisciplinary research can emerge. This year, for the 3rd CCA, some excellent papers have been received and these papers cover a wide range of research from fundamental to applied chemistry. It is interesting to note that there are increasingly more of interdisciplinary research papers being submitted for consideration. As the workshop organizer, I would like to thank all those who have submitted papers for this workshop and the ICCS conference. I would also like to thank all reviewers for their efforts and comments to improve some of the papers. I look forward that we have a fruitful gathering and successful conference in Kraków.
M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, p. 343, 2008. © Springer-Verlag Berlin Heidelberg 2008
First Principle Gas Phase Study of the Trans and Gauche Rotamers of 1,2-Diisocyanoethane, 1,2-Diisocyanodisilane and Isocyano(isocyanomethyl)silane Ponnadurai Ramasami Department of Chemistry, University of Mauritius, Réduit, Republic of Mauritius
[email protected]
Abstract. The trans and gauche rotamers of 1,2-diisocyanoethane, 1,2diisocyanodisilane and isocyano(isocyanomethyl)silane have been studied in the gas phase. A transition state has also been obtained for the interconversion of these rotamers. The methods used are MP2 and DFT/B3LYP and the basis sets used for all atoms are 6-311++G(d,p). The optimised geometries, dipole moments, moment of inertia, energies, energy differences and rotational barriers are reported. Vibrational frequencies of the rotamers are presented with appropriate assignments. The results indicate that the trans rotamer is more stable and the G2MP2 rotational energy differences are 2.97 kJ/mol (1,2-diisocyanoethane), 3.02 kJ/mol (isocyano(isocyanomethyl)silane) and 2.12 kJ/mol (1,2-diisocyanodisilane). The rotational barrier for 1,2-diisocyanoethane is larger than its energy difference but the barrier becomes comparable to the energy difference for isocyano(isocyanomethyl)silane. Keywords: 1, 2-diisocyanoethane, 1, 2-diisocyanodisilane and isocyano (isocyanomethyl)silane, MP2, DFT/B3LYP, energy difference, rotational barrier.
1 Introduction 1,2-Diisocyanoethane is a symmetrical 1,2-disubstituted ethane and therefore exists as two rotamers namely the trans and gauche rotamers which are in equilibrium [1-5]. 1,2-Dicyanoethane has been the subject of experimental [6] and theoretical conformational studies [7-11]. However 1,2-diisocyanoethane has not been comparably studied although the isocyano group is one of the most polar neutral substituents. Schrumpf and Martin [12] reported the conformation and vibrational spectra of 1,2-diisocyanoethane by measuring its infrared spectra in the vapour, liquid crystalline solid and several solvents. They also carried out semiempirical computations at the INDO level and concluded that the two forms are of about equal stabilities with the gauche form being slightly favoured. However they pointed about the shortcomings of semi empirical computation, which can be remedied by more refined ab initio level. Schrumpf et al. [13] reported the molecular structure of 1,2-diisocyanoethane obtained using the gas electron-diffraction method. They identified two distinct rotamers namely the trans and gauche forms. They concluded that the trans form is M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 344–352, 2008. © Springer-Verlag Berlin Heidelberg 2008
First Principle Gas Phase Study of the Trans and Gauche Rotamers
345
more stable with a percentage of 56.9 in the equilibrium mixture. In a previous communication [11], a theoretical gas phase study of trans and gauche rotamers of 1,2dicyanoethane and structurally analogous compounds, 1,2-dicyanodisilane and cyano(cyanomethyl)silane, was reported in terms of molecular structures, energies and infrared frequencies. The results indicate that in general, the energy difference between the trans and gauche rotamers is in the order: 1,2-dicyanoethane > cyano(cyanomethyl)silane > 1,2-dicyanodisilane. Thus although 1,2-diisocyanoethane has received attention, to the best of our knowledge there have not been studies involving 1,2-diisocyanodisilane and isocyano(isocyanomethyl)silane although they are structurally related to 1,2diisocyanoethane. Some of the motivations of this work are derived from the limited literature of 1,2-diisocyanodisilane and isocyano(isocyanomethyl)silane and more importantly, isocyano compounds are widely used as ligands [14] and occur naturally in marine organisms [15]. In this paper, the molecular structures, energy difference (∆E) between the gauche and trans rotamers, rotational barrier, rotational thermodynamics, and vibrational spectra of the trans and gauche rotamers have been obtained for the title compounds using MP2 and DFT/B3LYP methods. Some of the results obtained are compared with the reported parameters of their isomeric dicyano compounds [11]. The findings of this work are hereby reported.
2 Methods MP2 and DFT/B3LYP computations have been carried out for molecular geometry optimisation of the trans and gauche rotamers of the title compounds. The trans rotamers of 1,2-diisocyanoethane and 1,2-diisocyanodisilane have been studied in C2h symmetry but the trans rotamer of isocyano(isocyanomethyl)silane has been considered in Cs symmetry. The gauche rotamers of 1,2-diisocyanoethane and 1,2diisocyanodisilane have been studied in C2 symmetry but the gauche rotamer of isocyano(isocyanomethyl)silane has been considered in C1 symmetry. A transition state has also been obtained for the interconversion of these rotamers. The basis sets used for all atoms are 6-311++G(d,p). G2/MP2 computations have also been carried out. Frequency computations have been carried out using the optimised structures. All computations have been done using Gaussian 03W [16] program suite and Gauss View [17] has been used for visualising the rotamers.
3 Results and Discussion The relevant structural optimised structures of the trans and gauche rotamers of the title compounds are reported in Table 1. Several conclusions can be drawn from Table 1. Firstly, there is little difference between the values of the different parameters obtained using the two methods of theory. Secondly, there is a good comparison
346
P. Ramasami
between some of the calculated parameters and available literature data 1,2diisocyanoethane. To be more precise, the reported bond lengths (Å) of C-H, N≡C, CC and C-N are 1.123. 1.172, 1.529 and 1.424 respectively [13]. Further the torsional angle N-C-C-N for the gauche conformer is reported to be 56.9º [13]. Thirdly, the torsional angles calculated for the gauche rotamers are generally greater from DFT than MP2 computations. Lastly, the moment of inertias calculated for the rotamers follow the order IA >> IB ≈ IC. Apart from these, the effect of substituting the cyano group by the isocyano group can be found by comparing the results obtained with those reported [11]. It is found that substitution affects mainly the bond length between carbon and nitrogen with the bond being shorter in case of the dicyano compounds. The transition state modeled corresponds to the isocyano group eclipsing Table 1. Optimised parameters of the trans and gauche rotamers of 1,2-diisocyanoethane, 1,2diisocyanodisilane and isocyano(isocyanomethyl)silane MP2
DFT
Parameter r (C-H)/ Å r (N≡C)/ Å r (C-C)/ Å r (C-N)/ Å τ (N-C-C-N)/ º μ/ Debye IA/ GHz IB/ GHz IC/ GHz
Trans 1.092 1.186 1.534 1.423 180.0 0 24.689 1.649 1.576
r (Si-H)/ Å r (N≡C)/ Å r (Si-Si)/ Å r (Si-N)/ Å τ (N-Si-Si-N)/ º μ/ Debye IA/ GHz IB/ GHz IC/ GHz
1.470 1.191 2.349 1.762 180.0 0 7.647 1.059 0.951
r (Si-H)/ Å r (C-H)/ Å r (CN≡C)/ Å r (SiN≡C)/ Å r (Si-C)/ Å r (Si-N)/ Å r (C-N)/ Å τ (N-Si-C-N)/ º μ/ Debye IA/ GHz IB/ GHz
1.446 1.094 1.187 1.191 1.894 1.756 1.426 180.0 0.642 12.994 1.347
CNCH2CH2NC Gauche Trans 1.093 1.092 1.187 1.171 1.529 1.540 1.423 1.423 64.2 180.0 5.239 0 7.110 25.465 2.690 1.633 2.110 1.564 CNSiH2SiH2NC 1.472 1.479 1.191 1.178 2.348 2.360 1.758 1.758 64.1 180.0 4.584 0 3.441 7.833 1.494 1.045 1.186 0.942 CNSiH2CH2NC 1.469 1.475 1.095 1.094 1.187 1.172 1.190 1.178 1.893 1.905 1.750 1.753 1.424 1.424 63.9 180.0 6.192 0.451 4.299 13.273 1.900 1.331
IC/ GHz
1.248
1.440
1.237
Gauche 1.093 1.171 1.535 1.422 67.6 5.554 7.579 2.508 2.029 1.479 1.178 2.360 1.754 69.5 4.836 3.716 1.360 1.126 1.478 1.095 1.172 1.178 1.903 1.748 1.421 68.0 5.236 5.028 1.911
1.517
First Principle Gas Phase Study of the Trans and Gauche Rotamers
347
Table 2. Energies of the trans and gauche rotamers of 1,2-diisocyanoethane, 1,2diisocyanodisilane and isocyano(isocyanomethyl)silane
¨E/ kJ/mol Level MP2/6-311++G(d,p) DFT/B3LYP/ 6-311++G(d,p) MP2/ 6-311+G(3df,2p) G2MP2 MP2/6-311++G(d,p) DFT/B3LYP/ 6-311++G(d,p) MP2/ 6-311+G(3df,2p) G2MP2 MP2/6-311++G(d,p) DFT/B3LYP/ 6-311++G(d,p) MP2/ 6-311+G(3df,2p) G2MP2
Trans/Hartrees -263.553797 (0.073121)* -264.308002 (0.072546) -263.703031 -263.785502 (0.070277)
Gauche/Hartrees
¨H/ kJ/mol 0 K
1,2-Diisocyanoethane -263.553770 0.07 (0.073350) -264.306678 3.48 (0.072599) -263.702261 2.02
0.20
2.97
2.12
-514.604265 (0.061471) -515.745887 (0.060865) -514.754754
-765.904854 (0.049077) Isocyano(isocyanomethyl)silane -514.603668 1.57 (0.061624) -515.744528 3.57 (0.060880) -514.753655 2.88
-514.851786 (0.058859)
-514.850637 (0.058900)
3.02
-765.905662 (0.049055)
¨G/ kJ/mol 298.15 K
¨S J/mol 298.15 K
0.99 (42.7)** 3.50 (67.2)
2.64
2.70
2.84 (61.1)
0.48
5.49
5.89 (58.4) 6.49 (68.2)
1.36
2.34
2.09 (62.0)
-0.50
1.63
2.56 (60.4) 3.61 (58.4)
3.33
-263.784369 (0.070364) 1,2-Diisocyanodisilane -765.642636 1.22 (0.051513) -767.174855 2.64 (0.050591) -765.792823 2.21
-765.643099 (0.051346) -767.175861 (0.050625) -765.793665
¨H/ kJ/mol 298.15 K
6.09
3.43
2.84
2.94 (52.3)
0.55
1.34
3.12 0.60
0.32
* Values in bracket are zero point energies in Hartrees; ** Values in bracket are percentage of trans rotamers in the equilibrium mixture at 298.15K.
hydrogen atom. Further, the bond lengths of the transition state structures are comparable to those of the trans and gauche rotamers. Table 2 summarises the energies of the trans and gauche rotamers, ∆E and related thermodynamical parameters. Some conclusions can be drawn from Table 2. Firstly, for all the three molecules, the trans rotamer has lower energy than the gauche form. Secondly, ∆E is generally larger from DFT than MP2 computations when the same basis set is used. Lastly, the percentage of the trans rotamer can be calculated using the free energy difference (∆G) between the trans and gauche rotamers and the fact that there are two equivalent gauche forms. It is found that at 298.15 K, in general, the equilibrium mixture is more populated with the trans conformer and this is in agreement with literature for 1,2-diisocyanoethane [13]. The rotational barriers, the energy difference between the gauche rotamer and transition state, for 1,2-diisocyanoethane, 1,2-diisocyanodisilane and isocyano(isocyanomethyl)silane are 16.32, 6.59 and 2.32 kJ/mol (MP2) and 11.35, 4.46 and 1.77 kJ/mol (DFT) respectively. It can be found that the rotational barrier decreases and it becomes comparable to the energy difference for isocyano(isocyanomethyl)silane. Apart from these, for a given method, the relative energy indicates that a dicyano compound has lower energy than its isomeric diisocyano compound [11].
348
P. Ramasami Table 3. Vibrational analysis of the trans and gauche rotamers 1,2-diisocyanoethane
Vibration Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Q21 Q22 Q23 Q24
MP2 3115.4 (0) 2134.4 (0) 1513.1 (0) 1428.5 (0) 1079.4 (0) 1017.1 (0) 469.4 (0) 186.0 (0) 3192.2 (1.8) 1268.5 (4.1) 804.5 (0.03) 280.6 (2.8) 82.5 (23.1) 3154.0 (0) 1326.9 (0) 1043.1 (0) 336.1 (0) 3123.7 (12.7) 2137.2 (242.4) 1517.8 (10.8) 1334.8 (21.5) 986.0 (43.7) 416.2 (13.5) 118.9 (19.5)
Trans DFT 3058.2 (0) 2219.9 (0) 1494.4 (0) 1403.7 (0) 1036.5 (0) 996.7 (0) 459.1 (0) 196.3 (0) 3117.5 (5.3) 1258.0 (3.1) 794.6 (2.8) 297.9 (4.1) 84.1 (20.4) 3096.1 (0) 1323.8 (0) 1108.5 (0) 265.1 (0) 3067.2 (14.4) 2217.6 (396.7) 1503.3 (11.9) 1336.6 (22.1) 965.6 (38.2) 416.9 (16.1) 126.4 (16.1)
Experimental 2990
Mode Assignment Ag C-H stretch
2155
N{C stretch
1445
CH2 scissors CH2 wag C-C stretch
1002
C-NC stretch
520
C-CN bend
209
C-N-C bend
2998
Au
C-H stretch
1231
CH2 twist
778
CH2 rock
306
C-N-C bend torsion
2958
Bg
1022
C-H stretch CH2 twist CH2 rock
274 2958
C-N-C bend Bu
C-H stretch
2155
N{C stretch
1459
CH2 scissors
1344
CH2 wag
945
C-NC stretch
520
C-CN bend
187
C-N-C bend
MP2 3170.9 (1.9) 3113.0 (15.5) 2134.0 (122.9) 1508.1 (0.05) 1415.3 (12.4) 1316.8 (0.7) 1112.7 (0.02) 1066.9 (8.6) 851.8 (11.1) 395.6 (0.4) 304.4 (0.6) 167.7 (0.6) 86.2 (4.4) 3180.9 (2.8) 3111.3 (2.7) 2136.5 (100.5) 1508.5 (18.0) 1402.2 (19.4) 1285.3 (3.4) 1047.7 (4.6) 868.5 (12.2) 565.2 (18.7) 255.5 (0.5) 191.5 (3.6)
Gauche DFT 3095.4 (4.2) 3057.5 (18.5) 2219.9 (184.5) 1486.7 (0.1) 1388.4 (12.7) 1308.3 (0.5) 1086.6 (0.5) 1037.3 (7.7) 821.1 (8.6) 389.3 (0.3) 299.7 (0.9) 173.4 (0.6) 82.2 (4.6) 3106.1 (6.1) 3054.1 (2.9) 2217.6 (183.3) 1487.2 (19.6) 1390.1 (22.2 1270.8 (2.5) 1029.1 (5.0) 855.9 (12.7) 550.7 (20.1) 263.4 (1.0) 196.3 (2.1)
Experimental 2990
Mode Assignmen A C-H stretch
2958
C-H stretch
2155
N{C stretch
1445
CH2 scissors
1358
CH2 wag
1275
CH2 twist
1073
C-C stretch
1002
CH2 rock
823
C-NC stretch
395
C-C-N bend
308
C-N-C bend
184
C-N-C bend
108
torsion
2990
B
C-H stretch
2958
C-H stretch
2180
N{C stretch
1445
CH2 scissors
1344
CH2 wag
1248
CH2 twist
1022
C-C stretch
834
CH2 rock
558
C-C-N bend
274
C-N-C bend
207
C-N-C bend
Frequencies are in (cm-1), values in bracket are intensities in (km mol-1), experimental frequencies are from Ref [12]
First Principle Gas Phase Study of the Trans and Gauche Rotamers
349
Table 4. Vibrational analysis of the trans and gauche rotamers of 1,2-diisocyanodisilane Vibration Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Q21 Q22 Q23 Q24
MP2 2345.5 (0) 2083.9 (0) 983.8 (0) 894.3 (0) 659.4 (0) 473.6 (0) 266.1 (0) 112.8 (0) 2361.9 (131.1) 735.1 (48.3) 376.2 (27.3) 180.2 (0.4) 25.8 (12.0) 2354.3 (0) 754.0 (0) 608.0 (0) 176.3 (0) 2342.8 (99.9) 2082.0 (506.2) 972.0 (220.6) 802.6 (594.3) 658.7 (233.1) 227.4 (3.7) 61.5 (16.1)
Trans DFT 2260.4 (0) 2148.9 (0) 947.7 (0) 870.1 (0) 655.5 (0) 454.3 (0) 268.7 (0) 122.2 (0) 2275.5 (128.9 726.0 (43.0) 368.1 (22.7) 201.7 (0.1) 32.9 (10.0) 2267.1 (0) 740.2 (0) 592.3 (0) 197.1 (0) 2256.8 (91.6) 2144.5 (747.0) 943.4 (182.8) 784.2 (527.5) 654.2 (215.9) 242.8 (3.9) 67.4 (13.3)
Mode Ag
Assignment Si-H stretch N{C stretch SiH2 scissors SiH2 wag Si-N stretch Si-Si stretch Si-Si-N bend Si-N{C bend
Au
Si-H stretch SiH2 twist SiH2 rock Si-N{C bend torsion
Bg
Si-H stretch SiH2 twist SiH2 rock Si-N{C bend
Bu
Si-H stretch N{C stretch SiH2 scissors SiH2 wag Si-N stretch Si-Si-N bend Si-N{C bend
MP2 2351.9 (14.5) 2341.0 (88.2) 2084.4 (292.5) 986.8 (76.1) 889.8 (125.0) 755.0 (7.3) 674.1 (73.8) 570.3 (13.4) 431.5 (13.9) 222.1 (1.8) 186.7 (0.1) 77.3 (0.08 36.6 (3.3) 2356.3 (115.4) 2333.9 (48.2) 2082.2 (187.8) 963.8 (115.6) 837.5 (487.9) 744.2 (31.2) 659.9 (64.7) 431.5 (13.9) 222.1 (1.8) 186.7 (0.1) 111.5 (9.9
Gauche DFT 2264.6 (12.6) 2254.5 (71.8) 2149.4 (406.7) 949.4 (74.8) 861.9 (83.4) 739.4 (8.8) 669.8 (57.6) 543.0 (9.7) 401.1 (12.7) 235.1 (1.4) 201.7 (0.1) 82.0 (0.3) 33.9 (3.2) 2269.1 (109.6) 2245.3 (46.5) 2144.5 (321.0) 928.6 (85.3) 811.1 (479.3) 731.1 (29.7) 656.5 (72.2) 453.6 (42.8) 269.2 (9.1) 195.7 (0.3) 116.5 (8.1)
Mode A
Assignment Si-H stretch Si-H stretch N{C stretch SiH2 scissors SiH2 wag SiH2 twist Si-N stretch SiH2 rock Si-Si stretch Si-Si-N bend Si-N{C bend Si-N{C bend torsion
B
Si-H stretch Si-H stretch N{C stretch SiH2 scissors SiH2 wag SiH2 twist Si-N stretch SiH2 rock Si-Si-N bend Si-N{C bend Si-N{C bend
Frequencies are in (cm-1), values in bracket are intensities in (km mol-1)
The calculated infrared raw vibrational frequencies, their intensities and assignments of the trans and gauche rotamers of the title compounds are reported in Tables 3-5 respectively. The experimental infrared vibrational frequencies of the trans and gauche
350
P. Ramasami
Table 5. Vibrational analysis of the trans and gauche rotamers of isocyano(isocyanomethyl)silane
Vibration Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Q21 Q22 Q23 Q24
MP2 3165.6 (0.004) 2380.8 (74.8) 1251.3 (8.0) 869.9 (22.4) 724.6 (32.1) 507.0 (14.9) 268.9 (0.9) 177.8 (0.05) 52.7 (18.3) 3103.6 (5.8) 2366.5 (43.6) 2125.3 (83.7) 2087.0 (268.4) 1476.2 (9.8) 1285.2 (6.8) 993.3 (83.6) 978.8 (28.1) 880.8 (335.7) 737.8 (19.8) 669.4 (100.0) 375.3 (1.3) 275.0 (5.5) 148.0 (0.7) 81.9 (18.7)
Trans DFT 3091.7 (0.4) 2292.1 (74.1) 1242.7 (6.6) 855.0 (21.3) 713.0 (30.1) 502.7 (13.4) 279.7 (1.5) 198.7 (0.005) 55.1 (16.5) 3046.6 (5.9 2279.2 (41.3 2209.2 (137.0) 2150.2 (405.5) 1465.3 (9.9) 1265.3 (4.3) 968.4 (30.8) 956.7 (64.4) 864.6 (301.2) 711.6 (18.3) 664.4 (87.0) 375.2 (2.1) 281.6 (6.1) 158.6 (0.5) 89.2 (15.9)
Mode Acc
Assignment C-H stretch Si-H stretch CH2 twist CH2 rock SiH2 twist SiH2 rock C-N{C bend Si-N{C bend torsion
Ac
C-H stretch Si-H stretch N{C stretch
C
Si
N{C stretch
CH2 scissors CH2 wag SiH2 scissors SiH2 scissors SiH2 wag Si-C stretch Si-N stretch C-N{C bend Si-N{C bend C-N{C bend Si-N{C bend
MP2 3154.3 (0.3) 3093.6 (7.0) 2376.4 (67.8) 2355.2 (78.7) 2127.6 (80.3) 2087.4 (246.5) 1474.6 (13.1) 1292.1 (2.2) 1258.9 (14.7) 993.1 (13.1) 985.1 (125.4) 921.6 (278.5) 814.0 (24.8) 742.1 (26.4) 714.0 (24.6) 547.5 (72.0) 575.5 (42.9) 380.8 (11.7) 282.5 (0.9) 267.7 (2.1) 186.2 (0.4) 149.6 (7.3) 114.2 (0.6) 57.8 (4.6)
Gauche DFT 3078.1 (1.1) 3033.7 (8.3) 2290.0 (64.7) 2268.2 (69.7) 2211.8 (152.4) 2150.1 (363.3) 1460.3 (13.3) 1271.5 (1.2) 1250.8 (12.4) 976.7 (2.3) 950.4 (113.1) 900.4 (261.3) 794.1 (22.3) 729.4 (24.7) 701.7 (28.3) 635.8 (62.4) 556.9 (35.6) 378.4 (12.1) 283.1 (0.9) 273.1 (2.5) 198.2 (0.1) 156.8 (5.6) 120.7 (0.5) 53.2 (4.8)
Mode A
Assignment C-H stretch C-H stretch Si-H stretch Si-H stretch N{C stretch
C
Si
N{C stretch
CH2 scissors CH2 wag CH2 twist C-N stretch SiH2 scissors SiH2 wag SiH2 twist SiH2 twist Si-C stretch Si-N stretch SiH2 rock S-C-N bend C-N{C bend C-N{C bend Si-N{C bend C-N{C bend Si-N{C bend torsion
Frequencies are in (cm-1), values in bracket are intensities in (km mol-1)
rotamers of 1,2-diisocyanoethane obtained in the liquid phase are also included in Table 3. The 24 modes of vibrations account for the irreducible representations Γv = 8Ag + 5Au + 4Bg + 7Bu of the C2h point group of the trans rotamer of1,2-diisocyanoethane and
First Principle Gas Phase Study of the Trans and Gauche Rotamers
351
1,2-diisocyanodisilane and Γv = 13A + 11B of the C2 point group of the gauche rotamer of 1,2-diisocyanoethane and 1,2-diisocyanodisilane. Further, the 24 modes of vibrations ′′ ′ account for the irreducible representations Γv = 9A + 15A of the Cs point group of the trans rotamer of isocyano(isocyanomethyl)silane and Γv = 24A of the C1 point group of the gauche rotamer of isocyano(isocyanomethyl)silane. These infrared vibrational frequencies are dominated by the high intensity of the N≡C stretching frequency. It is interesting to note the close agreement of the calculated infrared raw vibrational frequencies (gas phase) and experimental literature data (liquid phase) [12] for 1,2diisocyanoethane. Apart from these, it can be found that substitution of the cyano groups by the isocyano groups affects mainly the vibrational frequencies where the cyano groups are attached [11].
4 Conclusion This paper reports the gas phase theoretical study of the trans and gauche rotamers of 1,2-diisocyanoethane, novel 1,2-diisocyanodisilane and isocyano(isocyanomethyl)silane in terms of optimised molecular structures and infrared raw vibrational frequencies. The rotational barrier and related thermodynamic parameters are also given. The trans rotamer is more stable than gauche rotamer. An interesting outcome of this work is that some of the calculated parameters for 1,2-diisocyanoethane compare satisfactorily with experimental literature data. More interestingly, although the changes in structure and energy are relatively small when carbon is substituted by silicon, data for the trans and gauche rotamers for the novel 1,2-diisocyanodisilane and isocyano(isocyanomethyl)silane could serve as a reliable set of reference as their literature is limited. Acknowledgments. The author acknowledges facilities from the University of Mauritius. The author also thanks anonymous referees for their comments to improve the manuscript.
References 1. Radom, L., Baker, J., Gill, P.M.W., Nobes, R.H., Riggs, N.V.: A Theoretical Approach to Molecular Conformational Analysis. J. Mol. Struct. 126, 271–290 (1985) 2. Wiberg, K.B., Murko, M.A.: Rotational barriers. 2. Energies of Alkane Rotamers. An Examination of Gauche Interactions. J. Am. Chem. Soc. 110, 8029–8038 (1988) 3. Wiberg, K.B., Keith, T.A., Frisch, M.J., Murcko, M.: Solvent Effects on 1,2-Dihaloethane Gauche/Trans Ratios. J. Phys. Chem. 99, 9072–9079 (1995) 4. Ramasami, P.: Gauche and Trans Conformers of 1,2-Dihaloethanes: A Study by Ab initio and Density Functional Methods. Lecture Series on Computer and Computational Sciences 4, 732–734 (2005) 5. Ramasami, P.: Gas Phase Study of the Gauche and Trans Conformers of 1-Fluoro-2- Haloethanes CH2F-CH2X (X=Cl, Br, I) by Ab initio and Density Functional Methods: Absence of Gauche Effect. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2006. LNCS, vol. 3993, pp. 153–160. Springer, Heidelberg (2006)
352
P. Ramasami
6. Fitzgerald, W.E., Janz, G.J.: Vibrational Spectra and Molecular Structure of 1,2 Dicyanoethane. J. Mol. Spectrosc. 1, 49–60 (1957) 7. Walker, G.A., Bhatia, S.C., Hall Jr., J.H.: Theoretical Investigation of the Hydration Properties for the Trans and Gauche Rotamers of Succinonitrile. Int. J. Quan. Chem. 38, 85–92 (1990) 8. Fengler, O.I., Ruoff, A.: Vibrational Spectra of Succinonitrile and its [1,4-13C2]-, [2,2,3,32 H4]- and [1,4-13C2-2,2,3,3-2H4]-Isotopomers and a Force Field of Succinonitrile. Spectrochimica Acta 57, 105–117 (2001) 9. Fengler, O.I.: The Vibrational Spectra of [15N2]–Succinonitrile. Spectrochimica Acta A57, 1627–1634 (2001) 10. Umar, Y., Morsy, M.A.: Ab initio and DFT Studies of the Molecular Structures and Vibrational Spectra of Succinonitrile 66, 1133–1140 (2007) 11. Ramasami, P.: Gas Phase Study of the Trans and Gauche Rotamers of 1,2-Dicyanoethane, novel 1,2-Dicyanodisilane and Cyano(cyanomethyl)silane by Ab initio and Density Functional Methods. Spectrochimica Acta 68, 752–756 (2007) 12. Schrumpf, G., Martin, S.: Conformation and Vibrational Spectra of 1,2-Diisocyanoethane. J. Mol. Struc. 81, 155–165 (1982) 13. Schrumpf, G., Trætteberg, M., Bakken, P., Seip, R.: The Molecular Structure and Conformational Equilibrium of 1,2-Diisocyano-ethane Studied by Gas-phase Electron Diffraction. J. Mol. Struc. 197, 339–347 (1989) 14. Dartiguenave, M., Dartiguenave, Y., Guitard, A., Mari, A., Beauchamp, A.L.: Synthesis and Characterization of Copper, Manganese and Zinc Complexes of the Bidentate Diisocyano Ligands 2,5-Dimethyl 2,5-Diisocyanohexane (TMB) and 1,2-Diisocyanoethane(DHB). Polyhedron 8, 317–323 (1989) 15. Gross, H., König, G.M.: Terpenoids from Marine Organisms: Unique Structures and their Pharmacological Potential. Phytochem. Rev. 5, 115–141 (2006) 16. Frisch, M.J., et al.: Gaussian 03, Revision B04, Gaussian Inc., Wallingford, CT (2004) 17. GaussView, Version 3.09, R. Dennington II, T. Keith, J. Millam, K. Eppinnett, W.L. Hovell and R. Gilliland, Semichem, Inc., Shawnee Mission, KS (2003)
A Density Functional Theory Study of Oxygen Adsorption at Silver Surfaces: Implications for Nanotoxicity Brahim Akdim, Saber Hussain, and Ruth Pachter Air Force Research Laboratory, Wright-Patterson Air Force Base, Ohio 45433, USA
[email protected],
[email protected],
[email protected]
Abstract. The formation of superoxide at Ag(100) and Ag(111) surfaces for cluster and periodic slab models is studied by applying first-principles density functional theory calculations, including ab-initio molecular dynamics. Adsorption energies and structural parameters are discussed in detail. Charge transfer analyses indicate that O2- preferentially forms on clusters, particularly at an Ag(100) surface. Keywords: Density Functional Theory, Ag Nanoparticles, Oxygen Adsorption.
1 Introduction The potential for toxicity in nanomaterials has been recently summarized by Nel et al., [1], particularly regarding reactive oxygen species (ROS) generation, pointing out that under conditions of excess ROS production, natural antioxidant defenses may be overwhelmed, which can impair or destroy cell proteins, lipids and DNA, thus leading to deterioration of cell function, or toxicity [2]. The effects of oxidative stress on cell behavior in vivo and in cell culture, where glutathione (GSH) is depleted, were recently described [3]. Indeed, concerns regarding health safety of nanoparticles have been consistently growing in the last few years [4.5,6,7,8]. However, the mechanisms by which nanoparticles generate ROS are still unclear. For example, it was proposed that the early phase of ROS production is due to redox cycling chemistry, while mitochondria are responsible for the late and progressive increase in O2- radicals [9]. Possible mechanisms range from H2O2 production, generation of superoxide anion by redox cycling reactions, electron-acceptor active groups for superoxide anion generation, catalyzing electron transport to oxygen, presence of metal nanoparticles that may induce Fenton-like chemistry to generate hydroxyl radicals from H2O2, or light activation, e.g. of TiO2 or C60 for the formation of superoxides [9]. At the same time, the antibacterial effect of silver nanoparticles resulted in their extensive application in health and home products, therefore making an evaluation of their toxicity an important undertaking, motivating animal experimental studies, and also the development of means for distribution of specific concentrations of silver nanoparticles [10]. Toxicity of Ag nanoparticles (15-50 nm in diameter) was recently M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 353–359, 2008. © Springer-Verlag Berlin Heidelberg 2008
354
B. Akdim, S. Hussain, and R. Pachter
demonstrated in vitro [11], where an investigation of the role of oxidative stress has shown increased ROS generation, noting differences due to the Ag nanoparticle size. Of course, the interaction of oxygen with silver has been studied for a long time, particularly as Ag is one of the best partial oxidation catalysts for industrial reactions, including the oxidative coupling of methane, partial dehydrogenation of methanol to formaldehyde, and epoxidation of ethylene, for example, as investigated by Avdeev and co-workers [12, and references therein], in order to explain the enhanced reactivity, also by understanding the role of subsurface oxygen upon dissociation, for example, theoretically [13,14,15, and references therein]. It has also been shown that an electron is transferred to O2, where the reduction of the dioxygen results in a superoxide anion metastable species, as a precursor for its dissociation [16,17,18], also depending on the silver facet. It was noted that the O-O bond at a silver cluster is weaker than at a bulk surface [19], possibly supporting the supposition that superoxide anions at the surface may result in toxicity for small Ag nanoparticles. Note however that the particles that compose the Ag catalyst, for example, are affected by many surface defects and their chemistry could be different from that of perfect low-Miller-index surfaces, as has been suggested, and studied [20]. In this work, as the first step towards gaining an understanding of the charge transfer characteristics and adsorption of oxygen, applying first principles density functional theory (DFT), we investigated the effects of O2 adsorption at facets of a truncated octahedron (TO) cluster, as observed experimentally [21], and for comparison, also at a Ag(100) slab, as test problems, which may possibly preliminarily explain, in part, changes in toxicity of Ag nanoparticles of different size. Note, however, that comprehensive simulations to model the system in room temperature will be carried out in the future to further understand this system.
2 Computational Details Calculations were performed with all-electron linear combination of atomic orbitals DFT method as implemented in Dmol3 [22]. The Perdew-Burke-Ernzerhof (PBE) exchange-correlation functional [23] and a double numerical basis set with dfunctions (DND) were applied. For validation, O2 molecular calculations were carried out using unrestricted spin calculations, resulting in an O—O distance (rO-O) of 1.22Å and a dissociation energy of 5.3 eV in the gas phase, in good agreement with experimental results (1.21Å and 5.2 eV) [24] and a previous theoretical study (1.24Å and 5.3 eV) [25].
3 Results and Discussion 3.1 Structural Parameters Fig. 1 summarizes the optimized structures for Ag(100) and Ag(111) surface cluster models, upon oxygen adsorption, as well as for the corresponding slab structures. Structural parameters are listed in Table 1.
A Density Functional Theory Study of Oxygen Adsorption at Silver Surfaces
355
a)
C(100)1
C(111)1
C(100)2
C(100)3
C(111)2
b)
S(100)1
S(100)2
S(100)3
Fig. 1. O2 adsorption on Ag surfaces: (a) TO cluster; (b) 3x4 Ag (100) slab
As shown in Fig. 1, the adsorption sites that were considered for Ag(100) consisted of O2 chemisorption at the four-fold site (C(100)1), a crossed configuration (C(100)2), and dissociated O2, where each oxygen occupies a four-fold site (C(100)3). In the Ag (111) plane, we considered a bridged configuration (C(111)1) and three-fold adsorption (C(111)2). For the TO cluster, a preference in the adsorption of O2 at an Ag(100) surface was calculated, specifically for configuration C(100)1. The results show that for model C(100)1, rO-O distances are consistent with previous work on discerning O-O distances upon superoxide formation when adsorbed at Ag [16,17,18,26], while a shorter distance is noted for the an Ag(111) surface. Although the O-O bond is weakened, charge transfer is less pronounced in
356
B. Akdim, S. Hussain, and R. Pachter
Table 1. Adsorption energies, Mulliken population analyses, and structural parameters for the TO cluster and slab models
(100)
(111)
Adsorption
rO—O (Å)
rAg—O (Å)
rAg—Ag (Å)
3.11 3.08
Mulliken partial atomic charges (e) -0.74/O2 -0.64/O2
Adsorption Energy (kcal/mol) -21.5 -17.1
C1 S1
1.45 1.42
2.33 2.32
C2 S2
1.41 1.42
2.45 2.38
3.14 3.19
-0.64/O2 -0.62/O2
-12.0 -4.9
C3 S3 C1 S2
3.47 3.26 1.34 1.37
2.30 2.35 2.32 O1: 2.25 O2: 2.48
3.17 3.07 3.16 3.04
-0.77/O -0.72/O -0.46/O2 -0.56/O2
-45.0 -40.9 -0.5 -8.8
this case, not yet in the distance range of a superoxide formation. Interestingly, results for a cluster and a slab, carried out for comparison, show somewhat larger rO-O for the cluster, consistent with the previous suggestion that the O-O bond may be weaker at a cluster than on a bulk surface [19]. Ag—O distances are in relatively good agreement with previously reported results [27,28] for a periodic system, however applying a plane wave basis set. The lengthened Ag—Ag distances calculated in the four-fold hollow site for the cluster (3.17 Å), as compared to the extended system (3.07Å), may explain the slight increase of rO-O in C(100)3. 3.2 Adsorption Energies Charged molecular species binding to the silver metal have been proposed as intermediates, namely, the superoxide and peroxide species, suggesting the possibility of long-lived trapped states due to deep potential wells [18]. Indeed, in carrying out ab initio MD simulations (Fig. 2), with runs of about 1 ps, in order to assess preferred configurations, with C(111)2 as the starting configuration, C(111)1 was found to be the stable structure at about 400°K, with an oxygen atom moving toward the center of the fourfold site, possibly denoting a trapped metastable state and no dissociation. The characteristics of dissociative oxygen adsorption on silver surfaces has drawn significant interest, because it is an important step in the catalytic oxidation reaction [17,29]. Although a dissociated molecule at temperatures lower than 110°K on clusters was shown [19], where dissociation occurs from molecular chemisorbed O2-, the experiments were carried out for very small clusters, up to 25 Ag atoms, noting that a larger energy barrier would be exhibited for a bulk surface. An investigation [30] of the interaction of molecular oxygen with bulk Ag(100) has shown that although some oxygen molecules adsorb dissociatively at low temperatures, a higher
A Density Functional Theory Study of Oxygen Adsorption at Silver Surfaces
357
Fig. 2. ab-inito MD simulations of O2 adsorption on an Ag(100) TO cluster (configuration C(100)2)
pressure and temperature of about 470°K are required to achieve larger surfaces with oxygen atom adsorption. Overall, our preliminary results suggest that superoxide species are more likely to form on Ag clusters, specifically at the Ag(100) facet. However, conclusive results have yet to be obtained, particularly regarding calculations of barrier heights, and consideration of realistic samples. 3.3 Charge Transfer As expected, our calculations showed charge transfer from Ag surfaces to the adsorbed O2 molecule, where for configuration C(100)1 a larger partial atomic charge (-0.74/O2) than for C(100)2 (-0.64/O2) was calculated (Table 1). A larger charge transfer is consistent with the increase in the rO-O distance, more pronounced for the cluster than for the slab model. A smaller charge transfer for Ag(111) adsorption, as compared to the Ag(100) surface, was calculated.
4 Summary In this communication, we presented a DFT first principles study, including MD simulations, to investigate the possibility of superoxide formation on Ag(100) and
358
B. Akdim, S. Hussain, and R. Pachter
Ag(111) surfaces. We found the four-fold adsorption site to be more favorable in Ag(100) as compared to C(100)2, and also to adsorption at Ag(111). Because of the lack of comprehensive experimental data on silver nanoparticles, oxygen dissociation mechanisms are yet to be definitively elucidated in comparison with the bulk. The charge transfer results indicate a higher preference for superoxide formation on an Ag(100) cluster surface, providing the first stage towards attempting to understand ROS formation in silver nanoparticles.
References 1. 2. 3. 4. 5. 6. 7. 8.
9. 10.
11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25.
Nel, A., Xia, T., Madler, L., Li, N.: Science 311, 622 (2006) Stone, V., Donaldson, K.: Nat. Nanotechnol. 1, 23 (2006) Halliwell, B.: Biochem. Soc. Trans. 35, 1147 (2007) Maynard, A.D., Aitken, R.J., Butz, T., Colvin, V., Donaldson, K., Oberdoerster, G., Philbert, M.: Nature 444, 267 (2006) Oberdester, G., Oberdester, E., Oberdester, J.: Environ. Health Perspect. 113, 823 (2005) Kreyling, W.G., Semmler-Behnke, M., Moller, W.J.: Nanopart. Res. 8, 543 (2006) Tsuji, J.S., Maynard, A.D., Howard, P.C., James, J.T., Lam, C.-W., Warheit, D.B., Santamaria, A.B.: Toxicol. Sci. 89, 42 (2006) Borm, P.J.A., Robbins, D., Haubold, S., Kuhlbusch, T., Fissan, H., Donaldson, K., Schins, R., Stone, V., Kreyling, W., Lademann, J., Krutmann, J., Warheit, D., Oberdorster, E.: Part. Fibre Toxicol. 3 (2006) Xia, T., Kovochich, M., Brant, J., Hotze, M., Sempf, J., Oberley, T., Sioutas, C., Yeh, J.I., Wiesner, M.R., Nel, A.E.: Nano Lett. 6, 1794 (2006) Ji, J.H., Jung, J.H., Kim, S.S., Yoon, J.-U., Park, J.D., Choi, B.S., Chung, Y.H., Kwon, I.H., Jeong, J., Han, B.S., Shin, J.H., Sung, J.H., Song, K.S., Yu, I.J.: Inhalation Toxicol. 19, 857 (2007) Hussain, S.M., Hess, K.L., Gearhart, J.M., Geiss, K.T., Schlager, J.J.: Toxicol. in vitro 19, 975 (2005) Avdeev, V.I., Boronin, K.S.V., Zhidomirov, G.M.: J. Mol. Catal. 154, 257 (2000) Xu, Y., Mavrikakis, M.: J. Am. Chem. Soc. 127, 12823 (2005) Li, W.-X., Stamfl, C., Scheffler, M.: Phys. Rev. B 68, 165412 (2003) Michaelides, A., Reuter, K., Scheffler, J.: Vac. Sci. Technol. A 23, 1487 (2005) Campbell, C.T.: Surf. Sci. 157, 43 (1985) Citri, O., Baer, R., Kosloff, R.: Surf. Sci. 351, 24 (1996) Katz, G., Zeiri, Y., Kosloff, R.: Surf. Sci. 425, 1 (1999) Schmidt, M., Masson, A., Brechignac, C.: Phys. Rev. Lett. 91, 243401 (2003) Bonini, N., Kokalj, A., Dal Corsoe, A., de Gironcoli, S., Baroni, S.: Phys. Rev. B 69, 195401 (2004) Harfenist, S.A., Wang, Z.L., Alvarez, M.M., Vezmar, I., Whetten, R.L.: J. Phys. Chem. 100, 13904 (1996) Delley, B.J.: Chem. Phys. 113, 7756 (2000); implemented in Accelyrs Inc. Perdew, J.P., Burke, K., Ernzerhof, M.: Phys. Rev. Lett. 77, 3865 (2006) Weast, R.: CRC Handbook of Chemistry and Physics. CRC Press, Boca Raton (1985) Honkala, K., Laasonen, K.: Phys. Rev. Lett. 84, 705 (2000)
A Density Functional Theory Study of Oxygen Adsorption at Silver Surfaces 26. 27. 28. 29. 30.
359
Nakatsuji, H., Nakai, H.: J. C. Phys. 98, 2423 (1993) Wang, Y., Jia., L., Wang, W., Fan, K.: J. Phys. Chem. B, 106, 3662 (2002) Wang, Y., Jia., L., Wang, W., Fan, K.: J. Chem. Phys. 119, 11210 (2003) German, E.D., Shientuch, M.: J. Phys. Chem. A 109 7957 (2005) Costine, I., Schmid, M., Schiechl, H., Gajdos, M., Stierle, A., Kumaragurubaran, S., Hafner, J., Dosch, H., Varga, P.: Surf. Sci. 617 (2006)
Mechanism of Influenza A M2 Ion-Channel Inhibition: A Docking and QSAR Study Alexander V. Gaiday1 , Igor A. Levandovskiy1, Kendall G. Byler2 , and Tatyana E. Shubina1,2 1
2
Department of Organic Chemistry, Kiev Polytechnic Institute, pr. Pobedy 37,03056 Kiev, Ukraine Computer-Chemie-Centrum der Friedrich-Alexander Universit¨ at Erlangen-N¨ urnberg, N¨ agelsbachstraße 25, 91052 Erlangen, Germany
[email protected]
Abstract. Binding of blockers to the Influenza A ion-channel is studied using automated docking calculations. Our study suggests that studied cage compounds inhibit the M2 ion channel by binding to the His37 residue. The adamantane cage fits into a pocket formed by Trp41 residue, while the hydrogen bond is formed between hydrogen atom of ammonium nitrogen and the nitrogen of histidine residue. This finding is supported by experimental data and should help to obtain better understanding of the inhibition mechanism of the Influenza A M2 ion channel. Keywords: Influenza A, ion-channel inhibition, QSAR, antiviral drugs, cage compounds.
1
Introduction
The influenza A virus, with 27 known subtypes that range from low to high pathogenicity in the bird population, also infects humans and other mammals. Subtype H5N1, currently known as avian or bird flu, is of particular interest because of its increasing pathogenicity and its ability to form a new viral subtype to which there is no native immunity in human hosts. The influenza virus is bound by a membrane taken from the plasma membrane of an infected cell. The M2 protein is present in this membrane and forms protonselective channels that are essential to viral function. Based on experimental data, it is assumed that the M2 proton channel consists of a tetrameric array of parallel TM peptides with their N –termini directed towards the outside of the virus and has following amino acids sequence: N –term–L26VVAAS31 IIGI LHLILWIL43 –C –term. [1],[2],[3],[4] It is also believed that residues Val27, Ala30, Ser31, Gly34, His37, Lys38, and Trp41 form the pore lining; and it is also possible that histidine–37 plays an important role in the gating of the channel, or in the mechanism of H+ transport or both. M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 360–368, 2008. c Springer-Verlag Berlin Heidelberg 2008
Mechanism of Influenza A M2 Ion-Channel Inhibition
361
Adamantane derivatives have been used successfully for the prevention and treatment of influenza A virus infection for more than 30 years. [5],[6],[7] It is proposed that these drugs inhibit influenza A virus replication by blocking the M2-protein ion channel. This prevents fusion of the viruses with the host-cell membranes and the release of viral RNA into the cytoplasm of infected cells. [8] There is experimental evidence that the amantadine blockade of the M2 channel occurs within the TM region. However, the mechanism of this process is still unclear. It was recently discovered that the Influenza A virus now exhibits resistance to both, amantadine and rimantadine. [9] This resistance to adamantane derivatives is believed to be due to a serine-31-to-asparagine mutation in the M2-protein. [10] Also, it is possible that an amino acid substitution at positions 26, 27, 30 or 34 in the transmembrane region of the M2 protein is also responsible for resistance to these drugs. [7],[11] In such cases, treatment by amantadine and rimantadine will be less predictably effective for treatment or prophylaxis in the event of a pandemic outbreak of influenza. Obviously, better understanding the exact nature of M2 proton channel inhibition by amantadine and other possible drug candidates is needed. The present study aims at shedding some light into this problem.
2
Results and Discussion
Cyclopentylamine (1), cyclohexylamine (2), amantadine (3), rimantadine (6) and other adamantane derivatives (Fig. 1) in their protonated states were initially optimized at the B3LYP/6–31G(d) level [12],[13] of theory within the Gaussian03 program package. [14] R2
R2 NH2 NH2 2
1
NH2
3: R1=R2=H 4: R1=H, R2=CH3 5: R1=R2=CH3
6: R1=R2=H 7: R1=H, R2=CH3 8: R1=R2=CH3
R2
R2 O R1
NH2
R1
R1
N H
CH3
9: R1=R2=H 10: R1=H, R2=CH3 11: R1=R2=CH3
R1
H N
CH3
N
R1
O 12: R1=R2=H 13: R1=H, R2=CH3 14: R1=R2=CH3
R2
15: R1=H, R2=CH3 16: R1=R2=CH3
Fig. 1. Compounds 1–16 used in this work
362
A.V. Gaiday et al.
Table 1. Computed docking energies and estimated binding constants of the compounds 1-16 Docking energy, -E, kcal/mol
Inhibition constant (K i )
Binding pocket
Binding pocket
N
I
II
III
I
II
III
1
3.79
3.43
5.73
0.00
0.00
2.48·10−5
2
3.87
4.34
6.56
6.17·10−4 2.81·10−4 6.44·10−6
3
5.09
6.00
8.73
5.68·10−5 1.13·10−5 1.29·10−7
4
5.27
6.31
8.39
4.78·10−5 8.25·10−6 2.45·10−7
5
5.65
6.87
9.15
1.05·10−5 1.87·10−6 3.98·10−8
6
5.84
6.74
9.84
1.66·10−5 3.86·10−6 1.97·10−8
7
6.34
7.20
10.23
7.34·10−6 1.40·10−6 8.16·10−9
8
6.87
7.89
10.63
2.32·10−6 4.53·10−7 3.52·10−9
9
5.67
6.27
9.51
1.34·10−5 5.02·10−6 2.21·10−8
10
6.96
8.09
10.31
4.03·10−6 5.75·10−7 1.38·10−8
11
6.58
7.28
10.02
3.27·10−6 9.93·10−7 8.06·10−9
12
6.67
7.53
10.95
7.71·10−6 1.84·10−6 5.74·10−9
13
6.96
8.09
10.31
4.03·10−6 5.75·10−7 1.33·10−8
14
7.30
8.60
10.99
2.17·10−6 3.24·10−7 5.25·10−9
15
5.02
5.64
8.61
2.29·10−5 8.64·10−6 6.01·10−8
16
4.38
5.26
8.11
1.61·10−5 3.40·10−6 2.66·10−8
For modelling the transmembrane part of the M2 channel we used the PDB entry 1NYJ with the following amino acid sequence order: SSDPLVVAASIIGILHLILWILDRL. The blockers and the protein were prepared for docking using ADT, the AutoDock tool graphical interface. [15] For each ligand Gasteiger charges were assigned and atomic solvatation parameters were added. For the protein, polar hydrogen charges of Kollman-type were assigned and the non-polar hydrogens were merged with the carbons. The internal degrees of freedom and torsions were set for each ligand. The AutoDock3.0 software [16] was used to obtain unbiased binding positions, estimate docking energies and inhibition constants for the studied compounds 1–16. The search was carried out with the Lamarckian Genetic Algorithm: a starting population of 150 ligand conformations with a stopping criteria of a maximum 10 millions energy evaluations. Evaluation of the results was done by sorting the different complexes with respect to the predicted binding energy. A cluster analysis based on RMSD values, with reference to the starting geometry,
Mechanism of Influenza A M2 Ion-Channel Inhibition
363
Fig. 2. A detailed view of ligand 14 docked into binding pocket I. For clear view only Ala30-Ser31-Gly34-Ile35 residues are shown in sticks. Bond lengths in ˚ A.
was subsequently performed. All the figures in this paper were produced with PyMOL. [17] The results of the docking study shows that the TM part of the M2 channel can be treated as three different binding pockets within large pore. The first binding pocket (I) is defined by Val27-Ala30-Ser31 residues of each α-helixes. It is believed that amantadine resistance is determined by amino acid exchanges within this region. Also, in an early molecular modelling study [18] Sansom et al. suggested localization of the amantadine binding site close to Ser31. In the formation of the second binding pocket (II) participate mostly Gly34 and Ile35. Finally, His37 residue is responsible for the formation of the third binding pocket (III). The computed docking energies of 1–16 within each binding pocket and computed respective inhibition constants (K i ) are summarized in Table 1. Both cyclic amines 1 and 2 exhibit the lowest binding energies to all three binding sites. Our study confirms that cyclopentylamine 1 does not block the M2 channel due to low binding energy and, thus, probable ability to pass through the pore, which is in agreement with its lack of antiviral activity in infected cells. [19],[20] While cyclohexylamine 2 is more bulky than cyclopentylamine 1, it still shows low inhibition properties. This correlates with the sensitivity of influenza virus replication to the size of aliphatic ring in the cyclic amines. For all studied compounds His37 binding site (III) was found to be the most preferred one. Rimantadine 6 has higher binding energies and exhibits ca. 10-fold higher inhibition properties than amantadine 3. This can be explained by the presence of the CH group between amino group and adamantane cage, which allows rotation of the amino group and, therefore, assists closer and more favorable interaction with amino acid residues of the proton channel compared to amantadine.
364
A.V. Gaiday et al.
Fig. 3. A detailed view of ligand 14 docked into binding pocket II. For clear view only Ala30-Gly34-Ile35-His37 residues are shown in sticks. Bond lengths in ˚ A.
Consequent substitution of hydrogens in 3- and 5- positions in the adamantane cage into methyl groups leading to compounds 4, 7, 10, 13 and 5, 8, 11, 14 increases binding energies and inhibition properties of 4, 5, 7–14 (Table 1). Additional consequent introduction of methyl groups in adamantane cage of compounds 9 and 12 leading to compounds 10, 13 and 11, 14 also increases binding energies and inhibition properties as it was in case of amantadine and rimantadine (Table 1). In fact, increase of molecular volume of the inhibitor and presence of the acetyl group makes 14 the strongest blocker among studied compounds. It should be mentioned that the nitrogen atom of amino group of all four ligands forms a hydrogen bond with the residues that form the binding pocket. Figures 2–4 show a detailed view of the docked representative ligands 14 or 8 within the first, second and third binding pockets, respectively. The distance between the ammonium hydrogen and the oxygen atom of serine residue in the first binding pocket ranges from 2.37˚ A (in 2), 2.0˚ A (in 1 and 3) to 1.84 ˚ A (4). In the second binding pocket there is also possibility to form hydrogen bonds between the hydrogen atom of NH3 group and the oxygen atom of glycine. Nevertheless, for all ligands studied here the most preferred binding pocket is the third one. Interestingly enough, opposite to our findings, Sansom et al.[18] showed that near the His37 residue there is a substantial barrier in case of amantadine and favorable interaction with cyclopentylamine. Our docking study shows that the aliphatic rings and adamantane cage fit into a pocket formed by tryptophan-34 (Fig. 4), the hydrogen bond is formed between hydrogen of NH3
Mechanism of Influenza A M2 Ion-Channel Inhibition
365
group and the nitrogen of histidine residue, leading to more efficient blocking of the channel. Moreover, the difference in the estimated inhibition constants of amantadine (3) and rimantadine (4) docked into the binding pocket III agrees fortuitously with the experimentally observed order. These findings are in agreement with recently proposed model [21] in which the ammonium nitrogen of amantadine shares a hydrogen bond with an unprotonated histidine-37 imidazole while the adamantyl group fits in the lipophilic pocked formed in the vicinity of Gly34. It is also worth to note that a similar adamantane-NH3 –His interaction has been also recently found [22] for the Hepatitis C virus ion channel p7.
Fig. 4. A detailed view of ligand 8 docked into binding pocket III. Only His37 and Trp41 are shown in sticks Bond lengths in ˚ A.
The structures in the influenza data set were aligned using SYBYL 7.0 program. Semi-empirical single-point calculations with the AM1 Hamiltonian were performed on each using VAMP 9.0 [23] to calculate charges and orbital information. A three-dimensional grid with a point spacing of 2 ˚ A and a 4 ˚ A border was generated used in the calculation of seven local properties: electron density de , electron affinity EAL , electronegativity cL , hardness hL , ionization potential IEL , electrostatic potential V, and polarizability aL at each grid point using Parasurf 07 program package [24]. Points interior to the molecular surface were removed from the grid by using a generalized van der Waals radius of 1.16 ˚ A in order to ensure that property values
366
A.V. Gaiday et al.
that bear no direct relation to surface activity would not appear in the PLS analyses. This atomic radius, which is slightly smaller than the van der Waals radius for the hydrogen atom (1.20 ˚ A), was chosen in order to leave enough surface electron density to use the local electron density as a steric parameter in the 3D-QSAR analysis. The local property values at the grid points were then used as independent variables in separate partial least squares regressions, using associated physical property values as target values. The partial least squares analyses were performed using an in-house program using the SIMPLS algorithm and a full cross-validation scheme, re-centering and re-scaling the included data for each run. The PLS regressions were carried out initially to ten vector components in order to determine the maximum number of components to be included in the final model, using the cross-validated standard error of prediction (SEP) of the model to choose the appropriate number of components. In the cases where PLS analyses were performed using single local properties, the property data was normalized by the mean. The regression coefficients for the final model were then used to generate a three-dimensional representation of the property space with colored spheres using Pymol.[17] Those grid points with a positive relationship to the particular target property are color-coded red, while those with a negative relationship to the target property are color-coded blue. In addition to color-coding, the size of the grid spheres, determined by the standardized magnitude of the regression coefficients, represents the magnitude of the relationship between the local property at that point to the target value. The PLS analysis based on SYBYL CoMFA yelds with an r2 = 0.993 for 10 components, however, reducing number of components to 4 leads to: r2 = 0.909; q2 = 0.348; σ = 3.46. 78.2% was found to be responsible for the steric influence and 21.8% - for the electrostatic interaction. QSAR performed with Parasurf07 (with 3 components) gives somewhat better correlation: r2 = 0.913; q2 = 0.401; σ = 3.73. The 3D-QSAR results for the two most contributed properties - electron density de and ionization potential IEL are shown below (Fig. 5, left and right, respectively). In this work, we have reported modeling of the M2-TM channel inhibition by different ligands. We have demonstrated the difference in binding mechanism between small cyclic amines (1–2) and adamantane derivatives (3–16) in the M2 proton channel. The docking study shows that the M2 proton channel is blocked by cage compounds and suggests that the preferred binding site is in a vicinity of His37
Mechanism of Influenza A M2 Ion-Channel Inhibition
367
Fig. 5. Electron density de (left) and ionization potential IEL (right)
residue. The calculated inhibiting ability of amantadine and rimantadine reproduces the relations observed clinically. Based on our docking study with 3D-QSAR results we suggest that next generation of potential M2 inhibitors should include lipophilic adamantane or other hydrocarbon cage (i.e. trishomocubane) combined with polar groups (e.g. hydroxyl, carboxyl, urea/thiourea moiety) separated from cage by methylene bridges. Nevertheless, more clearer understanding of the nature of inhibition mechanism is needed. One way of doing it - via molecular dynamic simulation of the whole M2 ion-channel and its interaction with inhibitors. Results of such study will be published elsewhere. Acknowledgments. This work was supported by grants for the young scientists from the President of Ukraine (GP 2006, 2007 and 2008).
References 1. Lamb, R.A., Holsinger, L.J., Pinto, L.H., Wimmer, E. (eds.): vol. 28, p. 526. Cold Spring Harbor Lab Press, New York (1994) 2. Hay, A.J.: Semin. Virol 3, 21 (1992) 3. Chizhmakov, I.V., Geraghty, F.M., Ogden, D.C., Hayhurst, A., Antoniou, M., Hay, A.J.: J. J. Physiol. 494, 329 (1996) 4. Shimbo, K., Brassard, D.L., Lamb, R.A., Pinto, L.H.: Biophys. 70, 1335 (1996) 5. Tominack, R.L., Hyden, F.G.: Rimantadine hydrochloride and amantadine hydrochloride use in Influenza A virus infection. Infect. Dis. Clin. North. Am. 1, 459 (1987) 6. Dolin, R., Reichman, R.C., Madore, H.P., Maynard, R., Linton, P.N., WebberJones, J.: A controlled trial of amantadine and rimantadine in the prophylaxis of Influenza A virus. N. Engl. J. Med. 307, 580 (1982) 7. Belshe, R.B., Smith, M.H., Hall, G.B., Bettis, R., Hay, A.J.: Genetic basis of resistance to rimantadine emerging during treatment of influenza virus infection. J. Virol. 62, 1508 (1988) 8. Wang, C., Takeuchi, K., Pinto, L.H., Lamb, R.A.: Ion channel activity of Influenza A virus M2 protein: characterization of the amantadine block. J. Virol. 67, 5585 (1993)
368
A.V. Gaiday et al.
9. Evolution of H5N1 avian influenza viruses in Asia, http://www.cdc.gov/ncidod/EID/vol11no10/pdfs/05-0644.pdf 10. Puthavathana, P., Auewarakul, P., Charoenying, P.C.: Molecular characterization of the complete genome of human influenza H5N1 viruses isolated from Tailand. J. Gen. Virol. 86, 423 (2005) 11. Hay, A.J., Zambon, M.C., Wolstenholme, A.J., Skehel, J.J., Smith, M.H.: Molecular basis of resistance of Influenza A viruses to amantaine. J. Antimicrob. Chemother. 8(suppl. B), 19 (1986) 12. Becke, A.D.: J. Chem. Phys. 98, 5648 (1993) 13. Lee, C., Yang, W., Parr, R.G.: Phys. Rev. 37, 785 (1988) 14. Frisch, M.J., Trucks, G.W., Schlegel, et al.: Gaussian 03, Revision B.03, Gaussian, Inc., Wallingford CT (2004) 15. Sanner, M., Duncan, B., Carillo, C., Olson, A.: Integrating computation and visualization for biomolecular analysis: an example using python and avs. Pac Symp Biocomput, 401–412 (1999) 16. Morris, G.M., Goodsell, D.S., Halliday, R.S., Huey, R., Hart, W.E., Belew, R.K., Olson, A.: Automated docking using a lamarckian genetic algorithm and empirical binding free energy function. J. Comp. Chem. 19, 1639 (1998) 17. DeLano, W.L.: The pymol molecular graphics system (2002), http://www.pymol.org 18. Sansom, M.S., Kerr, I.D.: Influenza virus M2 protein: a molecular modelling study of the ion channel. Protein Engineering 6, 65 (1993) 19. Hay, A.J., Wolstenholme, A.J., Skehel, J.J., Smith, M.H.: EMBO Journal 4, 3021 (1985) 20. Hay, A.J., Zambon, M.: Multiple actions of amantadine against influenza virus, volume Antiviral Drugs and Interferon: The Molecular basis of their activity (1984) 21. Mould, J.A., Li, H.-C., Dudlak, C.S., Lear, J.D., Pekosz, A., Lamb, R.A., Pinto, L.H.J.: J. Biol. Chem. 275, 8592 (2000) 22. Ptagrias, G., Zitzmann, N.D., Fischer, W.B.: J. Med. Chem. 49, 648 (2006) 23. Clark, T., Alex, A., Beck, B., Chandrasekhar, J., Gedeck, P., Horn, A.H.C., Hutter, M., Martin, B., Rauhut, G., Sauer, W., Schindler, T., Steinke, T.: VAMP 9.0. Computer-Chemie-Centrum, Universit¨ at Erlangen-N¨ urnberg, Erlangen (2005) 24. ParaSurf 2007 CEPOS InSilico Ltd (2007)
A Java Tool for the Management of Chemical Databases and Similarity Analysis Based on Molecular Graphs Isomorphism Irene Luque Ruiz and Miguel Ángel Gómez-Nieto University of Córdoba. Department of Computing and Numerical Analysis. Campus de Rabanales. Albert Einstein Building E-14071 Córdoba (Spain) {iluque,mangel}@uco.es
Abstract. This paper describes a computational chemistry solution for the management of large chemical databases of molecules and the performing of isomorphism calculation for the analysis of database similarity and diversity. The system has been fully developed using Java language and it uses other free and standard Java library. The system allows to the user the building of databases of molecules, store information about the molecules, the matching among molecules using different isomorphism paradigms and the similarity/diversity analysis of databases through a wide number of similarity indices. Keywords: Computational chemistry, isomorphism, matching, similarity and diversity analysis, chemical database management, Java.
1 Introduction The investigations in Computational Chemistry [1] use computers to assist in solving chemical problems, they incorporate the results of theoretical chemistry into efficient computer programs, aimed to calculate the structures and properties of molecules. While its results normally complement the information obtained by chemical experiments, computational chemistry find predict hitherto unobserved chemical phenomena. Several major areas may be distinguished within computational chemistry [1-3]: − The prediction of the molecular structure of substances. − Storing and searching for data on chemical entities. − Identifying correlations between chemical structures and properties (QSPR and QSAR). − Computational approaches to help in the efficient synthesis of compounds. − Computational approaches to design molecules that interact in specific ways with other molecules (e.g. drug design). Computational Science plays an important role in modern molecular discovery systems [4-5]. Nowadays, a large number of compounds are tested and optimized in M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 369 – 378, 2008. © Springer-Verlag Berlin Heidelberg 2008
370
I.L. Ruiz and M.Á. Gómez-Nieto
order to identify a molecule or sets of molecules that act favorably in some desired circumstances. One of the factors limiting in identifying new candidates is the availability of proper collections of chemical compounds in order to propose a prediction model of some molecular property needed. Similarity analysis engines [6-7] are often used to compute resemblances between molecules in order to analyze and organize databases of molecules and to develop new QSAR models. The similarity between two molecules can be measured as the differences between molecular graphs representing their chemical structures, if structurally similar molecules are more likely to have resembling properties. In this paper, we present a tool devoted to the database management of molecules in order to the development of chemoinformatic solutions. Chemoinformatics or chemical informatics is the use of computers and informational techniques, applied to a range of problems in the field of chemistry. These in silico techniques are used in pharmaceutical companies in the process of drug discovery. Applications of chemoinformatics include the storage of information related to chemical compounds, the in silico representation of chemical structures using specialized formats, the virtual screening of large in silico virtual libraries (usually built using combinatorial chemistry) of compounds, QSAR used to predict the activity of compounds from their structures, etc. The tool described in this paper embraces some of these chemoinformatic applications and is open to include new ones in a future. We describe the database management component aims at the building of molecules databases, permitting the storing, retrieving, visualizing, and similarity analysis of sets of molecules organized in different projects. The paper is organized as follows: first of all we describe the tool functionality using the main use case. In the next section the database model is described, later the system architecture and tool characteristics are described using some representative screenshots. Finally, we discuss the applications and future works.
2 Background Although there is a wide set of databases [8] in Internet aimed to offer researchers a tool to access to chemical information, these databases do not permit the user its own information management. They are government, company or industry owner databases and researchers can only access and visualize the information stored in these databases. There are other proprietary systems aimed to the management of chemical information. They usually are modular systems, where the researchers have to pay a high cost for obtaining each one of the functionalities needed (database management, similarity analysis, molecular descriptor calculation, etc.). In the last year, different free solutions have been developed. CDK [9] is a free Java library of chemical-informatics packages aimed to QSAR, and 2D and 3D chemical solutions. Other examples are Bioeclipse [10] and Instant JChem [11]. Both software products, currently, allow to the researcher the management of chemical information, but they do not include other functionality. However, Bioeclipse, a free
A Java Tool for the Management of Chemical Databases
371
Java tool, allows the expert researchers to develop new functionality and to include it in the user environment of the tool. We describe in this paper a fully developed tool in Java aimed to both, management of user chemical databases and similarity analysis. This system includes different Java free components developed by other researchers and algorithms developed by the authors in order to offer to the scientific community a tool for the management of databases of chemical compounds, and the similarity/diversity database analysis using similarity principles based on isomorphism calculation. The investigations in Computational Chemistry use computers to assist in solving chemical problems, they incorporate the results of theoretical chemistry into efficient computer programs, aimed to calculate the structures and properties of molecules. While its results normally complement the information obtained by chemical experiments, computational chemistry find predict hitherto unobserved chemical phenomena.
3 System Description The system functionality is shown in the use case diagram of Fig. 1. Three main functionalities are represented: • Database management: the user can build new databases from existing databases or files storing molecules in any of the standard file formats (SIMILE. MOL, SDF, etc.) [12]. Databases can be built using both Oracle 10g and MySQL 5.0 database systems, using the relational database model. • Project management: the user can build new projects incorporating molecules from the existing databases and/or external files of molecules. The project management is quite similar to the database management; however; here the user can modify, delete or insert new molecules characteristics (e.g. properties) in order to tailor specific groups of molecules for a later study. • Isomorphism calculation: the user can execute different isomorphism algorithms on the groups of molecules stored in the projects. The results of the isomorphism calculation are managed by the user who can carry out different similarity analysis in order to study database diversity, molecular resemblance or to developed different QSAR models. 3.1 Database and Project Management The user can create as many databases of molecules as required. Databases are created empty using a creation script or importing molecules from both other databases, and files containing molecules in some of the standard formats. For existing databases new molecules can be added, extracted or transferred to other database or even saved in external files. Using item menus, control or drag and drop tips (like in standards Windows applications) the user can manage the database content easily. When the molecules are imported into the database, properties of the molecules are stored. These properties are allocated in a referenced table storing the property value
372
I.L. Ruiz and M.Á. Gómez-Nieto
and its meaning. New properties can be imported from external files or added using specific functionality of the systems. Panels distributed along the windows application allow the user to visualize the information stored in the database. Molecules are visualized using Structure [13] paint tool; a free tool aimed to the representation of molecular graphs. Remaining information about molecules as: properties values, number and type of nodes and edges, molecular weight, etc., is visualized with tables distributed along the windows environment. The management of the project (see Fig. 1) is quite similar to the previously described database management. Indeed, the system considers a project as a tailored and owner user database.
Fig. 1. Use case diagram of Project Management
User can create as many projects as necessary, importing the information from existing databases or external files, and afterwards modify or extend the database content in order to prepare a set of molecules from a chemoinformatic analysis. 3.2 Isomorphism Calculation Several isomorphism criteria are included in the system described in this paper. • Exact isomorphism: calculates whether two molecular graphs are exactly equals. This calculation is performed using CDK class [9], and the results just inform if two molecular graphs are equals or not. • Subgraph isomorphism: calculates whether a molecular graph is part (is included) of another one. CDK class is used for this calculation. Results are obtained if one of the matched molecules is an entire subgraph of another one. • All Common Fragments (CF): calculates the set of all common fragments between the two matched molecular graphs. In this set some fragments may be included in other larger fragments. As result a list of the common fragments between the two
A Java Tool for the Management of Chemical Databases
373
matched molecules is obtained. An owner method also allows us to obtain all the non-common fragment of both matched molecules. • Maximum common subgraph (MCS): calculates the maximum common subgraph between two given molecular graphs. This isomorphic method is performed using an algorithm developed by the authors [14]. • Maximum overlapping set (MOS): also called Maximum Common Edges Subgraphs (MCES) is the set of common fragments without repetition between two molecules. The calculation of MCS and MOS isomorphism also retrieves the non isomorphic fragments of the both matched graphs; that is, those fragments belonging to each matched molecular graph not included in the maximum common fragment (in the matched graph). Non isomorphic fragments are widely applied in QSAR models for activity predictions of drugs. In all the isomorphism calculation methods, the user can include all molecules of a project or only a selected group to perform the calculation. The results of the isomorphism calculation are represented in an appropriate canvas as tables. The user can interact with the tables in order to visualize the graphs resulting of the isomorphism calculation (common and non-common) and the corresponding information related with the result. Moreover, these results can be stored in external text files and the isomorphic and non-isomorphic subgraphs in standards molecule formats. 3.3 Similarity Analysis Similarity analysis is carried out with any of the results obtained with the isomorphism calculation utility. Several similarity indices can be considered for this analysis (Tanimoto, cosine, Raymond, Dice, Kulczynski, Simpson, Forbes, Hamman, Pearson, Yule, etc.) [5-6] and the results are shown in symmetrical tables in a specific windows canvas. Moreover, the user can save these results in external text files in order to carry out, for instance, a further statistical analysis.
4 Database Structure Figure 2 shows the resumed class diagram of the database structure [15]. Projects are stored in a table with information related with the project identification and management. Each project maintains information about a set of molecules. For each molecule information on its identification, name, formula, smile format description, references to the database where was extracted, between other are stored. The class Properties maintains information about different properties which may be related with the molecules. This is a master class. PropertyMolecule class is related with both Molecule and Properties classes, so, keeping information about the property value of each molecule stored in the database. Isomorphism results are represented by different classes depending on the isomorphism type. Hence, GraphIsomReslt class stores information about the exact and subgraph isomorphism. Attributes exact and subgraph inform and store information relative to these isomorphism types.
374
I.L. Ruiz and M.Á. Gómez-Nieto
Fig. 2. Main classes of the database
Furthermore, MCSIsomReslt, MOSIsomReslt and CFIsomReslt classes store information about the types of isomorphism MCS, MOS and CF respectively. For all these classes the isomorphism subgraphs are stored as a set of subgraphs; the nonisomorphic subgraph can be later obtained through the knowledge of both the matched molecules and isomorphic subgraphs (both class attributes). All classes related with the representation of isomorphism information have defined some constraints in order to not store information; that is, pairs (idmolecule-1, idmolecule-2) are unique without order consideration.
5 System Architecture The system has been designed using UML 2.0 method [15], and as we have previously commented, fully codified using Java, free source Java code and Eclipse software environment [16]. Figure 3 shows some of the main system classes. Any remarkable functionality of the system is implemented using Eclipse Perspectives model. A Perspective is a set or windows or Panels with related functionality and that the user usually uses in a combined way. The user can change between different Perspectives as desired; however, only a Perspective (a set of Panels) can be active at the same time. In an execution session different database connections can be open at the same time. In this way, the user can manage information (molecules) of different databases in order to insert new molecules in a project. Besides, different projects can be open at a time. However, the user can only manipulate the molecules of the selected or active
A Java Tool for the Management of Chemical Databases
375
Fig. 3. Some of fthe main system classes
project. For each one of the specific functionalities a Panel class in charge of the manipulation of the corresponding information is defined. So, the user can manage and visualize huge information at a time in the same Perspective. As can be observed in Fig. 3, the user navigates among the projects and databases through a hierarchical menu or tree where database and project objects information can be accessed and managed (TreePanel class). Similarity calculation is performed using the SimilarityCalculation class. This class implements methods for the calculation of all the similarity indices. Results are presented in different tables using ResultsTablePanels class. 5.1 Molecule Management Interface Figure 4 shows the screenshots corresponding to the molecule management perspective. As observed in the windows left-hand two panels are devoted to the project and database management respectively.
376
I.L. Ruiz and M.Á. Gómez-Nieto
Fig. 4. Project and database management perspective
Using a tree menu, different projects and the corresponding molecules can be accessed. Furthermore, in the windows right-hand a table is used to show all the molecule data. When a molecule is set both in the tree menu or molecule table, the corresponding molecular graph is also shown. The user can manage different projects and tables of molecules in the database. So, the user can copy molecules between projects or between database tables and projects. Moreover, the user can import new molecules to a project or database table at any time. 5.2 Isomorphism Management Interface Figure 5 shows the screenshots corresponding to the isomorphism management perspective. The left-hand of the window is the molecule management perspective. However, these panels are dynamic, and the user can minimize them if they are not necessary any more. As Fig. 5 shows, a panel is devoted to the user selection of the isomorphism type calculation. User can select all project molecules or a subset and the type of isomorphism to be calculated. Once the isomorphism is calculated, the isomorphism results are shown through two panel: a) a panel shows the molecular graphs corresponding to the common and non-common fragments corresponding to the matched molecules which are selected in the left-hand tree project menu (or in the similarity results table), and b) a table shows the information corresponding to the selected isomorphism, namely, number and type of nodes and edges of the molecules, common and non-common fragment. Isomorphism results are automatically stored in the project database (see tree project menu) as we described in section 4. Another panel is used for the visualization of the similarity analysis results. As is observed in Fig. 5 (upper right-hand panel), user can select any of the similarity indices for the similarity calculation. Results are shown in a linkage table to the other
A Java Tool for the Management of Chemical Databases
377
Fig. 5. Isomorphism and Similarity perspective
panels (tree project menu, molecular graphs visualization and isomorphism data table). Results of similarity analysis can be saved by the user in external files using the corresponding option of the main menu or right-hand mouse button.
6 Discussion Applications of Computer Science to Chemistry are playing an important role in the research advances. Much of these advances are based on the building and manipulation of large databases of chemicals compounds and the development of new algorithms for the classification, screening and analysis of the chemicals databases. In this paper we have described a useful system for the management of chemical databases and the development of computational chemistry applications based on the similarity among chemical compounds using different isomorphism paradigms. The systems described in this paper is able to work with some of the most popular relational database systems and it allows to the user the building of different projects using molecules from other projects, databases or external files. Manipulation of projects and molecules is guided by a windows environment in which different Perspectives (panel’s sets with a related functionality) easy to the user the management and visualization of information. The system includes the matching calculation among molecules using different isomorphism paradigms. Matching results can be visualized, stored and used for similarity analysis of projects, databases or molecules sets. The similarity calculation can be carried out using a wide set of similarity indices. The modular characteristic of the system allows us the development of new computational chemistry solutions and to enlarge of the system with new Perspectives.
378
I.L. Ruiz and M.Á. Gómez-Nieto
Nowadays, we are working for to include a new perspective for the calculation of molecular descriptors. Acknowledgements. This work was supported by the Comisión Interministerial de Ciencia y Tecnología (CiCyT) and FEDER (Project: TIN2006-02071).
References 1. Young, D.: Computational Chemistry. A Practical Guide for Applying Techniques to Real World. John Wiley & Sons, Chichester (2004) 2. Cramer, C.J.: Essentials in Computational Chemistry. John Wiley & Sons, Chichester (2002) 3. Jensen, F.: Introduction to Computational Chemistry. John Wiley & Sons, Chichester (2007) 4. Van de Waterbeemd, H. (ed.): Structure-Property Correlations in Drug Research. Academic Press, Austin (1996) 5. Johnson, M.A., Maggiora, G.M. (eds.): Concepts and Applications of Molecular Similarity. John Wiley & Sons Ltd, Chichester (1990) 6. Willett, P.: Chemical Similarity Searching. J. Chem. Inf. Comput. Sci. 38, 983–996 (1998) 7. Hansch, C., Leo, A.: Exploring QSAR: Fundamentals and Applications in Chemistry and Biology. ACS Professional Reference Book, Washington DC (1995) 8. Irwin, J.J., Shoichet, B.K.: ZINC- a free database of commercially available compounds for virtual screening. J. Chem. Inf. Model. 45, 177–182 (2005) 9. Chemistry Development Kit, http://www.sourceforge.net/projects/cdk 10. Bioeclipse, http://www.bioclipse.net 11. ChemAxon Kft., http://www.chemaxon.com 12. Dalby, A., Nourse, J.G., Hounshell, W.D., Gushurst, A., Grier, D.L., Leland, B.A., Laufer, J.: Description of Several Chemical-Structure File Formats Used by Computer-Programs. J. Chem. Inf. Comput. Sci. 32, 244–255 (1992) 13. Structure, http://sourceforge.net/projects/structure 14. Cerruela García, G., Luque Ruiz, I., Gómez-Nieto, M.A.: Step-by-Step Calculation of All Maximum Common Substructures through a Constraint Satisfaction Based Algorithm. J. Chem. Inf. Comput. Sci. 44, 30–41 (2004) 15. Booch, G., Rumbaugh, J., Jacobson, I.: Unified Modeling Language User Guide, 2nd edn. Addison-Wesley Reading, Reading (2005) 16. Eclipse Software, http://www.eclipse.org
Noncanonical Base Pairing in RNA: Topological and NBO Analysis of Hoogsteen Edge - Sugar Edge Interactions Purshotam Sharma, Harjinder Singh, and Abhijit Mitra Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad 500032, India
[email protected]
Abstract. Hoogsteen sugar pattern of RNA base pairing is studied using density functional theory. Hydrogen bonding patterns of these base pairs are characteristized using NBO analysis and AIM analysis. Correlation between strength of base pairing and the nature of donor-acceptor combinations is also carried out.
1
Introduction
Molecular recognition abilities, the constitution of active sites and the associated conformational dynamics, all of which are guided by noncovalent interactions in their respective three dimensional structures, are important aspects in understanding of the functioning of biological macromolecules [1]. The molecular biology community has greatly benefitted from the concomitant understanding of their sequence-structure-function space in terms of the physicochemical properties of constituent amino acids in protein enzymes. Recent surge in the interest in understanding of the functioning of ribozymes, the functional RNA molecules, has highlighted the need for developing fresh understanding about these aspects. RNA molecules, with a preponderance of negatively charged phosphate groups and a choice of univalent and divalent counter ions, present a different set of physicochemical environments. They need to function with the help of only four types of bases, apart from a few modified bases, instead of the twenty different amino acids. Though there exists abundant literature related to the understanding of molecular recognition and conformational dynamics of DNA molecules [2], the scenario in RNA molecules is far more complex than in DNA. In DNA molecules we predominantly deal with double stranded helices with complementary Watson-Crick base pairs holding the two strands together. RNA molecules, on the other hand, are single strands folded on to themselves. The relative positioning of different regions in an RNA strand is thus not necessarily mediated by canonical Watson-Crick base pairs. More than 40% of base pairing interactions in RNA molecules are noncanonical [3]. The second important difference in the RNA molecules is the presence of the 2 OH group in the ribose moiety. This gives rise to extra possibilities for interstrand interactions in RNA as distinct from that in DNA. Each nucleobase has three distinct edges M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 379–386, 2008. c Springer-Verlag Berlin Heidelberg 2008
380
P. Sharma, H. Singh, and A. Mitra
the Watson-Crick, Hoogsteen and Sugar (W, H and S) edge. Leontis and Westhof have shown how, on the basis of the six possible edge interactions and with two possible glycosidic bond orientations (cis and trans) for each, pairing between any two interacting bases can be classified into twelve geometric families [3]. With sixteen possible pair combinations arising out of an alphabet of four bases, this leads to a total of 168 distinct base pair types, only four out of which are canonical [4],[5]. This does not include the variety that may arise out of modifications, including protonation and tautomerism, of the interacting bases. Also not included are the possibilities of multimodality within each geometric family. Notably, this count considers for example A:G W:W cis and G:A W:W cis as distinct types. This is not unreasonable since studies within evolutionarily related sequences have, in many cases, shown a lack of covariation and has highlighted the distinct nature of such combinations. Several studies exist involving compilation and characterization of noncanonical base pairs in available RNA structures [4],[6]. These not only highlight the variety in number and nature of hydrogen bonds involved in base - base interactions, but also underline the need for evaluating associated interaction energies. Several groups are currently involved in ab-initio quantum chemical studies of interaction energies of RNA molecules [5],[7,8,9],[12]. In our studies involving more than 45 representative base pairs, most of which are mediated by two or more good N-H...N/O hydrogen bonds [7,8,9], we have argued in favor of the possibility that the variety in physicochemical properties of several of these base pairs may define their respective unique roles in the context of structure and dynamics of functional RNA moelcules [7],[9]. While choosing representative examples and while building models for computations, we have so far not considered hydrogen bonding interactions involving sugar O2 -H group. In this paper, we describe our results of ab-initio quantum chemical studies on some representative members of two structurally and biochemically important families of RNA base pairs, the cis and trans Hoogsteen:Sugar edge families, where the sugar O2 interactions have been explicitly considered.
2
Computational Methods
Gaussian [10] suite of quantum chemical programs was used for all electronic and structural calculations carried out in this work. The initial structures of the base pairs were built on the basis of their respective crystal geometries. The ribose sugar of the base interacting through its sugar edge was retained in our computations. The 5 -OH group of the retained ribose was replaced by a hydrogen atom during model building. The ribose sugar of the base interacting through its Hoogsteen edge was not involved in base pairing interaction, and was therefore, replaced by hydrogen atom. Such models have been used recently in literature and are fully justified for studying base pairing interactions in RNA [5],[12]. We carried out unconstrained geometry optimizations with all parameters relaxed, at the DFT level. The Becke’s three parameter exchange [13] and
Noncanonical Base Pairing in RNA: Topological and NBO Analysis
381
Lee-Yang-Parr’s correlation functional [14] (known as B3LYP) was used with the 6-31G** basis set for geometry optimizations. The B3LYP/6-31G** optimized base pair geometries are shown to agree very well with the geometries obtained from RIMP2/cc-pVTZ (resolution of identity MØller-Plesset second order perturbation method/correlation consistent-polarization valence triple zeta basis set) optimizations [15]. This level of theory is well established in the literature for studying nucleic acid base pairing properties [5],[12]. Natural Bond Orbital [16] (NBO) and Atoms in molecules [17] (AIM) analysis have been performed in order to understand the nature of interbase binding and to characterize the hydrogen bonding interactions in terms of spatial distribution of electron densities.
Fig. 1. Optimized geomerties of cis (first five) and trans (last five) H:S base pairs
382
3
P. Sharma, H. Singh, and A. Mitra
Results and Discussion
We carried out geometry optimizations for ten selected base pair geometries, five of which belong to each of the cis and trans H:S base pair families. We have obtained minimum energy structures, which correspond to at least local minima on the intrinsic potential energy surfaces of these isolated molecular systems. All these studied structures are intrinsically stable, and do not deviate much from their respective crystal geometries. The main geometrical parameters for these structures are summarized in Table 1. All optimized structures possess at least two hydrogen bonds, and therefore, can be considered as stable base pairs. In order to characterize the hydrogen bonding in these base pairs, AIM and NBO analyses have been carried out. An analysis of results of our computations on base pairs of H:S family revealed several features, which could be helpful in getting insights into physicochemical basis of RNA structures, and developing a correlation between structural aspects and functional characteristics. The 2 -OH group of the ribose sugar seems to have an equal tendency of behaving as a hydrogen bond donor as well as acceptor towards the base of the other nucleotide as revealed by small hydrogen bonding distances and relatively linear bond angles (Table 1). The additional hydrogen bonds formed by 2 -OH group of the sugar of a base with its partner base seem to be strong enough to provide intrinsic stability to these base pairs. Except A:U H:S cis and C:C H:S cis geometries, all other base pairs are stabilized by at least one hydrogen bond involving O2 interactions. In case of H:S cis base pairs, wherever O2 interaction is involved, O2 always act as a donor atom. In case of base pairs of H:S trans family, O2 acts as an acceptor atom in case of A:A ans A:C geoemtries, and as a donor atom in case of A:G, C:C and C:A base pairs. Thus the dual ability of the 2 -OH group of the base interacting through it sugar edge, provides additional stabilization to these base pairs. Occasionally, a third hydrogen bond also contributes to the stabilization of studied H:S cis and trans binding patterns. Two of the five studied nucleosidebase systems (called as base pairs for simplicity) of H:S trans family, viz. A:A H:S trans and A:G H:S trans, possess three hydrogen bonds. But the nature of hydrogen bonding is different in all these three cases. In case of A:A H:S cis, a strong hydrogen bond involving N-H...N and a weaker hydrogen bond involving C-H...N interaction stabilizes the interbase contacts. An additional hydrogen bond of N-H...O type, involving O2 interaction is also present. In case of A:G, both the interbase hydrogen bonds involve N-H...N interactions. The O2 provides additional stabilization by forming a hydrogen bond with N6-H of adenine. Due to very weak electron density of hydrogen electrons obtained using Xray crystallography, there is an ambiguity in assignment of position of hydrogen atoms in crystal structures of macromolecules. As a result, the correct assignment of donor-acceptor tendencies to particular atoms forming hydrogen bonds is an exceedingly difficult task, and is writ with many errors. The optimized geometries of the base pairs can correctly describe the intermolecular interactions in such well defined intrinsically stable building blocks of RNA. For example, in case of
Noncanonical Base Pairing in RNA: Topological and NBO Analysis
383
Table 1. Selected geometrical parameters of studied base pairs Family
Base Bond Pair H:S cis A:A O2 H(A1)...N7 N6-H(A2)...N3 A:C O2 -H(C)...N7 N6-H(A)...O2 C:C C5-H(C1)...O2 N4-H(C1)...O2 A:U N6-H(A)...O2 C1 -H(U)...N7 C:U N4-H(C)...O2 C5-H(C)...O2 H:S trans A:A C2-H(A1)N7 N6-H(A2)...O2 N6-H(A2)...N3 A:C N6-H(A)...O2 N6-H(A)...O2 C:A O2 -H(A)...N4 N4-H(C)...N3 C:C N4-H(C)...O2 O2 -H(C1)...N4 A:G N2-H (G)...N7 O2 -H(G)...N6 N6-H(A)...N3
D-A A-H D-H D-H-A 2.79 3.00 2.82 2.86 3.23 2.88 3.01 3.30 3.10 3.52 3.48 3.01 3.08 2.98 3.17 2.89 3.03 2.86 2.89 3.04 2.93 3.18
1.88 2.08 1.87 1.85 2.49 1.93 2.01 2.45 2.13 2.46 2.48 2.12 2.09 2.24 2.16 1.92 2.02 1.92 1.92 2.03 1.98 2.20
0.98 1.02 0.99 1.02 1.08 1.02 1.01 1.09 1.01 1.08 1.09 1.01 1.02 1.01 1.02 0.98 1.03 1.02 0.98 1.02 0.98 1.02
153.4 160.0 161.5 172.4 124.5 154.5 170.7 133.6 162.8 167.0 152.5 145.6 165.0 129.8 167.5 165.3 172.0 153.1 170.4 170.5 162.6 160.5
C:C H:S cis, C:A H:S cis, C:C H:S trans and A:G H:S trans geometries, the O2 is considered as an acceptor atom by Leontis et al. [4], on the basis of crystal structure analysis. But the full optimization of the crystal geometry clearly shows the oxygen atom behaving as a donor in these cases. Some hydrogen bonds are not assigned in the crystal geometry of these base pairs. But the optimized geometries unambiguously show the presence of these hydrogen bonds. For example, O2 interaction is not considered in crystal geometries of A:C and C:U H:S cis base pairs. Similarly, the hydrogen bonding interaction of C5-H with O2 is not considered in case of C:C H:S cis geometries. Similarly, the weak interaction of C1 -H with N7 is not considered in crystal geometry of A:U H:S cis. In case of A:A H:S cis, the O2 -H...N7 interaction can be considered as a hydrogen bond according to geometric criteria. But the results obtained using NBO and AIM analysis do not support this assumption (discussed later). Thus, the nature of this interaction remains ambiguous. Thus, the accurate strength and interaction in such hydrogen bonded complexes can be assigned only using optimized geometries. Analysis of charge density (ρ(rc )) and the associated Laplacian of charge density at the hydrogen bond critical points (HBCP) provides a good measure of predicting the strengths of hydrogen bonds [18]. In general, the values of charge density and the Laplacian of charge density lie between 0.002-0.34 and
384
P. Sharma, H. Singh, and A. Mitra
0.016-0.13 a.u respectively. The values for these two parameters lie well within these proposed limits in these studied complexes. The values suggest that CH...O and C-H...N bonds are much weaker than other hydrogen bonds involving O and N as donor atoms. However, the values of ρ are very close to each other in case of O-H...N, N-H...N and N-H...O hydrogen bonds. Hence the relative strength of these bonds cannot be predicted with accuracy. The O2 -H...N7 interaction in case of A:A H:S cis geometry, although satisfying the geometrical criterion for hydrogen bonding (Table 1) cannot be regarded as a proper hydrogen bond, due to comparatively much weaker electron density at HBCP as compared to similar hydrogen bonds in other studied complexes. This is further substantiated by the absence of interaction between nonbonding orbital of N7 and antibonding orbital of O2 -H bond in NBO analysis. Table 2. Results of AIM and NBO analysis for studied base pairs Family
Base Bond Pair H:S cis A:A O2 H(A1)...N7 N6-H(A2)...N3 A:C O2 -H(C)...N7 N6-H(A)O2 C:C C5-H(C1)...O2 N4-H(C1)...O2 A:U N6-H(A)...O2 C1 -H(U)...N7 C:U N4-H(C)...O2 C5-H(C)...O2 HS trans A:A C2-H(A1)N7 N6-H(A2)...O2 N6-H(A2)...N3 A:C N6-H(A)...O2 N6-H(A)...O2 C:A O2 -H(A)...N4 N4-H(C)...N3 C:C N4-H(C)...O2 N4-H(C)...O2 A:G N2-H(G)...N7 O2 -H(G)...N6 N6-H(A)...N3
ρ ∇2 ρ 0.005 0.021 0.024 0.022 0.009 0.026 0.022 0.012 0.012 0.019 0.023 0.013 0.018 0.024 0.021 0.026 0.033 0.025 0.029 0.018
0.040 0.091 0.136 0.124 0.034 0.079 0.062 0.035 0.031 0.053 0.056 0.048 0.046 0.123 0.091 0.084 0.073 0.062 0.065 0.045
na 1.89257 1.88982 1.89277 1.95478 1.96017 1.96017 1.99974 1.99944 1.95618 1.97102 1.92191 1.70220 1.70220 0.98520 0.98632 1.97160 1.89000 1.89162 1.77412 0.95337 0.88855 0.95236
∗ σ(D−H) E(2)
0.04161 0.05501 0.03889 0.01435 0.03512 0.02777 0.03875 0.02103 0.01934 0.02952 0.02465 0.03249 0.00583 0.01758 0.04718 0.05030 0.04798 0.03168 0.01929 0.01994 0.01466
13.60 19.35 10.89 0.09 5.75 4.52 2.10 3.86 1.95 3.13 5.48 10.80 1.38 2.68 16.18 15.82 9.44 17.41 6.23 6.89 3.86
We had earlier shown a strong linear relationship between sum of electron densities at HBCP and the binding energy of base pairs. Using the same criterion, the strength of binding between these base pairs can be estimated as: A:G H:S trans > C:C H:S trans >A:A H:S trans > A:C H:S cis > C:A H:S trans > C:C H:S cis > A:U H:S cis > A:C H:S cis > A:A H:S cis. The strength of binding could not be predicted in case of C:U H:S trans geometry due to very high nonplanarity of this systemand presence of two critical points between some atoms.
Noncanonical Base Pairing in RNA: Topological and NBO Analysis
385
NBO analysis is carried out in order to understand the physical nature of intermolecular contacts in these base pair complexes. The charge transfer interaction between nonbonding orbital of the acceptor and the antibonding σ ∗ orbital on thedonor-hydrogen bond (D-H bond) are of crucial importance in hydrogen bonded complexes. The occupancies of nonbonding orbitalsof acceptor atoms (na ) and antibonding orbital of D-H bond (σ ∗ D-H) are listed in Table 2. The occupancies of σ ∗ (C-H), σ ∗ (N-H) and σ ∗ (O-H) bonds lie between 0.01435-0.02952, 0.00583-0.05031 and 0.01994-0.05501 e (e stands for one electronic charge) respectively, showing greatest charge transfer in hydrogen bonds where O2 acts as a donor atom. Moreover, the charge transfer stabilization energy E(2) is also greatest in case of O2 -H...N bonds. These observations further substantiate the importance of O2 interactions in the stabilization of these base pairing interactions. As discussed earlier, no charge transfer interaction between na and σ ∗ (D-H) orbitals is apparent from NBO analysis in case of O2 -H...N7 interaction in A:A H:S cis base pair. Thus, it cannot be regarded as a hydrogen bond.
4
Conclusion
In this paper, we present the first ab-initio studies on base pairs belonging to cis and trans H:S families. In addition to characterization of hydrogen bonds, these calculations refine the crystallographic data and evaluate the intrinsic stabilities of these base pairs. All the studied structures do not show any difference in edge interaction and glycosidic bond orientation, as compared to their respective crystal geometries. Hence, they can be considered as well defined intrinsically stable building blocks of RNA. The calculations thus, unambiguously show that hydrogen bonding is the most important stabilizing interaction in these noncanonical base pairs. In addition to assignment of correct donor-acceptor interacions, these calculations also highlight the importance of interactions inviolving sugar O2 in RNA base pairing. Previous studies [4] on detection of base pairs in RNA structures considered the oxygen atom of the sugar hydroxyl group as hydrogen bond acceptor only. We find that in many cases, this hydroxyl group acts as a hydrogen bond donor. Recently it was also indicated from statistical analysis that O-H...N is a very strong hydrogen bond [19]. This indicates the need to revisit the problem of base pair finding by automated tools such as BPFind [11] or X3DNA [20]. Further studies regarding evaluation of interaction energies and complete quantum chemcial characterization and physicochemical impact of change in edge orietnation on optimization in other members of these RNA base pairing families are in progress. Acknowledgements. We thank Department of Biotechnology, Govt. of India, for supporting this work (Grant No. BT/PR5451/BID/07/111/2004) and CDAC, Pune, for computational support. PS thanks CSIR, Govt. of India, for ˇ research fellowship. We also thank Prof. Jiri Sponer, Institute of Biophysics, Czech Republic, for useful discussions.
386
P. Sharma, H. Singh, and A. Mitra
References ˇ 1. Cerny, J., Hobza, P.: Non-covalent interactions in biomacromolecules. Phys. Chem. Chem. Phys. 9, 5291–5303 (2007) ˇ 2. Sponer, J., Hobza, P.: Molecular interactions of nucleic acid bases. A review of quantum-chemical studies. Collect; Czech Chem. Commun. 68, 2231–2282 (2003) 3. Leontis, N.B., Westhof, E.: Geometric nomenclature and classification of RNA base pairs. RNA 7, 499–512 (2001) 4. Leontis, N.B., Stombaugh, J., Westhof, E.: The non-Watson-Crick base pairs and their associated isostericity matrices. Nucleic Acids Res. 30, 3497–3531 (2002) ˇ ˇ 5. Sponer, J.E., Spackov, N., Pulhanek, P., Lesczynski, J., Sponer, J.: Non-WatsonCrick Base Pairing in RNA. Quantum chemical analysis of the cis WatsonCrick/Sugar Edge base pair family. J. Phys. Chem. A 109, 2292–2301 (2005) 6. Das, R., Baker, D.: Automated de novo prediction of native-like RNA tertiary structures. Proc. Natl. Acad. Sci. 104, 14664–14669 7. Bhattacharyya, D., Koripella, S.C., Mitra, A., Rajendran, V.B., Sinha, B.: Theoretical analysis of noncanonical base pairing interactions in RNA molecules. J. Biosci. 32, 809–825 (2007) 8. Sharma, P., Mitra, A., Sharma, S., Singh, H.: Basepairing in RNA: A computational study of structural aspects and interaction energies. J. Chem. Sci. 119, 525–531 (2007) 9. Sharma, P., Mitra, A., Sharma, S., Singh, H., Bhattacharyya, D.: Quantum chemical studies of noncanonical RNA base pairs: The trans Watson-Crick: Watson-Crick family. J. Biomol. Struct. Dyn. (in press) 10. Frisch, M.J., et al.: Gaussian03, revision B.05; Gaussian, Inc.: Pittsburgh, PA (2003) 11. Das, J., Mukherjee, S., Mitra, A., Bhattacharyya, D.: Noncanonical base pairs and higher order structures in nucleic acids: Crystal structure database analysis. J. Biomol. Struct. Dyn. 24, 149–162 (2006) ˇ ˇ 12. Sponer, J.E., Spackova, L.J., Sponer, J.: Principles of RNA Base Pairing: Structures and Energies of the Trans Watson-Crick/Sugar Edge Base Pairs. J. Phys. Chem. B 109, 11399–11410 (2005) 13. Becke, A.D.: Density-functional thermochemistry. III. The role of exact exchange. J. Chem. Phys. 98, 5648–5652 (1993) 14. Lee, C., Yang, P.R.G.: Development of the Colle-Salvetti correlation-energy formula into a functional of the electron density. Phys. Rev. B 37, 785–789 (1988) ˇ 15. Sponer, J., Jurecka, P., Hobza, P.: Accurate Interaction Energies of HydrogenBonded Nucleic Acid Base Pairs. J. Am. Chem. Soc. 126, 10142–10151 (2004) 16. Reed, A.E., Weinstock, R.B., Weinhold, F.J.: Natural population analysis. J. Chem. Phys. 83, 735–746 (1985) 17. Bader, R.F.W.: Atoms in molecules. A Quantum Theory. The Clarendon Press, Oxford (1990) 18. Kolandaivel, P., Nirmala, V.: Study of proper and improper hydrogen bonding using Bader’s atoms in molecules (AIM) theory and NBO analysis. J. Mol. Struct. 694, 33–38 (2004) 19. Roy, A., Bhattacharyya, M., Bhattacharyya, D.: Structures, Stability and Dynamics of Canonical and Noncanonical base pairs: Quantum Chemical Studies. J. Phys. Chem. B (accepted) 20. Lu, X.J., Olson, W.K.: 3DNA: a software package for the analysis, rebuilding and visualization of three-dimensional nucleic acid structures. Nucleic Acids. Res. 31, 5108–5121 (2003)
Design of Optimal Laser Fields to Control Vibrational Excitations in Carboxy-myoglobin Harjinder Singh1 , Sitansh Sharma1 , Praveen Kumar2 , Jeremy N. Harvey3 , and Gabriel G. Balint-Kurti3 1
Center for Computational Natural Sciences and Bioinformatics, International institute of Information Technology, Hyderabad 500032, India
[email protected] 2 Department of Chemistry, Panjab University, Chandigarh - 160014, India 3 Centre for Computational chemistry and School of Chemistry, University of Bristol, Bristol BS8 1TS, UK
Abstract. Optimal control theory is applied to obtain infrared laser pulses for the selective vibrational excitation of a two mathematical dimensional model of carboxy-myoglobin. Density functional theory is used to obtain the potential energy and dipole moment surfaces of the active site model. The Conjugate gradient method is employed to optimize the cost functional and to obtain the optimized laser pulses. Optimized laser fields are found which give virtually 100% excitation probability to preselected vibrational levels.
1
Introduction
Application of control theory [1,2,3,4,5,6,7,8,9,10,11] to biological systems is quite difficult because of their large size and complexity. The availability of ultrafast and intense mid-IR pulses has opened the way to achieving laser control in molecules such as proteins. Hemoglobin and myoglobin are among the most studied proteins, especially the structure and dynamics of the Fe-CO bond have been extensively studied for several years [12,13,14,15]. The possibility of laser control of dynamics in biological systems using frequencies in the visible regime has been established by various groups [15]. Recently Joffre et al. have demonstrated coherent vibrational climbing up to v = 6 state of C-O vibration in carboxy-hemoglobin using mid-IR chirped ultrashort pulses [16]. Our aim is to use optimal control theory (OCT) to design laser pulses for the selective vibrational excitation of the Fe-C-O fragment of the active site of carboxy-myoglobin. Peirce et al. [11] introduced the concept of OCT in molecular control, in which the problem is posed as an optimization of a cost functional. We have applied optimal control theory for selective vibrational excitation of Fe-C bond at the active site of carboxy-myoglobin. We have treated the interaction of the molecule with the electric field within the dipole approximation. As the dipole moment of the C-O bond in the Fe-C-O fragment is much higher than that of the Fe-C bond, so the C-O dipole will interact more strongly with the applied laser field and hence will absorb more energy. A reasonable physical mechanism for the M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 387–395, 2008. c Springer-Verlag Berlin Heidelberg 2008
388
H. Singh et al.
excitation of the Fe-C vibrational motion would be to first deposit energy in the C-O bond and then to selectively transfer this energy into the Fe-C bond. For the theoretical design of a laser field, we have performed optimal control calculations on Fe-C-O portion of the optimized structure. The potential energy and dipole moment surfaces are obtained using DFT calculations of a 60 atom model of the active site of the system. The Fe-C and C-O separations are systematically varied while all other internal coordinates are held fixed at their values in the equilibrium geometry. The paper is organized as follows: In Sec. 2.1 we present the details of electronic structure calculations and the description of the active site in consideration. In Sec. 2.2 the optimal control formulation for triatomic system (Fe-C-O) is described in detail. Sec. 2.3 describes the conjugate gradient method, used for optimisation of the functional constructed in Sec. 2.2. In Sec. 3, the results are discussed in detail and finally in Sec. 4 we conclude with a summary.
2 2.1
Theory and Method Electronic Structure Calculations and Active Site Model
In the present study, ab initio electronic structure calculations are performed using the Gaussian03 suite [17] of quantum chemical programmes. In order to model the active site of carboxy-myoglobin, we chose hexa-coordinated ironporphyrin-imidazole-CO as our model structure, where the imidazole ligand is used to mimic the effect of the proximal histidine residue. Geometry optimization of this model structure was performed using density-functional theory (DFT), with the popular B3LYP hybrid extended-correlation functional. For iron, oxygen, nitrogen and carbon we have used the 6-311G* basis set while the 6-31G basis was used for the hydrogen atoms. We have chosen Fe-C-O fragment to be our control system, treating the rest of the molecule as frozen. The crystallographic studies of carboxy-myoglobin (MbCO) show an almost linear FeCO geometry as well as a nearly perfect square planar symmetry with respect to the nitrogen plane formed by the four pyrrole N-ligands of the heme iron [18]. Figure 1 shows that the Fe-C-O fragment in the optimized structure is almost straight, so we have chosen the Fe-C-O fragment to be linear throughout our optimal field calculations. This is substantiated by the geometrical parameters obtained from the electronic structure calculations. The variations of potential energy and dipole moment with change in bond lengths (Fe-C, C-O) are shown in Fig. 2. 2.2
Dynamical Evolution and Application of Optimal Control Theory
In our control system (Fe-C-O) there are two bonds and the problem is formulated using Jacobi coordinates R and r, shown in Fig. 1(b), where the coordinate R is the distance of iron atom from center of mass of C-O bond and r is the carbon oxygen bond length. Our aim is to design a laser pulse that can selectively
Design of Optimal Laser Fields to Control Vibrational Excitations
389
Fig. 1. (a) Geometry optimized hexa-coordinated active site model, obtained using DFT, where the imidazole ring mimics the effect of the proximal histidine residue; (b) Jacobi coordinate representation of linear triatomic fragment Fe-C-O, where R = distance of Fe from center of mass of C-O bond while r = bond length of C-O bond
transfer population from an initial vibrational state to a desired target vibrational state at a fixed time. For the design of such a laser pulse we construct a cost functional containing our objective, a penalty term for the fluence and the dynamical constraint requiring that the time-dependent Schr¨ odinger equation be obeyed at all times. The Grand cost functional to be optimized takes the following form:[7,11] J[(t)] = |ψi (T )|φf | − α0 2
0
T
[(t)] dt − 2Re 2
0
T
∂ ˆ χf (t)| + iH ((t)) |ψi (t)dt , ∂t
(1)
where, the first term contains the physical objective, the second accounts for the penalties on pulse energy and the last term comes from the dynamical constraint. The function ψi (t) is the initial wavefunction propagated to time t under the action of the optimal laser field (t) and φf is the target state to be reached at the final time T . The function χf (t) can be regarded as a Lagrange multiplier introduced to assure satisfaction of the time-dependent Schr¨ odinger equation. The factor α0 is a constant positive weighting parameter that specifies the weight of the pulse energy in the functional. The time evolution of the quantum state |ψ(t) under the influence of an external field, (t), within the dipole approximation, is governed by the time-dependent Schr¨ odinger equation (atomic units used throughout the paper, h ¯ = 1), i
∂ ˆ ˆ0 − μ |ψ(t) = H|ψ(t) = [H ˆ (t)]|ψ(t), ∂t
(2)
ˆ and μ where H ˆ are the system Hamiltonian and the dipole moment operator, respectively (see [7] for further details). Variation of the grand functional (Eq. (1)) with small independent changes in the wave function, ψi (t), the Lagrange multiplier, χf (t) and the electric field, (t), leads to the following set of nonlinear pulse design equations for vibrational control in Fe-C-O part of carboxy-myoglobin:
390
H. Singh et al.
Energy(Hartrees) 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 −0.05
1.2
1.4
1.6 1.8 Fe−C (Å)
1.5 1.4 1.3 1.2 C−O (Å) 1.1 2
2.2
1 2.4 0.9
Dipole moment (Debye) 12 8 4 1.5 1.4 1.3 1.2 C−O (Å) 1.1
0
1.2
1.4
1.6 1.8 Fe−C (Å)
2
2.2
1 2.4 0.9
Fig. 2. Variation in potential energy and z-component (along Fe-C-O) of dipole moment with change in bond lengths of Fe-C and C-O bonds, obtained using DFT
i i
∂ψi (t) ˆ i (t), = Hψ ∂t
∂χf (t) ˆ f (t), = Hχ ∂t
ψi (0) = φi ,
(3)
χf (T ) = φf |ψi (T )φf ,
(4)
α0 (t) = Im(χf (t)|
∂ ˆ H|ψi (t)). ∂(t)
(5)
Design of Optimal Laser Fields to Control Vibrational Excitations
391
These coupled differential equations can be solved iteratively. In order to maximize our cost functional (Eq. (1)),subject to Eqs. (3) and (4) we have used the conjugate gradient method discussed in detail in Sec. 2.3. 2.3
Conjugate Gradient Method
In order to theoretically design a laser pulse that can selectively control vibrational excitation in the iron-carbon part of carboy-myoglobin, calculations are carried out to maximize the value of the cost functional, J, mentioned in Eq. (1) using the conjugate gradient method [7]. After k number of iterations in the optimization cycle, the gradient of J with respect to at time t is given by ˆ ∂J k ∂H k = −2 α0 (t) − Im χf (t)| k |ψi (t) . (6) ∂k (t) ∂ (t) −(t−T /2)2
In addition, a Gaussian envelope function s(t) = e (T /4)2 is preserved throughout the optimization (see [7]) to ensure that the optimized field switches smoothly on and off. The wave function, ψi (t), and the Lagrange multiplier, χf (t), are propagated in time using the second order split-operator method [19]. In order to determine the maximum value of the cost functional a line search is performed along the Polak-Ribiere-Polyak search direction calculated using Eq. (6) as dk (ti ) = g k (ti ) + ζ k dk−1 (ti )
where ζk =
i
g k (ti )T (g k (ti ) − g k−1 (ti )) k−1 ; (ti )T g k−1 (ti ) ig
(7)
(8)
k = 2, 3, ..., and d1 (ti ) = g 1 (ti ). To prevent the algorithm from sampling (t) values outside of the desired field amplitude range [min , ] during the line search, the direction dk (t) is projected. The projected search direction is Fourier transformed to obtain a function of frequency, and this function is filtered using a 20th -order Butterworth bandpass filter in order to restrict the frequency components of the electric field within a specified range. The field is updated using the following expression as k+1 (t) = k (t) + λ dkp (t),
(9)
where λ is determined by the line search.
3
Results and Discussion
We discuss now the results obtained using the optimal control formalism described above, for two vibrational excitations of the molecule. The first excitation is from the ground vibrational state to the vibrational state corresponding to a CO stretch (v = 0 → v = 5). The second is an excitation from ground vibrational
a1
0.6 0.4 transition probability cost functional
0.2 0
1
10
100
1000
1 0.8 0.6 0.4 0.2 0 -0.2
b1
transition probability cost functional 1
10
field amp. (Eh/ea0)
a2
0
10000
50 40
20000
time (_h Eh)
30000
40000
a3
30 20 10 0
0
2000
4000
-1
100
6000
8000
0.012 0.008 0.004 0 -0.004 -0.008 -0.012
b2
0
20000
60000
30 b3
20 10 0
0
2000
4000
6000
-1
8000
(cm )
1
1 a4
0.8
v=0
v=5
Population
Population
40000
time (_h Eh)
(cm )
0.6 0.4 0.2 0
1000
number of iterations
spectral intensity
spectral intensity
field amp. (Eh/ea0)
number of iterations 0.006 0.004 0.002 0 -0.002 -0.004 -0.006 -0.008
cost functional
1 0.8
transition probability
H. Singh et al.
cost functional
transition probability
392
0
10000
20000
_ time (h Eh)
30000
40000
b4
0.8
v=0 v=1 v=2 v=3 v=4 v=5
0.6 0.4 0.2 0
0
20000
40000
_ time (h Eh)
60000
Fig. 3. Results for transitions v = 0 to v = 5 and v = 0 to v = 1 : (a1, b1) Convergence behavior of transition probability and the cost functional, J, with number of iterative steps involved in optimization; (a2, b2) optimized laser field and (a3, b3) its frequency distribution and (a4, b4) populations of different states as a function of time
state to the vibrational state corresponding to the Fe-C stretch (v = 0 → v = 1). The Fourier Grid Hamiltonian method [20] is used to create a direct product basis for the variational calculation of the triatomic vibrational wave functions of the (Fe-C-O) system. For the time propagation, we employed 64 grid points for both R and r coordinates. In order to impose constraints on the total laser fluence a constant penalty parameter α0 in Eq. (1) is set to 0.1 for both the transitions. (a) Transition: v = 0 → v = 5: For this transition we have propagated the system (Fe-C-O) using a pulse of length 40,000 h ¯ /Eh , which is divided into 16384 ω time steps. The initial guess laser field is chosen as (t) = 0.0075 · sin( 0,5 2 t)s(t) where ω0,5 is transition frequency corresponding to the v = 0 → v = 5 vibration
Design of Optimal Laser Fields to Control Vibrational Excitations
393
and s(t) is the envelope function (see above). We illustrate the results obtained in Fig. 3(a1-a4). The convergence of the transition probability and cost functional with the number of iterations steps involved in the optimization is shown in plot a1. The figure shows that the dynamical goal is reached perfectly with near 100% population transfer in the desired state. Plot a2 shows the converged optimized electric field as a function of time. The maximum amplitude of the electric field is 0.0051 Eh /ea0 . The frequency distribution of the optimized field is shown in plot a3. The power spectrum shows the amplitude for the major component of the optimized electric field to be of frequency 1014 cm−1 , while the amplitude for the component of frequency corresponding to C-O stretch vibration (2005 cm−1 ) is considerably less, which implies a multiphoton nature of the excitation process. The mechanism of population transfer from initial vibrational state to final vibrational state with time is shown in plot a4. Small contributions from higher frequencies are also seen in plot a3. As time increases the population of v = 0 state decreases while that of target vibrational state keeps on increasing. Up to time 32,000 h ¯ /Eh the system is oscillating between these vibrational states and at the end of the pulse, the entire population is transferred to the target v = 5 vibrational state. (b) Transition: v = 0 → v = 1: In this case, we have propagated the system for 64,000 h ¯ /Eh of time. The pulse length is divided in 32768 time steps. The initial guess laser field is chosen as (t) = 0.0001 · sin(ω0,5 t)s(t) where ω0,5 is the same transition frequency as chosen for previous transition. The convergence of transition probability and cost functional with the number of iterative steps shows (Fig. 3(b1)). The figure shows that the desired goal is achieved, but clearly population has been transferred to some more highly excited states during the excitation process. Further work is in progress to better characterise the excitation process and to try to find better electric fields to achieve the desired goal. The optimized electric field is shown in plot b2 of Fig. 3. The frequency spectrum of the optimized field shown in plot b3 is quite complex, which shows the contributions from many frequencies. The plot b4 shows the evolution of the populations of the lowest 6 vibrational states as a function of time.
4
Conclusion
Our optimal control calculations of vibrational excitations in the active part of the biologically important molecule carboxy-myoglobin provides laser fields which lead to selective vibrational excitations within the molecule, subject to the simplifications inherent in the model of the system. We have discussed our results for two vibrational excitation processes. The shape of the optimized electric field and its corresponding frequency spectrum is quite complex for the second transition considered, namely, v = 0 −→ v = 1 corresponding essentially to excitation of the Fe-C bond. We are able to achieve our dynamical goals with nearly 100% population transfer at the end of the pulse. The evolution of vibrational states involved in both the transitions show that there is some missing population. This population must have been transferred temporarily to higher
394
H. Singh et al.
lying vibrational states. Another pertinent issue is a change in the polarity of the triatomic for Fe-C bond distances beyond the range shown in Fig. 2. It is possible that this introduces some complexity into the molecule-radiation interaction. These aspect are being investigated further. Application of OCT to the photodissociation of the Fe-C bond in carboxy-myoglobin is also in progress. Acknowledgements. We thank the Royal Society and British Council, India, and DST, GOI, for supporting this work. SS thanks Council of Scientific and Industrial Research, Government of India for research fellowship.
References 1. Rice, S., Zhao, M.: Optical Control of Molecular Dynamics. John Wiley & Sons, New York (2000) 2. Shapiro, M., Brumer, P.: Principles of the Quantum Control of Molecular Processes. John Wiley & Sons, New York (2003) 3. Ohtsuki, Y., Nakagami, K., Fujimura, Y.: Quantum optimal control of multiple targets: Development of a monotonically convergent algorithm and application to intramolecular vibrational energy redistribution control. J. Chem. Phys. 114, 8867– 8876 (2001) 4. Tannor, D.J., Rice, S.A.: Control of selectivity of chemical reaction via control of wave packet evolution. J. Chem. Phys. 83, 5013–5018 (1985) 5. Tannor, D.J., Kosloff, R., Rice, S.A.: Coherent pulse sequence induced control of selectivity of reactions: Exact quantum mechanical calculations. J. Chem. Phys. 85, 5805–5820 (1986) 6. Tannor, D.J., Rice, S.A.: Coherent pulse sequence control of product formation in chemical reactions. Adv. Chem. Phys. 70, 441–523 (1988) 7. Balint-Kurti, G.G., Manby, F.R., Ren, Q., Artamonov, M., Ho, T., Rabitz, H.: Quantum control of molecular motion including electronic polarization effects with a two-stage toolkit. J. Chem. Phys. 122, 84110 (2005); Ren, Q., Balint-Kurti, G.G., Manby, F.R., Artamonov, M., Ho, T., Rabitz, H.: Quantum control of molecular vibrational and rotational excitations in a homonuclear diatomic molecule: A full three-dimensional treatment with polarization forces. J. Chem. Phys. 124, 14111 (2006) 8. Shapiro, M., Brumer, P.: Laser Control of Molecular Processes. Ann. Rev. Phys. Chem. 43, 257–282 (1992) 9. Gross, P., Singh, H., Rabitz, H., Mease, K., Huang, G.M.: Inverse QuantumMechanical control: A mean for design and a test of intuition. Phys. Rev. A 47, 4593–4604 (1993) 10. Sharma, S., Sharma, P., Singh, H.: Quantum control of vibrational excitations in a heteronucler diatomic molecule. J. Chem. Sci. 119(2007), 433–440 (2007); Kumar, P., Singh, H.: Controlling dynamics in a diatomic systems. J. Chem. Sci. 119, 441–447 (2007) 11. Peirce, A.P., Dahleh, M.A., Rabitz, H.: Optimal control of quantum-mechanical systems: Existence, numerical approximation, and applications. Phys. Rev. A 37, 4950–4964 (1988) 12. Franzen, S.: An Electrostatic Model for the Frequency Shifts in the Carbonmonoxy Stretching Band of Myoglobin: Correlation of Hydrogen Bonding and the Stark Tuning Rate. J. Am. Chem. Soc. 124, 13271–13281 (2002)
Design of Optimal Laser Fields to Control Vibrational Excitations
395
13. Dreuw, A., Duniez, B.D., Head-Gordon, M.: Characterization of the Relevant Excited States in the Photodissociation of CO-Ligated Hemoglobin and Myoglobin. J. Am. Chem. Soc. 124, 12070–12071 (2002) 14. Rovira, C., Schulze, B., Eichinger, M., Evanseck, J.D., Parrinello, M.: Influence of the Heme Pocket Conformation on the Structure and Vibrations of the Fe-CO Bond in Myoglobin: A QM/MM Density Functional Study. Biophys. J. 81, 435–445 (2001) 15. Meier, C., Heitz, M.C.: Laser control of vibrational excitation in carboxyhemoglobin: A quantum wave packet study. J. Chem. Phys. 123, 44504 (2005) 16. Ventalon, C., Fraser, J.M., Vos, M.H., Alexandrou, A., Martin, J.L., Joffre, M.: Coherent vibrational climbing in carboxyhemoglobin. Proc. Nat. Aac. Sci. 101, 13216–13220 (2004) 17. Frisch, M.J., et al.: Gaussian03, revison B.05, Gaussian, Inc., Pittsburgh P.A (2003) 18. Kachalova, G.S., Popov, A.N., Bartunik, H.D.: A Steric Mechanism for Inhibition of CO Binding to Heme. Prot. Sci. 284, 473–476 (1999) 19. Feit, M.D., Fleck Jr., J.A.: Solution of the Schr¨ odinger equation by a spectral method II: Vibrational energy levels of triatomic moleculesa ). J. Chem. Phys. 78, 301–308 (1983) 20. Clay, C., Marston, Balint-Kurti, G.G.: The Fourier grid Hamiltonian method for bound state eigenvalues and eigenfunctions. J. Chem. Phys. 91, 3571–3578 (1989)
Computations of Ground State and Excitation Energies of Poly(3-methoxy-thiophene) and Poly(thienylene vinylene) from First Principles A.V. Gavrilenko, S.M. Black, A.C. Sykes, C.E. Bonner, and V.I. Gavrilenko Center for Materials Research, Norfolk State University 700 Park Ave, Norfolk, VA 23504
[email protected]
Abstract. Ground state and excitation energies of poly(3-methoxy-thiophene) (PMT) and poly(thienylene vinylene) (PTV) conjugated polymers are studied by first principles density functional theory (DFT). Two basic approaches of computational chemistry and physics are compared: time dependent DFT (TDDFT) of clusters and ab initio pseudopotentials within a standard DFT (PP-DFT) of infinite polymer chains. We demonstrate that series of excitation energies of PMT calculated by TDDFT with increased unit numbers converge well to the real experimentally measured energy gaps. Combination of TDDFT cluster method with PPDFT approach for infinite chain provides single-gap quasiparticle correction value needed for optical calculations. Infinite chain model is used to calculate optical absorption of PTV. Keywords: Density functional theory, equilibrium geometry, excitation energies, conjugated polymers.
1
Introduction
The search for inexpensive renewable energy sources has sparked considerable interest in the development of photovoltaics based on conjugated polymers and organic molecules [1,2]. In organic materials, the dominant light absorption interesting for applications is caused by π−electron transitions between highest occupied molecular orbital (HOMO) and excited lowest unoccupied molecular orbital (LUMO) [3,4]. The spectral location of the optical absorption band (or HOMO-LUMO gap value) determines efficiency of photovoltaic and optoelectronic devices [3,5,6]. It has been understood earlier that the gap value in poly(3methoxy-thiophene) (PMT) and poly(thienylene vinylene) (PTV) is determined by the rings and less affected by the sidegroups [7]. The technologically challenging goal of bandgap engineering could be substantially supported by modeling and simulation in addition to highlighting the methods of materials modification to achieve many of the desired properties. However, realistic modeling of the ground state and excitation energy of conjugated polymers, which are required for optics, still remains a challenging task for M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 396–404, 2008. c Springer-Verlag Berlin Heidelberg 2008
Ground State and Excitation Energies of Conjugated Polymers
397
first principle theories [8,9,10]. In the present work we compare different computational approaches for realistic predictions of the ground and excited states of conjugated polymers. The first approach involves modeling of atomic structure of conjugated polymers as clusters [7,11]. The excitation energies are calculated by Time Dependent Density Functional Theory (TDDFT). The second approach involves modeling of conjugated polymers as infinitely long chains thus allowing application of ab initio pseudopotentials within DFT (PP-DFT). We also present calculated optical absorption spectra and discuss both computational approaches in comparison to each other and to the data available in literature.
2
Methods
Density functional theory (DFT) has been shown for decades to be very successful for the ground state analysis of different materials [12]. In order to be able to predict physical properties of conjugated polymer systems it is important to realistically describe both short (covalent) and long range (Coulomb and van der Waals, vdW) components of intermolecular interactions [8]. Local density (LDA) [13] and generalized gradient approximations (GGA) [14] are frequently used to account for the exchange and correlation (XC) interaction. Coulomb part dominates intermolecular potential energy over higher order vdW part. The Coulomb interaction is realistically reproduced by LDA, but the vdW interactions is not included in standard DFT. It has been shown that equilibrium distance between organic molecule and solid (graphene) surface predicted by LDA is in good agreement to the value followed from explicit inclusion of the vdW into the interaction Hamiltonian [8] thus justifying neglect of the vdW interaction for interatomic equilibrium distances predictions. In this work the TDDFT method as implemented in the Gaussian03 package [15,14] was used for cluster calculations of PMT. For atomic structure optimization the B3LYP/6-31G* basis was used. Excitation energies were obtained through extrapolation of the values calculated for n = 2 to 7 repeat units. The calculated data for the n = 1 case does not obey the 1/n-extrapolation due to less pronounced delocalization of π-electrons in a single unit, and relevant data isn’t considered here. Similar approach to obtain ground and excited electronic states of PMT infinite chain by extrapolation of TDDFT cluster data was used in [11]. According to our finding the differences in excitation energies if using B3LYP/6-31G* and B3LYP/6-31+G* basis sets become negligible for n > 5 as shown in Table 1. Effect of the basis could be an interesting point for future studies in the field, however, it is out of scope of this work. Solvent effects were not taken into account for the cluster calculations. In addition, the PMT and PTV conjugated polymers were also modeled as infinite atomic chains [9,10] in two parts. We used the DMol3 [16,17] and CASTEP [18] (based on ab initio pseudopotentials, PP) computational packages as implemented in Materials Studio. First, the geometry relaxation was performed using the DFT-LDA method as implemented in DMol3 . Equilibrium atomic structures were obtained by LDA method. Only Coulombic interaction providing most
398
A.V. Gavrilenko et al.
Table 1. Basis set effect on the HOMO-LUMO gap and excitation energies of 3methoxy-thiophene in eV
Oligomer Units 2 3 4 5 6 7
HOMO-LUMO Gap 6-31G* 6-31+G* 3.97 3.92 3.19 3.16 2.65 2.62 2.36 2.33 2.16 n/a 2.02 n/a
Excitation Energies 6-31G* 6-31+G* 3.76 3.66 3.01 2.94 2.49 2.43 2.19 2.14 1.98 n/a 1.83 n/a
important contribution to the interchain interaction [8] is included. The unit cell is replicated in all 3 dimensions, the height, d3, was set to be 15 ˚ A in order to quench the interaction between the thiophene ring and the methyl group. We obtained very weak effect of the d3 on optics therefore this value was constrained. The fully relaxed structure was then used to calculate the electronic and optical properties of the conjugated polymer within PP-DFT as implemented in CASTEP. Three dimensional periodicity in this case allows application of ab initio pseudopotential method which makes optical calculations straight forward [19]. Optical absorption spectra of the PTV conjugated polymers are calculated within the independent particles picture (random phase approximation, RPA) employing ultrasoft pseudopotentials [9]. Details of the optical calculations are described in the detail in Ref. [20].
3
Results and Discussion
We studied equilibrium atomic configurations and excitation energies of PMT and PTV using different computational approaches described above. First we consider results of cluster calculations of PMT. As stated above the excitation energies are obtained by extrapolation of the values predicted for 2 to 7 units to the infinitely long chain. Our previous work with substituted poly(phenylene vinylenes) (PPV) has resulted in the observation of supramolecular scale quasicrystalline arrangements of these polymers with distinctive unit cell dimensions [19,21,22]. In those reports, both experimental and computational methods were used to confirm the formation of crystalline domains of the PPV conjugated polymer with interchain dimensions on the order of the van der Waals radii which would be expected to lead to a strong interchain interaction. In the present work, we extend these methods to the study of two more technologically relevant conjugated polymers, poly(3-methoxy-thiophene) and poly(thienylene vinylene). These conjugated polymers are similar to PPV but contain five membered rings with sulfur substitution in the ring and as a result a lower bandgap than their PPV counterparts. Each polymer has an alkoxy substituent on the thiophene ring which donates electrons to the chain leaving
Ground State and Excitation Energies of Conjugated Polymers
(a) PMT Cluster
(b) PMT unit cell
399
(c) PTV unit cell
Fig. 1. (a) Converged atomic structure of pentamer (5 units) of 3-methoxy-thiophene. 3D-view of the poly(3-methoxy-thiophene) (b) and poly(thienylene vinylene) (c) unit cells. Optimized unit cell dimensions for (b) are d1 = 7.74 ˚ A, d2 = 10 ˚ A, and d3 = 15 ˚ A, and for (c) are d1 = 6.55 ˚ A, d2 = 3.75 ˚ A, and d3 = 15 ˚ A.
the chain slightly electron rich forming an n−type material. Experimentally, an eight carbon (or longer) alkoxy chain is used in part to assist the solvation properties of the polymer in organic solvents. In this work, the alkoxy side chain was shortened to a single carbon methoxy chain as the chain length does not affect the electron donating ability of the alkoxy subtstituent and the arrangement of the chains did not include solvent effects. Additionally, the shorter methoxy side chain allows the computational resources to be reserved for the conjugated backbone. 3.1
Equilibrium Geometry
Fully relaxed configuration of 3-methoxy-thiophene pentamer is shown in Fig.1(a) as an example. Geometries of PMT and PTV polymers modeled as an infinite chain are given in Figs. 1(b) and 1(c). Interchain interaction is an important aspect since the close proximity of neighboring chains will split the electronic states of the polymer, thereby reducing the bandgap and creating additional states. As mentioned above the d3 provides very weak effect on optics therefore this value was constrained to 15˚ A . The equilibrium intermolecular distances (d1 ) for PMT and PTV are 7.74˚ A and 6.55˚ A, respectively, and were determined by cluster calculations of a single polymer chain. 3.2
Excitation Energies
The excitation energies (EXC) were calculated using both the TDDFT theory and ab initio pseudopotential method as described in section 2. In order to obtain correct values for EXC in PTV within the TDDFT we studied convergence of the EXC with increase of the unit number in the polymer chain. Predicted EXC values are given in Table 1. In Fig. 2 we demonstrate that calculated EXC data
400
A.V. Gavrilenko et al.
Fig. 2. Energy extrapolation to infinite chain length for both the HOMO-LUMO gap and the excitation energy. The excitation energy yields a value of 1.1 eV which is lower than that of the HOMO-LUMO gap by about 0.1 eV.
could be fitted by 1/n function, where n is the number of units in the chain. The observed 1/n dependence allows simple extrapolation to infinitely long polymer chain as shown in Fig. 2, however, analysis of this dependence requires further study. The obtained TDDFT excitation energy value of 1.1 eV is compared with that calculated by PP-DFT method. The excitation energies of PMT polymers are obtained from the Projected Density Of States (PDOS). The PDOS spectrum of PMT infinite chain polymer is given in Fig. 3. The calculated PP-DFT HOMO-LUMO gap of 0.9 eV (excitation energy) is lower then TDDFT value by 0.2 eV. It is well known that DFT excitation energies underestimate the actual energy gaps. In order to obtain realistic values the quasiparticle (QP) correction to the DFT data, substantial complications of computations are required [23]. On the other hand the QP values for the HOMOLUMO gap could be obtained from the comparison with available experimental data or with TDDFT gap-values [23]. By comparison with optical absorption data we demonstrate that both methods provide very close results. 3.3
Optical Absorption
Calculated optical absorption spectrum of PTV conjugated polymers is shown in comparison with experimental data [24,19] in Fig. 4. In order to match spectral location of theoretical spectrum to the experimental one (to the dominant peak located at 577 nm) a QP correction (blue shift) of 0.18 eV was applied to the calculated data in Fig. 4. The QP corrected theoretical spectrum correctly reproduces shoulders on a long-frequency wing of the spectrum. These shoulders are attributed to molecular vibrations (at 619 nm) and to the aggregation (at
Ground State and Excitation Energies of Conjugated Polymers
401
Fig. 3. Electronic structure of 3-methoxy-thiophene in the form of Projected Density of States (PDOS)
685 nm) caused by interchain interaction [19]. The shape of the calculated spectrum, however, shows remarkable differences from experimental data (see Fig. 4). It has been demonstrated before that for optical line shape analysis inclusion of excitonic effects in polymers is necessary [10]. However this technically challenging theory of optical response in polymers is out of scope of present study. Inclusion of exciton effects is expected to improve comparison between the line shapes of predicted and measured optical absorption spectra which should be addressed in future works. From the comparison between calculated and measured optical absorption spectra of PTV we obtained the QP correction value of 0.18 eV (the blue-shift) which is applied to the calculated spectrum in order to match the experimental data. The QP value of 0.18 eV obtained for PTV polymers is very close to the DFT gap underestimation (by 0.2 eV) which we obtained for PMT polymers from comparison to our TDDFT value. The TDDFT data realistically predict experimental HOMO-LUMO gap values in conjugated polymers [11]. The DFT+QP calculated gap values agree well with those predicted by TDDFT method for both organic and inorganic materials [23]. This comparison suggests that excitation energies of the PMT polymer obtained by extrapolation of TDDFT values to an infinitely long chain correctly reproduces a real HOMO-LUMO gap in agreement with general conclusions followed from comparative DFT+QP and TDDFT analysis [11,23]. Therefore, results of this work demonstrate that relatively simple TDDFT cluster calculations combined with extrapolation method provide correct QP values for HOMO-LUMO gap which could be used with PP-DFT method for realistic predictions of optical absorption in conjugated polymers. This is substantial simplification of the theory which still incorporates important many-body
402
A.V. Gavrilenko et al.
Fig. 4. Calculated optical absorption spectra resulted PTV (solid) in comparison with experimental data (circles) measured in [24]
effects in polymer optics. Thus for conjugated polymers, the realistic first principle predictions of optical spectra could be performed within a simplified scheme avoiding large-scale computations.
4
Conclusions
In this work we present results of the ground state and excitation energies study of poly(3-methoxy-thiophene) (PMT) and poly(thienylene vinylene) (PTV) conjugated polymers by first principles density functional theory (DFT). We used two approaches for computations of excitation states: time dependent DFT (TDDFT) of clusters and ab initio pseudopotentials based DFT (PP-DFT) of infinite polymer chains. The excitation energies of PMT calculated by TDDFT with increased unit numbers converge well to the real experimentally measured energy gaps. Combination of TDDFT cluster method with PP-DFT approach for infinite chain provide single-gap quasiparticle corrections value needed for optical calculations. Infinite chain model is used to calculate optical absorption of PTV. Predicted optical absorption spectrum is in good agreement with experiment. Combination of both cluster TDDFT and infinite chain PP-DFT methods is rather simple and promising first principle approach for realistic modeling and simulations of conjugated polymers optics. Acknowledgments. This work is supported by NSF STC DMR-0120967, NSF PREM DRM-0611430, NSF NCN EEC-0228390, and NASA CREAM NCC31035.
Ground State and Excitation Energies of Conjugated Polymers
403
References 1. Dhanabalan, A., van Duren, J.K.J., van Hal, P.A., van Dongen, J.L.J., Janssen, R.A.J.: Synthesis and characterization of a low bandgap conjugated polymer for bulk heterojunction photovoltaic cells. Adv. Funct. Mater. 11, 255–262 (2001) 2. Sun, S.S., Sariciftci, N.S.: Organic Photovoltaics: Mechanisms, Materials, and Devices. CRC Press, Boca Raton (2005) 3. Patil, A.O., Heeger, A.J., Wudl, F.: Optical properties of conducting polymers. Chem. Rev. 88, 183–200 (1988) 4. Skotheim, T.A. (ed.): Handbook of Conducting Polymers, 2nd edn. CRC Press, Boca Raton (1997) 5. Eckhardt, H., Shacklette, L.W., Jen, K.Y., Elsenbaumer, R.L.: The electronic and electrochemical properties of poly(phenylene vinylenes) and poly(thienylene vinylenes): An experimental and theoretical study. J. Chem. Phys. 91, 1303–1315 (1989) 6. Blohm, M.L., Pickett, J.E., Van Dort, P.C.: Synthesis, characterization, and stability of poly(3,4-dibutoxythiophenevinylene) copolymers. Macromol. 26, 2704–2710 (1993) 7. Toussaint, J.M., Bredas, J.L.: Theoretical analysis of the geometric and electronic structure of small-band-gap polythiophenes: poly(5,5’-bithiophene methine) and its derivatives. Macromol. 26, 5240–5248 (1993) 8. Ortmann, F., Schmidt, W.G., Bechstedt, F.: Attracted by long-range electron correlation: Adenine on graphite. Phys. Rev. Lett. 95, 186101–186105 (2005) 9. Gavrilenko, V.I.: Ab initio modeling of optical properties of organic molecules and molecular complexes. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2006. LNCS, vol. 3993, pp. 89–96. Springer, Heidelberg (2006) 10. Rohlfing, M., Tiago, M.L., Louie, S.G.: First-principles calculation of optical absorption spectra in conjugated polymers: role of electron-hole interaction. Synth. Met. 116, 101 (2001) 11. Ma, J., Li, S., Jiang, Y.: A time-dependent DFT study on band gaps and effective conjugation lengths of polyacetylene, polyphenylene, polypentafulvene, polycyclopentadiene, polypyrrole, polyfuran, polysilole, polyphosphole, and polythiophene. Macromol. 35, 1109–1115 (2002) 12. Kohn, W., Becke, A.D., Parr, R.G.: Density functional theory of electronic structure. J. Phys. Chem. 100, 12974–12980 (1996) 13. Perdew, J.P., Wang, Y.: Accurate and simple analytic representation of the electron-gas correlation energy. Phys. Rev. B 45, 13244–13249 (1992) 14. Perdew, J.P., Burke, K., Ernzerhof, M.: Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865–3868 (1996) 15. Miller, W.H., Hernandez, R., Handy, N.C., Jayatilaka, D., Willets, A.: Ab initio calculation of anharmonic constants for a transition state, with application to semiclassical transition state tunneling probabilities. Chem. Phys. Lett. 172, 62 (1990) 16. Delley, B.: An all-electron numerical method for solving the local density functional for polyatomic molecules. J. Chem. Phys. 92, 508–517 (1990) 17. Delley, B.: From molecules to solids with the DMol3 approach. J. Chem. Phys. 113, 7756–7764 (2000) 18. Segall, M.D., Lindan, P.J.D., Probert, M.J., Pickard, C.J., Hasnip, P.J., Clark, S.J., Payne, M.C.: First-principles simulation: ideas, illustrations and the CASTEP code. J. Phys.Cond. Matt. 14, 2717–2744 (2002)
404
A.V. Gavrilenko et al.
19. Gavrilenko, A.V., Matos, T., Bonner, C.E., Sun, S.S., Zhang, C., Gavrilenko, V.I.: Optical absorption of poly(thiophene vinylene) conjugated polymers. Experiment and first principle theory. J. Phys. Chem. B (in press) (2008) 20. Gavrilenko, V.I., Bechstedt, F.: Optical functions of semiconductors beyond density-functional theory and random-phase approximation. Phys. Rev. B 55, 4343–4352 (1997) 21. Bonner, C.E., Charter, S., Lorts, A., Adebolu, O.I., Zhang, C., Sun, S.S., Gavrilenko, V.I.: Luminescence and optical absorption of conjugated polyphenylene-vinylene polymers. In: Proc. of SPIE, vol. 6320, 63200J (2006) 22. Seo, K., Choi, S., Zhang, C., Sun, S.S., Bonner, C.: Processing and molecular packing of a derivatized PPV -donor-bridge-acceptor-bridge- type block copolymer for potential photovoltaic applications. Adv. Mater (in press) (2008) 23. Onida, G., Reining, L., Rubio, A.: Electronic excitations: density-functional versus many-body Green’s-function approaches. Rev. Mod. Phys. 74, 601–633 (2002) 24. Matos, T.D.: Design, synthesis, and characterization of 3-dodedyl-2,5-poly(thienylene vinylene) polymer for optoelectronics and solar energy conversions. Master’s thesis, Norfolk State University (2007)
Workshop on Computational Finance and Business Intelligence Yong Shi1,2, Shouyang Wang3, and Xiaotie Deng4 1
Research Center on Fictitious Economy and Data Sciences, Chinese Academy of Sciences 100080 Beijing, China 2 College of Information Science and Technology, University of Nebraska at Omaha Omaha NE 68182, USA 3 Academy of Mathematical and System Sciences, Chinese Academy of Sciences Beijing 100080, China 4 Department of Computer Science, City University of Hong Kong Tat Chee Avenue , Kowloon , Hong Kong
[email protected],
[email protected],
[email protected]
1 Introduction The workshop focus on computational science aspects of asset/derivatives pricing & financial risk management that relate to business intelligence. It will include but not limited to modeling, numeric computation, algorithmic and complexity issues in arbitrage, asset pricing, future and option pricing, risk management, credit assessment, interest rate determination, insurance, foreign exchange rate forecasting, online auction, cooperative game theory, general equilibrium, information pricing, network band witch pricing, rational expectation, repeated games, etc.
Acknowledgments This workshop was partially supported by National Natural Science Foundation of China (Grant No.70621001, 70531040, 70501030, 10601064), National Natural Science Foundation of Beijing (Grant No.9073020), 973 Project of Chinese Ministry of Science and Technology (Grant No.2004CB720103), and BHP Billiton Cooperation of Australia.
M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, p. 407, 2008. © Springer-Verlag Berlin Heidelberg 2008
Parallelization of Pricing Path-Dependent Financial Instruments on Bounded Trinomial Lattices Hannes Schabauer1 , Ronald Hochreiter2 , and Georg Ch. Pflug2 1
Research Lab Computational Technologies and Applications, University of Vienna
[email protected] 2 Department of Statistics and Decision Support Systems, University of Vienna {ronald.hochreiter,georg.pflug}@univie.ac.at
Abstract. Complex financial instruments are a central concept for the survival of financial enterprises in liberalized markets. The need for fast pricing of more complex and exotic financial products led to the development of new algorithms, and to the parallelization of existing algorithms. In this paper, we present a parallelization scheme for pricing path-dependent interest rate products on bounded trinomial lattices. The basic building block presented in this paper can be used to build more complex pricing schemes. The paper is concluded by a set of numerical results concerning the speedup of the proposed parallelization scheme. Keywords: computational finance, pricing, bounded lattice, cluster computing, parallelization.
1
Introduction
Pricing a financial contract is based on the calculation of its respective present value, which is done by computing the expected, discounted cash-flow given a cash-flow function. The underlying interest rate process, which is the basis of the calculations is most commonly given as continuous-time stochastic process. In order to numerically cope with such processes, one can either apply MonteCarlo methods or use some sort of time and state discretization, see e.g. [1]. If a stochastic process is simulated and discretized, then this discretization process leads to (non-recombining) trees or recombining trees (lattices) - see Figure 1 for two-stage, binomial examples of these structures. Using a lattice model results in problems of polynomial order if the lattice has polynomial growth and cash-flows may be represented on the lattice. However, exotic contracts, which are of growing importance for practical applications, define the payable cash-flows as a function of the whole history of the underlying interest rate process. The tree covering the history grows exponentially and therefore the complexity of pricing such contracts is non-polynomial. For certain subclasses of cash-flow functions however, a polynomial pricing algorithm exists, see [2]. Algorithms of this subclass are an essential component of the AURORA M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 408–415, 2008. c Springer-Verlag Berlin Heidelberg 2008
Parallelization of Pricing Path-Dependent Financial Instruments
t=0
1
2
t=0
1
409
2
Fig. 1. Tree-type structures
Financial Management System, a modular decision support tool for asset and liability management under development at the University of Vienna, see [3]. The main aim of this paper is to present a parallelization of the algorithms presented there for the special structure of bounded trinomial lattices. For an algorithmic and implementational overview of the general pricing methodology on trees, see [4] and [1]. Some foundations of the work presented in this paper have also been shown by [5]. Parallelization approaches for binomial trees (BT) have been proposed e.g. in [6] and [7]. Parallelizations of trinomial trees (TT) in [8]. In [9] two parallel - shared memory and distributed memory - algorithms for implementing the lattice methods for pricing European and American style equity options are discussed. Special parallelization schemes for Asian option structures are presented in [10]. A summary of the topic of financial pricing related to high performance computing is presented by [11]. The idea is to modify pricing algorithms to optimally run on clusters of workstations and some Grid middleware (see [12] for a summary of the importance for the whole financial industry). Both extensions provide an interesting infrastructure for pricing complex financial instruments, especially with a proper portage of the underlying data format of the pricing process (trees) to specialized hardware structure. This paper is organized as follows. Section 2 describes the lattice approach to pricing path-dependent interest rate instruments. Section 3 summarizes parallelization issues, and the chosen parallelization approach and Section 4 contains numerical results, while Section 5 concludes the paper.
2 2.1
Path-Dependent Financial Instruments Interest Rate Path Dependency
Financial instruments tend to become increasingly complex, due to ever changing regulatory situations in liberalized markets. Many interest rate products do not only depend on the interest rate at the time of exercise of the instrument, but on an interest rate, that depends on the path of the underlying interest rate. The calculation of the effective interest rate may be based on the following modifiers.
410
H. Schabauer, R. Hochreiter, and G.Ch. Pflug
– Mean-type. The effective interest rate is the arithmetic mean or geometric mean of the interest rate path. – Barrier-type. The effective interest rate is modified if the underlying interest rate reaches certain lower or upper barriers. – Extremum-type. The effective interest rate is bound to some upper and/or lower level. Furthermore the effective interest rate may depend on the whole path of the interest rate process (full path dependency) or only on the last tl stages (limited path dependency). While the effective interest rate of path-independent interest instruments depends on the current market interest rate, the contracted interest rates of path-dependent interest rate instruments are contingent to the historical path of the underlying interest rate. We assume, that a model for discretizing the process is available. Many other instruments are available respectively can be generated out of some basic building blocks. The strategy for creating complex products out of simpler building blocks is shown by [2]. The underlying time-discretized stochastic process has a finite set of T stages (points in time), which constitutes the complexity parameter for the calculations. It might be useful to use a fine-grained time discretization, i.e. hourly steps. Such a fine time-discretization results in up to tens of thousands of stages for long-term financial products. For most algorithms on lattices, a backward and a forward recursive algorithm can be defined. Introduce the conditional present values P Vt,j (c) for a future state (t, j) as the conditional expectation shown in Eq. 1. T E[ s=t cs (ξs ) fs−1 (ξs−1 ) · · · ft (ξt )|ξt = j] = T ct (j) + E[ s=t+1 cs (ξs ) fs−1 (ξs−1 ) · · · ft (ξt )|ξt = j]
(1)
The conditional present values P Vt,j satisfy a recursion. The basic backward recursion to calculate the present value of a path-independent product is shown in Table 2.1. The vectors C and F contain the cash-flow and the discount factors, which are defined for all nodes in the lattice respectively. St denotes the set of possible states at time t. Table 1. Basic backward recursion algorithm
1 2 3 4 5 6 7 8
BackwardRecursion (C, F ) returns P V0 forall j ∈ ST P VT,j ← cT (j) endforall for t ← T − 1 downto 0 forall j ∈ St P Vt,j ← ct (j) + ft (j) k P Vt+1,k pt,j,k end forall end for
Parallelization of Pricing Path-Dependent Financial Instruments
411
This algorithm is the basic block for pricing path-dependent present values, such as both mean-type cash flow structures, see [2]. 2.2
Bounded Trinomial Lattices
Table 2 compares binary and ternary lattices and trees of height T . However, instead of working plain trinomial lattices (TL), whose number of nodes is (T + 1)2 , it is more common to use bounded trinomial lattices (BTL). A bounded trinomial lattice is bounded in a stage tb < T . In all succeeding stages, starting in tb , the highest-value as well as the lowest-value node do not branch into new additional lower and upper value nodes, i.e. the number of nodes per stage is constant ((2·tb )−1). A methodology to calibrate such lattices has been proposed by [13]. A degenerated bounded trinomial lattice (DBTL) has the additional feature that it is only binomial at the highest-value and lowest-value. The choice of the bound structure (degenerated or not) is determined by the underlying calibration methodology. Table 2. Complexity of lattices and trees Type of lattice Binary Ternary 1 Nodes (Lattice) (T + 2)(T + 1) (T + 1)2 2 Arcs (Lattice) T (T + 1) 3T 2 T +1 1 Nodes (History Tree) 2T +1 − 1 (3 − 1) 2 T +1 1 (3 − 3) Arcs (History Tree) 2T +1 − 2 2
Bounded lattices consist of a triangular part and a rectangular part as shown in Figure 2. The triangular part has an increasing number of nodes and edges at each stage. The present values of both parts can be calculated in parallel. As expected, parallelization of the triangular part has turned out to be slower than parallelization of the bounded, rectangular part. The triangular part was parallelized by dynamic arrays for its nodes and edges. However, parallelization of this part is not that important, as the rectangular part is considerably larger than the triangular one in practice.
3
Parallelization
Our parallel approach introduces pricing of path-dependent contracts on distributed memory machines (clusters of PCs or workstations) [14]. 3.1
Iterative and Recursive Problem Pattern
A very general algorithm for path-dependent pricing can be described as follows. The algorithm includes both bounded and non-bounded parts of a lattice. Our approach of parallelizing is a combination of a divide-and-conquer algorithm combined with data decomposition. You can imagine the structure of a path
412
H. Schabauer, R. Hochreiter, and G.Ch. Pflug Ternary bounded lattice, T=4, NB=2 #nodes=19, #arcs=42
t=0
t=1
t=2
t=2
t=3
t=4
Fig. 2. Triangle and rectangle part
dependent lattice as a bundle of independent parts of the whole lattice. Therefore it is possible to calculate the parts in parallel, without communication during the computation. Each part is called here a sweep. For example, if a lattice extends over 5 periods (T = 5), you result in sweeps 1, 2, 3, 4 and 5. Communication between the nodes is only needed before and after performing the respective sweeps. Though, one should note that each sweep has a different runtime. Our parallel algorithm divides the subproblems into parts, which can be calculated in the same amount of time together. The way of dividing the lattice into parts is as follows. Consider the example of three nested loops. The interior loop is calculating all nodes in same state, in same time period, in the same sweep. One node needs the values of all adjacent edges and vertices. These have already been calculated in the last step. Thus, in the inner loop all vertices in same period, same sweep are calculated. A covering loop iterates over all time periods start at the period of the sweep, ending at period 0. The outer loop is iterating over all sweeps. 3.2
Parallel Code and Data Structure
In principle, all three loops have to be observed in order to find the best one for finding an efficient way of parallelization. Parallelization of the interior loop has been tried earlier. It turned out not to work well, as the ratio of computation to communication is not high enough. The covering loop can not be parallelized by any means, because of its recursive calculation. As described in [2], a path dependent lattice carries two different types of data: Path dependent and path independent data. Path independence occurs on discount factors and probabilities, whereas path dependency occurs on cash flows and calculated present values. From the algorithmic point of view, this means that discount factors
Parallelization of Pricing Path-Dependent Financial Instruments
413
and probabilities are the same for all respective sweeps. We just have to copy them before calculating. Cash flows and obviously also calculated present values are different for each iteration at each state and in each sweep. Calculating a net present value of T periods yields to T different subproblems (sweeps) from 1 . . . T . Their respective runtime complexity equals to the number of the sweep. E.g. Sweep 100 takes 50 times of sweep 2. Notice that after distributing all jobs to the nodes, we want to have our results at the end on the same time. Therefore we combine sweeps, which pairwise sum up to the same value. 3.3
Simplified Numerical Example
Let us consider an example of a ternary bounded lattice of height 99, lasting over 20 periods, running on a cluster with 4 nodes. Such a lattice consists of 20 distinct sweeps, starting at 1, 2, 3, . . . , 19, 20. In each sweep, we need sweep × height cash flows. In order to gain the same amount of data at each sweep, we combine two of them. Here, we combine sweep 20 with sweep 1, 19 with 2 and so on. Thus we can result in 10(20/2) columns of 21 · 99 cash flows. Now we distribute these cash flows in equal sizes to our worker nodes. 10 columns for 3 worker nodes results in 3 columns per node. The remaining column is performed by the master node. 16 j = 25 j = 49 j = 75 j = 99 Linear speedup
14
Speedup
12 10 8 6 4 2 0 1
2
4
6
8
10
12
14
16
Number of nodes Fig. 3. Speedups for different bounded heights j
4
Numerical Results
Parallel codes have been tested on the Schroedinger III cluster1 at the Vienna University Computer Center at the University of Vienna. Calculations were done on single-core processors, therefore all communication utilizes network communication, which takes significantly more time than locally on SMP or multi-core 1
http://www.univie.ac.at/ZID/schroedinger/
414
H. Schabauer, R. Hochreiter, and G.Ch. Pflug
Table 3. Runtime for T = 3000 (left) and T = 2000 (right) for n = 1 to 16 parallel computing nodes with a bounded height j of 25, 49, 75, 99
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
25 9,487 16,980 4,484 2,960 2,289 1,859 1,609 1,320 1,203 1,089 0,976 0,917 0,871 0,824 0,796 0,746
49 15,693 29,457 9,566 6,371 4,726 3,878 3,273 2,886 2,531 2,292 2,070 1,929 1,769 1,695 1,566 1,535
75 26,778 32,722 15,828 10,511 7,890 6,378 5,230 4,613 4,128 3,628 3,324 3,085 2,867 2,671 2,515 2,433
99 35,203 43,640 21,339 14,082 10,707 8,511 7,070 6,113 5,449 4,816 4,375 4,003 3,773 3,464 3,253 3,089
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
25 3,389 5,792 3,542 1,316 1,003 0,761 0,640 0,535 0,476 0,437 0,402 0,421 0,390 0,390 0,378 0,363
49 6,781 14,789 10,476 2,902 2,175 1,722 1,398 1,226 0,121 0,988 0,922 0,882 0,832 0,785 0,742 0,726
75 10,627 24,421 15,917 4,539 3,417 2,683 2,234 1,914 1,718 1,558 1,433 1,386 1,273 1,214 1,312 1,558
99 13,997 33,812 9,050 5,980 4,535 3,574 2,910 2,488 2,230 2,003 1,832 1,738 1,601 1,519 1,593 1,824
processors. The structure of the cluster at the time of the calculation of the results below consisted of 240 nodes. Each node has a Pentium 4 Prescott CPU, operating at 3.2 GHz, and 2 GB RAM. The runtime environment is operating on a Gigabit network, interconnected by a Cisco 6500 switch. The total performance is approximately 1.14 Teraflops per second (linpack). All program codes were written in Fortran 90, using mpich 1.27. The used compiler was Portland Group pgf90 6.1-1 (x86-64 Linux). The resulting runtime for T = 3000 and T = 2000 is summarized in Tab. 3. The speedup is shown graphically for T = 3000 for all four heights in Fig. 3. The speedup results indicate, that the algorithm scales well.
5
Conclusion
In this paper, we presented a parallelization scheme for pricing path-dependent interest rate instruments defined on discrete-time bounded trinomial lattices. Generally, tree and lattice-based data structures are not perfectly suited on cluster architectures. However, the results look promising to continue the work and extend the underlying financial problems from pricing path-dependent interest rate products to option pricing or large-scale multi-stage stochastic tree-based portfolio optimization.
References 1. J¨ ackel, P.: Monte Carlo Methods in Finance. John Wiley & Sons, Chichester (2002) 2. Hochreiter, R., Pflug, G.C.: Polynomial algorithms for pricing path-dependent interest rate instruments. Computational Economics 28(3), 291–309 (2006)
Parallelization of Pricing Path-Dependent Financial Instruments
415
´ etanowski, A., Dockner, E., Moritsch, H.: The Aurora financial 3. Pflug, G.C., Swi¸ management system: model and parallel implementation design. Annals of Operations Research 99, 189–206 (2000) 4. Clewlow, L., Strickland, C.: Implementing Derivative Models. John Wiley & Sons, Chichester (1998) ´ etanowski, A.: Selected parallel optimization methods for financial 5. Pflug, G.C., Swi¸ management under uncertainty. Parallel Computing 26(1), 3–25 (2000) 6. Thulasiram, R., Bondarenko, D.: Performance evaluation of parallel algorithms for pricing multidimensional financial derivatives. In: Fourth International Workshop on High Performance Scientific and Engineering Computing with Applications (HPSECA 2002), pp. 306–313. IEEE Computer Society, Los Alamitos (2002) 7. Gerbessiotis, A.V.: Architecture independent parallel binomial tree option price valuations. Parallel Computing 30(2), 303–318 (2004) 8. Gerbessiotis, A.V.: Trinomial-tree based parallel option price valuations. Parallel Algorithms and Applications 18(3), 181–196 (2003) 9. Campolieti, G., Makarov, R.: Parallel lattice implementation for option pricing under mixed state-dependent volatility models. In: 19th International Symposium on High Performance Computing Systems and Applications (HPCS 2005), pp. 170– 176. IEEE Computer Society, Los Alamitos (2005) 10. Huang, K., Thulasiram, R.K.: Parallel algorithm for pricing american asian options with multi-dimensional assets. In: 19th International Symposium on High Performance Computing Systems and Applications (HPCS 2005), pp. 177–185. IEEE Computer Society Press, Los Alamitos (2005) 11. Moritsch, H., Benkner, S.: High-performance numerical pricing methods. Concurrency and Computation: Practice and Experience 14(8-9), 665–678 (2002) 12. Harris, D.: Will financial services drive Grid adoption? Grid Today 3(40) (2004) 13. Hull, J., White, A.: The general Hull-White model and super calibration. Financial Analysts Journal 57(6) (2001) 14. Dongarra, J., Sterling, T., Simon, H., Strohmaier, E.: High-performance computing: Clusters, constellations, MPPs, and future directions. Computing in Science and Engineering 7(2), 51–59 (2005)
Heterogeneity and Endogenous Nonlinearity in an Artificial Stock Model Hongquan Li1,2,*1, Wei Shang1, and Shouyang Wang1 1
Laboratory of Management, Decision and Information Systems, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100080, China 2 Department of Applied Mathematics and Physics, Graduate School of Informatics, Kyoto University, Kyoto, 606-8501, Japan {Lhquan, shangwei ,sywang}@amss.ac.cn Tel.: +86-10-62651381; +81-75-753-3568; Fax: +81-75-753-4756
Abstract. We present a nonlinear structural stock market model which is a nonlinear deterministic process buffeted by dynamic noise. The market is composed of two typical trader types, the rational fundamentalists believing that the price of an asset is determined solely by its fundamental value and the boundedly rational noise traders governed by greed and fear. The interaction among heterogeneous investors determines the dynamics and the statistical properties of the system. We find the model is able to generate time series that exhibit dynamical and statistical properties closely resembling those of the S&P500 index, such as volatility clustering, fat tails (leptokurtosis), autocorrelation in square and absolute return, larger amplitude, crashes and bubbles. We also investigate the nonlinear dependence structure in our data. The results indicate that the GARCH-type model cannot completely account for all nonlinearity in our simulated market, which is thus consistent with the results from real markets. It seems that the nonlinear structural model is more powerful to give a satisfied explanation to market behavior than the traditional stochastic approach. Keywords: Computational finance; Nonlinearity; Heterogeneous agents; Endogenous fluctuations.
1 Introduction Modern finance is based on the standard paradigm of efficient market and rational expectations. The efficient market hypothesis postulates that the current price contains all available information and past prices cannot help in predicting future prices. Sources of risk and market fluctuations are exogenous. Therefore, in the absence of external shocks, prices would converge to a steady-state path which is completely determined by fundamentals and there are no opportunities for consistent speculative profits. In real markets, however, traders have different information on traded assets and process information differently, therefore the assumption of homogeneous ra* Corresponding author. M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 416 – 425, 2008. © Springer-Verlag Berlin Heidelberg 2008
Heterogeneity and Endogenous Nonlinearity in an Artificial Stock Model
417
tional traders may not be appropriate. The efficient market hypothesis motivates the use of random walk increments in financial time series modeling: if news about fundamentals is normally distributed, the returns on an asset will be normal as well. However, the random walk assumption does not allow the replication of some stylized facts of real financial markets, such as volatility clustering, fat tails (leptokurtosis), autocorrelation in square and absolute return, larger amplitude, crashes and bubbles. Recently, finance literature has been searching for structural models that can explain such observed patterns in financial data. A number of models were developed which build on boundedly rational, non-identical agents [1-4]. Financial markets are considered as systems of interacting agents which continually adapt to new information. Heterogeneity in expectations can lead to market instability and complicated dynamics. As a result, prices and returns in such markets may deviate significantly from fundamentals. In these heterogeneous agent models, different groups of traders coexist, having different beliefs or expectations about future prices of risky assets. Two typical trader types can be distinguished. The first type is the rational fundamentalists, believing that the price of an asset is determined solely by its fundamental value. The second typical trader type is the noise traders, chartists, or even technical analysts, believing that asset prices are not completely determined by fundamentals but that they may be predicted by simple technical trading rules. The literature on behavioral finance (for surveys see [5]) emphasizes the role of quasi-rational, overreacting, and other psychology factors including investor’s emotions. This paper builds on the model of [6], which is a deterministic behavioral stock market model with agents influenced by their emotions. In [6], the trading activity of the agents is characterized by greed and fear. They optimistically believe in booming markets, but panic if prices change too abruptly. Although the model is deterministic, it replicates several aspects of actual stock market fluctuations quite well. This paper is to extend this model in two ways. We introduce fundamentalists into the model to analyze how the interaction of different types of investors determines the dynamics and the statistical properties of the system. Further, an exogenous noise is added to the law of motion in order to mimic real market because noise and uncertainty play an important role in real financial markets. This paper is divided into five sections. Section 2 presents our model with heterogeneous traders. The third section brings forward the simulation results and discusses the behavior of nonlinear dynamics. The final section provides a brief summary and some conclusions.
2 The Model Let us consider a security with price Pt −1 (closed price) on the last trading period. Assume that this security is in fixed supply, so that the price is only driven by excess demand. Let us assume that the excess demand Dt is a function of the last price Pt −1 and the present fundamental value Pf ,t .A market maker takes a long position whenever the excess demand is negative and a short position whenever the excess demand
418
H. Li, W. Shang, and S. Wang
is positive so as to clear the market. The market maker adjusts the price in the direction of the excess demand with speed equal to λM . Accordingly, the log of the price at the end of period t is given as
p t = p t −1 + λ M D t ( p t −1 , p f ,t ) where
(1)
p f denotes the log of the fundamental value. In order to introduce an exogenous
news arrival process as a benchmark for the analysis of the resulting price dynamics, we make the assumption that p f follows a Wiener process, and hence,
p f ,t = p f ,t −1 + ε t
(2)
with ε t ~ N (0,σ ε2 ) . This specification makes sure that neither fat tails, volatility clustering, nor any kind of nonlinear dependence are brought about by the exogenous news arrival process. Hence, emergence of these characteristics in market prices would not be driven by similar characteristics of the news but would rather have to be attributed to the trading process itself. Agents’ interactions magnify and transform exogenous noise into fat-tailed returns with clustered volatility. The market is composed of two typical trader types. One type is the rational fundamentalists, believing that the price of an asset is determined solely by its fundamental value, and the other type is the boundedly rational traders whose behavior is influenced by their greed and fear. Let us assume that a fraction α of investors follows a fundamentalist strategy and a fraction (1 − α ) for boundedly rational traders. Let
DtF and DtN be, respectively, the demands of fundamentalists and boundedly rational traders. The excess demand for the security is thus given by
Dt = αDtF + (1 − α ) DtN
0 ≤α ≤1
(3)
Fundamentalists react to difference between price and fundamental value. The demand of fundamentalists in period t is
DtF = λtF ( p f ,t − pt −1 )
λtF > 0
(4)
where λtF is a parameter that measures the speed of reaction of fundamentalist traders; we will assume that λtF = λ F throughout the paper. This demand function implies that the fundamentalists believe that the price tends to the fundamental value in the long run and reacts to the percentage mispricing of the asset in a symmetric way with respect to underpricing and overpricing. Following [6], the actions of the boundedly rational traders in our model are governed by greed and fear. The agents greedily take long positions because they know that the stock market increases in the long run. However, they also know the fact that stock prices always fluctuate, and such behavior is risky. The larger the risk, the more the agents reduce their investments. If prices change too strongly, they even panic and go short. For simplicity, it is assume that greed and fear-based behavior only occurs
Heterogeneity and Endogenous Nonlinearity in an Artificial Stock Model
419
for two activity levels. As long as the market evolves stably, the agents are rather calm. However, in turbulent times both greed and fear increase. The emotional regime switching process may be formalized as follows:
DtN = λtN −
2
λ
N t
( pt −1 − pt −2 ) 2
(5)
where λtN is a parameter that measures the activity level of emotional reaction to market 5 volatility. If 1 ∑ pt −i − pt −i −1 ≤ 0.045 and pt −1 − pt −1 ≤ 0.05 , then λtN = 0.05 ; oth5 i =1 N erwise, λt = 0.10 . In other words, the agents are rather calm if the average volatility in the last 5 trading periods is below 4.5 percent and if the most recent absolute log price change is below 5 percent. Otherwise the activity level increases from 0.05 to 0.10. Similar to the explanation in Ref. [20], the first term of (5) reflects the greedy autonomous buying behavior of the agents whereas the second term of (5) captures the fear of the agents. Note that in the turbulent regime, extreme log price changes can be as large as ± 10 percent. In the calm regime, extreme returns are restricted to ±5 percent.
3 Simulation Results and Statistical Properties In this section, we analyze the statistical properties of the simulated time series, which have been generated with 2000 observations in each stochastic simulation in order to allow the system to get sufficiently close to the asymptotic dynamics and to have time series as long as the daily time series of the S&P 500 index between 6 October 1999 and 14 September 2007. Fig.1 reports the time series plot of the S&P500 and the simulation series generated by our model with parameters λM = 1 , λF = 1 , α = 0.5 , σ ε = 0.03 , and
initial value p0 = p f ,0 = 6.12 , [ p1 , p 2 , p3 , p 4 , p5 ] = [6.07,6.08,6.14,6.18,6.20] . Table 1 reports the statistics of the daily returns on the S&P500 and the model–generated time series. The Ljung-Box Q statistics for up to 30 lags for returns (Q(30)) and squared returns (QS(30)) are also presented. The Q(30) statistic for testing the hypothesis that all autocorrelations up to lag 30 are jointly equal to zero in the stock markets is greater than the value of χ 2 distribution with 30 degrees of freedom at the 5% level, suggesting that the null hypothesis of the independence of returns should be rejected. Thus, linear serial dependencies seem to play a significant role in the dynamics of stock returns. The next and the most important question for the study of the behavior of nonlinear dependencies in stock returns, is: do these returns also exhibit nonlinear serial dependencies? The easiest way to answer this question is by examining the autocorrelation behavior of squared daily returns. The values of QS(30) (see Table 1) provide strong evidence of nonlinear dependence, indicating that the conditional distributions of the daily returns are changing through time. This is a symptom of ARCH effects. The results from Fig.1 and Table 1 indicate that the model displays statistical properties similar to those of the S&P500 index and can replicate the stylized facts of real financial markets, such as volatility clustering,
420
H. Li, W. Shang, and S. Wang
1600 1500 1400 1300 1200 1100 1000 900 800 700
0
200
400
600
800
1000
1200
1400
1600
1800
2000
(a) 0.06
0.04
0.02
0
-0.02
-0.04
-0.06
-0.08
0
200
400
600
800
1000
1200
1400
1600
1800
2000
1200
1400
1600
1800
2000
1200
1400
1600
1800
2000
(b) 6.5
6
5.5
5
4.5
4
0
200
400
600
800
1000
(c) 0.08 0.06 0.04 0.02 0 -0.02 -0.04 -0.06 -0.08
0
200
400
600
800
1000
(d) Fig. 1. Time series of the S&P500 index (a), daily returns series of the S&P500 (b), simulated price series (c) and simulated returns series (d)
Heterogeneity and Endogenous Nonlinearity in an Artificial Stock Model
421
Table 1. Summary statistics of daily returns on the S&P500 index and the simulated time series Sample S&P500 Model
Mean 0.0001 0.0001
Variance 0.0111 0.0203
Skewness Kurtusis 0.05 5.62 -0.20 3.51
Jar.Bra 573 36
Q(30) 51 392
QS(30) 1434 181
excess kurtosis, autocorrelation in square return, crashes and bubbles. Furthermore, all these interesting features in our model arise endogenously from the trading process and interactions of our agents. With the assumptions of IID Normal innovations of the fundamental value, none of these characteristics can be attributed to exogenous influences. The significant deviations from normality (larger Jar.Bra) and the significant QS(30) statistic in Table 1 suggest that there is very strong evidence of nonlinear structure in simulated return series. In order to detect the nonlinear dynamics deeply, we use the correlation dimension method and the BDS test. The method of the correlation dimension introduced by Grassberger and Procaccia [7] provides an important diagnostic procedure for distinguishing between deterministic chaos and stochasticity in a time series. If the correlation integral C ( ε , m ) measures the fraction of total number of pairs ( xi , xi +1 , ⋅⋅⋅, xi + m −1 ) , ( x j , x j +1 , ⋅⋅⋅, x j + m −1 ) , such that the distance between them is no more than sion is defined as:
dc = lim ε →0
ε , then the correlation dimen-
ln C (ε , m) ln ε
(6)
where
C m , n (ε ) =
1 ∑ H (ε − X i − X j ) , i ≠ j n ( n − 1)
(7)
H(u) is Heaviside step function, H (u ) =1 if u ≥ 0 , 0 otherwise; n=the number of observations, ε = distance, C (ε , m) = correlation integral for dimension m, X= the time series. It is necessary to notice that when the embedding dimension m increases, the dimension
d m is reached, such that d c* is the estimate of the true correlation: dc* = lim dc (m) m →∞
If
(8)
d m tends to a constant as m increases, then d m yields an estimate of the correla*
tion dimension of the attractor, namely d c . In this case, the data are consistent with deterministic behaviour. If
d m increases
without bound as m increases, this suggests that the data are either stochastic or noisy chaotic.
422
H. Li, W. Shang, and S. Wang
The limitation of correlation dimension procedure is that no formal hypothesis testing is possible. To deal with the problems of using the correlation dimension test, Brock et al. [8] devised a new statistical test which is known as the BDS test. The BDS tests the null hypothesis of whiteness (independent and identically distributed observations) against an unspecified alternative using a nonparametric technique. The BDS test is a powerful test for distinguishing random system from deterministic chaos or from nonlinear stochastic systems. However, it does not distinguish between a nonlinear deterministic system and a nonlinear stochastic system. Essentially, the BDS test provides a direct (formal) statistical test for whiteness against general dependence, which includes nonwhite linear and nonwhite nonlinear dependence. Therefore, the null hypothesis of i.i.d. may be rejected because of non-stationarity, linear or nonlinear dependence or chaotic structure. The BDS statistic measures the statistical significance of the correlation dimension calculations. Brock et al. demonstrated that
[Cm,n (ε ) − C1,n (ε ) m ] × n
(9)
is normally distributed with a mean of zero. The BDS statistic, W, that follows is normally distributed and is given by
W = n [Cm,n (ε ) − C1,m (ε ) m ] / σ m,n (ε )
(10)
where Cm ,n (ε ) and C1, n (ε ) are given in (7) and σ m ,n (ε ) is an estimate of the standard deviation. W converges in distribution to N (0, 1). The BDS statistic can be used to test the residuals of GARCH type models for independence. If the null model is indeed GARCH then the standardized residuals of the fitted GARCH model should be independent. However, it was demonstrated by Hsieh [9] that the distribution of the W statistic changes when applied on the residuals of ARCH and GARCH-type filtered. Therefore, we have to use the simulated distribution of the BDS statistic by bootstrap method (which can be easily handled with Eviews5.0) in such cases. Before proceeding any further, we should firstly test the stationarity of our modelgenerated data. The augmented Dickey-Fuller unit-root test (ADF) shows that one unit root exists in the simulated price series ( pt ). The ADF value is -1.02, which is greater than the critical value at 5%. However, the ADF value for returns series (the difference of pt ) is -22.44, which is less than the critical value at 5%. The unit root test strongly rejects a unit root for simulated returns series and we can conclude that the returns series is stationary. Given stationarity in returns, none of non-IID behaviour of stock returns can be attributed to non-stationarity. The correlation dimension estimates for the simulated returns series as well as the S&P500 index returns series are presented in Table 2. Clearly, our model has correlation dimension estimates similar to that of the S&P500. For the two series, the correlation dimension estimates are lower than the embedding dimensions; however, they do not converge to a stable value. The results indicate the underlying process is nether purely random or completely chaotic.
Heterogeneity and Endogenous Nonlinearity in an Artificial Stock Model
423
Table 2. Estimates of the correlation dimension for the embedding dimension m m 2 S&P500 1.80 Model 1.68
3 2.63 2.38
4 3.26 3.19
5 3.98 3.96
6 4.35 4.33
7 5.03 5.04
8 5.44 5.60
9 5.68 6.04
10 6.09 6.48
We now apply the BDS test to the simulated returns series. In order to eliminate linear structure in our model, the returns data were firstly filtered by an Auto-regressive Moving Average method whose lag length was determined by the Akaike information criterion. Table 3 gives the results of BDS tests. They strongly reject the hypothesis that simulated stock returns are IID. The rejection of IID is consistent with the view that stock returns are generated by nonlinear stochastic systems, e.g. ARCH and GARCH-type models, or nonlinear deterministic process such as chaos or noisy chaotic model. ARCH-type models have been widely used to describe conditional heteroskedasticity and are deemed to closely resemble the typical behavior of stock market volatility. Our interesting question is: Does the conditional heteroskedasticity captured by ARCH-type models account for all the nonlinearity in our model which is a nonlinear deterministic process disturbed by dynamic noise? To answer this question, we ran the BDS procedures on the standardized residuals of the fitted GARCH model to test if GARCH captures all nonlinear dependence in stock returns. The best GARCH model is determined by AIC criterion. The model is: Mean equation rt = 0.2279rt −2 + 0.3397ε t −1 + ε t (10.73)
(13.31)
Variance equation σ t2 = 0.0001 + 0.2565ε t2−1 + 0.4126σ t2−1 (6.58) Q(10)=4.61
(11)
(6.72)
(12)
(6.43)
QS(10)=13.55
The parentheses contain the z-statistics of the estimated coefficients. The Q(10) and QS(10) statistics of the standardized residuals are significantly smaller than the critical value 18.31 at 5%, indicating the absent of linear autocorrelation and heteroskedasticity. The standardized residuals from GARCH model serve as the filtered data. Table 4 shows that the BDS statistics on the standardized residuals are much smaller than those of the ARMA filtered data. However, most BDS statistics are still outside the 5% critical range. There is sufficient evidence to indicate that the GARCH model cannot completely account for all nonlinearity in the simulated returns. This evidence is thus consistent with the results reported by Hsieh [9] in which Hsieh found that the popular GARCH-type models couldn’t capture all nonlinear dependence in 10 stock returns including the weekly S&P500 index and the daily S&P500 index. This suggests that the nonlinear deterministic model disturbed by dynamic noise may be more powerful to mimic and explain the observed fluctuations in real economic and financial time series than the traditional nonlinear stochastic models. Finally, it should be noted that we only display one simulation run here for different parameter values. Because of random element added to our model, one simulated
424
H. Li, W. Shang, and S. Wang Table 3. BDS statistics for ARMA filtered dataa Epsilons/sigma 0.5 1 m=2 12.2 11.6 m=3 16.9 15.4 m=4 18.4 15.8 m=5 18.5 15.7 a all significant at 5 % (two-tailed) level.
1.5 10.4 13.8 13.7 13.5
2 8.36 11.5 11.3 11.2
Table 4. BDS statistics for GARCH standardized residuals Epsilons/sigma 0.5 1 m=2 3.70* 3.00* m=3 6.25* 4.99* m=4 6.42* 4.63* m=5 6.73* 4.32* *significant at 5 % (two-tailed) level.
1.5 1.83 3.48* 2.91* 2.60*
2 0.21 1.53 0.66 0.51
data may differ from each other. However, the nonlinear dynamical behavior and statistical properties for the fixed parameter can be relatively stable. As revealed in further simulations, our conclusions are quite robust (which may easily be checked).
4 Conclusion In this paper we have outlined a nonlinear deterministic model disturbed by dynamic noise. The dynamical system is able to generate some stylized facts present in real markets: excess kurtosis, volatility clustering, and autocorrelation in square and absolute return, crashes and bubbles. The market is composed of two typical trader types, the rational fundamentalists believing that the price of an asset is determined solely by its fundamental value and the boundedly rational traders governed by greed and fear. The interaction of different types of investors determines the dynamics and the statistical properties of the system. It is worth emphasizing that, in our model, all these interesting qualitative features arise endogenously from the trading process and the interactions of heterogeneous agents. With the assumption of IID normal innovations of the fundamental value, none of these characteristics can be attributed to exogenous impacts. Taking together the complex behavior in real stock markets and the academic achievements (see [1-2], [10]), it seems more robust than the traditional stochastic approach to model the observed data by a nonlinear structural model buffeted by dynamic noise. Acknowledgments. This research was supported by the MEXT Global COE Program on Informatics Education and Research Center for Knowledge-Circulating society (Kyoto University), the China Postdoctoral Science Foundation, the National Natural Science Foundation of China and Scientific Research Fund of Hunan Provincial Education Department. The authors would like to thank Prof. Masao Fukushima for his helpful comments and suggestions. All errors remain, of course, our own responsibility.
Heterogeneity and Endogenous Nonlinearity in an Artificial Stock Model
425
References 1. Brock, W.A., Hommes, C.H.: Heterogeneous beliefs and routes to chaos in a simple asset pricing model. Journal of Economic Dynamics and Control 22, 1235–1274 (1998) 2. Hommes, C.H., Manzan, S.: Comments on Testing for nonlinear structure and chaos in economic time series. Journal of Macroeconomic 28, 169–174 (2006) 3. Gaunersdorfer, A.: Endogenous fluctuations in a simple asset pricing model with heterogeneous agents. Journal of Economic Dynamics and Control 24, 799–831 (2000) 4. Malliaris, A.G., Stein, J.L.: Methodological issues in asset pricing: random walk or chaotic dynamics. Journal of Banking and Finance 23, 1605–1635 (1999) 5. Shiller, R.J.: The irrationality of markets. The Journal of Psychology and Financial Markets 3, 87–93 (2002) 6. Westerhoff, F.: Greed, fear and stock market dynamics. Physica A 343, 635–642 (2004) 7. Grassberger, P., Procaccia, I.: Characterization of strange attractors. Physical Review Letters 50, 346–349 (1983) 8. Brock, W.A., Dechert, W.D., Scheinkman, J.A., LeBaron, B.: A test for independence based on the correlation dimension. Econometric Reviews 15, 197–235 (1996) 9. Hsieh, D.A.: Chaos and nonlinear dynamics: application to financial markets. Journal of Finance 46, 1839–1877 (1991) 10. Kyrtsou, C., Terraza, M.: Stochastic chaos or ARCH effects in stock series? A comparative study. International Review of Financial Analysis 11, 407–431 (2002)
Bound for the L2 Norm of Random Matrix and Succinct Matrix Approximation Rong Liu, Nian Yan, Yong Shi, and Zhengxin Chen Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences, Beijing 100080, China School of Mathematical Science, Graduate University of Chinese Academy of Sciences, Beijing 100049, China College of Information Science and Technology, University of Nebraska at Omaha, Omaha NE 68182, USA
[email protected],
[email protected], {nyan,zchen}@mail.unomaha.edu
Abstract. This work furnished a sharper bound of exponential form for the L2 norm of an arbitrary shaped random matrix. Based on the newly elaborated bound, a non-uniform sampling method was developed to succinctly approximate a matrix with a sparse binary one and hereby to relieve the computation loads in both time and storage. This method is not only pass-efficient but query-efficient also since the whole process can be completed in one pass over the input matrix and the sampling and quantizing are naturally combined in a single step. Keywords: Sampling, Matrix Approximation, Data Reduction, Random Matrix, L2 Norm.
1
Introduction
The computation for matrix by vector product is the most time and storage consuming step in many numeric algorithms. Typical complexity of this computation is O(mn) in both time and storage for a matrix size of m × n. In real world problems such as data mining, machine learning, computer vision, and information retrieval, computations soon become infeasible as the size of matrices gets large since there are so many operations of this type involved. In data mining and machine learning, low rank approximations provide compact representations of the data with limited loss of information and hereby relieve the curse of dimensionality. A well known technique for low rank approximations is the Singular Value Decomposition (SVD), also called Latent Semantic Indexing (LSI) in information retrieval [7] [4], which minimize the residual among all approximations with the same rank. Traditional SVD computation requires O(mn min{m, n}) time and O(mn) space. To speed up the SVD-based low rank approximations and save storage, sampling methods are often used to sparsifying the input matrices because the time and space complexity for a sparsified matrix of N non-zero entries are reduced to O(N min{m, n}) and O(N ) respectively. If a quantizing (rounding) approach is combined, the computation loads can be relieved further. M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 426–435, 2008. c Springer-Verlag Berlin Heidelberg 2008
Matrix Approximation
427
Attracted by the computational benefits from sparsifying matrices, numerous algorithms for massive data problems adopt sampling methods and different error bounds are presented along with them. Frieze et al. (2004) [12] de in the veloped a fast Monte-Carlo algorithm which find an approximation D vector space spanned by some sampled rows of the input matrix D such that F ≤ DF ) ≥ 1 − δ in two passes through D. Since many rows P (D − D are discarded in the random sampling process, the output matrix can be interpreted as a spare representation of the input one. In low rank approximations, Deshpande and Vempala (2006) [8] further presented an adaptive sampling method to approximate the volume sampling with a probability at least 3/4 that k F ≤ (1 + )D − Dk F (D k consists of the first k terms in the SVD of D − D D). This algorithm requires more passes over the input matrix. Passes through a large matrix can be extremely costive if the matrix can not be fed in Random Access Memory and must be stored in external storages such as Hard Disk. A alternative entrywise sampling method which uses only one pass was given by Achlioptas and McSherry (2001) [1]. Where the reconstruction error has a significant better bound in L2 norm, a more effectiveindicator for the trends in a 2 ≥ 5b s(m + n)) ≤ 1/(m+n) with matrix than Frobenius norm, as P (D− D certain regularity. The theorem [2] which bounds the deviations of eigenvalues of a random symmetric matrix from their medians is used to prove this bound. In [1] they also includes detailed comparisons of their application to fast SVD with the methods in [9] and [11]. There are also some achievements in non-sampling based methods. GLRAM proposed by Ye (2005) [16] is a such one. Instead of the linear transformation used in SVD, the GLRAM applies a bilinear mapping on the data because each point is represented by a matrix other than a vector. The algorithm intends to n minimize i=1 Ai − LM i RT 2F with the same L and R for all Ai (the data points) and hereby relieve computation loads. Their result is appealing but does not admit a closed form solution in general, hence no error bound presented. From another point of view, Bar-Yossef (2003) [5] pointed out based on information theory that at least O(mn) queries are necessary to produce a (, δ) approximation for a given matrix. The review paper by Mannila (2002) [14] mentioned that methods of [1] worked very well in low rank approximation. However, [14] also suggested the necessity of further research for their boundaries. Besides, the following drawbacks are found in these methods under our own investigation. Firstly, the prior (uniform sampling) in [1] is flat which incorporates no knowledge about the data. Secondly, a zero entry can even be round to non-zero. Thirdly, the rounding may change the sign of an entry. Among them, the last two are extremely harmful to data mining tasks. The main weakness of the method in [3] is that matrices are required to be symmetric. In addition, this method still demands large amounts of computation and storage. As a continuing study for data reduction, the objective of this paper is to develop an entrywise non-uniform sampling based data reduction (matrix approximation) method with lower reconstruction error.
428
2
R. Liu et al.
Method and Analysis
The data matrix under consideration in this paper is D ∈ Rm×n . Here, m is the number of data points, and n the number of attributes or the dimension of the points. The method sparsifying D is delivered and analyzed in the following. 2.1
Sampling Method and Its Bound
The sampling method we present to approximate a matrix is d˜ij =
0 w.p. 1 − |dij |/c c · sgn(dij ) w.p. |dij |/c
(1)
Here, w.p. means “with the probability of”. c = b · s, b = maxi,j |dij |, and s ≥ 1 is a controller for the sparsity. Actually, this method directly omit an entry or round it to ±c in one query unlike the method in [1] where an extra quantizing step is needed. Moreover, the matrix obtained can be succinctly represented by a binary matrix corresponding to the sign of the entries, enabling addition in place of multiplication, with a final scaling of the result by c. One of the most important capability indicators is the residual or the so called reconstruction error. There different metrics for the residual. Among them, matrix norm is a widely used one. For the insensitivity of F norm to the matrix structure, L2 norm is a preferred metric in classification, LSI, and other data mining tasks [1]. Therefore, the error of the sampling method are estimated in L2 norm as the theorem follows. is the matrix obtained from the sampling process in Equation Theorem 1. If D 2 ≤ ) ≥ 1 − e−2[(1−t)4 2 /c2 −(2+1/t)(m+n)] , for all > 0 and (1), then P (D − D some fixed t ∈ (0, 1). The proof of Theorem 1. are given below along with the motivation of this method. 2.2
Motivation and Analysis
Computing the L2 norm of an arbitrary matrix is a quite difficult task. The already known results to this problem are mainly involved in symmetric matrices. For an arbitrary matrix B ∈ Rm×n , the following Lemma can be used to associate B2 with the L2 norm of a symmetric matrix
0 BT A= B 0
∈ R(m+n)×(m+n)
(2)
Lemma 1. Let λ1 ≥ λ2 ≥ · · · ≥ λm+n be eigenvalues of A, then B2 = A2 = λ1 = −λm+n .
Matrix Approximation
429
Proof : Let x be the eigenvector belongs to the eigenvalue λ. We decompose the vector into two part as x = [xT1 , xT2 ]T according to the structure of A. Then Ax = λx can be written as x x1 0 BT =λ 1 x x2 B 0 2 Let y 1 = x1 , y 2 = −x2 , this becomes y y1 0 BT = −λ 1 y2 y2 B 0
i.e. Ay = −λy
so −λ is also an eigenvalue of A. This means that the eigenvalues of A appear in a pairwise way. Consequently, the L2 norm of the symmetric matrix A is given by A2 = maxi |λi | = λ1 = −λm+n . To prove that the L2 norms of B and A are identical, let us consider the eigenfunction of AT A T B B − λI 0 = |B T B − λI| |BB T − λI| 0 = |AT A − λI| = T 0 BB − λI According to the Sylvester’s theorem [6] B T B and BB T have the same nonezero eigenvalues. The equation above further reveals the truth that AT A, B T B, and BB T are all share the same non-zero eigenvalues. This lead us to the claim that B2 = A2 . Now we focus on the L2 norm of a symmetric matrix since the Lemma given above points out that one can always construct a symmetric matrix to make it has the same L2 norm as the given matrix of arbitrary shape. Symmetric matrices possess many appealing properties and some of them are very helpful in estimating the L2 norm. The Rayleigh quotient defined below is especially important among them owing to providing the information about the eigenvalues of a symmetric matrix. R(x) =
xT Ax , x = 0 xT x
The next Lemma is an improvement for the Theorem in [6] which associates the L2 norm of a symmetric matrix with the Rayleigh quotient. Lemma 2. If A is a symmetric matrix, then A2 = maxx |R(x)|; furthermore, if the matrix takes the form given in Equation (2), then A2 = λ1 = maxx R(x). Proof :
The best known property of Rayleigh quotient (cited in [6]) is max R(x) = λ1 , x=0
min R(x) = λn x=0
Here, λ1 ≥ λ2 ≥ · · · ≥ λn are eigenvalues of A. This property gives the first part of the Lemma straightforwardly since A2 = max{|λ1 |, |λn |} = maxx |R(x)|.
430
R. Liu et al.
According to Lemma 1., A2 = λ1 = maxx R(x) if A takes the form in Equation (2). n The Rayleigh quotient can be defined equivalently as R(x) = i,j Aij xi xj on a unit vector x2 = 1. R(x) is a random variable, if A is a random symmetric matrix. Inequalities are of great use in bounding quantities that might otherwise be hard to compute. But directly using Rayleigh quotient combined with a random inequality to bound the L2 norm of A will encounter great difficulty because P (A2 ≥ ) ≥ P (|xT Ax| > ) (recall that A2 = maxx2 =1 |xT Ax|). Fortunately, [10] introduced a method that makes it sufficient to consider only vectors in a discrete space. We strengthen their result ([10], [3]) here as a Lemma that Lemma 3. (Reduction to Discrete Space) (Reduction to Discrete Space) Let S = {z : z ∈ √tn Zn , z2 ≤ 1} for some fixed t ∈ (0, 1). If for every v i ∈ S we have v Ti Av j ≤ , then uT Au ≤ /(1 − t)2 for every unit vector u. And the size of S (denoted by |S|) is at most e(2+1/t)n . Proof : Let y be a vector of length at most (1 − t) and C be the specific hypercube from the grid √tn Zn in which y lies. Any two points inside C (include y) are t close since the side length of C is √tn . As the maximum length of y is assumed to be (1 − t), all vertices of C are within the distance of 1 from the origin. Therefore, all vertices of C belong to set S. The convex combination representability of y in the vertices of C can then be extended to S. Namely, y = i ai v i (ai ≥ 0, i ai = 1). In this way, y T Ay = ai v ti A aj v j = ai aj v Ti Av j i
≤
i
i
i
j
ai aj =
j
The inequality is obtained as we assumed that v Ti Av j ≤ for every v i ∈ S. Resultingly, uT Au =
y T Ay ≤ (1 − t)2 (1 − t)2
for any unit vector u. Map every point z ∈ S in a 1-1 correspondence with a n-dimensional hypercube of side length √tn centered on itself as z → Cz (= {z ± w : w∞ ≤ 2√t n }). The length of w satisfies the following inequality. w22 =
n i=1
wi2 ≤
n t2 t2 = 4n 4 i=1
Then, z + w2 ≤ z2 + w2 ≤ 1 + t/2
Matrix Approximation
431
The inequality above indicates the length of any vector in Cz is bounded by 1 + t/2. The union of these cubes is thus contained in a n-dimensional ball B of radius (1 + t/2). According to the volume relationship between the union and the ball,
|S| ·
t √ n
n
=
Vol(Cz ) ≤ Vol(B) =
z∈Cz
π n/2 (1 + t/2)n Γ (n/2 + 1)
Consequently, |S| ≤
π n/2 Γ (n/2 + 1)
√ n (1 + t/2) n ≤ e(2+1/t)n t
That’s all for the proof. In respect that the probability relationship between P (A2 ≤ ), P (xT Ax ≤ | x2 = 1), and P (v Ti Av j ≥ (1 − t)2 | v i , v j ∈ S) is rather involving, we give an additional Lemma to describe them. Lemma 4. If A takes the form in Equation (2) and P (v Ti Av j ≥ (1 − t)2 | v i , v j ∈ S) ≤ ξ, then P (A2 ≤ ) ≥ 1 − ξe2(2+1/t)(m+n) . Proof : The logic relationship of the propositions above is ∀v i ∈ S, v Ti Av j ≤ (1 − t)2 ⇒ ∀x2 = 1, xT Ax ≤ ⇒ A2 ≤ As a result, P (A2 ≤ ) ≥ P (xT Ax ≤ | x2 = 1) ≥ P(
|S| |S|
v Ti Av j ≤ (1 − t)2 | v i , v j ∈ S)
i=1 j=1
Note that P(
|S| |S|
v Ti Av j ≤ (1 − t)2 | v i , v j ∈ S)
i=1 j=1
= 1 − P(
|S| |S|
v Ti Av j ≥ (1 − t)2 | v i , v j ∈ S)
i=1 j=1 |S| |S|
≥1−
P (v Ti Av j ≥ (1 − t)2 | v i , v j ∈ S)
i=1 j=1
≥1−
|S| |S| i=1 j=1
ξ = 1 − |S|2 ξ ≥ 1 − ξe2(2+1/t)(m+n)
432
R. Liu et al.
The last inequality results from that Lemma 3. points out the maximum size of T is e(2+1/t)(m+n) (note that the dimension of A is (m + n)). Then the lemma follows. Now we focus on furnishing P (v Ti Av j ≥ | v i , v j ∈ S) with an upper bound via probability inequalities. Markov’s inequality obtained by tailing the expectation of a random variable is a widely used one in real world applications. Hoeffding’s inequality is similar in spirit to Markov’s inequality, but it is a sharper inequality owing to that it makes use of a Taylor expansion of second order. Furthermore, if there are large number of samples, the bound from Hoeffind’s inequality is smaller than the bound from Chebyshev’s inequality. Before applying Hoeffding’s inequality to get a bound for the sampling method, we give a refinement for it as: Lemma 5. Let Y1 , · · · , Yn be independent observation such that E(Yi ) = 0 and αi ≤ Yi ≤ βi . Then, for any > 0, n 2 n 2 P Yi ≥ ≤ e−2 / i=1 (βi −αi )
i=1
Proof : is
The original Hoeffding’s inequality given in [15] on the same condition
P
n
Yi ≥
≤ e−t
i=1
n
2
et
(βi −αi )2 /8
, ∀t > 0
i=1
n When t = 4/ i=1 (βi −αi )2 , the right hand side of the inequality above archives 2 n 2 its minmum e−2 / i=1 (βi −αi ) . Thus the claim.
A bound for P (v Ti Av j ≥ | v i , v j ∈ S) can then be obtain by applying the results obtained above. We state it as one of our main results as the Theorem follows. Theorem 2. Let B ∈ Rm×n be a random matrix with independent entries of zero mean. If αij ≤ bij ≤ βij , then P (B2 ≤ ) ≥ 1 − δ for ∀ > 0. Where 4 2 n m 2 2 δ = e−2[(1−t) / i=1 j=1 (xi wj +ui yj ) (βij −αij ) −(2+1/t)(m+n)] for some fixed t ∈ (0, 1), [xT , y T ]T = v i ∈ S, and [uT , wT ]T = v j ∈ S.
Proof : We construct a symmetric matrix A by B as Equation (2), then B2 = A2 according to Lemma 1. and Lemma 2.. Lemma 4. reveals further that P (B2 ≤ ) = P (A2 ≤ ) ≥ 1 − ξe2(2+1/t)(m+n) if P (v Ti Av j ≥ (1 − t)2 | v i , v j ∈ S) ≤ ξ. For the matrix A, v Ti Av j = xT y T
0 BT B 0
= y T Bu + xT B T w =
u w
n m i=1 j=1
(xi wj + ui yj )bij
Matrix Approximation
433
Since bij are independent, (xi wj + ui yj )bij are also independent and zero mean. The requirements for Lemma 5. are thus satisfied and the range of random variable (xi wj +ui yj )bij is from (xi wj +ui yj )αij to (xi wj +ui yj )βij or conversely. As a result, P (v Ti Av j ≥ (1 − t)2 | v i , v j ∈ S) ≤ e−2(1−t)
4 2
/
n i=1
m 2 2 j=1 (xi wj +ui yj ) (βij −αij )
=ξ
Then the theorem follows.
At this time we are prepared to prove Theorem 1.. But before going further, we present the motivation of the sampling method. A general approach for sparsifying a dense matrix is d˜ij =
0 w.p. 1 − pij γij w.p. pij
In many cases, the method that makes the expectation of the sampled matrix equal to the input one is a reasonable strategy. We follow this too, that is dij = E(d˜ij ) = 0 · (1 − pij ) + γij · pij = γij · pij . So our method is d˜ij =
0 w.p. 1 − pij dij /pij w.p. pij
as the error or difference matrix of the sampling. Obviously, Define B = D − D E(bij ) = 0 and bij are independent due to that d˜ij are sampled independently. The requirements for Theorem 2. are hereby satisfied. The range for the random 2., variable ˜bij is from dij to (1−1/pij )dij or the converse. According n toTheorem m the components that contain the ranges are then become as i=1 j=1 (xi wj + ui yj )2 (dij /pij )2 . Now the last free parameter need to be settled is the probability pij which is essential for the success of many sampling based methods. From the analysis above, we know that with the increase of pij the δ in the bound increases. In applications, of B2 decreases but the density of the matrix D compromise between them should be take into consideration. Note that there is no priori reason to omit each entry in the input matrix D with the same probability, we set pij = |dij |/c (c is a constant positive real number) so that all entries of B are contained in segments of equal length c and the bound for B2 are simplified further. In addition, that the larger the absolute value of an entry is, the more likely it retained is desired in many real world applcations. Intend to guarantee that pij is a well defined probability that contained in the segment of [0, 1], the scale factor c is set to be s · b. As a result, sparsifying (omitting) and quantizing (rounding) are combined naturally without any extra steps. The sampling method given in Equation (1) comes into being as thus. After explaining the motivation, Theorem 1. can be proofed straightly.
434
R. Liu et al.
2 2 Proof of Theorem 1.: For pij = |dij |/c, ni=1 m j=1 (xi wj +ui yj ) (βij −αij ) = n m 2 2 c i=1 j=1 (xi wj + ui yj ) . And we have n m
(xi wj + ui yj )2
i=1 j=1 n m
= ≤
i=1 j=1 n m
(x2i wj2 + u2i yj2 ) + 2
≤ =
i=1 j=1 n ( x2i i=1
xi ui
i=1 n
m
wj yj
j=1
m xi ui )2 + ( wj yj )2
(x2i wj2 + u2i yj2 ) + (
i=1 j=1 n m
n
i=1
(x2i wj2 + u2i yj2 ) + +
n
x2i
n
i=1 i=1 m yj2 )( u2i + wj2 ) j=1 i=1 j=1
m
n
j=1 m
u2i +
j=1
wj2
m
yj2
j=1
≤1
The second inequality is obtained from Cauchy’s inequality, and the last is from the fact that [xT , y T ]T = v i ∈ S, [uT , w T ]T = v j ∈ S. Accordingly, 4 2 2 δ ≤ e−2[(1−t) /c −(2+1/t)(m+n)] . Applying Theorem 2., P (B2 ≤ ) ≥ 1 − 4 2 2 e−2[(1−t) /c −(2+1/t)(m+n)] comes forth. Our method is hereby much sharper in error bound with larger compression ratio than that of [1]. The bound from [3] is restricted to symmetric data matrices, moreover, their δ is two times worse than ours.
3
Discussion
In this paper, we furnish a bound of exponential form for the L2 norm of a random matrix of arbitrary shape. The method we use differs from the one that bounds the deviations of eigenvalues of a random symmetric matrix from their medians and thus gives a bound of polynomial form. Through the Rayleigh quotient of the matrix, our work associates the L2 norm of an input matrix with the inner product weighted by a constructed symmetric matrix in a reduced discrete space. As a result, an exponential bound for the input matrix is obtained by applying Hoeffiding’s inequality to the weighted inner product. Moreover, the upper probability bound is further improved by two times owing to utilizing the speciality of the constructed symmetric matrix. These findings shed some new light on the understanding of the L2 norm of random matrices and bound them more tightly. Motivated by the exponential bound we found, a non-uniform sampling based sparse binary matrix approximation is presented to accelerate computations and save storages for massive data matrices. In the sampling method, omitting and rounding are naturally combined together without the need for extra steps because the sampling probabilities we chosen tailor the segments containing the retained entries into equal length. Consequently, the
Matrix Approximation
435
implementation requires only one pass over the input matrix and the bound is simplified further. Acknowledgments. This work was partially supported by 973 Project of Chinese Ministry of Science and Technology (Grant No.2004CB720103), National Natural Science Foundation of China (Grant No.70621001, 70531040, 70501030, 10601064, 70472074), National Natural Science Foundation of Beijing (Grant No.9073020), and BHP Billiton Cooperation of Australia.
References 1. Achlioptas, D., McSherry, F.: Fast Computation of Low Rank Matrix Approximations. In: STOC 2001: Proceedings of the 32nd annual ACM symposium on Theory of computing, pp. 611–618 (2001) 2. Alon, N., Krivelevich, M., Vu, V.H.: On the Concentration of Eigenvalues of Random Symmetric Matrices. Tech. Report 60, Microsoft Research (2000) 3. Arora, S., Hazan, E., Kale, S.: A Fast Random Sampling Algorithm for Sparsifying Matrices. In: D´ıaz, J., Jansen, K., Rolim, J.D.P., Zwick, U. (eds.) APPROX 2006 and RANDOM 2006. LNCS, vol. 4110, pp. 272–279. Springer, Heidelberg (2006) 4. Berry, M., Dumais, S., O’Brie, G.: Using Linear Algebra for Intelligent Information Retrieval. SIAM Review 37, 573–595 (1995) 5. Bar-Yossef, Z.: Sampling Lower Bounds via Information Theory. In: Proceedings of the 35th annual ACM symposium on Theory of computing, pp. 335–344 (2003) 6. Chen, Y.P.: Matrix Theories, 2nd edn. NorthWestern Polytechnical University Press, Xian (1998) 7. Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by Latent Semantic Anlysis. J. American Society for information science 41, 391–407 (1990) 8. Deshpande, A., Vempala, S.: Adaptive Sampling and Fast Low-Rank Matrix Approximation. In: D´ıaz, J., Jansen, K., Rolim, J.D.P., Zwick, U. (eds.) APPROX 2006 and RANDOM 2006. LNCS, vol. 4110, pp. 292–303. Springer, Heidelberg (2006) 9. Drineas, P., Frieze, A., Kannan, R., Vempala, S., Vinay, V.: Clustering in large graphs and matrices. In: Proceedings of the 10th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 291–299 (1999) 10. Feige, U., Ofek, E.: Spectral Techniques Applied to Sparse random graphs. Random Structures and Algorithms 27(2), 251–275 (2005) 11. Frieze, A., Kannan, R., Vempala, S.: Fast Monte-Carlo algorithms for finding lowrank approximations. In: 39th Annual Symposium on Foundations of Computer Science, pp. 370–378 (1998) 12. Frieze, A., Kannan, R., Vempala, S.: Fast Monte-Carlo algorithms for finding lowrank approximations. Journal of the ACM 51(6), 1025–1041 (2004) 13. Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. Johns Hopkins University Press, Maryland (1996) 14. Mannila, H.: Local and Global Methods in Data Mining: Basic Techniques and Open Problems. In: Proceedings of the 29th International Colloquium on Automata, Languages and Programming, pp. 57–68 (2002) 15. Wasserman, L.: All of Statistics: A Concise Course in Statistical Inference. Springer, New York (2004) 16. Ye, J.P.: Generalized Low Rank Matrix Approximations of Matrices. Machine Learning 61(1-3), 167–191 (2005)
Select Representative Samples for Regularized Multiple-Criteria Linear Programming Classification Peng Zhang1, Yingjie Tian1, Xingsen Li1, Zhiwang Zhang1, and Yong Shi1,2,* 1
Research Center on Fictitious Economy & Data Science, Chinese Academy of Sciences, Beijing, China, 100080 2 College of Information Science & Technology, University of Nebraska at Omaha, Omaha, NE 68182, USA
[email protected], {tyj,xsli,zzw,yshi}@gucas.ac.cn
Abstract. Regularized multiple-criteria linear programming (RMCLP) model is a new powerful method for classification in data mining. Taking account of every training instance, RMCLP is sensitive to the outliers. In this paper, we propose a sample selection method to seek the representative points for RMCLP model, just as finding the support vectors to support vector machine (SVM). This sample selection method also can exclude the outliers in training set and reduce the quantity of training samples, which can significantly save costs in business world because labeling training samples is usually expensive and sometimes impossible. Experimental results show our method not only reduces the quality of training instances, but also improves the performance of RMCLP. Keywords: RMCLP, clustering, outlier detection, sample reduction.
1 Introduction Recently, multiple-criteria mathematical programming models have been widely used for classification in data mining and business intelligence [1]. The original idea of multiple-criteria linear classification is simultaneously maximizing the distances between classes and minimizing the overlapping of each group [2]. Assume there are two groups G1 and G2. Multiple-criteria mathematical model try to find a projection direction x and a boundary b, making all the training records ai satisfy the constraints aix
b for ai∈G2. This formulation of calculating all the training samples performs pretty well when there are not many outliers, but the disadvantage is also obvious, if there are outliers, it needs much more samples to keep stability. The biggest difference in RMCLP and the well-known support vector machine (SVM) is that RMCLP takes account of all the training instances when draw the classification boundary, and SVM draws the classification boundary only according to a few support vectors. If we can find the representative instances of multiple-criteria models, just as the support vectors of SVM, RMCLP can robustly avoid the effect of noise and reduce the training samples dramatically. This will benefit business application significantly, because in business *
Corresponding author.
M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 436 – 440, 2008. © Springer-Verlag Berlin Heidelberg 2008
Select Representative Samples for Regularized Multiple-Criteria Linear Programming
437
world, acquiring the training samples is always costly and sometimes impossible, reducing training samples can save much money to do data mining, that is to say, if we find the representative instances, multi-criteria models will more powerful in business data mining. Clustering is an effective method to remove outliers. With the clustering centers, it also can reduce the training samples to classification. In this paper, we first introduce the RMCLP model, and then we give the algorithm of how to select the representative instances for RMCLP. The following experiment on synthetic dataset demonstrates that finding the representative instances is capable to improve the performance of RMCLP. In the fourth section, we show the empirical study on a real-life credit card dataset. Finally, we conclude this paper.
2 Two Groups RMCLP Model In 2007, Shi et.al proposed a new regularized multiple-criteria linear programming (RMCLP) model for classification. Now we will give a brief introduction here without demonstration of its mathematical solution, more theoretical explanation of this model is in [3]. Given a set of r-dimensional variables (attribute) vector a = ( a1 ,..., a r ) , let Ai = ( Ai1 ,..., Air ) ∈ R r be one of the sample records of these attributes, where i = 1,..., n ; n represents the total number of records in the dataset. Suppose two groups, G1 and G2, are predefined. A boundary scalar b can be set to separate these two groups. We then give the RMCLP model as follows:
min 1 xT Hx + 1 α T Qα + d T α − cT β z 2
2
s.t. (1)
Ai x − α i + β i = b, ∀Ai ∈ G1; Ai x + α i − β i = b, ∀Ai ∈ G2 ;
α i , βi ≥ 0. In this model,
α and β
are n-dimension vectors.
α denotes the misclassification
distance while β denotes the correct classification distance. As far as the constraints be considered, it means when we misclassify Ai ∈ G1 to G2 or vice versa, there is a distance
αi
and the value equals | Ai x − b | ; when we correctly classify Ai , there is a
distance β i and the value equals | Ai x − b* | , where b* = b + α i or b* = b − α i . If we let the objective function be the linear combination of
∑α
i
α and β , that is to say, minimize
and −∑ βi simultaneously, we can get the original MCLP model. To assure
the MCLP model always has a solution, we add the 1 xT Hx and 1 α T Qα to the ob2 2 r *r jective function to formulate the two groups RMCLP model. H ∈ R , Q ∈ R n*n are
438
P. Zhang et al.
symmetric positive definite matrices. d , c ∈ R .The RMCLP model is a convex quadratic program. T
T
n
3 Algorithm for Sample Selection Data preprocess is an important step for data mining. Without pure and right data source, data mining models will lose its power in discovering useful knowledge. Traditionally, bagging and boosting methods are popular in sample selection. In this section, a clustering based sample selection algorithm (Algorithm 1) is introduced as follows: ------------------------------------------------------------------------------------------------------Input: training samples Tr, testing samples Ts, parameter ε , exclusion percentage s Output: selected samples Tr’ Begin 1. Set Tr’=Tr 2. While ( |PrevClusteringCenter-CurrClusteringCenter| < ε ) { 2.1. Calculate current clustering center; cent =
1 ∑ xi | Tr ' | i
2.2. For each instances i ∈ Tr do { 2.2.1 Calculate the Euclidean distance of the clustering center,
disi =
∑ | cent r∈R
2
− Trr | i
r
2.2.2 get the s% of the instances which are farthest to the center, denoted as the set {P} 2.2.2 exclude the noisy instances, Tr’=Tr\{P}. } } 3. Return the selected samples Tr’ End ------------------------------------------------------------------------------------------------------Algorithm 1. Clustering method to get the representative samples We now create a synthetic dataset and show how the effectiveness of our method on RMCLP. Assume there is tow groups classification problem, all of the samples follows the distribution x ~ N ( μ , ∑ ) , more specifically, G1 ~ N (1, Σ) , G2 ~ N (5, Σ) . There are 6 instances in G1 and 6 instances in G2, additionally, a noisy instance -100 is added in G1, and a noisy point -200 is add in G2. We will see how these two noisy instances affect the boundary of RMCLP: G1= {1.11, 1.20, 1.19, 0.82, 0.90, -50} G2 = {5.20, 5.22, 4.76, 4.99, -100} Case 1: We directly build RMCLP model on this dataset, the optimal parameters of RMCLP are H=105, Q=1, d=105, c=100. The projection director x=-0.798034, and
Select Representative Samples for Regularized Multiple-Criteria Linear Programming
439
boundary b=7.33516. The objective value of each element in G1 is {-0.886, -0.958, 0.950, -0.654, -0.718, 39.902}, in G2 is {-4.150, -4.166, -3.799, -3.982, -4.086, 79.803}. We can figure out that the accuracy is 83% on G1 and 16% on G2. Case 2: Now we call algorithm 1 first to take out the noisy -50 from G1 and -100 from G2, and then we build RMCLP model on the purified dataset and get the optimal parameters are H=105, Q=1, d=105 , c=104. The x=1.98336 and b=5.50112. The objective value of G1 is {2.202, 2.380, 2.360, 1.626, 1.785} and of G1 is {10.313, 10.353, 9.441, 9.897, 10.155}. So the accuracies of two groups are both 100%. This means we’ve gotten a fine representative training sample. Besides this synthetic dataset, we will test the effectiveness of our clustering based sample selection method on a real life dataset in next section.
4 Empirical Study on Real Life Dataset In this section, we will do experiments on a real life credit card dataset, more detailed introduction of this dataset is in [4]. According to the previous research work on this dataset, we first select a benchmark training size of randomly selected 700 bankruptcy records and 700 normal records, and the remained 4600 records are used to testing the performance. Now what we need to do is to examine three assumptions: First, is the randomly selected 1400 points are suitable for build model? Second, are there any noise points in this randomly selected dataset? Third, can we reduce the 1400 points Table 1. Comparison of different percentage of training samples
Percentage (number) of training samples
% %
100 (1400) 90 (1260) 80%(1120) 70%(980) 60% (840) 50% (700) 40% (560) 30% (420) 20% (280) 10% (140)
Training Samples Right instances 1096 998 912 789 667 559 449 331 232 116
Accuracy 78.29% 79.20% 81.43% 80.51% 79.40% 79.86% 80.18% 78.81% 82.86% 82.86
%
Testing Samples (4600 instances) Right inAccuracy stances 3394 73.78% 3295 71.63% 3292 71.57% 3571 77.63% 3761 81.76% 3881 84.37% 3964 86.17% 4050 88.04% 4073 88.54% 1971 42.85%
in a much smaller size and improve the accuracy synchronously? Experimental results in Table 1 tell us the results. The first column is the current training samples’ size, from the 1400 instances to 140 instances, column two and three is the performance on training samples and the column four and five is the performance on 4600 testing records. The experiment is conducted as follows: first, we build RMCLP model on all the 1400 samples, the result is 73,78% and we take it as benchmark accuracy. Then
440
P. Zhang et al.
we call algorithm 1 with s=1, the clustering method will drop one points after one iteration. We remark 9 special sets, 10%, 20%, …, to 90% of the original 1400 samples respectively, and build RMCLP model on these selected samples. We finally list the performance of RMCLP with each different sample. Intuitionally, we though the larger the training samples, the more information we could get, and more accurate when prediction. However, from the result, we can see that, the 1400 randomly selected samples is not the best training set for RMCLP model, there exists noisy and useless points. The clustering method reduces the training samples continuously. When we get 20%, that is 280 samples in the benchmark 1400 samples, we can build RMCLP with 88.54% on testing set, which is the best.
5 Conclusions As a new promising data mining tool, RMCLP has been extensively used in data mining and business intelligence. To label lots of training samples before building RMCLP model is expensive and sometime impossible. Even if we could label all of them, there would be lots of noisy that impact the performance of RMCLP. So we should find representative instances of RMCLP, just as find the support vectors of SVM. In this paper, we have proposed a clustering based method to wipe off the noisy data and further more, to reduce training samples. From empirical study, we can see that our method is eligible to improve the performance of RMCLP on the credit card dataset. Whether this clustering method gets the optimal representative samples of RMCLP has not been discussed here and will be remained as our further research topic.
Acknowledgments This research has been partially supported by a grant from National Natural Science Foundation of China (#70621001, #70531040, #70501030, #10601064, #70472074), National Natural Science Foundation of Beijing #9073020, 973 Project #2004CB720103, Ministry of Science and Technology, China and BHP Billiton Co., Australia.
References 1. Olson, D., Shi, Y.: Introduction to Business Data Mining. McGraw-Hill, New York (2007) 2. Shi, Y., Peng, Y., Kou, G., Chen, Z.: Classifying Credit Card Accounts for Business Intelligence and Decision Making: A Multiple-Criteria Quadratic Programming Approach. International Journal of Information Technology and Decision Making 4, 581–600 (2005) 3. Shi, Y., Chen, X., Tian, Y., Zhang, P.: A Regularized Multiple Criteria Linear Program for Classification. In: IEEE ICDM 2007 (2007). 4. Peng, Y., Kou, G., Chen, Z., Shi, Y.: Cross-Validation and Ensemble Analyses on MultipleCriteria Linear Programming Classification for Credit Cardholder Behavior. In: Bubak, M., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2004. LNCS, vol. 3039, pp. 931–939. Springer, Heidelberg (2004)
A Kernel-Based Technique for Direction-of-Change Financial Time Series Forecasting Andrew Skabar Department of Computer Science and Computer Engineering La Trobe University, Victoria 3086 Australia [email protected]
Abstract. This paper presents a generative approach to direction-of-change time series forecasting. Kernel methods are used to estimate densities for the distribution of positive and negative returns, and these distributions are then combined to produce probability estimates for return forecasts. An advantage of the technique is that it involves very few parameters compared to regressionbased approaches, the only free parameters being those that control the shape of the windowing kernel. A special form is proposed for the kernel covariance matrix. This allows recent data more influence than less recent data in determining the densities, and is important in preventing overfitting. The technique is applied to predicting the direction of change on the Australian All Ordinaries Index over a 15 year out-of-sample period. Key words: Financial time series forecasting, kernel methods
1 Introduction Financial time series forecasting has almost invariably been approached as an autoregression problem in which future values of a time series are predicted on the basis of past values. The parameters of the prediction model are first optimized using insample data; the model is then used to make forecasts on a set of out-of-sample data, and the accuracy of the forecasts is measured by comparing the forecast values with the realized (i.e., actual) values. In evaluating forecast accuracy, the mainstream literature has tended to focus almost exclusively on measuring error magnitudes. For example, in their review of time series forecasting covering a 25 year period, De Gooijer and Hyndman (2006) include a section on forecast evaluation and accuracy measures in which they list 17 commonly used forecast accuracy measures, all of which are based on the magnitude of the error between the predicted and realized values [1]. This is interesting because in many cases the magnitude of the error is not as important as whether the direction of the prediction (i.e., up or down) is correct. For example, if one is going to make stock trading decisions based on forecast values, it is more important to be able to correctly predict the direction of change than it is to achieve, say, a small mean squared error. We call this direction-of-change forecasting. The importance of correctly predicting the direction of change has been acknowledged in several recent papers [2-5]. M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 441 – 449, 2008. © Springer-Verlag Berlin Heidelberg 2008
442
A. Skabar
One approach to direction-of-change forecasting is simply to use a regression model to forecast the next value of the time series, and to then convert this to a direction prediction by comparing the forecast value with the current value: if the forecast value is greater than the current value of the series, predict up; otherwise predict down. That is, direction-of-change forecasting can be seen as simply a regression problem involving this extra step. Many regression models have been proposed and tested over the years. Many of these models are linear, but in recent years non-linear models have also become popular. For example, there are many reports in the literature of the application of neural networks to financial time series forecasting [2][6-8]. There is, however, considerable debate about whether non-linear models are able to provide better forecasts than linear or random walk models when applied to financial time-series data [9-11]. One of the main problems with non-linear models such as neural networks is that they involve a large number of parameters. A typical neural network architecture will have in the order of 100s of weights which must be optimized, usually using some gradient-descent based technique. This can lead to severe over-fitting problems, especially on noisy data such as financial commodity prices. Preventing over-fitting requires careful selection of number of hidden units, regularization coefficients, and early stopping point, and these usually require an expensive cross-validation procedure. Furthermore, there is no guarantee that training from a different set of initial weights will results in the same predictions. In this paper we take a conceptually different approach to direction-of-change forecasting. Rather than treating the problem as a regression problem, we conceptualize the problem as a binary classification problem, and predict the direction of change directly. There are several reasons as to why such an approach might be advantageous. Firstly, traders base their trading decisions primarily on their opinion of whether the price of a commodity will rise or fall, and to a lesser extent on their opinion of the degree with which it will rise or fall. This may create in financial systems an underlying dynamic that allows the direction of change to be predicted more reliably than the actual value of the series. Secondly, conceptualizing the problem as a classification problem allows us to apply a different family of algorithms. In this paper, we are specifically interested in applying generative classification models. That is, rather than discovering a function which maps directly from a set of inputs onto a prediction of the direction of change (as would be the case with a neural network approach, for example), the approach we present in this paper is based on estimating probability densities and combining these under Bayes' Theorem to arrive at a posteriori estimates of the probabilities of upward and downward movements. The main advantage of this approach over discriminative approaches is that it leads to more parsimonious models that involve few parameters. The remainder of the paper is structured as follows. Section 2 reviews the basic framework for density estimation-based classification and outlines how this may be applied to direction-of-change time series forecasting. Section 3 describes the kernel covariance matrix parameterization, together with a cross-validation procedure for optimizing the covariance matrix parameters. Section 4 presents and discusses empirical results of applying the techniques to the Australian All Ordinaries (AORD) Index. Section 5 concludes the paper.
A Kernel-Based Technique for Direction-of-Change Financial Time Series Forecasting
443
2 Generative Models for Classification Generative classification models are based on estimating the probability distributions from which the data was generated. Typically, the probability density functions are estimated; these densities are then combined using Bayes' Theorem to produce posterior probabilities; and these probabilities are then used to perform classification. In this section we describe how density estimation-based methods can be applied to direction-of-change time series forecasting. 2.1 Preliminaries Because financial commodity price series are usually highly non-stationary it is common to apply some type of transformation to the raw price data, thus obtaining transformed variables. Transformed variables are typically based either on absolute or relative changes in price, and in this paper we use returns. The return on day t is defined as rt = ( pt − pt −1 ) / pt −1 , where pt and pt−1 are the prices respectively on days t
and t−1. Our objective is to predict the direction of the change in price on day t; that is, we wish to predict the sign of rt. More specifically, we wish to predict the sign of rt on the basis of the returns observed on the D days preceding day t. That is, we wish to predict sign(rt) on the basis of the vector (rt-1, rt-2, … , rt-D), which we refer to as the delayed-return vector for day t. For notational convenience, we will represent the delayed-return vector as a column vector xt = (rt-1, rt-2, … , rt-D)T, which we refer to as a data point. If there are N data points, then we represent the collection of these using the set X = {x1, x2, …, xN}, where |X| = N. 2.2 Density Estimation
There are three general approaches to explicitly estimating probability densities: the parametric, non-parametric, and semi-parametric (or mixture-model) approaches. The parametric approach assumes the form of the distribution (e.g., Gaussian), and the task is to estimate the values of the parameters for that distribution (e.g., the mean and covariance in the Gaussian case). A problem with this approach is that many datasets do not follow a standard distribution, and attempts to model them in this way may lead to very poor estimates of the distribution. A second approach—the kernel, or Parzen, method—is a non-parametric approach that involves modelling the distribution using a series of probability windows (usually Gaussian) centred at each sample [12][13]. The overall probability density function is the average of all of the individual distributions centred at each point: i.e., X
p(x) =
∑
1 1 1 exp(− (x − xn )T ∑−1(x − xn )) D /2 1/2 X ( 2π ) ∑ n=1 2
(1)
where x is the point at which the density is estimated, X = {x1, x2, …, xN} is the set of points from which the density is estimated, the factor 1 ( 2π ) D /2 ∑ 1/ 2 is a constant which ensures that the area under each Gaussian sums to one, and Σ is the covariance matrix for the Gaussian kernel. The main decision to be made in using this approach
444
A. Skabar
is the selection of Σ, which defines the shape of the windowing kernel function, and acts as the smoothing parameter. If the kernel is too narrow then the distribution will be peaked around each of the sample points; if the kernel is too wide then the distribution will be overly smoothed. A third approach is the semi-parametric, or mixture model, approach, which can be seen as a compromise between the parametric and non-parametric approaches. In this case K distributions are used to model the data, where K is much smaller than the number of sample points. Only the non-parametric approach is used in this paper. 2.3 Classification Bayes' Theorem states that the posterior probability than an observation x belongs to class Ck is given by
P(Ck | x) =
p (x | Ck ) P (Ck ) , p ( x)
(2)
where p(x | Ck ) is the class-conditional probability density function for examples belonging to class Ck, P(Ck ) is the prior probability than an example belongs to class Ck, and can be estimated from the training data, and p(x) is the unconditional probability density function for x. Note that p(x) can be determined using p ( x) =
K
∑ p(x | Ck )P(Ck ) ,
(3)
k =1
and can thus be seen as a normalizing factor that ensures that the sum of probabilities over all classes is unity. An example x is classified into the class Ck for which P(Ck | x) is a maximum. Direction-of-change forecasting is a binary classification problem in which an example belongs either to the class C+ (upward movement), or C− (downward movement). Assuming that the data points X are partitioned into two sets, X+ and X, containing respectively data points corresponding to upward and downward movements, then the Parzen estimate for p (x | C+ ) is X
P(x | C+ , Σ) =
+ 1 1 1 exp(− (x − x+n )T ∑−1(x − x+n )) 2 X+ ( 2π ) D/2 ∑1/2 n=1
∑
(4)
with a corresponding expression for P(x | C− , Σ) . The posterior probabilities P(C+ | x n , Σ) and P(C− | x n , Σ) are estimated using Bayes' Theorem.
3 Covariance Matrix Optimization In this paper we estimate densities using the non-parametric method described above. Therefore the only parameter to be determined is the covariance matrix defining the
A Kernel-Based Technique for Direction-of-Change Financial Time Series Forecasting
445
Gaussian kernel used in the density estimation process. We first discuss how the covariance matrix may itself be parameterized, and we then discuss how these parameters may be optimized. 3.1 Covariance Matrix Parameterization There are three general forms that the covariance matrix Σ may take: spherical, in which the variance along each dimension is identical; diagonal, in which case the variances along different dimensions differ, but principal directions are aligned with the coordinate axes; and full, in which case the principal direction of the variances can be aligned in arbitrary directions. Because Σ is a symmetric matrix, in the most general case it has D( D+1) /2 independent components, where D is the dimensionality of the input space (i.e., the number of delayed returns). In the diagonal case all of the non-diagonal elements of the covariance matrix are zero, and thus the number of independent components of the covariance matrix reduces to D. In the spherical case the diagonal components of the covariance matrix are all equal, and hence there is only one non-zero component [14]. Data points in a time series are temporally ordered, and intuitively we would expect recent values of the time series to be more important than less recent values in predicting future values; that is, we expect the prediction to be more sensitive to recent values. This suggests that the kernel used to estimate a density should not be symmetrical (i.e., spherical), but of a form such that its width along dimensions corresponding to recent returns is smaller than its width along dimensions corresponding to less recent returns. To capture the requirement that the width of the kernel increases with the delay of the return, we use the scaling Σt − n = a e k ( n −1) ,
(5)
where Σt − n ( n ∈ {1, 2, ..., D} ) is the variance of the kernel in the direction parallel to the axis corresponding to delay rt − n , k is an exponential scaling factor, and a is the variance parallel to the first delay1. Thus the kernel covariance matrix Σ has the form:
⎡a 0 ⎢ k ⎢ 0 ae ⎢ ⎢0 0 ⎢# # ⎢ ⎢⎣ 0 0
0
"
0
"
2k
ae #
" %
0
"
⎤ ⎥ 0 ⎥ ⎥ 0 ⎥. ⎥ # ⎥ ae( D −1) k ⎥⎦ 0
Note that this form of covariance reduces sensitivity to the value of D, as additional returns will have increasingly less influence on the estimate of the density. 1
In our experiments we have defined a slightly differently as Σt − n = a ⋅ v ⋅ ek ( n −1) , where v is the variance of the returns in the training period. This rescaling makes a less variable across different datasets, and makes parameter optimization easier.
446
A. Skabar
3.2 MAP Optimization of Covariance Matrix Parameters
We assume that our examples have been partitioned into two sets: a training set X, and a holdout set Xh. Under the assumption that the observations on our holdout set are independent and identically distributed (i.i.d.), then the optimal Σ is that for which the probability of correctly predicting the direction of change on all holdout examples is a maximum. We refer to this value as Σ MAP (max a posteriori), and calculate it as Xh
Σ MAP = arg max ∏ P (C + | x n , Σ ) yn P (C − | x n , Σ )1− yn Σ
n =1
(6)
Xh
= arg max ∑ yn ln ( P (C + | x n , Σ ) ) + (1 − y n ) ln ( P (C − | x n , Σ ) ) Σ
n =1
where P(C+ | x n , Σ) and P(C− | x n , Σ) are the probabilities respectively of an upward and downward movement on day n, and yn is the realized direction of change on day n; i.e., yn = 1 if rn > rn−1 , and 0 otherwise. The problem is to find the optimal values of a and k which parameterize Σ , and in our experiments we have used a grid search optimization.
4 Empirical Results Assuming that a return series is stationary, then a coin-flip decision procedure for predicting the direction of change would be expected to result in 50% of the predictions being correct. We would like to know whether our model can produce predictions which are statistically better than 50%. However, a problem is that many financial return series are not stationary, as evidenced by the tendency for commodity prices to rise over the long term. Thus it may be possible to achieve an accuracy significantly better than 50% by simply biasing the model to always predict up. A better approach is to compensate for this non-stationarity, and this can be done as follows. Let xa represent the fraction of days in an out-of-sample test period for which the actual movement is up, and let xp represent the fraction of days in the test period for which the predicted movement is up. Therefore under a coin-flip model the expected fraction of days corresponding to a correct upward prediction is (xa × xp), and the expected fraction of days corresponding to a correct downward prediction is (1−xa) × (1−xp). Thus the expected fraction of correct predictions is aexp = (xa × xp) + ((1−xa) × (1−xp)) .
(7)
We wish to test whether amod (the accuracy of the predictions of our model) is significantly greater than aexp (the compensated coin-flip accuracy). Thus, our null hypothesis may be expressed as follows: Null Hypothesis:
H0 : amod ≤ aexp
H1 : amod > aexp
We test this hypothesis by performing a paired one-tailed t-test of accuracies obtained using a collection of out-of-sample test sets from the Australian All Ordinaries (AORD) Index. Specifically, we take the period from 1 January 1992 to 31 December
A Kernel-Based Technique for Direction-of-Change Financial Time Series Forecasting
447
2006, and divide this into 202 20-day test periods. Predictions for each of the 20-day test periods are based on a model constructed using the 250 trading days immediately preceding this test period. The number of delayed returns used for each data point was 10. For each 20-day prediction period we calculate amod and aexp. We then use a paired t-test to determine whether the means of these values differ statistically. We are particularly interested in observing how the significance of the results depends on the parameters a and k from Eq. 5. Table 1 shows the t-test p-values corresponding to various values for these parameters, and Table 2 shows additional information corresponding to a selection of cases from Table 1. Table 1. p-values for one-sided paired t-test comparing amod and aexp. Numbers in bold are significant at the 0.01 level. Asterisked values are the minimums for each row. k a
0.000
0.100
0.200
0.300
0.400
0.500
0.600
0.800
1.000
5.000
0.10
0.8197
0.8462
0.8783
0.5735
0.3471
0.4935
0.2260
*0.0989
0.2169
0.3987
0.50
0.4952
0.2041
0.0212
0.0074
*0.0050
0.0065
0.0110
0.0016
0.0141
0.0399
1.00
0.2430
0.1192
0.0196
0.0309
0.0089
*0.0038
0.0062
0.0203
0.0087
0.0160
1.50
0.3488
0.0282
0.0313
0.0061
*0.0019
0.0016
0.0052
0.0055
0.0110
0.0106
2.00
0.1036
0.0281
0.0219
0.0006
*0.0005
0.0019
0.0109
0.0091
0.0199
0.0127
2.50
0.1009
0.1095
0.0166
0.0087
*0.0009
0.0092
0.0186
0.0115
0.0403
0.0205
3.00
0.1012
0.0511
0.0053
0.0029
*0.0026
0.0071
0.0189
0.0164
0.0672
0.0083
3.50
0.0913
0.0591
*0.0039
0.0040
0.0090
0.0292
0.0486
0.0296
0.0676
0.0094
4.00
0.0901
0.0306
0.0084
*0.0056
0.0126
0.0293
0.0841
0.0337
0.0610
0.0059
4.50
0.1087
0.0659
0.0081
*0.0037
0.0114
0.0244
0.0479
0.0423
0.0401
0.0073
5.00
0.1624
0.0366
*0.0041
0.0058
0.0151
0.0306
0.0502
0.0484
0.0297
0.0060
Table 2. Mean training accuracy, mean test accuracy, and confusion matrices corresponding to a selection of cases from Table 1. Upper/lower row of confusion matrix corresponds to upward/downward predictions; left/right column corresponds to realized upward/downward movements. Figures in parentheses are totals for rows of the confusion matrix. Mean Train Acc.
Mean Test Acc.
Confusion Matrix
a = 2.0 k = 0.0
0.7830
0.5141
⎡1143 946 ⎤ ⎢1025 941⎥ ⎣ ⎦
(2089) (1966)
a = 2.0 k = 5.0
0.5369
0.5263
⎡1005 757 ⎤ ⎢1163 1130 ⎥ ⎣ ⎦
(1762) (2293)
a = 0.5 k = 0.4
0.7465
0.5245
⎡1191 951⎤ ⎢ 977 936 ⎥ ⎣ ⎦
(2142) (1913)
a = 2.0 k = 0.4
0.5958
0.5343
(1974) (2081)
a = 5.0 k = 0.4
0.5628
0.5224
⎡1127 847 ⎤ ⎢1041 1040 ⎥ ⎣ ⎦ ⎡1009 777 ⎤ ⎢1159 1110 ⎥ ⎣ ⎦
(1786) (2269)
Of the values listed in Table 1, the smallest p-value value is 0.0005, and corresponds to parameter values a = 2.00 and k = 0.40. This means that the probability that the observed difference between amod and aexp being due to chance is 0.05%, and is well
448
A. Skabar
below the 0.01 level commonly used to measure statistical significance. From Table 2 it can be seen that the mean test accuracy (i.e., the mean accuracy over the 202 testperiods) for this case is 0.5343, and that the mean training accuracy (i.e., the mean of the accuracy on the 202 250-day training sets) is 0.5958. To see the effect of the use of a non-spherical covariance matrix, consider the first column of values in Table 1 (k = 0.0), and note that these values are much higher than the lowest p-value in the corresponding row, and yield results which are far from statistically significant (i.e., all p-values are well over 0.01). To shed further light on the role of k, consider the case a = 2.0, k = 0.0, and note that the mean training accuracy for this case is 0.7830, which is much higher than the value 0.5958 observed for a = 2.0, k = 0.4, and suggests that the model has been overfitted to the training data. Now consider right-most column of Table 1, which corresponds to k = 5.0. This is a large value for the scaling factor, and results in very similar p-values to what would be obtained if only a 1-dimensional delayed-return vector were used. Specifically, consider the case a = 2.0, k = 5.0. The mean training accuracy for this case is 0.5369, which is lower than that observed for case a = 2.0, k = 0.4, and suggests that underfitting is occurring. We can conclude from this that the proposed form for the kernel covariance matrix is successful in allowing recent data more influence than less recent data in the construction of the model. To observe the effect of the parameter a, consider the last three rows of Table 2, all of which correspond to k = 0.4. Note that as the value of a increases, the mean accuracy on training data decreases. This can be explained by the fact that small values of a correspond to narrow kernels, which produce spiky density estimates, resulting in overfitting. Conversely, large a values result in overly smoothed densities, and thus an inability accurately model the training data. Finally, note from Table 2 that for the cases in which a low training accuracy was achieved (e.g., a = 2.0, k = 5.0 and a = 5.0, k = 0.4), the total number of downward predicted movements is noticeably larger than the number of upward predicted movements. This can be explained by the fact that we have assumed that the priors for upward and downwards movements are equal, when in reality the priors for upward movements are higher than those for downward movements (i.e., the return series is non-stationary). When the densities are overly smoothed, the resulting posterior probability estimates are very close to the priors. If the priors for upward/downward movements were increased/decreased to reflect the fact that prices tend to rise, then the total number of upward predicted movements would increase. In fact, by pre-specifying priors it may indeed be possible to achieve out-of-sample accuracies significantly better than the value of approximately 53.4% that we have been able to achieve here. For example, if we believe that the market is displaying strong bull or bear behaviour, then we may wish to reflect this through setting the priors correspondingly.
5 Conclusions The paper has presented a density estimation-based technique which can be used to make direction-of-change forecasts on financial time series data. A distinct advantage of the technique is that it involves very few parameters compared to discriminative models such as neural networks, and these parameters can easily be optimized using
A Kernel-Based Technique for Direction-of-Change Financial Time Series Forecasting
449
cross-validation. Also, the use of a non-spherical kernels allows recent data to have more influence than less recent data in the construction of the model, reducing the degree to which the model is sensitive to the dimensionality of the input space, thereby reducing the risk of overfitting. Results on the AORD Index show that technique is capable of yielding out-of-sample prediction accuracies which are statistically higher than those of a coin-flip procedure.
References 1. De Gooijer, J.G., Hyndman, R.J.: 25 years of time series forecasting. International. Journal of Forecasting 22, 443–473 (2006) 2. Kajitani, Y., Mcleod, A.I., Hipel, K.W.: Forecasting nonlinear time series with feedforward neural networks: a case study of Canadian lynx data. Journal of Forecasting 24, 105–117 (2005) 3. Chung, J., Hong, Y.: Model-free evaluation of directional predictability in foreign exchange markets. Journal of Applied Econometrics 22, 855–889 (2007) 4. Christoffersen, P.F., Diebold, X., Financial, F.: asset returns, direction-of-change forecasting, and volatility dynamics. PIER Working Paper Archive 04-009, Penn Institute for Economic Research, Department of Economics, University of Pennsylvania (2003) 5. Walczak, S.: An empirical analysis of data requirements for financial forecasting with neural networks. Journal of Management Information Systems 17, 203–222 (2001) 6. Swanson, N.R., White, H.: A model selection approach to real-time macroeconomic forecasting using linear models and artificial neural networks. The Review of Economics and Statistics 79, 540–550 (1997) 7. Teräsvirta, T., Medeiros, M.C., Rech, G.: Building neural network models for time series: a statistical approach. Journal of Forecasting 25, 49–75 (2006) 8. Kaastra, I., Boyd, M.S.: Designing a neural network for forecasting financial and economic time series. Neurocomputing 10, 215–236 (1996) 9. Adya, M., Collopy, F.: How effective are neural networks at forecasting and prediction? a review and evaluation. Journal of Forecasting 17, 481–495 (1998) 10. Chatfield, C.: Positive or negative? International Journal of Forecasting 11, 501–502 (1995) 11. Tkacz, G.: Neural network forecasting of canadian gdp growth. International Journal of Forecasting 17, 57–69 (2001) 12. Parzen, E.: On the estimation of a probability density function and mode. Annals of Mathematical Statistics 33, 1065–1076 (1962) 13. Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. John Wiley and Sons, New York (1974) 14. Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)
An Optimization-Based Classification Approach with the Non-additive Measure Nian Yan1, Zhengxin Chen1, Rong Liu2, and Yong Shi1,2 P
P
P
P
P
P
1
College of Information Science and Technology, University of Nebraska at Omaha, NE 68182, USA {nyan, zchen, yshi}@mail.unomaha.edu 2 Chinese Academy of Sciences Research Center on Fictitious Economy and Data Science, Graduate University of Chinese Academy of Sciences, Beijing 100080, China {[email protected], [email protected]}
Abstract. Optimization-based classification approaches have well been used for decision making problems, such as classification in data mining. It considers that the contributions from all the attributes for the classification model equals to the joint individual contribution from each attribute. However, the impact from the interactions among attributes is ignored because of linearly or equally aggregation of attributes. Thus, we introduce the generalized Choquet integral with respect to the non-additive measure as the attributes aggregation tool to the optimization-based approaches in classification problem. Also, the boundary for classification is optimized in our proposed model compared with previous optimization-based models. The experimental result of two real life data sets shows the significant improvement of using the non-additive measure in data mining. Keywords: Data Mining, Classification, Non-additive Measure, Optimization.
1 Introduction In data mining, the classification is the task that aims to construct a model that could most efficiently distinguish different groups in a dataset. The optimization-based classification approaches formalize the data separation problem as the mathematical programming problems. Since Fisher’s linear classification model [1], the groups are described as AX ± b, where A, X, b are representing parameters to be learned, observations, and the constant boundary respectively. There are numerous optimizationbased classification models, from the classical linear classification to the popular Support Vector Machine (SVM). The common feature of these methods is the use of optimization techniques. Such technique as linear programming (LP) has already been widely used in the early studies of classification problem, e.g. Freed and Glover [2],[3] introduced two classification approaches based on the idea of reducing the misclassification through minimizing the overlaps or maximizing the distance of two objectives in a linear system, i.e. maximizing the minimum distances (MMD) of data from the critical boundary and minimizing the sum of the distances (MSD) of the data M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 450 – 458, 2008. © Springer-Verlag Berlin Heidelberg 2008
An Optimization-Based Classification Approach with the Non-additive Measure
451
from the critical boundary. The linear SVM formulizes the bounding planes with soft margins to separate the data [4],[5]. An important indicator for the quality of the classifier is classification accuracy, as most existing researches on optimization-based classification approaches concern. Yet there are a number of other important issues for classification, such as speed, robustness, scalability, and interpretability [6]. Since interpretability refers to the level of understanding and insight that is provided by the classifier or predictor, it is related to the important aspect of handling the data. Unfortunately, the classic linear optimization-based classification approaches are with weak interpretability because it only considers the contributions from the attributes towards the classification equal to the sum of contributions from each individual attribute. Even with non-linear approach, the attributes are equally treated in the modeling stage. When interactions exist among attributes, the classification model does not correctly interpret the nature of the data. Thus, in this paper, we introduce an optimization-based classification approach with respect to the non-additive measure which is able to identify the interaction among the attributes and improve the classification performance in both accuracy and interpretability.
2 Non-additive Measures and Integrals The concept of non-additive measures (also referred as fuzzy measure theory) and the ways of aggregation, nonlinear integrals, were proposed in the 1970s and have been well developed [7],[8],[9]. Non-additive measures have been successfully used as a data aggregation tool for many applications such as information fusion, multiple regressions and classifications [9],[10],[11],[12]. 2.1 Definitions 0B
Let finite set X = {x1 ,..., x n } denote the attributes in a multidimensional data set. The non-additive measures are defined as following [8],[13]: Definition 1. A non-additive measure μ defined on X is a set function μ :
P ( X ) → [0, ∞) satisfying (1) μ (φ ) = 0 (2) μ ( E ) ≤ μ ( F ) if E ⊆ F The property (2) is called monotonicity.
P ( X ) denotes the power set of X , which means the set of all subsets of X , μ i denote the values of set function μ and i = 1,...,2 n − 1 . Definition 2. A signed non-additive measure μ defined on X is a set function μ :
P ( X ) → (−∞, ∞) satisfying (1) in definition 1. 2.2 Nonlinear Integrals 1B
Nonlinear integrals are regarded as the methods of aggregating μ i in the set function μ . The studies of non-additive measures and the corresponding nonlinear integrals could be
452
N. Yan et al.
found in the literatures [7],[8],[9],[13] from additive Lebesgue-like integral to the classic nonlinear integrals, i.e. Sugeno integral and Choquet integral. The Lebesgue-like integral is exactly the weighted sum of all the attributes and is widely used in forms of the linear models. However, considering the nonlinear relationships particularly the interactions among attributes, Sugeno integral and Choquet integral are the necessary data aggregation tools to be applied. Choquet integral is more appropriate to be chosen for data mining applications because it provides very important information in the interaction among attributes in the database compared with Sugeno integral [9]. Thus, in this paper, we choose the Choquet integral as the representation of the non-additive measure. Now let the values of f , { f ( x1 ), f ( x 2 ),..., f ( x n )}, denote the values of each attribute in the data set; let μ be a non-additive measure. The general definition of Choquet integral, with function f : X → (−∞,+∞) , based on signed non-additive measure μ is defined by the formula (1) 0
+∞
−∞
0
(c)∫ f du = ∫ [μ( Fα ) − μ( X )]dα + ∫ μ( Fα )dα
(1)
Where Fα = {x | f ( x ) ≥ α } for α ∈ [0, ∞) and is called α -cut set of f , for
α ∈ [0, ∞) , n is the number of attributes in the database. When μ is additive, it coincides with the Lebesgue-like integral. Therefore, the Lebesgue-like integral is a special case of Choquet integral. The linear relationship of the attributes is still able to be identified when Choquet integral is used in data modeling. The general algorithm used to calculate the Choquet integral is shown below. General algorithm for calculating Choquet integral Step 1: Let f = { f ( x1 ), f ( x 2 ),..., f ( x n )} denote the weighted values of each attribU
ute for one given record. Then, we rearrange those values into a non-decreasing order: f * = { f ( x1* ), f ( x 2* ),..., f ( x n* )},
where f ( x1* ) ≤ f ( x2* ) ≤ ... ≤ f ( x n* ) . The sequence of ( x1* , x2* ,..., x n* ) is one of the possibilities from the permutation of ( x1 , x2 ,..., x n ) . Step 2: Create the variables of μ , where μ = { μ1 , μ 2 ,..., μ 2 }, and μ1 = μ (φ ) =0. n
Each of them represents the interaction of several attributes, e.g. μ 2 = μ ({x2* , x3* }) . Step 3: The value of the Choquet integral is calculated by: n
(c) ∫ fdμ = ∑ [ f ( xi* ) − f ( xi*−1 )] × μ ({xi* , xi*+1 ,..., x n* }) , where f ( x0* ) = 0 . i =1
The above 3-step algorithm is easy to understand but hard to implement with computer program. In this paper, we apply and modify the method proposed in [11] to calculating the Choquet integral. The method is illustrated in formula (2) and (3): 2 n −1
(c)
∫ fdμ = ∑ z μ j =1
j
j
(2)
An Optimization-Based Classification Approach with the Non-additive Measure
⎧ min( f ( xi )) − max( f ( xi ) )if > 0 or j = 2 n − 1 ⎪i: frc( j / 2i )∈[0.5,1) i: frc( j / 2i )∈[ 0,0.5) where z j = ⎨ ⎪ 0 otherwise ⎩
453
(3)
frc( j / 2 i ) is the fractional part of j / 2i and the maximum operation on the empty set is zero. Also the criteria of judging i equals to the following, if we transform j into binary form j n j n − 1 ... j 1
{i | frc( j / 2 ) ∈ [0.5,1)} = {i | j = 1} and {i | frc( j / 2 ) ∈ [0,0.5)} = {i | j = 0} i
i
i
i
Generally, the Choquet integral does not satisfy the additivity, which is defined as: (c ) ∫ gdμ + (c) ∫ fdμ = (c) ∫ ( g + f )dμ , g and f are functions defined as f , g : X → (−∞,+∞) From the above two versions of algorithms to the Choquet integral according to the general definition in formula (1), we observe that the pre-ordering of the values of attributes are required in both versions. However, the order of the attributes is sometimes only with only one ordering situation because different attributes may have different levels of scales. Thus, data normalization is needed to map the data into the same scale before the calculation of the Choquet integral. Two different types of data normalization are considered: the first is to perform the traditional data normalization process such as min-max normalization, z-score normalization and etc. [6]. In this way, each attribute is mapped into a certain range, e.g. [0, 1] for a typical min-max normalization. The real world situation is that you never know which data normalization is better for a given dataset unless to test all of them. The reason is that those different data normalization approaches treat every attributes equally. With this concern, the second type of data normalization is to set weights and bias on each attributes. In this way, we have to extend the definition of the Choquet integral, as follows: (c)∫ (a + bf )du where a = {a1 , a 2 ,..., a n } , b = {b1 , b2 ,..., bn } representing bias and weights to the attributes respectively. However, with this scenario, a and b are not pre-determined and need to be learned during the training process of data mining. A typical searching algorithm such as genetic algorithm can be used to obtain those bias and weights according to the performance of the data mining task [11].
3 Optimization-Based Classification Model with the Non-additive Measure In this section, we introduce the classical optimized-based classification models and extend the model with the signed non-additive measure.
454
N. Yan et al.
3.1 Classical Optimization-Based Classification Models 2B
Freed and Glover [2],[3] introduced two classification approaches based on the idea of reducing the misclassification through minimizing the overlaps or maximizing the distance of two objectives in a linear system. The simple MSD classification model for two group classification is described as follows [2]: m
Minimize
∑β i =1
i
Subject to:
y i ( AX − b) ≤ β i βi ≥ 0 where y i = {1,−1} denotes the two different groups. The model is such a simple and efficient method that separates the data by searching a linear cut AX and a suitable boundary b. Before we introduce the use of signed non-additive measure in classification, the concept of separating data by hyperplanes is briefly described in section 3.2. 3.2 Concept of Linearly Separable and SVM 3B
“Linearly separable” means that data is able to be perfectly separated by linear classification model. Data sets are sometimes linearly separable, most of the time not. The theorem of linearly separable is defined as: two groups of objects are linearly separable if and only if the optimal value of LP is zero. The proof of the theorem is from Bosch and Smith [14]. A linearly inseparable data set can not be perfectly separated by linear classification model due to the unreachable to the objective of LP. The linear classification models contain such constrains as “AX” (or “ x T w ” in SVM). The efforts have been made to achieve the better separation in classification through constructing different objectives in optimization. For example, the very typical objective for SVM [4],[5] is l 1 Minimize || w || 2 +C ∑ ξ i 2 i Subject to: y i ( K ( x, w) − b) ≥ 1 − ξ i ξi ≥ 0 where xi denotes the parameters which need to be determined and are even possibly mapped into a even higher dimensional space by the kernel function K ( xi , x j ) = φ ( xi )T φ ( x j ) . φ : X → H is a map from low dimension to high dimension, e.g. φ ( x1 , x 2 ) = ( x12 , 2 x1 x 2 , x 22 ) is a map from two dimensional to three dimensional space. C is a positive constant and is regarded as the penalty parameter. Given a linear kernel, the training procedure of SVM is to find linearly separating hyperplanes with
An Optimization-Based Classification Approach with the Non-additive Measure
455
the maximal margin in this higher dimensional space. The linear kernel function K, e.g. K ( x, w) = x T w , makes the SVM classification coincide with the linear classification models. 3.3 Optimization-Based Non-additive Classification 4B
The optimization-based signed non-additive measure classification approach was developed by minimizing the SVM like objective and creating “Choquet hyperplanes” to separate the groups is as follows [15]: Minimize
m 1 || μ || 2 +C ∑ β i 2 i =1
Model 1
Subject to: y i (( c ) ∫ fd μ − b )) ≤ β i
βi ≥ 0 From programming perspective, the boundary value b is hardly to be optimized because of the degeneracy issue of the programming. It is even much more difficult to solve with the above non-linear programming. The solution is to predetermine the value of b or implement the learning scheme in the iterations, such as updating b with the average of the lowest and largest predicted scores [10],[12]. The famous Platt’s SMO algorithm [16] for SVM classifier utilized the similar idea by choosing the average of the lower and upper multipliers in the dual problem corresponding to the boundary parameter b in the primal problem. Keerthi et al. [17] proposed the modified SMO with two boundary parameters to achieve a better and even faster solution. The boundary b in MSD becomes b ± 1 in standard form of linear SVM, in which it is called the soft margin. The idea is to construct the separation belt in stead of a single cutting line. This variation of soft margin actually made the achievement that: the using of formation of b ± 1 in optimization constrains coincidently solved the degeneracy issue in mathematical programming. The value of b could be optimized by a simple linear programming technique. Thus we simplify the model in [12] into a linear programming solvable problem with optimized b and non-additive measure with respect to generalized definition of the Choquet integral (formula 2, 3): m
Minimize C ∑ β i
Model 2
i =1
Subject to: y i (( c ) ∫ ( a + b f ) d μ − b )) ≤ 1 + β i
βi ≥ 0
In this model, the generalized “Choquet hyperplanes” separates the data with more flexibility and its geometric meaning could be found in [12]. We utilized the linear programming technique for solving the non-additive measure μ . The parameters from a and b are optimized by genetic algorithm as we mentioned earlier.
456
N. Yan et al.
4 Experimental Results We conduct the proposed approach (Model 2) on two datasets with comparisons to other popular approaches, i.e. Decision Tree (C4.5) and SVMs. 4.1 US Credit Card Dataset 5B
The credit card dataset is obtained from a major US bank. The data set consists of 65 attributes and 5000 records. There are two groups: current customer (4185 records) and bankrupt customer (815 records). The task is to predict the risk of bankruptcy of customers. We regard the current customers as good customers and bankruptcy as bad. In order to reduce the curse of dimensionality, we use hierarchical Choquet integral [12],[18], that calculates the Choquet integral hierarchically, to Model 2 for decision makings on the new applicants. The way of hierarchical is determined by human experts. We use randomly sub-sampling 10-fold cross-validation (90% for training and 10% for testing). The results are summarized in Table 1. Table 1. Classification accuracy (%) on US credit card dataset
Training Testing Training Testing Training Testing Training Testing Training Testing
Sensitivity Specificity (Good) (Bad) Linear_SVM 69.8 87.2 69.1 84.6 Polynomial_SVM 69.9 88.1 68.7 85.5 RBF_SVM 71.5 88.4 69.1 85.6 See5.0 (C4.5) 74.8 88.4 69.5 81.8 Model 2 75.0 85.8 70.9 81.6
Accuracy (Overall) 78.5 69.4 79.0 69.1 80.0 69.5 81.6 69.8 80.5 71.1
There is no such significant difference in terms of classification accuracy among those different approaches. Our approach performs best on testing dataset while See 5.0 performs best on training. One advantage of our approach is reliability which refers to as the classification model achieves similar accuracy on both training and testing dataset. 4.2 UCI Liver Disorder Dataset 6B
The Liver Disorder dataset is obtained from UCI Machine Learning Repository (www.ics.uci.edu/~mlearn/MLRepository.html). This is a two-group classification problem and data consists of 6 attributes and 345 samples. We perform model 2 with
An Optimization-Based Classification Approach with the Non-additive Measure
457
Table 2. Classification accuracy (%) on liver disorder dataset Training
Testing
Linear_SVM
Classification Methods
63.9
55.6
Polynomial_SVM
65.1
60.0
RBF_SVM See5.0 (C4.5)
100.0 88.9
75.4 69.1
Model 2
79.3
75.9
a 10-fold cross-validation (90% for training and 10% for testing) on the dataset. The comparison results on average classification accuracy are summarized in Table 2. We observe that the linear SVM performs worst that shows the linearly inseparable property of the dataset. The RBF SVM model classifies the dataset with 100% for training and 75.44% for testing. The proposed Model 2, which is based on the signed non-additive measure, performs 79.34% and 75.86% on training and testing respectively. This result is comparable to the RBF kernel SVM.
5 Conclusions In this paper, we proposed a new optimization-based approach for classification when we believe the attributes have interactions. The new approach achieved higher accuracy of classification on two real life datasets compared with traditional approaches. We introduced the concept of non-additive measures as the data integration approach to optimization-based classification in data mining. The use of the Choquet integral with respect to the signed non-additive measure identified more hidden interactions among attributes and contributes to construct more reliable classification models. The learned non-additive measure shows some impact from the joint effect from several certain attributes towards classification and provides a potentially better interpretability of classification model. In the future, more experiments will be conducted for a better illustration both in classification performance and model representation by describing the relationship of learned μ and the interactions. Acknowledgments. This research has been partially supported by a grant from National Natural Science Foundation of China (#70621001, #70531040, #70501030, #70472074), National Natural Science Foundation of Beijing #9073020, 973 Project #2004CB720103, National Technology Support Program #2006BAF01A02, Ministry of Science and Technology, China, and BHP Billion Co., Australia.
References 1. Fisher, R.A.: The Use of Multiple Measurements in Taxonomic Problems. Annals of Eugenics 7, 179–188 (1936) 2. Freed, N., Glover, F.: Simple but powerful goal programming models for discriminant problems. European Journal of Operational Research 7, 44–60 (1981)
458
N. Yan et al.
3. Freed, N., Glover, F.: Evaluating alternative linear, programming models to solve the twogroup discriminant problem. Decision Science 17, 151–162 (1986) 4. Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995) 5. Smola, A.J., Scholkopf, B.: A tutorial on support vector regression. Statistics and Computing 14, 199–222 (2004) 6. Han, J., Kamber, M.: Data Mining Concepts and Techniques, 2nd edn., pp. 286–289. Morgan Kaufmann Publishers, Inc., San Francisco (2002) 7. Choquet, G.: Theory of capacities. Annales de l’Institut Fourier 5, 131–295 (1954) 8. Wang, Z., Klir, G.J.: Fuzzy Measure Theory. Plenum, NewYork (1992) 9. Wang, Z., Leung, K.-S., Klir, G.J.: Applying fuzzy measures and nonlinear integrals in data mining. Fuzzy Sets and Systems 156(3), 371–380 (2005) 10. Xu, K., Wang, Z., Heng, P., Leung, K.: Classification by Nonlinear Integral Projections. IEEE Transactions on Fuzzy Systems 11(2), 187–201 (2003) 11. Wang, Z., Guo, H.: A New Genetic Algorithm for Nonlinear Multiregressions Based on Generalized Choquet Integrals. In: The Proc. of Fuzz/IEEE, pp. 819–821 (2003) 12. Yan, N., Wang, Z., Shi, Y., Chen, Z.: Nonlinear Classification by Linear Programming with Signed Fuzzy Measures. In: 2006 IEEE International Conference on Fuzzy Systems (July 2006) 13. Grabisch, M.: A new algorithm for identifying fuzzy measures and its application to pattern recognition. In: Proceedings of 1995 IEEE International Conference on Fuzzy Systems (March 1995) 14. Bosch, R.A., Smith, J.A.: Separating Hyperplanes and the Authorship of the Disputed Federalist Papers. American Mathematical Monthly 105(7), 601–608 (1998) 15. Yan, N., Wang, Z., Chen, Z.: Classification with Choquet Integral with Respect to Signed Non-Additive Measure. In: 2007 IEEE International Conference on Data Mining, Omaha, USA (October 2007) 16. Platt, J.C.: Fast Training of Support Vector Machines using Sequential Minimal Optimization, Microsoft Research (1998), http://research.microsoft.com/jplatt/ smo-book.pdf 17. Keerthi, S., Shevade, S., Bhattacharyya, C., Murthy, K.: Improvements to Platt’s SMO algorithm for SVM classifier design. Tech Report, Dept. of CSA, Banglore, India (1999) 18. Murofushi, T., Sugeno, M., Fujimoto, K.: Separated hierarchical decomposition of the Choquet integral. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 5(5) (1997)
A Selection Method of ETF’s Credit Risk Evaluation Indicators∗ Ying Zhang 1, Zongfang Zhou 1, and Yong Shi 2 1
School of Management, University of Electronic Science & Technology of China, P.R. China, 610054 2 Research Center on Fictitious Economy & Data Science, CAS, Beijing 100080, China and College of Information Science and Technology, University of Nebraska at Omaha, Omaha, NE 68182, U.S.A [email protected], [email protected]
Abstract. This work applied data analysis method of attribute reduction to the credit risk evaluation indicators selection of the emerging technical firms (ETF), and constructed the ETF’s credit risk evaluation indicators system. Furthermore, the utilization the distinct matrix method was carried out the confirmation to the reduction result. Finally the 7 ETF's data was used to carry out the analysis in the western of China by attribute reduction and their core attribute was obtained. Keywords: ETF, credit risk, evaluation indicators, attribute reduction.
1 Introduction The Rough Set theory was proposed by Pawley Z in 1982. In 1991 Pawley Z published a monograph which elaborated the RS theory comprehensively and established a strict mathematical foundation for it. However, there is little information available in literature about the application of attribute reduction method to the research for the credit risk of emerging technology firms. Our work is thus devoted to adopting the attribute reduction in the ETF's credit risk indicator selection as an application of RS. The work was carried out on reducing the condition attribute of the ETF’s credit risk evaluation indicator. Firstly, the credit risk evaluation indicator of the common firms was referenced. Then, in view of ETF, its condition attribute value was separated, and the redundant ingredient was removed by the attribute reduction algorithm. Finally, group of ETF’s credit risk evaluation indicator was obtained. The indicator attributes provide a support for the scientific management, the forecast, and the decision-making for the ETF. ∗
This research is partially supported by grants from National Natural Science Foundation of China (#70671017, #70621001, #70501030, #70472074), 973 Project #2004CB720103, BHP Billiton Co., Australia.
M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 459–465, 2008. © Springer-Verlag Berlin Heidelberg 2008
460
Y. Zhang, Z. Zhou, and Y. Shi
2 Attribute Reduction 2.1 Data Discretization This work employed the impoved greedy algorithm which does not have the policy-making condition ([1]). An information system S can be represented as: S = (U, R, V, F). Here, U = {x1, x2,…,xn} which denotes the universe of discourse is a non-empty limited set. R=C D is the attribute set, subset C and D is called the condition attribute and the policy-making attribute respectively. Va ∈ [la, ra] indicates the value territory of attribute a, la= c1a c2a c3a…… cna=ra. Va=[ c1a, c2a [ c2a, c3a …… [ cn-1a, cna . the c1a, a a c2 , ……, cn are called the breakpoint. The arbitrary breakpoint in Va is assembled in the interest of {(a,c1a),(a,c2a),…,(a,cna)}. And for a, cia R. Let S*=(U*,R*,V*,F*) be a new information system. In it, U*={(xi,xj) U×U} R*={Pra| a C}. Here, the Pra is the r-th breakpoint of a. As for arbitrary Pra, in case (Cra, Cr+1a) ⊆{min [a(xi), a(xj)], max [a(xi), a(xj)]}, a(xi) expresses the i-th attribute value of a. In that way Pra((xi,xj))=1 otherwise Pra((xi,xj))=0. Pra((xi,xj)) denotes acing according to the element (xi,xj) with the r-th breakpoint substituting of attribute a. Step as follows:
∪
< <
<
∈
)∪
)∪ ∪ ∈
)
∈
,
1. Information table S on the basis of former constitutes new information table S*; 2. The initial breakpoint set is a empty set; 3. When only one breakpoint the row 1 value integer many, select all rows 1 value integer most breakpoints join to the breakpoint, remove the row and in this breakpoint which this breakpoint is at the value are 1 line; When has above same the breakpoint row 1 value integer, corresponds a row institute the break point row value is 1 line 1 integer adds together, takes is the breakpoint which sum of the line value is smallest. 4. If recent information table is not empty, go to third step, otherwise stop. 2.2 The Attribute Reduction Information system S can be expressed by S = (U,R,V,F). Here the F: U ×R →V denotes a mapping from U×R to V. Suppose R is a equal in value relational bunch on U, then IND(R) expressed on U are all equal in value kinds which derives by R. If r R and IND(R)=IND(R-{r}), then r is called in R the condition attribute which can be removed; If IND(R)≠IND(R-{r}), then r is called the condition attribute which should not drop out from R. The removable condition attribute in the indicator system is unnecessary if removing them from the indicator system will not change this indicator system in classified ability. On the contrary, if in the indicator system removes the condition attribute which should not be droped out, then the classified ability of the indicator system will certainly change. If any r R is not droped out from R, then equal value relates bunch of R is independent; otherwise R is related. The condition attribute in which R should not be droped out is called the core attribute.
∈
∈
A Selection Method of ETF’s Credit Risk Evaluation Indicators
461
In the ordinary status, an information system reduction continues one kind. This reduction has approximately maintained same classification ability to the original condition attribute. The simplification distinct matrix method is introduced for computating the reduction more effectively. Step as follows: 1. The initial data discretization produced the knowledge information table; 2. Observe object whether there is the same attribute condition and can't be distinguished; delete object which has the same attribute condition and cannot be distinguished; 3. Directly pick up the distinct HFS[3]of the first object from the information system to be opposite to other objects, and reduction; 4. Is opposite front in turn the object to other objects withdraws distinct HFS and adds on to be distinct HFS, and reduction; 5. Finally, obtain the reduction of information system.
3 ETF’s Credit Risk Evaluation Indicator System The emerging technology which refers to these recently to appear or to develop, has an important influence on the economic structure or the profession development. The ETF with high degree of concentration manpower, the intelligence and the technological resource complex compound, has the high risk, high incertitude and the high defeat rate characteristic. In previous literature, the majority conerned emerging technical aspect and so on firm’s concept.While management conducted the research to ETF’s credit risk research is extremely few. The [6] mainly induced the influence of emerging technology firms credit risk some important characteristics from the system risk and the non- system risk two aspects. This research is based on the indicator selection systematic characteristic, the scientific nature, may be operational, objective as well as the quota and decides the disposition union the principle, through to the ETF investigation and the analysis, induces from the multitudinous indicator distinguishes between the traditional profession, the prominent characteristic of ETF’s credit risk evaluation indicator, it including the profit ability, pays off a debt the ability, the business capacity, the innovation ability and the market competition strength. Where the profit ability is reflected by the indicator such as transport business profit margin, the total assets reward rate, the net assets income rate, the new product income increase ratio, and the firms goodwill. The pay-off ability of a debt is reflected by the indicator like property debt rate, liquidity ratio, cash current capacity ratio, and Quick assets ratio. The business capacity is reflected by the indicators as the total assets velocity, the account receivable velocity, income rate of increment, the new product sale personnel ratio, the superintendent manage the level. The innovation ability is reflected by the indicators such as the R&D expense accounts for the sales income ratio, researches and develops the personnel ratio, the new equipment rate, master above cultural staff population, invisible asset possessing rate. The firms goodwill is reflected by the indicator like customer quantity, new product life cycle, product strain strength,
462
Y. Zhang, Z. Zhou, and Y. Shi
new product quantity. The significance of these indicators are different. There inevitably be some relevance and duplication between the indicators. This work adopts the data analysis method of the attribute reduction and the distinct matrix method, and seeks the simple and effective attribute set. The ETF credit risk evaluation indicator system is constructed as thus.
4 A Demonstration The case below used 7 ETF's data to carry out the analysis in the western of China, with A, B, C, D, E and F denoted them respectively. Table 1. The 7 ETF's Data in the Western of China Indicator
Profit Capability
Pays off a debt the ability
Business capacity
Innovation Capability
Sub- indicator Operation profit margin % Total assets reward rate % Net assets income rate % The new product income increase ratio % Business goodwill(Ten thousand Yuan) Property debt rate % liquidity ratio % Cash flow ratio % Quick assets ratio % Assets velocity % Account receivable velocity Income rate of increment % New product sale personnel ratio % The superintendent manages level The R&D expense accounts for the sales income ratio % Researches and develops the personnel ratio % New equipment rate % Above master accounts for the total staff population Invisible asset possessing rate %
( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( )
( ) ( ) ( )
( )
A
B
C
23.34
63.64
29.78
9.71
28.85
12.99
34.23
25.71
12.43
10.42
4.35
7.56
8.26
3.36
4.52
9.22 7.07 21.70 4.23 41.61 5.36
34.04 8.42 27.05 4.58 44.52 6.89
45.82 4.34 13.24 2.35 43.62 4.52
-3.22
1.11
-0.16
25.68
45.62
26.35
3.8
3.25
3.57
10.37
9.78
11.25
10.36
6.98
8.45
37.49
63.44
29.32
20.48
12.61
15.64
19.7
9.86
24.98
A Selection Method of ETF’s Credit Risk Evaluation Indicators
463
Table 1. (continued)
Market competition strength Indicator
Profit Capability
Pays off a debt the ability Business capacity
Innovation Capability
Client's amount (hundred million ) New product life cycle Year Product strain strength New produce amount Sub- indicator Operation profit margin % Total assets reward rate % Net assets income rate % The new product income increase ratio % Business goodwill(Ten thousand Yuan) Property debt rate % liquidity ratio % Cash flow ratio % Quick assets ratio % Assets velocity % Account receivable velocity Income rate of increment % New product sale personnel ratio % The superintendent manages level The R&D expense accounts for the sales income ratio % Researches and develops the personnel ratio % New equipment rate % Above master accounts for the total staff population Invisible asset possessing rate % Client's amount (hundred million ) New product life cycle Year Product strain strength New produce amount
(
)
( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
( )
Market competition strength
(
)
2.3
1
1
1
3
3
middle 3 D 34.89 20.33 14.27
Good 2 E 29.20 16.81 16.42
middle 2 F -28.86 -5.15 -5.23
8.16
5.42
3.21
1.15
1.03
5.09
8.30 12.08 17.13 6.52 59.76 6.92 2.67
8.37 11.49 18.02 5.46 57.56 7.06 8.86
8.37 11.35 -4.40 5.36 17.83 3.65 -2.32
40.56
30.48
20.31
4.11
4.00
3.67
8.71
2.43
3.56
15.23
5.62
6.83
153.82
25.48
115.89
30.56
25.68
18.97
10.24
1.06
3.27
2.7
0.8
0.76
2
2
2
Good 2
Good 2
bad 3
The attribute reduction was removed through the attribute one by one. The profit ability indicator concreted the reduction operation as follows: First step: Attribute value discretization. The result is in table 2. Second step: Delete the redundant attribute condition.
464
Y. Zhang, Z. Zhou, and Y. Shi
In table 2, new product income increase ratio and the firms goodwill income attribute value quite same not less than, explained their classified ability is same, deletes a attribute condition not to affect the classified result, deletes the firms goodwill income attribute. Third step: Attribute value reduction. Universe of discourse U= {A, B, C, D, E, F}, attribute condition set R= {a, b, c, d, e}; IND(R)={{A} {B} {C} {D} {E} {F}}, IND(R-{a})={{A} {B, E} {C} {D} {F}≠IND(R). The attribute a was removed to change the information table classified ability. Therefore attribute a is the core attribute which cannot be dropt out. Table 2. The Result of Discretization
Operation profit margin a Assets remuneration ratio b Net assets income rate c New produce income seizes increase ratio d Business goodwill e
A 1 1 2
B 2 2 2
C 1 1 1
D 2 2 1
E 1 2 2
F 1 1 1
2
2
2
1
2
1
2
2
2
1
2
1
Table 3. The Deleting Result of the Redundant Attribute Condition
Operation profit margin a Assets remuneration ratio b Net assets income rate c New produce income seizes increase ratio d
A 1 1 2
B 2 2 2
C 1 1 1
D 2 2 1
E 1 2 2
F 1 1 1
2
2
2
1
2
1
According to that U={A,B,C,D,E,F} is attribute condition set R={a,b,c,d,e}, IND(R)={{A}{B}{C}{D}{E}{F}}, IND{R-{a}}={{A}{B,E}{C}{D}{F}}≠IND{R}. Getting rid of attribute a revised the information table classification capability. The same principle may result in IND(R-{b})={{A, E} {B} {C} {D} {F}}≠IND{R}. Here b is the core attribute which cannot be dropt out. IND{R-{c}}={{A, C} {B} {D} {E} {F}}≠IND{R}, c also can not be dropt out. IND{R-{d}}={{A}{B}{C,F}{D} {E}}≠IND{R}, d is also the core attribute, and the profit capability core attribute set is {a,b,c,d}. Using the distinct matrix to the profit ability carries on attribute reduction resulted the same and confirmed the above method results. Utilizing the method above to the business capacity, the innovation ability and the market competition strength separately carried out attribute reduction. Finally their core attribute was obtained. The business capacity including the total assets velocity, the account receivable velocity, the superintendent manages level; the innovation ability accounts for the sales income ratio including the R&D expense, researches and develops the personnel ratio, the new equipment rate, invisible asset possessing rate; market competition strength including customer quantity, new product life cycle, product strain strength.
A Selection Method of ETF’s Credit Risk Evaluation Indicators
465
5 Concluding This article proposed an attribute reduction method for ETF’s credit risk evaluation indicator. Besides, a simplification and exercisable evaluation indicator system which respond the credit characteristic of ETF was obtained. It can be effectively employed by evaluation the credit risk of the ETF scientifically.
References 1. Wang, G.: Rough’s set theory and knowledge acquisition [M]. Xian Jiao tong University Press (2001) 2. Han, Z., Zhang, F., Wen, F.: Summarize of the Rough’s set theory with its applications [J]. Control Theory with Application 16 (1999) 3. Liu, Q.: Rough’s theory and Rough consequence. Scientific Press [M] (2001) 4. Zhang, W., Wu, W.: Introduction of Rough’s theory and research summarize [J]. Fuzzy System and Mathematics (14) (2000) 5. Shi, X.: Theory and means of credit evaluation [M]. The Administered and Economy Press (2002) 6. Chen, L., Zhou, Z.F.: To Analyze Risks of Credit for Enterprises of New Technique on the Internet Development from among the China [J]. Value Engineering (4) (2005) 7. Zhang, Y., Zhou, Z.: Commercial bank credit risk evaluation index entropy power choice method [J]. Journal of University of Electronic Science and Technology of China 35(5), 857–860 (2006) 8. Zhang, Y., Zhou, Z.: Based on fuzzy entropy commercial bank credit risk evaluation index choice method [J]. Management Review 18(7), 27–31 (2006)
Estimation of Market Share by Using Discretization Technology: An Application in China Mobile Xiaohang Zhang, Jun Wu, Xuecheng Yang, and Tingjie Lu Economics and Management School, Beijing University of Posts and Telecommunications, Beijing China {zhangxiaohang, junwu, yangxuecheng, lutingjie}@bupt.edu.cn
Abstract. The mobile market is becoming more competitive. Mobile operators having been focusing on the market share of high quality customers. In this paper, we propose a new method to help mobile operator to estimate the share in high quality customers market based on the available data, inter-network calling detail records. The core of our method is a discretization algorithm which adopts the Gini criterion as discretization measure and is supervised, global and static. In order to evaluate the model, we use the real life data come from one mobile operator in China mainland. The results prove that our method is effective. And also our method is simple and easy to be incorporated into operation support system to predict periodically.
1 Introduction Due to deregulation, new technologies, and new competitors, the telecommunication industry becomes more competitive than ever. Two biggest mobile operators in China mainland, China Mobile and China Unicom, are struggling for acquiring the customers. As well as attracting the incoming customers, retention of the own high quality customers and acquisition of competitors’ customers are becoming the core tasks of each operator. The reason is that these customers are not only the source of the revenue but also the foundation of maintaining an advantage in such a competitive environment. For each operator, the pre-step of retention and acquisition is to estimate the share in high quality customers market and be control of changes of share. This prework is very important because it can help to duly find the potential problems during the process of managing the high quality customers and can be guideline to supervise the operation of sub operators to guarantee that they can be kept in sustaining developments. In this paper, we propose a new method to help one China Mobile operator in China mainland to estimate the share in high quality customers market. Due to the lack of the foundational data, the estimation can only be based on the inter-network calling detail records (CDRs) which are stored in the operation support systems (OSS). The CDRs describe the calling behaviors between the competitors’ customers and the operators’ internal customers, which are composed of many fields including calling part, called part, starting time and duration of the call. However, the calling M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 466 – 475, 2008. © Springer-Verlag Berlin Heidelberg 2008
Estimation of Market Share by Using Discretization Technology
467
types, which include local call, long-distance call and roam call, are difficult to identify from the inter-network CDRs. In order to decrease the costs of data process, the estimation can only depend on the total call duration of each customer. In the traditional method, the estimation process can be decomposed of three steps. Firstly, the distribution of inter-network call duration of China Unicom’s customers is computed. Secondly, a cut point is chosen based on the distribution. Finally, the market share can be estimated by using the scale of the customers whose calling duration is greater than the chosen cut point. These customers are considered as high quality ones. The foundational hypothesis of the traditional method is that the customers who have higher inter-network calling duration are more possible to be high quality customers. Although this method is adopted universally by operators, it has some drawbacks and can result in estimation error.
-
-
Due to the difference of the customers’ behavioral structure and the difference of the prices of telecom services, the hypothesis of the traditional method can’t be satisfied. For example, for some customers, the long-distance calls and roam calls occupy bigger proportion in the total duration than others’. Because the prices of these two services are higher, although they may have shorter total calling duration, they also can be high quality customers. The China Unicom customers can give call to many customers including not only China Mobile but also China Telecom and China Netcom ones. However, the OSS of China Mobile only records the CDRs between China Mobile and China Unicom. So if some customers have long calling duration, most of which are not directional to China Mobile, then these customers are possible to be considered as low quality ones.
The most current research of market share estimation focus on the sales prediction [1, 2]. The research on customers’ market share estimation in telecom is rare. And because of the disadvantages of traditional method, we propose a new method which is a process of inference from China mobile customers to China Unicom customers. During the process our method adopts supervised discretization technology which has good characters compared with other ones. We estimate the market share based on the real life data which come from one China Mobile operator. And the data are customers’ CDRs from January to November in 2007. The results show that our method is simple and comparatively accurate.
2 Discretization Method The discretization methods can be classified according to three axes [3]: supervised versus unsupervised, global versus local, and static versus dynamic. A supervised method would use the classification information during the discretization process, while the unsupervised method would not depend on class information. The popular supervised discretization algorithms contain many categories, such as entropy based algorithms including Ent-MDLP [4, 5], D2 [6], Mantaras distance [7], dependence based algorithms including ChiMerge, Chi2 [8], modified Chi2 [9], Zeta [10], and binning based algorithms including 1R, Marginal Ent. The unsupervised algorithms contain equal width, equal frequency and some other recently proposed algorithms such as PCA-based algorithm [11] and an algorithm using tree-based density estimation [12].
468
X. Zhang et al.
Local methods produce partitions that are applied to localized regions of the instance space. Global methods, such as binning, produce a mesh over the entire continuous instances space, where each feature is partitioned into regions independent of the other attributes. Many discretization methods require a parameter, , indicating the maximum number of partition intervals in discretizing a feature. Static methods, such as EntMDLP, perform the discretization on each feature and determine the value of n for each feature independent of the other features. However, the dynamic methods search through the space of possible n values for all features simultaneously, thereby capturing interdependencies in feature discretization. Most of the popular methods are classified according to the three axes in [13]. The Fig. 1 describes the basic process of discretization. In our application, we use the Gini-criterion based discretization which is supervised, global and static.
Fig. 1. Process of discretization
2.1 Notations Suppose for a supervised classification task with k class labels, the training data set consists of | | instances, where each instance belongs to only one of classes. Let be one continuous attribute. Next, there exists a discretization scheme on the attribute A, which discretizes the attribute into discrete intervals bounded by the pairs of numbers: :
,
,
,
,
,
,
Estimation of Market Share by Using Discretization Technology
469
where and , respectively, are the minimal and the maximal values of the attribute , and the values in are arranged in ascending order. The discretization scheme is called an -scheme. These values constitute the cut points set , , , of the discretization scheme. The discretization algorithm is to determine the value of and the cut points set. 2.2 Discretization Measure For the ith interval , , we can get a conditional class probability , , where is the th class probability in th interval and satisfies ∑ 1. The Gini [14] is defined as follows: (1) In our algorithm we use Gini gain as the discretization measure. The cut point is chosen based on the criterion, whose Gini gain value is the biggest on attribute . The Gini gain ∆ is defined as: ∆
| | | |
| | | |
, ;
,
(2)
where · is the Gini measure defined in Eq. (1), and are the subsets of partitioned by the cut point , |·| denotes the number of instances. 2.3 Stopping Criterion The training set is split into two subsets by the cut point which is chosen using Gini measure. Subsequent cut points are selected by recursively applying the same binary discretization method to one of the generated subsets, which has biggest Gini gain value, until the stopping criterion is achieved. Because the quality of discretization methods involves a tradeoff between simplicity and predictive accuracy, the stopping criterion of our algorithm is defined by ,
(3)
where denotes the current number of intervals, is a positive integer determined by the user, is the Gini value with intervals, defined by | | | |
.
(4)
Eq. (3) can be easily returned as follows: /
/
.
(5)
470
X. Zhang et al.
From the Eq. (5), we can know that the parameter can affect the number of partition intervals. The smaller the right part of the Eq. (5) is, the more chances the algorithm has to discretize the continuous attribute further. In general, higher value can result in more intervals. 2.4 Comparison with Other Discretization Measures The following are two other discretization measures. ∑
Entropy measure:
(6) ,
Minimal measure:
,
,
(7)
These two measures and Gini measure have common characters. Their values are high for lower probable events and low otherwise. Hence, they are the highest when 1/ for each j; and they are the lowest when each event is equi-probable, i.e., 1 for one event and 0 for all other events. As known, entropy is one of the most commonly used discretization measures in the discretization literature. When there are only two classes in a classification task, entropy undoubtedly is an excellent measure of class homogeneity. However, when there are more than two classes in a classification problem, entropy sometimes cannot accurately reflect the class homogeneity. For example, see the Table 1. Table 1. Example for comparison of three measures Case 1
P1 1/2
P2 1/2
P3 0
Entropy 1
Gini 0.5
Minimal 0
2
1/6
2/3
1/6
1.25
0.5
1/6
3
1/8
3/4
1/8
1.06
0.406
1/8
In the example, the entropy value for case 1 is the lowest. Because the smaller value is preferred, the case 1 is the best of all. But the case 3 has higher prediction accuracy, it is 3/4 higher than 1/2 in case 1. Likewise, the minimal measure has the same ordered value as entropy. However, Gini measure gives the case 3 the lowest value. So in this example, Gini measure shows the better ability in terms of classification than the other ones. We compare the contours of three measures, entropy, Gini, and minimal, in Fig. 2. The vertex of the triangle denotes the event in which only one class label occurs. The center 1/3, 1/3, 1/3 denotes that each class label occurs at the equal probability. The farther away the point moves from , the higher the degree of the class heterogeneity. The minimal measure only consider the minimum class information in an interval, whereas the Gini measure takes into account all the class information and evaluates the interval according to the whole class distribution. The shape of contour of entropy
Estimation of Market Share by Using Discretization Technology
471
O O
O
O
Fig. 2. Comparison of the contours of three measures. (a) The contour of entropy. (b) The contour of minimal. (c) The contour of Gini.
lies between minimal and Gini. Entropy and minimal measure seem to prefer the points that are close to the boundary of the triangle. So they prefer (1/2, 1/2, 0) than (1/8, 3/4, 1/8).
3 Estimation Method 3.1 Notations All telecom operators pay more attention to average revenue per user (ARPU). The customers with high ARPU mean that they can provide more revenue to operators and they are the high quality customers. The ARPU depends on the calling minutes per user in one month (MOU) and the price of services. In our application, the customers which satisfy the following conditions can be defined as high quality customers in th month.
- The average ARPU of th, 1th, 2th months is greater than 100 Yuan. - During th, 1th, 2th months, the calling behaviors all exist. In order to give the basic definitions, the customers’ average MOU of th, 1th, 2th months are sorted by ascending order. Then the MOU values are discretized into many intervals by cut points. The number of customers and high quality customers falling into these intervals can be computed. The Fig. 3 shows the process. In Fig. 3, 1,2, , represents the th cut point, is the number of the is number of the high China Mobile customers falling into the th MOU interval,
Fig. 3. Discretization of MOU
472
X. Zhang et al.
quality customers falling into th MOU interval. And is the number of the China Unicom customers falling into the th MOU interval, is number of the high quality customers falling into th MOU interval. MOU means the average MOU of the continuous three months. 3.2 Basic Hypothesis The basic hypothesis of our method is that in all discretized intervals, proportions of the high quality customers for two operators are similar, which can be represented by ,
, ,
, ,
.
(8)
The meaning of the hypothesis is concluded in the following.
-
In each MOU interval, the distribution of the proportion between the China Unicom customers’ inter-network call MOU and their total call duration is similar to the China Mobiles’ customers. - For China Mobile and China Unicom customers, the call behavior structure is similar in each MOU interval. Because of the similarity of service structure, service quality and service price for the two operators, the hypothesis can be approximately satisfied. And to some extent, the hypothesis depends on the chosen cut points which discretize the MOU into intervals. We adopt the Gini discretization method to choose the cut points. 3.3 Estimation Based on the basic hypothesis, the formula of estimating the number of China Unicom’s high quality customers is as follows: ,
(9)
and can be computed simply based on internal CDRs and can be where computed based on inter-network CDRs. In order to estimate the market share of high quality customers for each sub mobile operators, we also need to estimate the number of high quality customers of each sub China Unicom operator. ,
(10)
where represents the number of high quality customer in th sub China Unicom represents the number of high quality customers who fall into th operator, MOU interval in th sub China Mobile operator, is the number of customers falling into th MOU interval in th sub China Mobile operator, and is the number of customers who fall into th MOU interval in th sub China Unicom operator.
Estimation of Market Share by Using Discretization Technology
473
3.4 Model Evaluation Due to lack of the real market share data, we can’t directly evaluate the model. We adopted the self-validation method. In other words, we use our method to estimate the proportion of high quality customers to all customers in China Mobile which can be compared to the real proportion. We use the data from nth to n+2th months to build model and estimate the proportion of n+3th month. The process is shown in Fig. 4.
Fig. 4. Process of evaluation of model
4 Results In the application, we extract the internal CDRs and inter-network CDRs from January to November in 2007. The cut points obtained by using supervised discretization technology are shown in Table 2. Here the best number of cut points is seven, so there are eight discretization intervals. Because of totally eleven months, we can get nine groups of cut points. In order to get robust results, we adopt the median of all groups as the final cut points. Table 2. The cut points obtain by using supervised discretization method cutP
G1
G2
G3
G4
G5
G6
G7
G8
G9
Median
1 2 3 4 5 6 7
64 115 160 210 268 318 440
70 120 165 217 264 338 436
74 126 165 208 254 341 465
78 132 182 229 277 367 527
80 123 173 224 284 353 476
90 143 191 242 302 383 501
84 141 194 255 326 407 524
87 144 199 246 301 384 512
90 140 190 245 312 400 481
80 132 182 229 284 367 481
The evaluation results are shown in Table 3, in which Ci (i=1, 2, …, 12) represents the ith sub China Mobile operator. Totally there are twelve sub operators. And because of eleven months, we can obtain eight groups of errors. We can see that the average error of each sub operator is less than 6%, which proves that our estimation method can obtain good accuracy. The Table 4 shows the results of market share
474
X. Zhang et al.
estimation. From the description, we can see that our estimation method is so simple that it can be easily incorporated into operation support system to predict periodically. Table 3. The estimation errors
C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 AVE
E1 2.0% 3.7% 3.7% -7.6% 6.2% 4.8% 3.5% 6.1% 3.5% 3.9% 5.1% 1.6% 3.0%
E2 2.7% 10.5% 3.7% 3.9% 8.5% 5.0% 2.7% 3.8% 1.8% 6.3% 6.3% 5.9% 5.1%
E3 3.6% 6.5% 4.2% 3.1% 10.2% 6.0% 5.4% 4.6% 4.3% 7.7% 8.5% 4.8% 5.7%
E4 2.5% 4.4% 3.1% 2.1% 7.7% 3.0% 2.0% 4.2% 4.0% 5.1 6.2% 3.2% 3.9%
E5 1.6% 3.1% 2.2% 1.9% 3.4% 2.1% 0.1% 4.3% 3.3% 3.7% 4.9% 0.6% 2.6%
E6 2.1% 3.5% 1.7% 1.0% 3.8% 0.4% 0.3% 2.6% 1.9% 1.9% 4.4% 1.1% 2.1%
E7 3.5% 7.1% 2.3% 2.3% 3.4% 1.8% 1.9% 2.1% 3.4% 1.4% 4.5% 1.6% 3.0%
E8 3.4% 6.6% 2.8% 3.3% 2.6% 1.7% 0.7% -0.3% 3.8% 1.4% 5.5% 1.8% 2.8%
AVE 2.7% 5.7% 3.0% 1.3% 5.7% 3.1% 2.0% 3.4% 3.3% 3.9% 5.7% 2.6%
STD 0.7% 2.5% 0.9% 3.7% 2.8% 2.0% 1.8% 1.9% 0.9% 2.3% 1.3% 1.9%
Table 4. Estimation of share in high quality customers market C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12
Mar 83.92% 74.86% 78.54% 80.06% 75.28% 86.30% 87.89% 79.32% 76.65% 86.51% 82.96% 86.38%
Apr 84.12% 75.10% 78.89% 78.39% 75.83% 86.62% 88.02% 79.54% 77.15% 86.65% 83.29% 86.33%
May 84.52% 76.71% 79.16% 78.63% 77.21% 86.91% 87.85% 80.02% 77.41% 87.11% 83.88% 86.99%
Jun 84.82% 77.41% 79.59% 79.08% 78.54% 87.36% 88.07% 80.74% 78.01% 87.72% 84.66% 87.53%
July 84.97% 77.62% 80.00% 79.58% 79.20% 87.78% 88.05% 81.44% 78.45% 88.36% 85.40% 87.82%
Aug 85.12% 77.54% 80.43% 80.10% 79.72% 88.03% 88.03% 81.92% 78.65% 88.78% 86.06% 87.87%
Sep 85.49% 77.86% 80.80% 80.62% 80.34% 88.20% 88.09% 82.29% 79.14% 89.33% 86.62% 88.31%
Oct 86.22% 78.75% 81.56% 81.46% 81.30% 88.37% 88.47% 82.78% 79.74% 89.91% 87.33% 88.99%
Nov 86.93% 79.67% 82.31% 82.23% 82.02% 88.67% 88.90% 83.34% 80.64% 90.52% 87.92% 89.52%
5 Conclusions In this paper, we propose a new method to estimate the market share of high quality customers for mobile operators. This method is based on a discretization technology which adopts the Gini criterion as discretization measure. The Gini-based discretization method is supervised, static and global, which is compared with other methods and has its’ good characters. We describe the complete process of estimating market share. And based on real life data come from one China mobile operator, the estimation method is implemented. The results prove that our method is effective. Our method is also simple and can be easily incorporated into the OSS to predict periodically.
Estimation of Market Share by Using Discretization Technology
475
References 1. Kumar, V., Anish, N., Rajkumar, V.: Forecasting category sales and market share for wireless telephone subscribers: a combined approach. International Journal of Forecasting 18(4), 583–603 (2002) 2. Fok, D., Franses, P.H.: Forecasting market shares from models for sales. International Journal of Forecasting 17(1), 121–128 (2001) 3. Dougherty, J., Kohavi, R., Sahami, M.: Supervised and Unsupervised Discretization of Continuous Features. In: Proc. 12th Int’l Conf. Machine Learning, pp. 194–202 (1995) 4. Fayyad, U., Irani, K.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proc. Thirteenth International Joint Conference on Artificial Intelligence, pp. 1022–1027. Morgan Kaufmann, San Francisco (1993) 5. Fayyad, U., Irani, K.: Discretizing continuous attributes while learning bayesian networks. In: Proc. Thirteenth International Conference on Machine Learning, pp. 157–165. Morgan Kaufmann, San Francisco (1996) 6. Catlett, J.: On changing continuous attributes into ordered discrete attributes. In: Kodratoff, Y. (ed.) EWSL 1991. LNCS, vol. 482, pp. 164–177. Springer, Heidelberg (1991) 7. Cerquides, J., Mantaras, R.L.: Proposal and empirical comparison of a parallelizable distance-based discretization method. In: KDD 1997: Third International Conference on Knowledge Discovery and Data Mining, pp. 139–142 (1997) 8. Liu, H., Setiono, R.: Chi2: Feature selection and discretization of numeric attributes. In: Vassilopoulos, J.F. (ed.) Proceedings of the Seventh IEEE International Conference on Tools with Artificial Intelligence, Herndon, Virginia, November 5-8, 1995, vol. 1995, pp. 388–391. IEEE Computer Society Press, Los Alamitos (1995) 9. Tay, F.E.H., Shen, L.X.: A Modified Chi2 Algorithm for Discretization. IEEE Trans. Knowledge and Data Eng. 14(3), 666–670 (2002) 10. Ho, K.M., Scott, P.D.: Zeta: A global method for discretization of continuous variables. In: KDD 1997: 3rd International Conference of Knowledge Discovery and Data Mining. Newport Beach, CA, pp. 191–194 (1997) 11. Sameep, M., Srinivasan, P., Hui, Y.: Toward Unsupervised Correlation Preserving Discretization. IEEE Transaction on Knowledge and Data Engineering 17(8) (August 2005) 12. Gbi, S., Eibe, F.: Unsupervised Discretization using Tree-based Density Estimation. Lecture Notes in Computer Science (2006) 13. Huan, L., Farhad, H., Lim, T.C., Manoranjan, D.: Discretization: An Enabling Technique. Data Mining and Knowledge Discovery 6, 393–423 (2002) 14. Leo, B., Jerome, F., Charles, J.S., Olshen, R.A.: Classification and Regression Trees. Wadsworth International Group (1984) 15. Xiao-Hang, Z., Jun, W., Ting-Jie, L., Yuan, J.: A Discretization Algorithm Based on Gini Criterion. In: Machine Learning and Cybernetics, 2007 International Conference, August 19-22, 2007, vol. 5, pp. 2557–2561 (2007)
A Rough Set-Based Multiple Criteria Linear Programming Approach for Classification Zhiwang Zhang1, Yong Shi2, Peng Zhang3, and Guangxia Gao4 1
School of Information of Graduate University of Chinese Academy of Sciences, China; Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences, Beijing 100080, China [email protected] 2 Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences, Beijing 100080, China; College of Information Science and Technology, University of Nebraska at Omaha, Omaha NE 68182, USA [email protected] 3 School of Information of Graduate University of Chinese Academy of Sciences, China; Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences, Beijing 100080, China [email protected] 4 Foreign Language Department, Shandong Institute of Business and Technology, Yantai, Shandong 264005, China [email protected]
Abstract. It is well known that data mining is a process of discovering unknown, hidden information from a large amount of data, extracting valuable information, and using the information to make important business decisions. And data mining has been developed into a new information technology, including regression, decision tree, neural network, fuzzy set, rough set, support vector machine and so on. This paper puts forward a rough set-based multiple criteria linear programming (RS-MCLP) approach for solving classification problems in data mining. Firstly, we describe the basic theory and models of rough set and multiple criteria linear programming (MCLP) and analyse their characteristics and advantages in practical applications. Secondly, detailed analysis about their deficiencies are provided respectively. However, because of the existing mutual complementarities between them, we put forward and build the RS-MCLP methods and models which sufficiently integrate their virtues and overcome the adverse factors simultaneously. In addition, we also develop and implement these algorithm and models in SAS and Windows platform. Finally, many experiments show that RS-MCLP approach is prior to single MCLP model and other traditional classification methods in data mining. Keywords: Data mining, Rough Set, MCLP, Classification.
1 Introduction Data mining has been used by many organizations to extract information or knowledge from large volumes of data and then use the valuable information to make critical business decisions. Consequently, analysis of the collected history data in data M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 476 – 485, 2008. © Springer-Verlag Berlin Heidelberg 2008
A Rough Set-Based Multiple Criteria Linear Programming Approach for Classification
477
warehouse or data mart can gain better insight into your customers and evaluation of the place in industry, improve the quality of decision-making and effectively increase competitiveness in market. From the aspect of methodology, data mining can be performed through association, classification, clustering, prediction, sequential patterns, and similar time sequences [Han and Kamber, 2001]. For classification, data mining algorithms use the existing data to learn decision functions that map each case of the selected data into a set of predefined classes. Among various mathematical tools including statistics, decision trees, fuzzy set, rough set and neural networks, linear programming has been initiated in classification more than twenty years [Freed and Glover, 1981]. Given a set of classes and a set of attribute variables, one can use a linear programming model to define a related boundary value separating the classes. Each class is then represented by a group of constraints with respect to a boundary in the linear program. The objective function minimizes the overlapping rate of the classes or maximizes the distance between the classes. The linear programming approach results in an optimal classification. It is also flexible to construct an effective model to solve multi-class problems. However, the MCLP model is not good at dimensional reduction and removing information redundancy, especially facing many attributes with a large number of data. To our joy, rough set can find the minimal attribute set and efficiently remove redundant information [Z. Pawlak, 1982]. Consequently, the developing approach of RSMCLP to data mining is promising to overcome these disadvantages. In this paper, we will give a full description of rough set-based MCLP method and model for classification in data mining. First a detailed introduction of MCLP model and rough set in the related work section is given, including the algorithms of MCLP model, rough set for feature selection and their virtues of classification. Then we put forth the methodology of the Rough set-based MCLP model after the analysis of their deficiencies respectively and implement the combinational model in SAS and Windows platform. And then we describe the advantages of the RS-MCLP model. Finally we present a comprehensive example in different data set and experimental conclusions.
2 Related Work 2.1 MCLP Approach for Classification A general problem of data classification by using multiple criteria linear programming can be described as the following: Given a set of n variables or attributes in database A = (a1 , a2 ," , an ) , let Ai = (ai1 , ai 2 ," , ain ) ∈ R n be the sample observations of
data for the variables, where i = 1, 2," , l and l is the sample size. If a given problem can be predefined as s different classes, C1 , C2 ," , Cs , then the boundary between the j th and ( j + 1) th classes can be b j , j = 1, 2," , s − 1 . Then we determine the coefficients for an appropriate subset of the variables which can be represent the whole of decision space, denoted by X = ( x1 , x2 ," , xm ) ∈ R m (m ≤ n) , and scalars b j such that the separation of these classes can be described as follows:
478
Z. Zhang et al.
(1) Ai X ≤ b1 , ∀Ai ∈ C1 and bk −1 ≤ Ai X ≤ bk , ∀Ai ∈ Ck , k = 2," , s − 1 , and Ai X ≥ bs −1 , ∀Ai ∈ Cs . where ∀Ai ∈ C j , j = 1, 2," , s , means that the data case Ai belongs to the class C j . For a binary classification we need to choose a boundary b to separate two classes: G (Goods) and B (Bads); For the purpose of simplification we present only descriptions about binary classification, and which can be extended easily into multiple classification circumstances. That is: (1Ê) Ai X ≤ b, Ai ∈ G and Ai X ≥ b, Ai ∈ B . Where Ai are the vector value of the subset of the variables. For better separation of Goods and Bads, someone considered the two measurements of the overlapping degree with respect to Ai and the distance where Ai departed from its adjusted boundary b respectively [Freed and Glover, 1981]. Subsequently, Glover introduced the two factors above in models [Glover, 1990]. Consequently, we have the following conclusions: Let α i be the overlapping degree as above describing, and we want to minimize the sum of α i , then the primal linear programming can be written as:
(2) Minimize∑ i α i , subject to: Ai X ≤ b + α i , Ai ∈ G and Ai X ≥ b − α i , Ai ∈ B . Let βi be the distance as above defined too, and we want to maximize the
sum of βi , then the primal linear programming can be expressed as:
(3) Maximize∑ i β i , subject to: Ai X ≥ b − βi , Ai ∈ G and Ai X ≤ b + β i , Ai ∈ B .
If considering the two measurements in classification simultaneously, we will get hybrid multiple criteria linear programming model as follows: (4) Minimize∑ i α i and Maximize∑ i β i , subject to: Ai X ≤ b + α i − β i , Ai ∈ G and Ai X ≥ b − α i + β i , Ai ∈ B . Where Ai are given,
X and b are unrestricted,
and α i and βi ≥ 0 . Furthermore, the compromise solution approach has been used to improve the above model (4) in business practices [Shi and Yu, 1989]. It is assumed that the ideal value of −∑ i α i be α ∗ ( α ∗ > 0 ), at the same time, the ideal value of
∑β i
i
be β ∗ ( β ∗ > 0 ). Then, if −∑ i α i > α ∗ , the regret measure is defined as
− dα+ = α ∗ + ∑ i α i ( dα+ ≥ 0 ); otherwise, it is 0. If −∑ i α i < α ∗ , the regret measure is
also written as dα− = α ∗ + ∑ i α i ( dα− ≥ 0 ); otherwise it is 0.
Thus, we have α ∗ + ∑ i α i = dα− − dα+ and | α ∗ + ∑ i α i |= dα− + dα+ . Similarly, we
have β ∗ + ∑ i βi = d β− − d β+ and | β ∗ − ∑ i β i |= d β− + d β+ , d β+ ≥ 0 , d β− ≥ 0 . To sum up, the improved MCLP model which we use for modeling in this paper may be expressed as: (5) Minimize : dα− + dα+ + d β− + d β+ ,
A Rough Set-Based Multiple Criteria Linear Programming Approach for Classification
479
Subject to: α ∗ + ∑ i α i = dα− − dα+ and β ∗ + ∑ i βi = d β− − d β+ ,
Ai X = b + α i − β i , Ai ∈ G and Ai X = b − α i + βi , Ai ∈ B . Where Ai , α and β ∗ are given, X and b are unrestricted, and α i , βi , dα− , dα+ , ∗
d β− , d β+ ≥ 0 . Owing to the following characteristics, MCLP models are more popular correspondingly than traditional nonlinear models, a) simplicity, from algorithm to model results MCLP are very easy to understand and explain. b) flexibility, user may freely input different parameters to adjust model performance and get better effects. c) generalization, because of systematic consideration to the best trade-off between minimizing the overlapping degree and maximizing the distance departed from boundary, the model will gain better classification correct rate and generalization of training set and test set. 2.2 Rough Sets-Based Feature Selection Method
On account of the deficiency which MCLP model failed to make sure and remove the redundancy in variables or attributes set. That is to say the model is not good at giving judgment on attributes which are useful and important or unnecessary and unimportant relatively. However, rough set methods have an advantage in this aspect. Rough set theory which was developed by Z. Pawlak is a new mathematical analysis method for dealing with fuzzy and uncertain information and discovering knowledge and rules hided in data or information [Z. Pawlak, 1982]. Besides, knowledge or attribute reduction is one of the kernel parts of rough sets, and it can efficiently reduce the redundancy in knowledge base or attribute set. For supervised learning a decision system or decision table may often be the form A = (U , A ∪ {d }) , where U is a nonempty finite set of objects called the universe, A is a nonempty finite set of attributes, d ∉ A is the decision attribute. The elements of A are called conditional attributes or simple conditions. And a binary relation R ⊆ X × X which is reflexive (i.e. an object is in relation with itself xRx ), symmetric (i.e. if xRy then yRx ) and transitive (if xRy and yRz then xRz ) is called an equivalence relation. The equivalence class of an element x ∈ X consists of all objects y ∈ X such that xRy . Let A = (U , A) be an information system, then with any B ⊆ A there is associated an equivalent relation INDA ( B ) : INDA ( B ) = {( x, x ′) ∈ U 2 | ∀a ∈ B, a( x) = a( x′)} , here INDA ( B ) is called B - indiscernibility relation. If ( x, x′) ∈ INDA ( B) , then objects x and x ′ are indiscernible from each other by attributes from B . Then the equivalence classes of the B -indiscernibility relation are denoted [ x]B . An equivalence relation induces a partitioning of the universeU . These partitions can be used to build new subsets of the universe. Subsets that are most often of interest have the same value of the outcome attribute [J. Komorowski, L. Polkowski, A. Skowron, 1998]. In a word, rough set is a powerful tool of data analysis with the virtue as follows: a) no using the prior knowledge, traditional analysis methods (i.e. fuzzy set, probability and
480
Z. Zhang et al.
statistics method) are also used to process uncertain information, but it is necessary for them to provide additive information or prior knowledge. However, rough set only make use of information in data set. b) expressing and processing uncertain information effectively, on the basis of equivalent relation and indiscernibility relation it can reduce redundant information and gain the minimal reduction of knowledge or attribute and discover simplifying knowledge and rules. c) missing value, rough set can be avoided of the effects because of missing value in data set. d) high performance, it can rapidly and efficiently process the large number of data with many variables or attributes.
3 A Rough Set-Based MCLP Approach for Classification 3.1 The Limitation of Rough Sets and MCLP Methods
Although rough set has many advantages just like the above mentioned, it is short of the fault tolerance and generalization in new data case. Besides, it only deals with discrete data. However, MCLP model good at those aspects. Similarly, MCLP model can gain the better compromise solution on condition that it got the better control of the trade-off of between minimizing the overlapping degree and maximizing the distance departed from boundary. That is to say it do not attempt to get optimal solution but to gain the better generalization by means of using regret measurement and to seek for non inferior solution. Nevertheless, MCLP model only can deal with continuous and discrete data with numeric type. In addition, it also failed to treat with missing value and get the reduced conditional attribute set by the aid of its function. 3.2 Rough Set-Based MCLP Approach for Classification
According to the above analysis, we can find the difference and the existing mutual complementarities between them and combine rough set and MCLP model in data mining. In general, MCLP model can not reduce the dimensions of input information space. Moreover, it result in too long training time when the dimensions of input information space is too big, and to some extent the model will not gain solution of primal problems. However, rough set theory can discover the hided relation among data, remove redundant information and gain a better dimensionality reduction. In practical applications, because rough set is sensible to the noise of data, the performance of model will be the bad when we apply the results learning from data set without noise to data set including noise. That is to say, rough set have the poor generalization. Nevertheless, MCLP model is provided with the good noise suppression and generalization. Therefore, according to their characteristics of the complementarities, the integration of rough set and MCLP model will produce a new hybrid model or system. At the same time, rough set is used as a prefixion part which is responsible for data preprocessing and MCLP will be regarded as the classification and prediction system where use the reduced information by rough set in the model or system. Consequently, the system structure of the rough set-based MCLP model for classification and prediction may be presented as follows in Figure 1:
A Rough Set-Based Multiple Criteria Linear Programming Approach for Classification
481
Training sample set Testing sample set Optimal values of conditional attributes
Organizing decision table
Reducing decision table
Minimal conditional attributes set and corresponding training sample set
Prefixion system based on rough sets
Testing sample set corresponding with minimal conditional attributes set
MCLP for classification and prediction system
Results evaluation, explanation and application
Rear-mounted system based on MCLP
Rough set-based MCLP model for classification and prediction system
Fig. 1. Rough set-based MCLP model for classification
Firstly, we derive the attributes or variables from the source data sets collected on the basis of classification requirements and quantify these attributes. Then a data set which is composed of quantified attributes or variables may be represented as a table, where each row represents a case, an event, an observation, or simply an object. Every column represents an attribute or a variable that can be measured for each object, therefore, this table is called a decision table (or a decision system, information system). In addition, attributes may be categorized into conditional attributes and decision attributes. Meanwhile, reduction in decision table includes both reducing conditional attributes and decision rules. As far as the conditional attribute reduction is concerned, we need to check the consistency of decision table after removing an attribute. If the decision table is consistent, we will remove such attribute and finally gain the minimal attribute set. However, as for decision rule reduction we must examine the rest of training set to find which attribute is redundant after deleting the repetitions information in sample set, and then the minimal decision table or system will be produced by means of removing redundant and repetitious information. Of course, we may complete decision rule reduction ahead of the conditional attribute reduction and finally get the minimal decision table. In succession, we need create new training set
482
Z. Zhang et al.
based on the minimal decision attribute and the corresponding primal data over again, where the set preserve the important attributes have an affects on performance of classification model. Then we use the data set to learn and train the MCLP model. Similarly, we need to create new testing set based on the minimal decision attribute and the corresponding primal data. Finally, we use the data set to test classifier learning from the above data set, get the results of prediction and give evaluation and explanation for these results. 3.3 The Characteristics of the Rough Set-Based MCLP Approach
As for the rough set-based on MCLP approach, data preprocessing which is used rough set method leads to the minimal attributes set and the absent redundant information. Accordingly, the quantity of data which are used for MCLP model will be reduced and the speed of the system is increased remarkably. On the other hand, in this approach we regard MCLP model as rear-mounted system, which will possess better fault tolerance and interference suppression capabilities, because we apply rough set method to remove the correlative attributes and the overlapping information. That is to say, the dimensional reduction leads to creating a good comprehensive evaluation function and outputting assessment results for us.
4 Algorithm of Rough Set-Based MCLP Approach In this part, we give an algorithm which implements the Rough Set-based MCLP model for classification (RS-MCLP), describing it in the following: Input: A = (a1 , a2 ," , an ) , // Sample observations of data for the variables
Y = {0,1} ∈ R , // Target variable b , // Initial boundary value α ∗ , // Negative sum of the overlapping degree β ∗ // Sum of the distance where Ai departed from its adjusted boundary b Output: X = ( x1 , x2 ," , xm ) ∈ R m (m ≤ n) , // Weights of the minimal attribute set α i (1 ≤ i ≤ l ) , // The overlapping degree of each observation or case βi (1 ≤ i ≤ l ) // The distance departed from boundary of each observation Processing flow: // Rough Set-based MCLP model for classification Step one: Data preprocessing and using Rough Set theory to compute and get the minimal attribute set: Discretize all of the continuous variables and merge all of the unreasonable intervals of discrete variables using ChiMerge technologies or other related methods. [Liu, H. & H. Motoda, 1998]. Compute the minimal attribute set A′ = (a1 , a2 ," , am ) (m ≤ n) using Rough Set theory and methods on basis of the discrete variable or attribute set. Step two: Firstly partition data set, and then compute the weights of variables using MCLP model and gain the ordering values of each observation in classification:
⑴ ⑵
A Rough Set-Based Multiple Criteria Linear Programming Approach for Classification
⑴ ⑵ ⑶
483
Divide the data set with the minimal attribute set into training set and testing set independently. Train and Learn MCLP model which is described in model (5) above on training set. Check and validate the performance of classification. If the result of classification is satisfied, the flow goes on; otherwise, change the parameters of models and turn the step to . In the end, compute the results of classification on testing set. Step three: Transform the results above into probabilities and evaluate the classification accuracy rate: According to the results, we use logistic regression to compute the probabilities of each instance. That is: log( p (1 − p)) = A′X + ε , wher p is probability of target class, A′ is the minimal attribute set obtained by Rough Set methods, X is the weights or coefficients of variables gained by MCLP approach and ε is an interception item. Compute and give the evaluation of classification, namely type I and II error rate, the total misclassification rate, KS index and Gini index as well.
⑷
⑵
⑴ ⑵
5 Experimentation and Comparison of Results In this experimentation, data sets source from the UCI Knowledge Discovery in Database Archive, an online repository of large data sets which encompasses a wide variety of data types. In the repository, we select five data sets related with medical diagnosis and treatments: 1) breast cancer data: the table have 286 instances, 9 attributes or variables (i.e. age, menopause, tumor size, node cap, malignant degree, left or right, etc.) and 1 class attribute which show breast cancer whether is recurrent or not; 2) heart disease data: the data set includes 270 instances, 13 attributes (i.e. age, sex, chess pain type, blood pressure, blood sugar, heart rate, if exercise induced angina, etc.) and 1 target attribute; 3) Lung Cancer data: the data set includes 32 instances, 56 variables and 1 class attribute which possesses 3 types. 4) Wisconsin breast cancer data: the data set includes 699 instances, 9 attributes (i.e. clump thickness, cell Size, cell shape, marginal adhesion, bare nuclei, bland chromatin, normal nucleoli, mitoses, etc.) and 1 target variable which shows the cancer is benign or malignant; 5) SPECTF Heart data: the dataset describes diagnosing cardiac Single Proton Emission Computed Tomography (SPECT) images. Each of the patients is classified into two categories: normal and abnormal. And the database of 267 SPECT image sets (patients) was processed to extract features that summarize the original SPECT images. After we choose the above data sets, first each table is divided into two parts: training set and testing set. Then we turn to train MCLP model using the training set respectively, which is built on SAS development environment, and the system rapidly give the solutions of the models. Afterwards, when the results of models are applied to testing set, we will obtain the classification result and the corresponding evaluation indexes. As is represented in the Table 1, we provide the detailed results: the number of attributes used by model, type I error rate, type II error rate and the total misclassification rate as well.
484
Z. Zhang et al.
Table. 1. The results and comparison of MCLP model and Rough Set-base MCLP model
Model
MCLP
RS-MCLP
Data Set Breast Cancer Heart Disease Lung Cancer WBreast Cancer SPECTF Heart Breast Cancer Heart Disease Lung Cancer WBreast Cancer SPECTF Heart
Number of Attributes 9 13 56 9 44 7 5 3 5 3
Type I Error Rate (%) 29 66 67 3 41 28 36 22 2 33
Type II Error Rate (%) 36 17 15 19 56 36 20 30 20 29
MisClassification Rate (%) 35 39 36 11 24 32 27 26 11 23
For the sake of comparison and validation our new model, we also train Rough Set-based MCLP (RS-MCLP) model in similar way. However, the difference lies in: first we discretize the related attributes in data set where includes some continuous attributes, and then get the reduction decision tables using rough set theory and methods. Furthermore, we train MCLP model on the reduction data set again, and the classification results and evaluation indexes are listed in Table 1. At the same time, for the convenient comparison, the number of variable is provided too. In addition, we use these parameters in MCLP, and they are: the overlapping degree α ∗ = 0.01 , the distance departed from its adjusted boundary β ∗ = 300000 and the class boundary b = 1 . In a word, through the comparison with the classification results of the different models, we find the accuracy of classification is not decreasing. However, the RSMCLP approach can significantly reduce the number of variable in model and the computational complexity.
6 Conclusions and Future Work This paper provides a new data mining model and its applications in the different decision systems or tables, and experiments show that the rough set-based MCLP model for classification is prior to single MCLP model and Rough Set method. That is to say, after rough set attribute reduction and its removing redundant information, the speed and the performance of MCLP model will be considerably improved and increased. Besides, we plan to implement a rough set-based fuzzy MCLP model for classification and attempt to extend it to regression methods or unsupervised learning approaches in the future. Acknowledgements. This research has been partially supported by a grant from National Natural Science Foundation of China (#70621001, #70531040, #70501030, #70472074, #10601064), 973 Project #2004CB720103, Ministry of Science and Technology, China, and BHP Billiton Co., Australia.
A Rough Set-Based Multiple Criteria Linear Programming Approach for Classification
485
References 1. Komorowski, J., Polkowski, L., Skowron, A.: Rough Sets: A Tutorial. Springer, Heidelberg (1998) 2. Maimon, O., Rokach, L.: Data Mining and Knowledge Discovery Handbook, pp. 535–558. Springer, Heidelberg (2005) 3. Sawaragi, Y., Nakayama, H., Tanino, T.: Theory of Multiobjective Optimization, Mathematics Science and Engineering, vol. 176. Academic Press, London (1985) 4. Freed, N., Glover, F.: Simple but Powerful Goal Programming Models for Discriminant Problems. European Journal of Operational Research 7, 44–60 (1981) 5. Shi, Y., Wise, M., Luo, M., Lin, Y.: Data Mining in Credit Card Portfolio Management: A Multiple Criteria Decision Making Approach. In: Advance in Multiple Criteria Decision Making in the New Millennium, pp. 427–436. Springer, Berlin (2001) 6. Kou, G., Xiantao, L., Peng, Y., Shi, Y., Morganwise, Xu, W.: Multiple criteria linear programming approach to data mining. Models, algorithm designs and software development, Optimization Methods and Software 18(4), 453–473 (2003) 7. Swiniarski, R.W., Skowron, A.: Rough set methods in feature selection and recognition. Pattern Recognition Letters 24, 833–849 (2003) 8. Pawlak, Z.: Rough sets. International Journal of Computation and Information Sciences 11, 341–356 (1982)
Predictive Modeling of Large-Scale Sequential Curves Based on Clustering Wen Long1 and Huiwen Wang2 1
Research Center on Fictitious Economy & Data Science, Chinese Academy of Sciences, Beijing 100080, China [email protected] 2 School of Economics & Management, Beijing University of Aeronautics & Astronautics, Beijing 100083, China [email protected]
Abstract. Traditional approach to predict large-scale sequential curves is to build model separately according to every curve, which causes heavy and complicated modeling workload inevitably. A new method is proposed in this paper to solve this problem. By reducing model types of curves, clustering curves and modeling by clusters, the new method simplifies modeling work to a large extent and reserves original information as possible in the meantime. This paper specifies the theory and algorithm, and applies it to predict GDP curves of multi-region, which confirms practicability and validity of the presented approach. Keywords: curves clustering, predictive modeling, large-scale curves, SOM.
1 Introduction A series of sequential data can draw a curve, which depicts dynamic information clearly. Nowadays the amount of data which is stored in databases increases very fast; and the situation of large-scale sequential curves often happens in the analysis work, such as GDP curves of multi-regions, sales volume curves of products, observed clinical data of large sample. How to analyze and predict a large set of curves is worthy to be further studied. Traditional solution to the problem is building model separately according to every curve, that is, one curve needs one model. The method can obtain the predictive results accurately, however modeling work will become rather heavy and complicated inevitably while the analysis objects are large scale. Some illuminating work seems to have opened a new prospect. G. Hebrail [1, 2] has performed clustering to 2,665 electric load curves in order to distinguish diverse electric consumption patterns, and got pleasant results. Other related work [3, 4] also proved that clustering is an effective approach to describe and analyze huge data sets or sequential curves. However, this paper is focused on predictive modeling on a large set of curves, which is different from their work. Therefore, it’s expected that the shape of curve can display some visible regularity so as to be convenient for choosing appropriate predictive model. Yet in fact, original data usually exhibit abundant curve configuration, which makes it difficult to choose predictive model. M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 486 – 493, 2008. © Springer-Verlag Berlin Heidelberg 2008
Predictive Modeling of Large-Scale Sequential Curves Based on Clustering
487
Consequently, this paper proposes a new approach to solve the problem of predictive modeling of large-scale curves. Basic idea of the new approach is to perform some transformation to original data for the sake of eliminating scale and reducing types of models, and clustering to the preprocessed curves based on some certain clustering technique, then build predictive models by clusters, finally calculate the predictive value of original data through inverse algorithm. This paper is organized as follows. In Sect. 2, the method of curves clustering modeling will be introduced, which will be specified by three parts, including reduction of models’ types, curves clustering and predictive modeling by cluster. Sect. 3 will state the method of Self-Organizing Map briefly, which is used in the curves clustering. In Sect. 4, applying this new approach, a case will be studied concerning predicting GDP of 133 countries and regions. At last Sect. 5 summarizes the results.
2 Predictive Modeling of Large-Scale Curves Based on Clustering Method 2.1 Reducing Types of Models One problem of existing approach to predictive modeling of large-scale curves is that modeling work is too much to apply in the applications. Since types of models decide number of modeling, the models’ types must be decreased. In the problem of predictive modeling of curves, the model’s type depends upon two factors, scale of original data and shape of original curves. Therefore, the reduction of models’ types must be based on the unification of scale of original data and simplification of shape of original curve. In the social and economic cases, usually the data sequences are positive, so this paper only discusses the situation while xit > 0 ( i = 1," , n; t = 0,1," , T ).
Given original data sequences described by a set of vectors X i = { xi1 , xi 2 ," , xiT } , (i = 0,1," , n) . Then the development speed with link relative method of individual i at the time t is defined as
ait =
xit , (t = 1," , T ) . xi ( t −1)
(1)
Accumulate development speeds with link relative method, obtain bi1 = ai1 ,
bit = bi (t −1) + ait , (t = 2," , T ) .
(2)
Here, accumulated development speed curve bi = ( bi1 , bi 2 ," , biT ) is an increasing one whose scale has been eliminated and tendency grows steadily. After calculating the velocities of development with link relative method ai = {ai1 ," , aiT } and accumulated the velocities of development bi = {bi1 ," , biT } , the diversity of original curves due to scale has been obviated, and accumulated curves bi
488
W. Long and H. Wang
demonstrates good configuration with increasing steadily. As a result, the types of curves have been largely decreased, and the accumulated curves bi will be more convenient to clustering. 2.2 Curves Clustering
Clustering can be loosely defined as the process of organizing objects into groups whose members are similar in some way. The standard way of analyzing a large set of curves is to perform a clustering of the curves, so that experts look at a small number of classes (i.e. clusters) of similar curves instead of at the whole set of curves [2]. The direct objective of curves clustering in this paper is to build models by cluster. Therefore, what are used to perform clustering are not original curves xit (t = 0,1," , T ) , but accumulated velocities of development curves bit (t = 1," , T ) that benefit modeling. If curves belonging to the same cluster display similar curve features and evident cluster effect, the result of clustering is regarded as pleasant. In this paper, the method of Self-Organizing Map will be applied to perform clustering, that will be introduced at Sect. 3. 2.3 Modeling by Cluster 2.3.1 Basic Idea of Modeling by Cluster A hypothesis must be given before modeling by cluster that those curves which belong to one cluster have same or similar dynamic trends, that is, they grow at the same or similar pattern. Perform curves clustering to accumulated velocities of development curve bit (t = 1," , T ) . However, if the clustering result is not pleasant and the curves belonging to a certain cluster are dispersed each other, for the sake of lowering the loss of original information, a second clustering should be performed to them, till the pleasant result of clustering bas been obtained. As for the clusters with good clustering result, the work of modeling by cluster can be performed. Provided that n curves are sorted into l classes (l n), and cluster k includes nk curves, define
≤
ytk = bt k =
1 nk
nk
∑b i =1
k it
,
(t = 1," , T ) .
(3)
Build predictive model y k = f k (t ) according to y k = { y1k , y2k ," , yTk } , (k = 1," , l ) . Based on the hypothesis of modeling by cluster stated above, this model can explain all the curves contained in cluster k. Consequently, the amount of modeling has been reduced from the amount of individuals n to that of classes l, which has simplified the modeling work to a large extent, also reserved original information as possible. 2.3.2 Algorithm of Returning to Original Data If yˆTk +1 denotes the predictive value of y k at future time T+1, yˆTk +1 can be calculated
by the predictive model of cluster j, that is y j = f j (t ) .
Predictive Modeling of Large-Scale Sequential Curves Based on Clustering
489
Calculate the development speed of cluster j at time T+1 according to yˆTj +1 , obtain aTj +1 = yˆTj +1 − yTj , ( j = 1," , l ) .
(4)
Then, the predictive value of individual i at the time T +1 can be obtained as follow
xˆij(T +1) = xiTj × aTj +1 , ( j = 1," , l , i = 1," , k ) .
(5)
3 Neural Network of Self-organizing Map In the method introduced in Sect. 2, neural network of Self-Organizing Map is applied to perform curves clustering, which can exhibit visualized results [5]. Self-Organizing Map SOM is a method of artificial neural network, proposed by Prof. Teuvo Kohonen of Finland in 1981. This network can simulate selforganizing map function of nervous system of brain. It is a competitive learning network that it can perform unsupervised self-organizing learning and learn from complex, multi-dimensional data and transform them into visually decipherable clusters. The main function of SOM networks is to map the input data from an n-dimensional space to a lower dimensional (usually one or two-dimensional) plot while maintaining the original topological relations [6]. The SOM network typically has two layers of nodes, input layer and competitive layer. The nerve cells in the input layer are one-dimensional, and those in the competitive layer are two-dimensional. All the nerve cells both in the two layers connect each other. As the training process proceeds, the nodes adjust their weight values according to the topological relations in the input data. The node with the minimum distance is the winner and adjusts its weights to be closer to the value of the input pattern [7, 8]. Justification for weight vector can be explained by this formula as follows:
(
)
Wi (t + 1) = Wi (t ) + hc ( x ),i ( X (t ) − Wi (t )) .
(6)
Here t denotes iterative number of input vector; Wi(t) is weight vector and X(t) is observed vector X at tth iteration; hc ( x ),i is neighborhood function and c(x) represents the winner. Define neighborhood function hc ( x ),i as hc ( x ),i = α (t ) exp(− ri − rc
2
2δ 2 (t )) .
(7)
α(t) denotes learning rate varying in [0,1] and descending with increasing of iterations. ri, rc are position vectors, corresponding to Wi, Wc, ri ∈ R 2 , rc ∈ R 2 . δ (t ) , descending with increasing of iterations, corresponds to the width of neighborhood function hc ( x ),i denoting N c (t ) .
490
W. Long and H. Wang
The algorithm of SOM can be briefly described as follows. 1. Initialization of weight values by random and give an initial radius of neighborhood. 2. Input a new vector X j ∈ R p , j = 1, 2," , n . 3. Calculated the distances between X j and all the output nodes. 4. Find the node c whose weight vector is closest to the current input vector X j . 5. Train node c and all nodes in some neighborhood of c, and then modify the weight vector using formula (6). 6. Return to step 2 for X tj , t = 1," , T . In the course of development of SOM, originally initial weight vector is chosen by random, that indicates SOM can self-organize toughly even though initial situation is out-of-order. But in practice, if initial weight vector can be selected by PCA rules, the constringency of algorithm will be accelerated.
4 Case: Predicting GDP Curves of 133 Countries and Regions Based on the new approach proposed in this paper, a case is studied to predict the future GDP of 133 countries and regions according to historical GDP data from 1990 to 2002 [9]. Calculate the accumulated GDP velocities of development curves of 133 countries (or region). Then perform clustering to the accumulated curves using the method of Kohonen. Considering two factors of decrease of modeling and reservation of original information, this paper classified the 133 accumulated curves into 10 clusters (Fig. 1).
(1)-3
(6)-16
(2)-11
(7)-17
(3)-22
(4)-10
(5)-7
(8)-17
(9)-17
(10)-13
Fig. 1. Clustering result of 133 accumulated GDP development speeds curves
Predictive Modeling of Large-Scale Sequential Curves Based on Clustering
491
An example of cluster 3 is illustrated, which contains the most curves among these clusters. Cluster 3 includes 22 countries and regions, and there exists a great disparity in the size of GDP among them. Seen from Fig. 2, due to the GDPs of U.S.A. and Britain counted by $1,000 billions, being much greater than others, especially influenced by U.S.A., the curves of other 20 countries and regions pile up near the abscissa axis. In fact, even in the 20 countries and regions, there is a wide gap of GDP. For instance, GDPs of India, Norway and Saudi Arabia are above $100 billions; and those USD 100 million 100000 USA 80000 60000
40000
20000
UK India
0 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002
Fig. 2. 22 observed GDP curves of cluster 3 14 12 10 8 6 4 2 0 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002
Fig. 3. 22 accumulated development speed curves of cluster 3
492
W. Long and H. Wang
of Bengal, Hungary and Nigeria are counted by $10 billions; yet Haiti, Nepal and Malta produce GDP counted by billions. In addition, the configuration of curves doesn’t show evident regularity. However, we find that the curves display visible regularity after calculating accumulated development speeds of GDP (Fig. 3). 22 curves congregate closely, not only avoiding the influence of scale, but also presenting evident increasing trends. Obviously it becomes much easier to modeling in this instance. Build models separately to 10 clusters of accumulated curves, and fitting values of every cluster can be obtained. Then the predictive GDP of each country (or region) can be worked out by the means mentioned at Sect. 2.3. In order to test precision of the result calculated using the new method, this paper will measure the error by comparing the fitting values of GDP with observed ones. Table 1 has given the relative errors of GDP fitting values in 2001 and 2002. Table 1. Relative error of GDP fitting values of 133 countries and regions in 2001 and 2002
Range of relative error proportion of 2001 countries (or re2002 gions) included
≤10% 80.5%
10~15% 13.5%
15~20% 3.0%
>20% 3.0%
73.7%
15.0%
7.5%
3.8%
Seen from Table 1, the accuracy of GDP fitting values in 2001 and 2002 seems satisfied and the relative error of most countries (or regions) limits in the range of 10%, such as U.S.A. (Fig. 4), whose observed data and fitting ones are very close. That further confirms the new approach has obtained pleasant results, which decreases the amount of modeling from 133 to 10 under the circumstances of keeping original information as possible. USD 100 million 120000 100000 80000 60000 40000 observed data 20000
fitting data
0 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002
Fig. 4. Comparison between observed GDP data and fitting ones of U.S.A
Predictive Modeling of Large-Scale Sequential Curves Based on Clustering
493
5 Conclusions This paper introduces a new approach to predict a large set of timing curves. The predictive precision of this method relates to two factors, one is amount of classes during clustering; the other is choice of model during predictive modeling. In general, the more classes the curves are divided into while performing clustering to accumulated curves, the more information will be reserved while modeling. If the amount of clusters equals to that of individuals, the new approach will become to build model according to every curve, which is just the traditional method. Consequently there exists a contradiction between preserving original information and diminishing workload of modeling. The analyst must make a comprehensive decision to how many clusters should be classified into based on the specific data and clustering results. The models of modeling by cluster also influence the final predictive precision. More the error of model for accumulated velocities of development shows, more the error of predictive data which passes from the model magnifies. So choosing model while perform modeling by cluster must be deliberate. To sum up, the approach proposed in this paper offers an efficient and effective solution to predictive modeling of a large set of timing curves, which is also applicable to the related problems. Acknowledgments. This research was supported by National Science Fund of China (NSFC). The authors wish to thank Prof. Georges Hébrail of ENST and Prof. Ruoen Ren of BUAA for helpful comments.
References 1. Chantelou, D., Hébrail, G., Muller, C.: Visualizing 2,665 electric power load curves on a single A4 sheet of paper. In: International Conference on Intelligent Systems Applications to Power Systems (ISAP 1996), Orlando, USA (1996) 2. Debrégeas, A., Hébrail, G.: Interactive interpretation of Kohonen maps applied to curves. In: International Conference on Knowledge Discovery and Data Mining (KDD 1998), New York (1998) 3. Guo, H., Renaut, R., Chen, K., Reiman, E.: Clustering huge data sets for parametric PET imaging. BioSystems 71, 81–92 (2003) 4. Jank, W.: Ascent EM for fast and global solutions to finite mixtures: An application to curveclustering of online auctions. Computational Statistics & Data Analysis 51, 747–761 (2006) 5. Mingoti, S.A., Lima, J.O.: Comparing SOM neural network with Fuzzy c-means, K-means and traditional hierarchical clustering algorithms. European Journal of Operational Research 174, 1742–1759 (2006) 6. Kiang, M.Y.: Extending the Kohonen Self-Organizing Map Networks for Clustering Analysis. Computational Statistics and Data Analysis 38, 161–180 (2001) 7. Kohonen, T.: Self Organized Formation of Topologically Correct Feature Maps. Biological Cybernetics 43(1), 59–69 (1982) 8. Kohonen, T., Oja, E., Simula, O., Visa, A., et al.: Engineering application of the selforganizing map. Proceeding of the IEEE 84, 1358–1384 (1996) 9. International statistical yearbook. China Statistics Press, Beijing (1997-2004)
Estimating Real Estate Value-at-Risk Using Wavelet Denoising and Time Series Model Kaijian He1,2 , Chi Xie1 , and Kin Keung Lai2,1 1
College of Business Administration, Hunan University, Changsha, Hunan, 410082, P.R. China [email protected], [email protected] 2 Department of Management Sciences, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong [email protected]
Abstract. As the real estate market develops rapidly and is increasingly securitized, it has become an important investment asset in the portfolio design. Thus the measurement of its market risk exposure has attracted attentions from academics and industries due to its peculiar behavior and unique characteristics such as heteroscedasticity and multi scale heterogeneity in its risk and noise evolution etc. This paper proposes the wavelet denoising ARMA-GARCH approach for measuring the market risk level in the real estate sector. The multi scale heterogeneous noise level is determined in the level dependent manner in wavelet analysis. The autocorrelation and heteroscedasticity characteristics for both data and noises are modeled in the ARMA-GARCH framework. Experiment results in Chinese real estate market suggest that the proposed methodology achieves the superior performance by improving the reliability of VaR estimated upon those from traditional ARMA-GARCH approach. Keywords: Value at Risk, Real Estate Market, Wavelet Analysis, ARMA-GARCH Model.
1
Introduction
Traditionally the direct investment in the real estate market has been seen as the low risk/low return means of investment and are only allocated the small proportion of the portfolio value [1]. In recent years, the increasing securitization of the real estate industry has offered an important alternative to the direct investment in property. These indirect investment means offer clear advantages over the direct investment counterpart such as significantly higher liquidity level, lower unit costs, lower management and transaction costs, good levels of within property portfolio diversification, and transparent and real time price. Empirical researches also confirm that prices of these securitized real estate vehicles are highly correlated with the value of the underlying properties, thus serves as the alternative to the traditional direct property investment [2]. M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 494–503, 2008. c Springer-Verlag Berlin Heidelberg 2008
Estimating Real Estate Value-at-Risk Using Wavelet Denoising
495
However, the securitization process of the property investment also introduces additional risks in the investment process. Empirical researches suggest that their prices are not only exposed to the performance of the underlying property values, but also exposed to the volatility in the stock market. Thus they are more volatile than the underlying property market. There is higher level of risks accompanied by higher level of returns with the advent of these new features in the property investment [2]. Therefore, when new features of securitized real estate investment attract large allocation of funds into the real estate investment in the portfolio design, it also demands the appropriate risk measurement and management during the investment process. The more volatile market environment requires more precise measurement and management of the market risk level. However, traditional risk measurement techniques no longer suffice as the property shares increasingly exhibit unique characteristics. On one hand, its movement reflects the underlying property market behavior and is typically auto correlated and mean reverting. On the other hand, it is also influenced by speculative forces in the stock market and incorporates multi scale nonlinear dynamics structure in its risk evolution [3]. Therefore, new risk measurement techniques that are capable of analyzing risk in the multi scale domain is desired to deal with the particular characteristics in the market. Despite its significance, the literature on risk measurement for real estate investment is surprisingly scarce. This paper attempts to measure the market risk in the real estate market using Value at Risk methodology. Although numerous risk measures emerge over years, none has received the same level of recognition as VaR at both industrial and academic level. As a single summarizing statistics measuring the maximum possible losses over certain investment horizon at the given confidence level, VaR is simple and concise to understand and implement. To estimate real estate VaR at higher accuracy and reliability demanded by investors, this paper proposes the wavelet denoising ARMA-GARCH approach to analyze the multi scale structure of risk evolutions in the real estate market and further estimates VaR at the finer time scale domain. The contribution of the present paper is two folds: firstly wavelet denoising ARMA-GARCH algorithm is proposed for VaR estimates (WDNVaR). This approach is unique as the wavelet denoising algorithm is treated here as a blind source separation tool that extracts two time series with distinct features from the original time series, i.e. data and noise, for further modeling. Besides, noises are taken into account when measuring the market risk level since it contributes significantly to the overall variation. Secondly the proposed approach is applied to Chinese real estate market to measure significant market risk there, which is largely overlooked issue in the current literature, to the best of our knowledge. The rest of this paper is organized as follows: the second section reviews the relevant theories. The third section proposes the wavelet denoising ARMAGARCH VaR estimation methodology. The fourth section conducts empirical studies and analyze results. The fifth section concludes.
496
2 2.1
K. He, C. Xie, K.K. Lai
Relevant Theories Value at Risk and Backtesting Theories
VaR measures the maximum level of downside risk of the portfolio at certain confidence level with the given investment time horizon. It computes the maximum loss level as in (1) p{rt ≤ −rV aR } = 1 − cl
(1)
Where rt denotes the return of the portfolio over the time period t. cl denotes the associated investment confidence level. There are mainly three approaches to calculate VaR based on historical returns. These include parametric, nonparametric and semi parametric approach. The parametric approach derives the analytical form of risk measures based on the assumed density function. The non parametric approach infers patterns and the risk exposure level from the historical data using empirical data driven approaches. As parametric and non parametric approaches both have their advantages and disadvantages, the semi parametric approaches attract significant attentions recently by complimenting one using the other [4]. Meanwhile, the VaR model adequacy is checked using formal statistical procedures such as Backtesting. There are different types of backtesting procedures developed over years, which mainly concentrate on the following three aspects: the frequency of exceedances, the temporal behavior of exceedances and the size of losses [4]. Among them, Kupiec backtesting procedure is the most widely accepted one and is adopted in this paper to evaluate the model performance. The backtesting process is treated as a series of Bernoulli trials. The frequency of exceedances in these trials should converge to the binomial distribution at infinity. Thus the likelihood ratio test statistics takes the mathematical form as in (2). LRuc = −2log[(1 − ρ)T −N ρN ] + 2log([1 − (
T T −N T N )] ( ) ]) N N
(2)
Where LRUC denotes the test statistics that has an asymptotic χ2 (1) distribution. N is the number of VaR exceedances. ρ is the probability of an VaR exceedances occurrence. T is the number of trials carried out. 2.2
Wavelet Denoising Theory
Wavelet analysis is a new tool designed for multi scale analysis of signals in the time scale domain. Compared to the frequency only signal projection in the traditional Fourier analysis, wavelet analysis projects signals in the frequency-time domain, thus bringing the capability for more subtle and detailed data analysis. Mathematically wavelet analysis is an orthogonal transform that utilizes a special family of functions called wavelets to approximate and extract the features of interests from the original data [5]. Wavelets are basis functions that satisfy the admissibility and unit energy condition as in (3).
Estimating Real Estate Value-at-Risk Using Wavelet Denoising
∞
Cϕ = 0
|ϕ(f )| df < ∞, f
∞
−∞
|ψ(t)|2 dt = 1
497
(3)
Where ϕ(f ) is the Fourier transform of wavelet ψ(t) in the frequency domain. The original data series can be orthogonally transformed using wavelet basis functions. It is a decomposition process that projects the original data into the sub data series in the wavelet domain as in (4). ∞ t−u 1 x(t) √ ψ( )dt (4) W (u, s) = s s −∞ Where u is the wavelet parameter translating the original wavelet function, and s is the scale parameter dilating the original wavelet function. The wavelet synthesis process reconstructs the data series using wavelet coefficients in the time scale domain as in (5). x(t) =
1 Cψ
i
∞
nf ty 0
−∞
W (u, s)ψu,s (t)du
ds s2
(5)
As data are often contaminated by irrelevant noises, researches on recovering data of interests from noisy data have attracted significant attentions. Traditional approach assumes linear nature of the data during the denoising process and only achieves limited success. Recently nonlinear denoising algorithm that recognizes the nonlinear nature of data and noise has been explored further to improve the quality of data recovered from noisy data. Among numerous denoising algorithm proposed, wavelet analysis is one promising approach [6]. It starts by firstly decomposing the original data series in the time scale domain. The threshold level is selected for each scale following certain threshold selection rules. Then wavelet coefficients smaller than the calculated threshold level are removed while the remaining coefficients are dealt with following certain thresholding strategy. And finally the denoised data series are reconstructed from the denoised sub data series at different scales. In the end, the original data are decomposed into data part and irrelevant noises part. During the denoising process, both thresholding rules and threshold selection rules are crucial to the appropriate separation of data from noises. There are mainly two thresholding rules in the literature. The hard thresholding rule suppresses only those insignificant data that are less than the chosen thresholds. It retains the rest of the data. Thus it would retain the spike and abrupt changes. The soft thresholding rule subtracts the thresholding value from all data points while suppressing those insignificant data smaller than the chosen thresholds. The soft thresholding rule would leave more smooth and continuous data series for model fitting. Significant efforts have been devoted to develop algorithms for determining the optimal thresholds. These may include numerous threshold selection rules such as universal thresholding, minimax thresholding and Stein’s Unbiased Risk Estimate (SURE), etc [6]. The universal thresholding rule sets the upper ceiling boundary for independently and identically distributed noises.
498
K. He, C. Xie, K.K. Lai
The minimax thresholding rule adopts more conservative approach by choosing the optimal threshold that minimizes the overall mean square error. Thus the abrupt changes from the original data are retained [7]. The SURE chooses the optimal threshold by minimizing the Stein’s unbiased estimator of risk and is data adaptive. However, it could be narrow banded around the minimum value range.
3
Wavelet Denoising ARMA-GARCH Approach for Value at Risk Estimates
The traditional ARMA-GARCH approach fits conditional models directly to the original data to obtain parameter estimates for both mean and volatility. It only achieves the moderate level of accuracy since they ignore the underlying risk structure. Besides, different parts of risks may follow different evolution processes, which require the same model with different parameters or different model specifications. The traditional ARMA-GARCH approach is only approximation and offers limited insights into the underlying risk structure and evolution. To tackle these issues, the wavelet denoising ARMA-GARCH approach is proposed to model the risk evolution at finer scales. Risks are separated into data and noise parts. They are modeled separately with different model specifications and parameters. The rationale behind the proposed wavelet denoising ARMAGARCH approach is as follows: Since the fundamental and speculation part of real estate risks are driven by completely different factors, they evolve in independent manners and require separate treatments. Thus by separating these two parts and fitting models to them independently using the wavelet denoising algorithm, the fitting accuracy can be improved and more insights can be gained. The WDNVaR algorithm involves the following steps: (1) Wavelet analysis is used to separate the data from irrelevant noises. rt = rdata,t + rnoise,t
(6)
(2) The mean for both data and noises are modeled by ARMA processes with different parameters. μ t = a0 +
m i=1
ai rt−i +
n
bj εt−j
(7)
j=1
Where μ t is the estimate of the conditional mean at time t, rt−i (i = 1...m) is the lag m returns with parameter ai , and εt=j (j = 1...n) is the lag n residuals in the previous period with parameter bj . (3) The conditional mean for the original data series are aggregated from estimates for both data and noises part. data + μ noises μ aggregated = μ
(8)
(4) The volatility for both data and noises are modeled by GARCH processes with different parameters.
Estimating Real Estate Value-at-Risk Using Wavelet Denoising
σ t2 = ω +
p i=1
2 ai σt−i +
q
βj ε2t−j
499
(9)
j=1
2 (i = 1...p) Where σ t2 is the estimate of the conditional variance at time t, σt−i 2 is the lag p variance with parameter ai , and εt−j (j = 1...q) is the lag q squared errors with parameter βj in the previous periods. (5) And the conditional volatility for the original data series are aggregated from estimates for both data and noises part. 2 2 2 σ aggregated =σ data +σ nosies
(10)
The model order during model fitting process is determined under AIC and BIC minimization principle. (6) Although the empirical sample distribution could take any arbitrary form, it would converge to the normal distribution with significantly large size of samples and length of time horizon. Thus, when sample size is significantly large and covers significant length of time, VaR can be estimated parametrically as in (11). W DN V aR = −F (a) σt+1|t − μ
(11)
Where F (a) refers to the corresponding quantile (i.e. 95%, 97.5% or 99%) of the normal distribution. σ t+1|t refers to the estimate of conditional standard deviation at time t + 1. μ refers to the estimate of the sample mean.
4
Empirical Studies
The data analyzed in this paper include the closing prices for the Shanghai real estate index. Data are obtained from Global Financial Data, which is one of the major data vendors. The data set includes 3568 observations for Shanghai SE real estate index covering the time period from 30th April, 1993 to 14th December, 2007. Data range covers the period of significant changes and events, such as the residential reform in 1998, the volatile market environment during the transitional period and the late 2007, etc. Thus the proposed model is put to strict test under different market scenarios. 60% of the data set is used as the training set while the remaining 40% is reserved as the test set for model backtesting. The Augmented Dickey Fuller (ADF) test of stationarity shows that the original price series is not stationary. Further autocorrelation and partial autocorrelation function analysis suggest the existence of significant trend factors. Thus the price series are log differenced at the first order to eliminate the trend pt . factors as in ln pt−1 Some stylized facts can be observed in table 1. There are considerable fluctuations in the Chinese real estate market as suggested by the significant level of standard deviation. The market is also prone to high probability of extreme events occurrences as indicated by significant excess kurtosis. This is further
500
K. He, C. Xie, K.K. Lai Table 1. Descriptive Statistics and Statistical Tests Mean Standard Deviation Skewness Kurtosis JB Test BDS test 0.0001
0.0277
1.3826
17.3717
0.001
0
confirmed by the rejection of Jarque-Bera test of normality, which suggests that the phenomenon of fat tail is statistically significant. Meanwhile, the rejection of BDS test of independence suggests that there are nonlinear autocorrelation for the market returns. Thus the risks in the Chinese real estate market are statistically significant. Accurate and reliable risk measures are desired to protect investors’ interests in the risky market environment. Firstly VaR is estimated using traditional ARMA-GARCH approach. The lag order up to 5 for both ARMA and GARCH processes are tried using the training data. The model order is determined as ARMA(1,1)-GARCH(4,2) under AIC and BIC minimization principle. The rolling window method is used to include new observations made available for each trail. The window length is set to 2139 to cover the relevant information set. Table 2. ARMA-GARCH VaR for Shanghai SE Real Estate Index Confidence Levels Exceedances Kupiec Test P Value 99.0%
17
0.4933
0.4825
97.5%
38
0.1489
0.6996
95.0%
64
0.8352
0.3608
As indicated by lower p value, the VaR estimated at 97.5% confidence level is too aggressive. Meanwhile, the estimation at 95% confidence level is too conservative, which leads to too few exceedances. Besides, when it comes to the further performance improvement, this approach leaves little room. Thus the wavelet denoising ARMA-GARCH model is proposed to investigate the underlying risk structure and further improve the accuracy and reliability of VaR estimated. Four parameters need to be set when estimating VaR based on the wavelet denoising approach, including the decomposition level, the wavelet family, the thresholding rule and the threshold selection rule. The decomposition level is set to 1. The wavelet families tried includes Haar, Daubechies 2 and Coiflets 2. The thresholding rule tried include hard and soft thresholding rule. The threshold selection algorithms attempted include SURE, minimax and universal threshold selection algorithm. Firstly the wavelet denoising algorithm is used to denoise the original data series and analyze the underlying risk structure. The standard deviations for both denoised and noise data are calculated to investigate the impacts of both data and noises on the total risk level. The ARMA-GARCH model is fitted to the denoised data and noise separately to determine the model order. In table 3, the standard deviation is shown as the number outside the
Estimating Real Estate Value-at-Risk Using Wavelet Denoising
501
Table 3. ARMA-GARCH Model Order with different Thresholding and Threshold Selection Rules Wavelets
Hard
Soft
Hard
Soft
Hard
Soft
Heursure Heursure Minimaxi Minimaxi Universal Universal Data
(2,3,1,1)
(5,4,1,4)
(5,4,1,5)
(3,4,4,4)
(5,5,4,4)
(4,5,4,4)
(Haar)
0.0273
0.0240
0.0255
0.0216
0.0238
0.0205
Noise
(1,2,5,4)
(1,1,4,4)
(1,2,5,4)
(1,1,4,4)
(1,1,4,4)
(1,1,4,4)
(Haar)
0.0046
0.0082
0.0108
0.0140
0.0141
0.0164
Data
(1,3,5,2)
(1,1,1,1)
(5,5,2,2)
(2,4,1,4)
(2,5,4,3)
(5,4,4,4)
(Db2)
0.0271
0.0237
0.0255
0.0216
0.0239
0.0206
Noise
(1,4,4,5)
(4,5,3,2)
(2,2,4,5)
(4,5,2,2)
(4,5,4,4)
(2,4,3,4)
(Db2)
0.0055
0.0092
0.0107
0.0140
0.0140
0.0163
Data
(3,4,5,2)
(4,5,1,2)
(2,5,1,4)
(4,4,1,2)
(3,5,3,1)
(4,1,4,2)
(Coif2)
0.0273
0.0237
0.0253
0.0211
0.0235
0.0201
Noise
(5,3,1,1)
(1,5,4,2)
(4,5,1,1)
(1,5,4,2)
(3,5,1,5)
(1,5,2,1)
(Coif2)
0.0049
0.0087
0.0113
0.0146
0.0146
0.0172
parenthesis. The number pair (r,m,p,q) shown within the parenthesis correspond to the optimized model order in ARMA(r,m)-GARCH(p,q) specification. Experiment results in table 3 confirm that the standard deviation for noise data series are significant and non negligible during the modeling process. It could be as large as half of the standard deviation of data series in some circumstances. That means that noise data contributes up to 20% of the total variation in the data. Thus previous approaches ignoring the significant impact of noise data lead to significantly downward biased risk estimates. In contrast, risk estimates for noise data constitute important part of total risk estimates. Since the typical physical real estate market is usually more stable than the securitized market, the fundamental part of the risk is supposed to be smaller than the speculation part of the risk [?]. Thus it can be observed from experiment results that the noise data correspond to the fundamental part of market risks since it contributes less to the total variation in the long term. Meanwhile the denoised part corresponds to the speculation part of the market risk since they contribute most to the total variation. The securitized real estate market is much more volatile than the traditional one. This findings offer potentially useful information and freedom to investors when estimating the market price and risk level. Based on the analysis, the securitized real estate market is at least 50% more volatile than the traditional one. Experiment results in table 3 also show that the model order varies when WDNVaR is tried using different parameters. Thus to determine the best set of parameters for improving model’s forecasting accuracy, the AIC and BIC information criteria are introduced into the modeling process. Experiment results in
502
K. He, C. Xie, K.K. Lai
table 3 also shows clearly that data and noise follow their own path of evolutions and they are both non negligible during modeling process. This finding supports the modeling methodology proposed in this paper that treats data and noise separately with emphasis on their own characteristics. It also implies that the current denoising algorithms are not solely reliable as they ignore the potential significant impact of the noise part of the data, determine the separation of data and noise in a largely arbitrary manner and lacks the theoretical guidance as to the suitability of the parameters chosen. Based on the previous findings, the WDNVaR is used to estimate VaR in Chinese real estate market. Through trial and error method, the decomposition level is set to 1 as it shows the best performance among decomposition levels tested up to 6. The wavelet family chosen is Haar. The universal threshold selection rule is used during the denoising process while the noises are processed following hard thresholding method. The model order for fitting denoised data is determined as ARMA(5,5)-GARCH(4,4). The model order for noise data is determined as ARMA(1,1)-GARCH(4,4). Table 4. WDNVaR(Haar,1) for Shanghai SE Real Estate Index Confidence Level Confidence Levels Exceedances Kupiec Test P Value 99.0%
17
0.4933
0.4825
97.5%
36
0.0026
0.9595
95.0%
67
0.2912
0.5895
As shown in the experiment results in table 4, the reliability of VaR estimated based on the wavelet denoising approach improves considerably. The P value improves at both 95% and 97.5% confidence level and stays the same at 99% confidence level, during the model backtesting process. The performance improvement is attributed to the separation of data from noises during the model fitting process. Thus, the models fitted to both data and noise series can better capture the essential factors governing the risk evolution and lead to higher goodness of fit.
5
Conclusions
This paper proposes the wavelet denoising ARMA-GARCH approach for estimating VaR. Compared to traditional approaches that recover signals from noisy data, the wavelet denoising ARMA-GARCH model recognizes the contributions of both data and noise to the overall market risk level. The denoised data are linked to the speculation part of the real estate risk while the noise part is linked to the fundamental part of the market risk. Both data and noises are separated by the utilization of wavelet analysis for further model fitting. Recognizing their distinct features, they are further modeled using ARMA-GARCH specification
Estimating Real Estate Value-at-Risk Using Wavelet Denoising
503
with different model specifications and parameters. Since empirical studies suggest that Chinese real estate market represents a considerable risky environment, the proposed wavelet denoising ARMA-GARCH approach is applied to Chinese real estate market for market risk measurement. Experiment results confirm the superior performance of the proposed model against traditional ARMA-GARCH approach, i.e. the reliability of VaR estimated improves considerably upon those from the traditional ARMA-GARCH approach.
Acknowledgements The work described in this paper was supported by a grant from the National Social Science Foundation of China (SSFC No.07AJL005) and a Research Grant of City University of Hong Kong (No. 9610058).
References 1. Glascock, J.: Market conditions, risk, and real estate portfolio returns: Some empirical evidence. The Journal of Real Estate Finance and Economics 4(4), 367–373 (1991) 2. Ball, M., Lizieri, C., MacGregor, B.D.: The economics of commercial property markets, pp. 367–384, Routledge, London, GB98Z0620 bnb 2516 Michael Ball, Colin Lizieri and Bryan D. MacGregor. 24cm. Bibliography: Includes index (1998) 3. Scott, L.O.: Do prices reflect market fundamentals in real estate markets? The Journal of Real Estate Finance and Economics 3(1), 5–23 (1990) 4. Dowd, K.: Measuring market risk. Wiley finance series. pp. 341–354, Wiley, Chichester (2002) (GBA364768 bnb 2758 Kevin Dowd. ill. 25 cm. + 1 CD-Rom (4 3/4 in.) Includes bibliographical references and indexes) 5. Percival, D.B., Walden, A.T.: Wavelet methods for time series analysis. Cambridge series in statistical and probabilistic mathematics. Cambridge University Press, Cambridge (2000) (GBA102097 bnb 2629 Donald B. Percival and Andrew T. Walden. ill. 26cm. Includes bibliographical references and index) 6. Donoho, D.L., Johnstone, I.M.: Adapting to unknown smoothness via wavelet shrinkage. Journal Of The American Statistical Association 90(432), 1200–1224 (1995) 7. Donoho, D.L., Johnstone, I.M.: Minimax estimation via wavelet shrinkage. Annals Of Statistics 26(3), 879–921 (1998)
The Impact of Taxes on Intra-week Stock Return Seasonality Virgilijus Sakalauskas and Dalia Kriksciuniene Department of Informatics, Vilnius University, Muitines 8, 44280 Kaunas, Lithuania {virgilijus.sakalauskas, dalia.kriksciuniene}@vukhf.lt
Abstract. In this paper we explore the impact of trading taxes (commissions) on day-of-the-week effect in the Lithuanian Stock market. We applied the computational model for processing trading activities only on the particular days of the week. The suggested algorithm of trading shares not only reveals presence of the day-of-the-week anomaly, but allows comparing it to the influence of the trading taxes by estimating the final return of the selected shares. As the taxes of each transaction depend on the investment sum, therefore the suggested algorithm had to optimize the number of operations for ensuring the biggest gain. The research revealed significance of intra-week stock return seasonality for majority of shares (17 out of total 24). The advantages of the suggested method include its ability to better specify the shares for performing intra-week seasonality-based transactions, even though embracing of the trading commissions reduces visibility of the effect. Keywords: stock return, day-of-the-week effect, seasonality, operation taxes, trading strategy.
1 Introduction Together with the internationalization of economics and finance activities and increasing on-line possibilities for trading in various financial markets of the world, the investors acquire wider powers to manage the investment portfolio. The variety of trading conditions urges to look for the markets with bigger return of investment and lower risk. Most developed markets are highly integrated and similar in profitability, therefore more attentive look of the investors is now aimed to the emerging stock markets. The abilities to indicate and exploit various trading anomalies could be helpful for gaining higher returns. One of such anomalies is the day-of-the-week effect, which presumes that significantly different returns are regularly occurring for some particular trading day of the week. Mondays and Fridays are indicated as the trading days, mostly affected by this anomaly. Indicating the days of the week with the return, significantly different from the other days, could allow us to design the algorithm of profitable strategy. Therefore, documenting and testing of intra-week seasonality anomalies across various markets is the interesting and ultimate task both for the researchers and traders. The research literature on the day-of-the-week stock return anomaly focuses M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 504 – 513, 2008. © Springer-Verlag Berlin Heidelberg 2008
The Impact of Taxes on Intra-week Stock Return Seasonality
505
mainly on the United States and other advanced economies [1-6] with the inadequately low attention to the emerging markets, including those of the Eastern Europe. Recently the emerging stock markets became more interesting for the researchers, as the day-of-the-week effect in the developed stock markets from 1990s was confirmed as fading [7-11]. Basher, S.A. and Sadorsky, P. in [11] have studied the day of week effect in 21 emerging stock market by using both unconditional and conditional risk analysis and applying different analysis models. They concluded that “while the day-of-the-week effect is not present in the majority of emerging stock markets studied, some emerging stock markets do exhibit strong day-of-the-week effect“. The empirical investigation of the day-of-the-week stock return anomaly was made by Ajayi R.A. et al by using major market stock indices in eleven Eastern European emerging markets [12]. The research outcomes indicate negative Monday returns in six of the Eastern European markets and positive Monday returns in the remaining five. Two of the six negative Monday returns and one of the five positive Monday returns were statistically significant. The day-of-the-week effect in the emerging Lithuanian Stock market was investigated by authors in [13], [14], by using traditional statistical analysis methods. The research lead to the conclusion that day-of-the-week effect had no significant influence to the Vilnius Stock OMX Index return, and there were only few shares (not more than 4 from 24, as confirmed by applying different analysis methods), where affected by this anomaly. More investigation results of the day-of-the-week effect were obtained by using artificial neural networks [15]. Two standard types of neural networks were applied: MLP (Multilayer Perceptrons) and RBF (Radial Basis Function Networks). The research outcomes revealed better sensitivity of the neural networks. Monday effect was present for 11 shares, and Friday effect was present for 9 out of total list of the 24 analysed stocks. In this article we analyse the impact of taxes on intra-week stock return seasonality. We suggested the computational model for calculating the accumulated return by making trading activities on particular days of the week. The proposed trading strategy allows to reveal presence of the day-of-the-week anomaly and to estimate the influence of trading taxes on final return of the selected shares. The experimental research of the computational method employed the financial trading data of 24 shares of the Vilnius Stock Exchange (from 2003-01-01 to 2008-01-11). In the following sections we present the computational model, designed for implementing stock trading algorithm, define the organization of research data set and present the results of experimental research. The research conclusions address the impact the day-of-the-week effect, influence of the sales commissions, and the problem of number of trading operations for the accumulated return of investment.
2 Computational Model for Stock Trading Strategy In this part we present the computational model for evaluating the possible accumulated return of stock trading, exploiting the day of the week effect. The model is designed for taking into account the influence of commission taxes, trading volume and
506
V. Sakalauskas and D. Kriksciuniene
Fig. 1. Trading strategy algorithm
number of transactions, executed for application of the designed trading algorithm. Such trading strategy gives us possibility to define the days of the week, which could be confirmed as significantly different, than the other days of the week, according to the estimated accumulated return (either higher or lower). For implementing this strategy, the trading operations were executed only on particular days of the week and the accumulated return was estimated with and without trading commissions. Most research works neglect the impact of taxes, charged for the trading operations, or do not check the influence of this condition to intra-week
The Impact of Taxes on Intra-week Stock Return Seasonality
507
seasonality. As each operation bears the taxes of quite big value comparing to the stock price fluctuations, the trading frequency is very important parameter as well, which can lower the return considerably. In order to imitate real trading situation, the 0.3 % commissions rate was applied for the experimental data, described in the section 3. This rate is quite big and requires applying conservative trading strategy to the sufficient extent so that the total return was not totally dissolved by the trading taxes. The stock trading strategy, implemented in the suggested algorithm is aimed to reveal presence the day-of-the-week effect, and to check if it was financial worth to take into account the day of the week of the transaction, and the frequency of trading operations. At the same time, our task was not formulated as the goal to create optimal investment strategy with the maximal return on investment, but to indicate exceptional trading outcomes by making transactions on particular days of the week. According to the algorithm, we calculate the accumulated sum of return for selected stocks for each day of the week. We make the buying transaction only when the market is rising and the selling transaction on the day of the falling prices. In each case the trading transaction is made for all available sum of money. The accumulated return in each case is calculated in two ways: including and not including taxes. Therefore the positive return of any trading day cannot be illustrated by trade operations made on the opening and close moments of that day, as the trading taxes could exceed the generated profit. For experimental research of the trading algorithm we selected the initial investment sum 100 LTL. In this case the final earned sum of return denoted the percent rate of return. The trading algorithm is presented in Fig. 1. By using this strategy we have possibility to research influence of number of trading operations for the final accumulated return, as this has the total impact of taxes as well. The day-of-the-week based strategy gives us the possibility to identify exclusive values of return.
3 Experimental Data and Research Results The performance of the algorithm was analysed by processing of the financial data of the Vilnius Stock Exchange. Vilnius Stock Exchange belongs to the category of small emerging securities markets characterized by comparatively low volume, low liquidity and high volatility. It can be described by the following characteristics: market value is 7 EUR billions, near 2 million EUR of share trading value per business day, approximately 600 trading transactions per business day, and 45 shares in the equity list [16]. The experimental data set was created of the financial data of 24 most actively traded shares listed at the Vilnius Stock Exchange during the time interval from 200301-01 to 2008-01-11 on daily basis. The trading data was assigned to the variables, named according to their acronyms in the Vilnius Stock Exchange [16], which are further used for presenting research results in this article. The selected set of stock data represents all variety of the list, according to the capitalization, number of shares, daily turnover, profitability and risk. The dynamics of the stock prices during the period of analysis is well reflected by the profile of OMX Vilnius Stock Index (Fig.2), which is a capitalization weighted chain linked total-return index.
508
V. Sakalauskas and D. Kriksciuniene
Vilnius Stock Exchange Index value 600 550 500 450 400 350 300 250 200 150
07-09-24
07-05-04
06-12-08
06-07-21
06-02-28
05-10-10
05-05-20
04-12-28
04-08-23
04-05-15
04-02-05
03-10-28
03-07-20
03-04-11
50
03-01-01
100
Fig. 2. OMX Index values of the Vilnius Stock Exchange
In Fig.2 the period of 2003.01 to 2005.10 can be characterized by the average increase of stock prices which went up to about 5 times, with moderate fluctuations. Starting from 2005.10 till 2006.08 was the period of quite harsh decrease, than followed by significant rising of price level during the whole following year. At the same time the price volatility increased as well. The stock market price crisis of the end of 2007 had influence to the trading situation of Vilnius Stock Exchange in Lithuania. The recent period of more stable stock prices still shows quite big price fluctuations. Therefore the described features of the historical stock trading data can substantiate that the designed database covers sufficient amount of financial data, wide variety of real trading situations in the financial market, and can ensure validity of the experimental research. The data cleansing procedures of the stock information time series included removal of non-trading records during the holidays, weekends and the records of the trading days with zero number of deals. After processing the data set the average number of daily trading records for each share was approximately 1100, thus ensuring necessary amount of experimental data for getting significant findings. For mining the data and calculations we used STATISTICA and MS EXCEL software. By using the share trading strategy (Fig. 1), outlined in the section 2, we calculated the final return of selected shares for all days of the week. In Table 1 the final accumulated return of share trading according to the designed computational model and algorithm included the trading commissions of 0.3 % of the invested sum.
The Impact of Taxes on Intra-week Stock Return Seasonality
509
In the Table 1, the final return value of the stocks, which are significantly different (p<0.05) from the values of other days of the week are marked in bold font. Therefore these values indicate the days of the week, affected by the day-of-the-week anomaly. Table 1. Final trading return after investing 100 Lt, by days of the week (including taxes)
The average profitability of all days of the week is close to the total average value of 167.5 (as presented in the last row of Table 1). This result supports the conclusion of the [14] stating that the mean return value of the Vilnius Stock OMX Index is not impacted by the day-of-the-week effect. Only 7 stocks of the total analysed 24, namely MNF, KNF, SAN, ZMP, LDJ, LNS, LEL, have no significantly different days by means of profitability. Therefore the majority of stocks of the Vilnius Stock Exchange are impacted by the day-of-the-week
510
V. Sakalauskas and D. Kriksciuniene
effect. The most exclusive trading day is Friday, as the day-of-the-week anomaly is confirmed for 6 shares, and the Monday effect is noticed for five shares. The other days were only occasionally marked as significantly different, thus Monday and Friday effects had important trading impact in the observed market, similarly to the developed markets. On the other hand, our experiment did not confirm traditional understanding that the lowest return is on Monday, and the highest - on Fridays. The results showed that in most cases the presence of the day-of-the-week effect meant the highest values of return, and only in 5 cases the significant results denoted the lowest values of accumulated return. For performing analysis of influence of the commissions to the final return and to the significance of the day-of-the-week anomaly we compared the results including taxes (Table 1) and without them (Table 2). In both cases the same trading strategy (Fig 1) was applied for calculations. By comparing results of Table 2 and Table 1 we have to admit obvious difference in the accumulated values of final return. From Table 1 and 2 we can notice good match of most cases the exceptional values, which indicate presence of the intra-week stock return seasonality. In Table 2 several additional stocks are marked as affected by the day-of-the-week anomaly (ZMP, LDJ, KNF). The effect itself is more visible, as the exceptional values differ clearer from the others. Thus we can conclude that by not taking into account the trading commissions we can make false identification of shares, denoting them as affected by the day-of-the-week anomaly. The importance of the number of trading operations to identify significant return values in Tables 1 and 2 can be suggested by the research outcome, as omitting taxes (0.3%) increased the expected accumulated return by 38.5%. The biggest difference was estimated for LJL shares – even 118.6%. This result was impacted by the maximal value of 178 sell/buy operations, performed by applying the trading algorithm. The average number of trading operations per share was 115, which means that the investment portfolio approximately is changed every 10 days. By applying the algorithm we make on average 100 buy/sell operations with one share during the five trading years. This number of operations is comparatively low and allows expecting profitable outcome of trading stocks. The standard deviation of the number of operations for the particular days of the week is quite low - 3.4. This means that the number of trading operations does not affect the occurrence of the day-of-the-week effect. It also points to stability of the suggested computational model, as day-of-the-week effect is the outcome of difference in return for particular days of the week, but not of the parameters of trading algorithm. One of the research aspects was to analyse, if the shares grouped according to different trading volumes, differed in their profitability, and if the day-of-the-week effect was related to the trading volume. In Table 1 the list of shares is sorted according to their average daily trading volumes in ascending order. The difference in trading volumes of the stocks was quite evident, as the lowest daily turnover (stocks LEL and KBL) was about 15 thousand LTL (1 EUR=3.45 LTL), and the highest turnover (TEO stock) reached 700 thousand LTL.
The Impact of Taxes on Intra-week Stock Return Seasonality
511
From the Table 1 and 2 any dependence of the day-of-the week effect and the trading volume could not be confirmed. This conclusion conforms to the research results, obtained by using traditional statistical methods in [15]. Table 2. Final trading return after investing 100 Lt, by days of the week (without taxes)
In the articles [13], [14] the day-of-the-week anomaly was explored by applying traditional statistical research methods for the same share trading data of the Vilnius Stock Exchange. By applying t-test, one-way ANOVA, Levene and Brown-Forsythe test of homogeneity of variances, the statistically significant difference among Monday/Friday and the other days of the week was observed only for some shares with medium trading volume.
512
V. Sakalauskas and D. Kriksciuniene
4 Conclusions In major scientific research works the day-of-the-week effect is revealed by applying traditional statistical research methods, and the classification power of neural network or discriminant analysis. In this work we explore impact of this effect in the Vilnius Stock Exchange by using the computational method for realization of the trading strategy, which allows highlighting the dependency accumulated return from the week day, selected for making trading transactions (buy/sell). The suggested trading strategy allowed to reveal presence of the day-of-the-week anomaly and to evaluate the influence of trading commissions to the accumulated return of selected share. The experimental research was performed by using financial trading data set of 24 shares of Vilnius Stock Exchange (from 2003-01-01 to 2008-01-11). The suggested algorithm was applied for calculating return, accumulated by making investment operations only on the fixed trading day of the week in the Vilnius Stock Exchange. The effect was explored with and without commissions (equal to 0.3 % from every buy/sell operation). Therefore the designed algorithm had to minimize the number of operations as well. In case of calculation without transactions taxes, the application of the suggested method revealed significant impact of this effect for 20 out of 24 shares of the Vilnius Stock Exchange. The inclusion of the buy/sell commissions weakened the visibility of the day-of-the-week effect, and its significant presence could be confirmed only for 17 shares. Thus the research confirmed the assumption that the trading taxes must be included while exploring intra-week stock return seasonality. Otherwise the significant presence of the effect could be indicated for the shares with no impact of the effect to the final accumulated return. The other conclusion of the research is related to the high importance of the number of buy/sell transactions for more reliable identification of the presence of the dayof-the-week effect. While applying trading algorithm to very intensive trading strategy, most part of the return is lost due to the commissions applied to the increasing sales volume. Our research showed that in the designed trading strategy the buy/sell activities were quite even distributed for all week days, and could not imply significant occurrence of exclusive stock returns. The research of financial series of the Vilnius Stock Exchange did not confirm the common understanding that the return was marked to be lowest on Monday, and the biggest - on Fridays. Though the strongest effect was really indicated for Fridays and Mondays, but on Mondays the return value was not significantly lower. In contrast, the biggest value of return was recognized for 5 shares out of 6 (without taxes). In research findings was also noticed the absence of significant impact of the trading volume to the occurrence of the day-of-the-week effect. Comparing to the research [13], [14] elaborated by using traditional statistical methods, we concluded that the direct method of exploring presence of the day-ofthe-week effect, as applied in this work, was considerably more efficient. The forthcoming directions of the research should deal with exploring stability of the obtained results by analysing the financial data set, updated by new stock trading data, and by implementing modifications to the stock trading algorithm.
The Impact of Taxes on Intra-week Stock Return Seasonality
513
References 1. Connolly, R.A.: An examination of the robustness of the weekend effect. Journal of Financial and Quantitative Analysis 24, 133–169 (1989) 2. Cross, F.: The behaviour of stock prices on Fridays and Mondays. Financ. Anal. J. 29, 67–69 (1973) 3. Ho, Y.K., Cheung, Y.L.: Seasonal pattern in volatility in Asian stock markets. Appl. Finance Econom. 4, 61–67 (1994) 4. Jaffe, J.F., Westerfield, R., Ma, C.: A twist on the Monday effect in stock prices: evidence from the U.S. and foreign stock markets. Journal of Banking and finance 13, 641–650 (1989) 5. Steeley, J.M.: A note on information seasonality and the disappearance of the weekend effect in the UK stock market. Journal of Banking and Finance 25, 1941–1956 (2001) 6. Tong, W.: International evidence on weekend Anomalies. Journal of financial research 23(4), 495–522 (2000) 7. Brooks, C., Persand, G.: Seasonality in Southeast Asian stock markets: some new evidence on the day-of-the-week effects. Applied Economics Letters 8, 155–158 (2001) 8. Kamath, R., Chusanachoti, J.: An investigation of the day-of-the-week effect in Korea: has the anomalous effect vanished in the 1990s? International Journal of Business 7, 47–62 (2002) 9. Kohers, G., Kohers, N., Pandey, V., Kohers, T.: The disappearing day-of-the-week effect in the world’s largest equity markets. Applied Economics Letters 11, 167–171 (2004) 10. Mills, T.C., Coutts, J.A.: Calendar Effects in the London Stock Exchange FTSE Indices. The European Journal of Finance 1, 79–93 (1995) 11. Basher, S.A., Sadorsky, P.: Day-of-the-week effects in emerging stock markets. Applied Economics Letters 13, 621–628 (2006) 12. Ajayi, R.A., Mehdian, S., Perry, M.J.: The Day-of-the-Week Effect in Stock Returns: Further Evidence from Eastern European Emerging Markets. Emerging Markets Finance and Trade, M.E. Sharpe, Inc. 40(4), 53–62 (2004) 13. Sakalauskas, V., Kriksciuniene, D.: Analysis of the Day-of-the-Week Anomaly for the Case of Emerging Stock Market. In: Neves, J., Santos, M.F., Machado, J.M. (eds.) EPIA 2007. LNCS (LNAI), vol. 4874, pp. 371–383. Springer, Heidelberg (2007) 14. Sakalauskas, V., Kriksciuniene, D.: Day of the week effect in small securities markets. In: Proceedings volume AIDSS of 9th International Conference on Enterprise Information Systems, Funchal, Portugal, 12-16, June 2007, pp. 432–436 (2007) 15. Sakalauskas, V., Kriksciuniene, D.: The impact of daily trade volume on the day-of-theweek effect in emerging stock markets. Information technology and control 36(1A), 152–158 (2007) 16. The Nordic Exchange, http://www.baltic.omxgroup.com/
A Survey of Formal Verification for Business Process Modeling Shoichi Morimoto School of Industrial Technology, Advanced Institute of Industrial Technology 1-10-40, Higashi-oi, Shinagawa-ku, Tokyo, 140-0011, Japan [email protected]
Abstract. Information systems have to respond well to the changing business environment. Thus, they must have architecture which withstands the change. To design such systems, business process modeling is effective, however, the models include often abstractness and arbitrariness. Therefore, there have been efforts that validate rigorousness of the models. They have defined semantics of the models and applied various logics and formal methods to verification of the rigorousness. This paper focuses on formal verification of the models and surveys the efforts. We also discuss the prospect of the solutions. The establishment of the verification will be surely helpful toward solving the problems on business process reengineering, business process management, service-oriented architecture, and so on.
1
Introduction
Recently, enterprise information systems are designed based on service-oriented architecture. The solution against the changing business environment is construction of flexible business processes, which is the core of enterprise information systems development. It is common knowledge that business process modeling (BPM) is effective for the development. Developers can generally model business processes with modeling notation, e.g., BPMN [38], activity diagrams of UML [22]. The diagram, modeled with the notation, is simple and intuitively understandable at a glance. The notation is also designed so that anyone can easily model. Moreover, the notation is closely relevant to web services; the diagram can be converted into the BPEL XML format [40]. However, work of general modeling includes arbitrariness and lacks strictness. A diagram modeled with the notation may have various interpretations and one or more different diagrams may denote one process. Thus, before utilizing BPM, we must define strict semantics of the models and verify formally them. There have been many efforts that validate strictness of the diagrams; automation tools which can debug grammatical errors of BPMN and convert diagrams into BPEL [23], formal methods for verifying diagrams based on the π calculus [34] or Petri Net [42], techniques proving consistency with model-checking [5], and so on. In this paper we present a survey of existing proposals for formal verification techniques of business process diagrams and compare them among each other M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 514–522, 2008. c Springer-Verlag Berlin Heidelberg 2008
A Survey of Formal Verification for Business Process Modeling
515
with respect to motivations, methods, and logics. We also discuss some conclusive considerations and our direction for future work. We hope the survey contributes to designers and developers of enterprise information systems for solving issues on their section and satisfying the industrial needs.
2 2.1
Formal Verification of Business Process Models Basic Logics of the Verification
To verify formally business process models, it is firstly required to give the formal semantics to the models. The logical bases and researches using them are roughly classified as follows. Automata. Automata are a public and base model of formal specifications for systems [21]. An automaton consists of a set of states, actions, transitions between states, and an initial state. Labels denote the transition from one state to another. Many specification models to express system behavior derive from automata. In the reference [16], the authors propose a framework to analyze and verify properties of BPMN diagrams converted into the BPEL format that communicate via asynchronous XML messages. The framework first converts the processes to a particular type of automata whose every transition is equipped with a guard in an XPath format, after which these guarded automata are translated into Promela (Process or Protocol Meta Language) for the SPIN model checker [20]. Consequently, SPIN can be used to verify whether business process models satisfy properties formalized in LTL (Linear Time Temporal Logic). In the reference [8], the authors show a case study to convert automatically business processes written in BPEL-WSCDL to timed automata and to verify subsequently them by the UPPAAL model checker [49]. The authors are currently implementing a tool for the automatic translation that utilizes UPPAAL. In the reference [10], the authors propose a framework to verify automatically business processes that are modelled in Orc [35]. The authors define a formal timed-automata semantics for Orc expressions, which verifies to the Orc’s operational semantics. Accordingly, one can verify formally Orc models with UPPAAL. The paper also shows a simple case study. Thus, to verify business process diagrams the efforts utilizing automata convert the diagrams to XML formats (e.g., BPEL, XPDL, WS-CDL, Orc) for the present. After that, they accommodate automata to XML formats and then model-checking tools can verify them (Fig. 1). Besides the above automata models, team automata [12] and I/O automata [30] may be helpful for the verification. Team automata allow one to specify separately the components of a system, to describe their interactions, and to reuse the components. Their advantage is a flexible description for communication services among distributed systems, extending I/O automata. This advantage enables team automata to describe the formal model of secure web service compositions.
516
S. Morimoto
㪧㫉㫆㪺㪼㫊㫊㩷㪤㫆㪻㪼㫃㫊
㪯㪤㪣㩷㪽㫆㫉㫄㪸㫋㫊 㪚㫆㫅㫍㪼㫉㫋
㪤㫆㪻㪼㫃㩷㪚㪿㪼㪺㫂㫀㫅㪾 㪫㫆㫆㫃㫊 㪭㪼㫉㫀㪽㫐
Fig. 1. Verification of business process models with automata
Petri Net. Petri Net is a framework to model concurrent systems. Petri Net can identify many basic aspects of concurrent systems simply, mathematically and conceptually. Therefore, many theories of concurrent systems derive from Petri Net. Moreover, because Petri Net has easily understandable and graphical notation, it has been widely applied. Petri Net often become a topic in BPM and is related to capturing process control flows [50]. Petri Net can specially detect the dead path of business process models whose preconditions are not satisfied. The paper [9] shows how to correspond all BPMN diagrams constructs into labeled Petri Net. This output can subsequently be used to verify BPEL processes by the open source tools BPEL2PNML and WofBPEL [41]. In the reference [36], the authors define the semantics of relation BPEL and OWL-S [51] in terms of first-order logic. Based on this semantics they formalize business processes in Petri Net, complete with an operational semantics. They also develop a tool to describe and automatically verify composition of business processes. In the reference [17], the authors apply a Petri-net-based algebra to modeling business processes, based on control flows. The paper [52] proposes a Petri-net-based design and verification tool for web service composition. The tool can visualize, create, and verify business processes. The authors are now improving the graphical user interface which can be used to aid the business process modeling and to edit Petri Net and BPEL in a lump. The paper [53] introduce a Petri-net-based architectural description language, named WS-Net, in which web-service-oriented systems can be modeled, and presents a simple example. To handle real applications and to detect errors in business processes, the authors are currently developing an automatic translation tool from WSDL to WS-Net. The paper [18] proposes a formal Petri Net semantics for BPEL which assures exception handling and compensations. Moreover, the authors present the parser which can automatically convert business process diagrams into Petri Net. Consequently, the semantics enabled many Petri Net verification tools to automatically analyze business processes. In the reference[45], the authors propose a framework which can translate Orc into colored Petri Net. Colored Petri Net has been proposed to model large scale systems more effectively. The framework and tool deal with recursion and data
A Survey of Formal Verification for Business Process Modeling
517
handling. Moreover, because the framework and tool can simulate and verify the behavior of process models at the design phase of information systems, users of them can detect and correct errors beforehand. Therefore, they contribute to raise the reliability of business process diagrams. Petri Net is the traditional and well-established technique, thus there have been many verification methods and tools. The essentials of the above efforts are how to translate business process diagrams into Petri Net. After that, we have a rich variety of tools for the verification. However, all the components in business process modeling notation cannot change into Petri Net. For instance, BPMN has various gateways, event triggers, loop activities, control flows, and nested/embedded sub-processes. It is difficult to define the correspondence of these objects to Petri Net. There is room for argument on the translation. Process Algebras. Process algebras are a diverse family of related approaches to formally modeling concurrent systems. Their semantic foundation is based on automata. Many derivative algebras have been defined and many references about them exists. The representative process algebras are Milner’s Calculus of Communicating Systems (CCS [33]), Hoare’s Calculus of Sequential Processes (CSP [19]), the Algebra of Communicating Processes (ACP [1]) by Bergstra and Klop, and the Language of Temporal Ordered Systems (LOTOS [2]) ISO standard. Process algebras are strict and well-established theories that support the automatic verification of properties of systems behavior as well as Petri Net. They also provide a rich theory on bisimulation analysis. The analysis is helpful to verify whether a service can substitute another service in a composition or the redundancy of a service[3]. Among these, the π calculus is a process algebra that influences business process description languages, e.g., XLANG, BPEL. In respect of automatic verification, the π calculus is far superior to Petri Net. From a compositional perspective, the π calculus offers constructs to compose activities in terms of sequential, parallel, and conditional execution, combinations of which can lead to compositions of arbitrary complexity. The followings are process-algebraic approaches to specify and verify secure compositions of business processes. In the reference [46], the authors discuss the application of process algebras to describe, compose, and verify business processes, with a particular focus on their interactions. They show an example in which they use CCS to specify and compose business processes. They also use the Concurrency Workbench [6] to validate properties such as correct business process composition. If the π calculus is used instead of CCS, this approach can be useful in practical application. It may solve real issues, e.g., the exchange of messages during business process interactions. In the reference [14], the authors define correspondence between BPEL and LOTOS. The advantage of this proposal is that it includes compensations and exception handling. Thus, it enables the verification of temporal properties with the CADP model checker [13].
518
S. Morimoto
Thus, process algebraic approaches are suitable for verification of the reliability of information systems, because they can simulate the behavior of business process models and correct their error at the design phase as well as Petri-netbased verification. 2.2
Aims of the Verification
We also classified the researches from the perspective of the aim of the verification. The followings are properties which are elicited from the above researches. Connectivity. We must guarantee connectivity by the verification. Developers can determine which processes are composed and reason about their interactions with the reliable connectivity. On the other hand, they must also satisfy non-functional requirements, e.g., timeliness, security, and dependability. It is important in a B2B system to define consensus between involved stakeholders. Business consensus defines the contract between two or more stakeholders on such requirements. It is necessary to describe and verify the requirements in composed business processes. Correctness. With the scale-spreading and diversification of information systems, concurrency of large systems are increasingly important. The correctness of such large systems corresponds to their temporal behavior. Behavioral properties can be classified into safety and liveness. A safety property stipulates that “nothing bad” will happen, ever, during the execution of a system, and a liveness property stipulates that “something good” will happen, eventually, during the execution of a system. Formal verification provides rigorous and mathematical semantics of guaranteeing large systems to comply with their specification. Compensation and Scalability. In the real-world business, end users maybe generally want to interact with many services, thus enterprise information systems must be connected with possibly many services. Therefore, one of the important verification properties is how many services can business process models implement. Compatibility. Business process modeling notation intends to bridge the gap between business process design and implementation of enterprise information systems. The diagrams should be verified process interactions for business collaboration described in such notation. Business collaboration must satisfy compatibility between business participants in the collaboration.
3
Discussion
With the above verification techniques, verifiers must formalize mathematically business process models and verification criteria, considering the semantics and
A Survey of Formal Verification for Business Process Modeling
519
various issues of their business. This is difficult for verifiers who are not specialists of formal methods or conversant with the methods. They may make mistakes in the formal descriptions of verification criteria, or may not know how to formally describe the verification criteria. Moreover, they must define the verification criteria themselves, but such criteria does not have objectivity. Thus, it is desirable to establish objective criteria for business process models, e.g., ISO standard. The essential purpose of BPM is to construct processes which yield a profit for enterprise. Thus, we should verify not only the above properties of the diagrams but also the profit which is generated by the model. That is, we must verify whether the process model certainly yields a profit. We are now discussing whether the chance discovery process [39] can be applied to the verification from the financial and business administration viewpoint. Moreover, various logics introducing probability into first-order predicate logic have been proposed [7][11][15][24][26][27][32][37][43][44]. These studies make it possible to generate Bayesian Network [25] from predicate logic expression based on knowledge-based model construction [4]. Since there are some business processes which flow with non-programmable decision [48], the business process diagrams with uncertainness have to be verified by such logics. If the correspondence of the business process modeling notations to Bayesian Network is defined (Fig. 2), we can verify the diagrams based on probabilistic inference.
㪘
㪙 㪙 㪫 㪘 㪫 㪇㪅㪌 㪝 㪇㪅㪐
㪧㩿㪘㪔㪫㪀㩷㪔㩷㪇㪅㪌 㪧㩿㪘㪔㪝㪀㩷㪔㩷㪇㪅㪌
㪚 㪝 㪇㪅㪌 㪇㪅㪈
㪚 㪫 㪘 㪫 㪇㪅㪏 㪝 㪇㪅㪉
㪝 㪇㪅㪉 㪇㪅㪏
Fig. 2. An example of the correspondence business process models to Bayesian networks
4
Conclusion
In this paper, we have presented the formal verification techniques which simulate and verify one’s business process models at the design phase of enterprise information systems. These techniques can detect and correct errors of the models as early as possible and in any case before implementation. We have also shown future work of the formal verification for BPM. However, the comparison in this paper surveyed only the basic logics and the aims of the verification, thus we must also define the other quantitative information in order to choose which logics or methods better suit the formal verification of BPM. Moreover, there are the well-established practices which verify UML state machine diagrams for
520
S. Morimoto
behavior with model-checking [28][29][31][47]. We should compare BPM verification with such studies. A prospect of the researches that we would like to deepen in future work is to determine the financial characteristics that each of the languages and models is able to describe in order to define a valuable business process.
References 1. Bergstra, J.A., Klop, J.W.: Algebra of Communicating Processes with Abstraction. In: Theor. Comput. Sci., vol. 37, pp. 77–121. Elsevier, Amsterdam (1985) 2. Bolognesi, T., Brinksma, E.: Introduction to the ISO Specification Language LOTOS. Computer Networks 14, 25–59 (1987) 3. Bordeaux, L., Sala¨ un, G., Berardi, D., Mecella, M.: When are Two Web Services Compatible? In: Shan, M.-C., Dayal, U., Hsu, M. (eds.) TES 2004. LNCS, vol. 3324, pp. 15–28. Springer, Heidelberg (2005) 4. Breese, J.S.: Construction of Belief and Decision Networks. Computational Intelligence 8(4), 624–647 (1992) 5. Clarke, E., Grumberg, O., Peled, D.: Model Checking. MIT Press, Cambridge (2000) 6. Cleaveland, R., Li, T., Sims, S.: The Concurrency Workbench of the New Century (2000), http://www.cs.sunysb.edu/∼ cwb/ 7. Cussens, J.: Parameter Estimation in Stochastic Logic Programs. Machine Learning 44(3), 245–271 (2001) 8. D´ıaz, G., Pardo, J.J., Cambronero, M.-E., Valero, V., Cuartero, F.: Automatic Translation of WS-CDL Choreographies to Timed Automata. In: Bravetti, M., Kloul, L., Zavattaro, G. (eds.) EPEW/WS-EM 2005. LNCS, vol. 3670, pp. 230– 242. Springer, Heidelberg (2005) 9. Dijkman, R.M., Dumas, M., Ouyang, C.: Formal Semantics and Analysis of BPMN Process Models using Petri Nets, Queensland University of Technology (2007), http://eprints.qut.edu.au/archive/00007115/ 10. Dong, J.S., Liu, Y., Sun, J., Zhang, X.: Verification of Computation Orchestration via Timed Automata. In: Liu, Z., He, J. (eds.) ICFEM 2006. LNCS, vol. 4260, pp. 226–245. Springer, Heidelberg (2006) 11. Eisner, J., Goldlust, E., Smith, N.A.: Compiling Comp Ling: Weighted Dynamic Programming and the Dyna Language, Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. In: Proc. of the Conference (HLT/EMNLP 2005). The Association for Computational Linguistics, pp. 281–290 (2005) 12. Ellis, C.: Team Automata for Groupware Systems. In: Proc. of the international ACM SIGGROUP conference on Supporting group work (GROUP 1997), pp. 415– 424. ACM Press, New York (1997) 13. Fernandez, J.-C., Garavel, H., Kerbrat, A., Mounier, L., Mateescu, R., Sighireanu, M.: CADP - A Protocol Validation and Verification Toolbox. In: Alur, R., Henzinger, T.A. (eds.) CAV 1996. LNCS, vol. 1102, pp. 437–440. Springer, Heidelberg (1996) 14. Ferrara, A.: Web Services: A Process Algebra Approach. In: Proc. of the 2nd International Conference on Service-Oriented Computing (ICSOC 2004), pp. 242– 251. ACM Press, New York (2004)
A Survey of Formal Verification for Business Process Modeling
521
15. Friedman, N., Getoor, L., Koller, D., Pfeffer, A.: Learning Probabilistic Relational Models. In: Proc. of the 16th International Joint Conference on Artificial Intelligence (IJCAI 1999), pp. 1300–1309. Morgan Kaufmann, San Francisco (1999) 16. Fu, X., Bultan, T., Su, J.: Analysis of Interacting BPEL Web Services. In: Proc. of the 13th International Conference on the World Wide Web (WWW 2004), pp. 621–630. ACM Press, New York (2004) 17. Hamadi, R., Benatallah, B.: A Petri Net-based Model for Web Service Composition. In: Proc. of the 14th Australasian Database Conference (ADC 2003), Conferences in Research and Practice in Information Technology, vol. 17, pp. 191–200. Australian Computer Society (2003) 18. Hinz, S., Schmidt, K., Stahl, C.: Transforming BPEL to Petri Nets. In: van der Aalst, W.M.P., Benatallah, B., Casati, F., Curbera, F. (eds.) BPM 2005. LNCS, vol. 3649, pp. 220–235. Springer, Heidelberg (2005) 19. Hoare., C.: Communicating Sequential Processes. Prentice-Hall, Englewood Cliffs (1985) 20. Holzmann, G.J.: The SPIN Model Checker –Primer and Reference Manual. Addison-Wesley, Reading (2003) 21. Hopcroft, J.E., Motwani, R., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation, 3rd edn. Addison-Wesley, Reading (2006) 22. ISO/IEC 19501 Standard: Information Technology – Open Distributed Processing – Unified Modeling Language (UML) Version 1.4.2 (2005) 23. ITP Commerce, Process Modeler for Microsoft VisioT M , http://www.itp-commerce.com/ 24. Jaeger, M.: Relational Bayesian Networks. In: Proc. of the 13th Conference on Uncertainty in Artificial Intelligence (UAI 1997), pp. 266–273. Morgan Kaufmann, San Francisco (1997) 25. Jensen, F.V.: Bayesian Networks and Decision Graphs. Springer, Heidelberg (2001) 26. Kersting, K., Raedt, L.D.: Bayesian Logic Programs. In: Cussens, J., Frisch, A.M. (eds.) ILP 2000. LNCS (LNAI), vol. 1866, Springer, Heidelberg (2000) 27. Koller, D., Pfeffer, A.: Learning Probabilities for Noisy First-Order Rules. In: Proc. of the 15th International Joint Conference on Artificial Intelligence (IJCAI 1997), pp. 1316–1323. Morgan Kaufmann, San Francisco (1997) 28. Latella, D., Majzik, I., Massink, M.: Automatic Verification of a Behavioural Subset of UML Statechart Diagrams using the SPIN Model Checker. Formal Asp. Comput. 11(6), 637–664 (1999) 29. Lilius, J., Paltor, I.: vUML: A Tool for Verifying UML Models. In: Proc. of the 14th IEEE International Conference on Automated Software Engineering (ASE 1999), pp. 255–258. IEEE Computer Society, Los Alamitos (1999) 30. Lynch, N.A., Tuttle, M.R.: An Introduction to Input/Output Automata. CWI Quarterly 2(3), 219–246 (1989) 31. Mikk, E., Lakhnech, Y., Siegel, M., Holzmann, G.J.: Implementing Statecharts in Promela/SPIN. In: Proc. of the 2nd Workshop on Industrial-Strength Formal Specification Techniques (WIFT 1998), pp. 90–101. IEEE Computer Society, Los Alamitos (1998) 32. Milch, B., Marthi, B., Russell, S.J., Sontag, D., Ong, D.L., Kolobov, A.: BLOG: Probabilistic Models with Unknown Objects. In: Proc. of the 19th International Joint Conference on Artificial Intelligence (IJCAI 2005), pp. 1352–1359. Professional Book Center (2005) 33. Milner, R.: Communication and Concurrency. Prentice-Hall, Englewood Cliffs (1989)
522
S. Morimoto
34. Milner, R.: Communicating and Mobile Systems: The Pi-Calculus. Cambridge University Press, Cambridge (1999) 35. Misra, J., Cook, W.R.: Orc - An Orchestration Language, http://www.cs.utexas.edu/∼ wcook/projects/orc/ 36. Narayanan, S., McIlraith, S.A.: Simulation, Verification and Automated Composition of Web Services. In: Proc. of the 11th International World Wide Web Conference (WWW 2002), pp. 77–88. ACM Press, New York (2002) 37. Ngo, L., Haddawy, P.: Answering Queries from Context-Sensitive Probabilistic Knowledge Bases. Theor. Comput. Sci 171(1-2), 147–177 (1997) 38. Object Management Group: Business Process Modeling Notation Specification, Final Adopted Specification dtc/06-02-01 (2006) 39. Ohsawa, Y.: Chance Discoveries for Making Decisions in Complex Real World. New Generation Computing 20(2), 143–163 (2002) 40. Organization for the Advancement of Structured Information Standards: Web Services Business Process Execution Language Version 2.0 (2007) 41. Ouyang, C., Verbeek, E., van der Aalst, W.M.P., Breutel, S., Dumas, M., ter Hofstede, A.H.M.: WofBPEL: A Tool for Automated Analysis of BPEL Processes. In: Benatallah, B., Casati, F., Traverso, P. (eds.) ICSOC 2005. LNCS, vol. 3826, pp. 484–489. Springer, Heidelberg (2005) 42. Petri, C.A.: Kommunikation mit Automaten. PhD thesis, Rheinisch-Westf¨ alisches Institut f¨ ur Instrumentelle Mathematik an der Universit¨ at Bonn (1962) 43. Poole, D.: The Independent Choice Logic for Modelling Multiple Agents under Uncertainty. Artif. Intell. 94(1-2), 7–56 (1997) 44. Richardson, M., Domingos, P.: Markov Logic Networks. Machine Learning 62(1-2), 107–136 (2006) 45. Rosario, S., Benveniste, A., Haar, S., Jard, C.: Net System Semantics of Web Services Orchestrations Modeled in Orc, Research Report IRISA, No. 1780, Istitut de Rechercheen Informatique et Syst`emes Al´eatoires (2006) 46. Sala¨ un, G., Bordeaux, L., Schaerf, M.: Describing and Reasoning on Web Services using Process Algebra. In: Proc. of the International Conference on Web Services (ICWS 2004), pp. 43–50. IEEE Computer Society, Los Alamitos (2004) 47. Sch¨ afer, T., Knapp, A.,, Merz, S.: Model Checking UML State Machines and Collaborations. Electr. Notes Theor. Comput. Sci. 55(3), 357–369 (2001) 48. Simon, H.A.: The New Science of Management Decision. Prentice-Hall, Englewood Cliffs (1977) 49. UPPAAL: http://www.uppaal.com/ 50. Wohed, P., van der Aalst, W.M.P., Dumas, M., ter Hofstede, A.H.M.: Analysis of Web Services Composition Languages: The Case of BPEL4WS. In: Song, I.-Y., Liddle, S.W., Ling, T.-W., Scheuermann, P. (eds.) ER 2003. LNCS, vol. 2813, pp. 200–215. Springer, Heidelberg (2003) 51. World Wide Web Consortium: OWL-S: Semantic Markup for Web Services, http://www.w3.org/Submission/OWL-S/ 52. Yi, X., Kochut, K.: A CP-nets-based Design and Verification Framework for Web Services Composition. In: Proc. of the International Conference on Web Services (ICWS 2004), pp. 756–760. IEEE Computer Society, Los Alamitos (2004) 53. Zhang, J., Chung, J.-Y., Chang, C.K., Kim, S.: WS-Net: A Petri-net Based Specification Model for Web Services. In: Proc. of the International Conference on Web Services (ICWS 2004), pp. 420–427. IEEE Computer Society, Los Alamitos (2004)
Network Modeling of Complex Dynamic Systems Bosiljka Tadi´c Department of Theoretical Physics, Joˇzef Stefan Institute, Ljubljana, Slovenia [email protected]
Abstract. Networks representing complex dynamical systems often have a multiscale structure and additional properties of nodes and links, which may vary in time. We briefly summarize current trends in the statistical physics research of networks structure and dynamics and its applications to physical, biological, and social systems discussed in the Workshop. Keywords: Networks, Complexity, Dynamics, Subgraphs.
1
Mapping Complex Dynamical Systems to Graphs
The idea of mapping a complex dynamical system on a graph, e.g., by identifying nodes and their connections, is to provide a ground for quantitative analysis of these systems, particularly in view of the graph theory and the statistical physics of complex networks. The approach is particularly useful in areas traditionally lacking formal theories and new emerging disciplines, specifically in the
Fig. 1. Network generated from the correlations of gene expressions for the selected set of genes in genome of yeast [2,3]. Nodes represent different genes and each color correspond to a gene function (put in 7 groups). Threshold for the correlations applied.
study of biological networks, and networks related to socio-technological interactions and economy. Science has just began to recognize the dynamically nonlinear and strongly correlated nature of these systems [1]. Furthermore, modeling M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 525–526, 2008. c Springer-Verlag Berlin Heidelberg 2008
526
B. Tadi´c
the collective dynamics on networks [4] provides the ways towards quantitative study, scalability and universality in the emergent dynamical phenomena. Such dynamical effects play an important role in systems across different scales and nature of interactions. Two striking examples are the technology-based social communications [1] and the functional materials in nano-technology [5]. Emergent Graphs representing the complex dynamical systems both in datadriven modeling and in theoretical approaches often are sparse, scale-free, and modular. These structural properties are closely related to the systems evolution and function. Accordingly, there are constant needs to develop new/adjusted methods and algorithms to appropriately analyze and visualize these graphs. Goals of the Workshop are to (partially) review the current works both in the theoretical and algorithmic approaches, on one side, and in the applications which primarily use methods of the statistical physics, on the other. Not of a lesser importance is a constant intention to developing communications across different disciplines and improving the converging research standards in the field.
2
Current Trends in Network Research
A brief summary of the research to be discussed at the Workshop includes: – Networks from the empirical data, especially in the social sciences and bioinformatics, can be constructed in different ways (an example is shown in Fig. 1). Depending on the concept, a careful filtering of the data is necessary, awareness of the parameters, missing data, etc, which may affect the results. – Topology & Relevant Subgraphs in multiscale networks are of great importance, particularly the subgraphs related to conserved evolutionary or functional units (communities, motifs, trees, paths, cycles) in biological networks. – More StructureDynamics Interdependences, readily manifested in the diffusive processes on networks [4]. The spin dynamics, synchronization, coupled chaotic maps, are some of the currently studied examples, where new dynamical effects occur due to the network structure and vice-versa. – Standard & Renewed Methods, such as spectral analysis, weighted maximumlikelihood, but for different environment, M-distance matrices, reveal additional robust properties not seen immediately from the connectivity matrix.
References 1. Boccaleti, S., Latora, V., et al.: Complex Networks: Structure and Dynamics. Physics Reports 424, 175–308 (2007) 2. Cho, R.J., et al.: A Genome-Wide Transcriptional Analysis of the Mitotic Cell Cycle. Molecular Cell 2, 65–73 (1998) ˇ 3. Zivkovic, J., et al.: Statistical Indicators of Collective Behavior and Functional Clusters in Gene Expression Network of Yeast. European Phys. J. B 50, 255 (2006) 4. Tadi´c, B., Rodgers, G.J., Thurner, S.: Transport on Complex Networks: Flow, Jamming & Optimization. Int. J. Bifurcation and Chaos 17, 2363–2385 (2007) 5. Virtual, J.: Nanoscale Science & Technology, http://www.vjnano.org/
Clustering Organisms Using Metabolic Networks Tomasz Arod´z Institute of Computer Science, AGH University of Science and Technology, al. Mickiewicza 30, 30-059 Krak´ ow, Poland [email protected]
Abstract. Topological properties of metabolic networks may reflect systematic differences between evolutionary distinct groups of organisms. Indeed, the mean shortest path length between metabolites is, on average, longer in eukaryotes than in bacteria. We show that not only the averages of groups differ, but the organisms can be successfully clustered, based on network properties, into categories corresponding to taxonomic groups. We use the fact that in metabolic networks of different organisms, correspondence between vertices is available. We compare our approach with several graph indices employed previously to analyse metabolic networks, and show that they fail at achieving level of clustering similar to ours. Finally, we show that the phylogenetic tree constructed using network-based approach agrees in most cases with gene-based phylogeny. Keywords: Metabolic networks, Phylogenetic tree, Graph descriptors.
1
Introduction
Nature provides us with a stunning diversity of life forms. First taxonomic efforts aimed at classifying and bringing order to this heterogeneity relied on apparent phenotypic differences between organisms. With the advances in molecular biology, study of evolution of species found ground on the conservation and divergence of their corresponding gene or protein sequences. Phylogenetic analyses of evolutionary conserved genome units, such as 16S rRNA, allowed for constructing trees that capture the evolutionary closeness of organisms, with more distant species occupying nodes in distant branches. Still, reliance in the construction of a phylogenetic tree on a specific sequence can be unstable, due to events such as horizontal gene transfer. As genome sequencing efforts progress, whole-genome phylogenetic approach emerged to circumvent this issue [1]. Analysing sequences of some or even all genes or proteins does not make explicit the system-level interactions between these entities. Recently, networkbased approach gained popularity in modelling and analysing the linked and entangled structure that turns these individual units into a living cell [2]. In particular, metabolic networks, linking metabolites through enzymes that catalyse reactions between them, were shown to belong to a class of complex networks with scale-free connectivity [3]. Such networks are resilient to small random changes, which may accumulate through generations and give hints to the evolutionary distance between organisms. M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 527–534, 2008. c Springer-Verlag Berlin Heidelberg 2008
528
T. Arod´z
The goal of the paper is to cluster organisms according to similarity of the topological properties of their metabolic networks. Using such networks to aid the construction of phylogenetic trees has been introduced recently. One approach for evaluating distance between organisms’ networks relies on information other than topological. Measures of overlap between sets of metabolites [4,5] or enzymes [6,4] present in different organisms were used. Other methods use information on enzyme similarity instead or in addition to topological properties, which makes them more akin to classical phylogenetic algorithms that operate on genetic closeness. For example, enzyme functional class similarity was employed together with similarities of enzyme neighbourhoods [7]. Extent of sequence agreement of enzymes, and of those substrates that are proteins, was also used [8]. Another group of methods relies only on topological properties of the graph, once the network structure of enzymatic connections linking metabolites has been inferred from the genome. Average shortest path lengths of metabolic networks was shown to be shorter for bacteria than for eukaryotes and archaea [9]. Average clustering coefficient and average betweenness centrality was proposed as means to discriminate archaea from bacteria, but it was validated using only 11 organisms [10]. In the same study, network diameter and average shortest paths were deemed not suitable for clustering. Phylogenetic trees were also constructed based on graph kernels [11]. Rank order of metabolites or enzymes was also used [4], revealing clustering of parasitic and non-parasitic bacteria, eukaryotes and archaea, but the networks involved spurious connections through abundant current metabolites, for example cofactors such as ATP. We show here that analysing metabolic networks using only simple graph descriptors to infer similarity allows for obtaining meaningful phylogenetic results. We use the fact that network edges are spanned on a common set of vertices with known identities as metabolites. We calculate and compare vertex-specific descriptors, that is, topological properties for nodes common in networks of all organisms. Our approach does not involve quantitative measures of genetic or functional similarity of enzymes in the network. Nor does it use the differences in presence or absence of certain elements of the network, as we use descriptors for vertices that are present in all networks. Despite the simplicity of the topological descriptors used in calculating graph similarity, the resulting phylogenetic tree is in good accord with current phylogenetic knowledge.
2
Methods
Evaluation of similarity of two graphs is in most applications hindered by the lack of correspondence between elements of the network. Techniques to circumvent this involve dissimilarity measures based on the cost of edit operations required to transform one graph into another or rely on estimated vertex matching in graphs [12]. Other methods compare graph indices, either global properties, such as graph diameter, or aggregated statistics of node- or edge-specific properties, such as average clustering coefficient [13]. Also, more sophisticated functions of
Clustering Organisms Using Metabolic Networks
529
vertex-specific properties are in use, such as symmetric polynomials, which are invariant to the permutation of nodes [14]. In metabolic networks made publicly available by Ma and Zeng [9], which we analyse here, vertices represent well-defined chemical entities, such as glucose, while edges correspond to enzymatic reactions that transform the metabolites. Thus, the correspondence between vertices in the network is readily available. Permutation invariant measures or indices based on aggregate statistics are no longer necessary to obtain similarity between two graphs. Instead, we evaluate here an approach in which the values of vertex-specific topological properties are used directly to form a feature vector. Each feature corresponds then to a specific metabolite. Dissimilarities between metabolic networks are quantified by distances between corresponding feature vectors. Metabolic networks were shown to be composed of a single giant connected component, and a set of isolated small components [15]. In further analysis, we have discarded these small components, and retained only the giant connected component of the network. One should note that in different organisms the whole networks, as well as the giant connected components, are of different size, and overlap only partially. We have analysed two vertex-specific topological descriptors as basis for our feature vectors. A descriptor provides a single numerical value for each vertex in the network’s giant connected component. One analysed property is the vertex degree, defined as the number of edges incident with the vertex. Alternatively, we analysed the betweenness centrality of a vertex v, defined as the fraction of the number of shortest paths between two vertices s and t passing through v in the overall number of shortest paths between s and t, averaged over all source-target pairs in the giant connected component. Values of these descriptors are available only for nodes present in the network of a given organism, and also present in its giant connected component. To obtain feature vectors with no missing entries for all organisms, we identified vertices that are present in giant connected components of all the analysed metabolic networks. This resulted in a subset containing 100 nodes. We utilised the values of node degrees or node betweenness centrality only from these vertices. Thus, we arrive with 100–dimensional feature vectors that are well-defined across all the networks. Also, while the values are taken from only a subset of the connected component, they contain information from outside of the subset, as calculation of the vertex properties involves the whole giant connected component.
3 3.1
Results and Discussion Clustering Organisms Using Vertex-Specific Descriptors
We have analysed metabolic networks of 44 organisms – 3 eukaryotes and 41 bacteria. The latter consist of 25 members of Proteobacteria, 12 Firmicutes and 4 Actinobacteria (see Fig. 4 for detailed information about the species analysed). For each organism, we obtained a feature vector of 100 values describing the vertex degrees of the nodes in its metabolic network.
T. Arod´z
Principal component II
530
Proteobacteria Firmicutes Actinobacteria Eukaryotes
Principal component I
Fig. 1. First two principal components of the 100-dimensional feature space, formed by values of the vertex degrees of 100 vertices common to all giant connected components of the 44 analysed metabolic networks
In Fig. 1, we show the projection of vertex-specific descriptors into principal component space (PCA) [16], spanned by highest variance, mutually uncorrelated directions in the dataset. The figure indicates that metabolic networks of the three animals, as captured by the descriptors, are distinct from the networks of bacteria. Within the cluster of the latter, Proteobacteria form a region separated from Firmicutes. Actinobacteria are grouped between Frimicutes and Proteobacteria. The distances between the three animals are relatively large compared to distances between neighbouring bacterial organisms. This is consistent with the fact that the animals analysed, H. sapiens, D. melanogaster and C. elegans, represent different phyla, while within the phyla of Proteobacteria, Actinobacteria and Firmicutes several examples of each bacterial class are present. 3.2
Comparison with Network Indices
To compare the effectiveness of the vertex-specific approach with the one based on graph indices, we grouped the descriptors employed in previous studies of metabolic networks to form a feature vector and analysed its clustering and discriminatory ability. The feature vector comprises of average shortest path length, maximum path length, average clustering coefficient, number of vertices, average vertex degree and average vertex betweenness centrality, each normalised to zero mean and unit variance. Next, we have conducted principal component analysis (PCA). Contrary to results of our vertex-specific descriptors in Fig. 1, the results depicted in the left panel of Fig. 2 do not show meaningful clusters.
Proteobacteria Firmicutes Actinobacteria Eukaryotes
Principal component I
Discriminant component II
Principal component II
Clustering Organisms Using Metabolic Networks
531
Proteobacteria Firmicutes Actinobacteria Eukaryotes Discriminant component I
Fig. 2. Principal component space (left) and optimal discriminant space obtained with LDA (right) for a feature set composed of global descriptors and averages of local descriptors of metabolic networks. See text for details.
Lack of clusters, corresponding to groups of organisms, in the principal component space indicates that the main variance captured by graph indices is not associated with inter-group diversity. It does not mean that differences between groups are not captured at all. We evaluated the extent to which analysed graph indices contain information allowing to discriminate between groups of organisms. To this end, we employed a supervised projection method, the Fisher’s Linear Discriminant Analysis (LDA) [16], to find an optimal discriminant space. LDA finds a projection optimal in terms of separation between groups, knowing on input the assignment of organisms to groups. The result, depicted in the right panel of Fig. 2, indicates that some discriminatory information is present in the descriptors, at least allowing to separate eukaryotes from bacteria. Unlike for the vertex-specific descriptors, as comparison with PCA results show, this information does not dominate the variance but is overwhelmed by variance within organism groups. It has to be extracted with supervised learning methods that know group assignment from the start, such as LDA. However, such supervised methods are unsuitable for phylogenetic analysis, which is inherently an unsupervised task. Similarly unsatisfactory clustering was obtained when graph indices were computed using only the 100 vertices common to all giant connected components (data not shown), which rules out the possibility that the advantage of our method stems from using only a specific subset of vertices. For comparison, if the same supervised LDA method is applied to our vertexspecific descriptors based on vertex degrees, as shown in the left panel of Fig. 3, discrimination between groups of organisms is much more evident than for graph indices in Fig. 2. Moreover, clustering in PCA space is not much different from supervised results in LDA space, which further supports the claim that for vertexspecific descriptors, the main source of differences between organisms is related to their evolutionary closeness. As shown in the right panel of Fig. 3, the effectiveness of the vertex-specific approach is not confined to vertex degree only, vertex-specific descriptors based on betweenness centralities of the nodes can also be used.
T. Arod´z
Proteobacteria Firmicutes Actinobacteria Eukaryotes Discriminant component I
Discriminant component II
Discriminant component II
532
Proteobacteria Firmicutes Actinobacteria Eukaryotes
Discriminant component I
Fig. 3. Optimal discriminant space obtained with LDA for the vertex-specific descriptors of metabolic networks, based on vertex degree (left) or vertex centrality (right)
3.3
Phylogenetic Tree from Vertex-Specific Descriptors of Metabolic Networks
We have used feature vectors defined by vertex-specific descriptors based on vertex degrees to construct the phylogenetic tree of the 44 analysed organisms. To this end, we calculated Euclidean distances between normalised feature vectors representing the metabolic networks of the organisms. The distances were then supplied to the NJ phylogenetic tree construction algorithm [17] implemented in the PHYLIP tool. The resulting tree is depicted in Fig. 4. In the tree, bacteria form a separate subtree from eukaryotes. Within bacteria, a certain amount of taxonomic order is preserved, with the most notable exception being the placement of N. meningitidis, a β-proteobacteria, along with M. leprae, a member of Actinobacteria, and C. crescentus, an α-proteobacteria, in a separate subtree. One reason for this may be that C. crescentus is a species of Caulobacterales, while all the other α-proteobacteria, separated from it in the tree, are Rhizobiales. Furthermore, a recent study using a genome-wide phylogeny indicated that N. meningitidis is quite distinct [18]. Depending on the tree-construction method, it was situated near α- or γ-proteobacteria, not along the other β-proteobacteria with sequenced genome, R. solanacearum. In whole-genome phylogeny, R. solanacearum aligned itself consistently with a γproteobacteria P. aeruginosa, a finding that is also supported by results of our study. The rest of the organisation of the bacteria tree follows taxonomy, in some groups down to below the level of species class. A subtree of α-proteobacteria forms a separate group within Proteobacteria, as do γ-proteobacteria. Within a common tree of the class of γ-proteobacteria, a branch corresponding to the order Enterobacteriales is present. Actinobacteria, a phylum of high-GC Grampositive bacteria, and Firmicutes, low-GC Gram-positive bacteria, form a tree isolated from Gram-negative Proteobacteria, with further grouping of Firmicutes into a single branch separated from Actinobacteria.
Clustering Organisms Using Metabolic Networks
Actinobacteria
Proteobacteria
Firmicutes
Actinobacteria Eukaryotes
533
Mycobacterium leprae b-Proteobacteria Neisseria meningitidis Z2491 a-Proteobacteria Caulobacter crescentus Escherichia coli K-12 MG1655 Escherichia coli K-12 W3110 Escherichia coli O157 Sakai Escherichia coli O157 EDL933 Salmonella typhimurium Enterobacteriales Salmonella typhi Escherichia coli CFT073 Shigella flexneri Yersinia pestis CO92 Yersinia pestis KIM Vibrio cholerae g-Proteobacteria Pasteurella multocida Xanthomonas campestris Xanthomonas axonopodis Shewanella oneidensis Pseudomonas aeruginosa b-Proteobacteria Ralstonia solanacearum Mesorhizobium loti Sinorhizobium meliloti Brucella melitensis Agrobacterium tumefaciens C58 U a-Proteobacteria Agrobacterium tumefaciens C58 C Brucella suis Staphylococcus aureus N315 Staphylococcus aureus Mu50 Staphylococcus aureus MW2 Lactococcus lactis Clostridium acetobutylicum Clostridium perfringens Streptococcus pneumoniae R6 Listeria innocua Listeria monocytogenes Bacillus subtilis Bacillus halodurans Oceanobacillus iheyensis Streptomyces coelicolor Mycobacterium tuberculosis H37Rv Mycobacterium tuberculosis CDC1551 Caenorhabditis elegans Drosophila melanogaster Homo sapiens
Fig. 4. Phylogenetic tree obtained with NJ tree construction method [17] using the vertex-specific descriptors of metabolic networks based on vertex degrees
4
Conclusion
We have shown that the evolutionary distances between organisms can be successfully inferred from simple properties of their metabolic networks. We used the fact that correspondence between vertices in metabolic networks is readily available, as they represent well-defined chemical entities, to construct vertexspecific topological graph descriptors. This network-centred approach allows for capturing evolutionary closeness of organisms at the functional, system level. The resulting clustering of organisms is in most aspects consistent with the well established view that relies on genome content.
534
T. Arod´z
Acknowledgements. The author acknowledges Polish Ministry of Science and Higher Education research projects 3T11F01030 and N519315535 and KI AGH internal grant 11.11.120.777. Graph properties were calculated with Matlab BGL library. The author thanks Prof. Dr. Witold Dzwinel for mentoring, and Mr. Marcin Kurdziel and Mr. Wojciech Czech for discussions.
References 1. Wolf, Y., Rogozin, I., Grishin, N., Koonin, E.: Genome trees and the tree of life. Trends in Genetics 18, 472–479 (2002) 2. Barabasi, A., Oltvai, Z.: Network biology: understanding the cell’s functional organization. Nature Reviews Genetics 5, 101–113 (2004) 3. Jeong, H., Tombor, B., Albert, R., Oltvai, Z., Barab´asi, A.: The large-scale organization of metabolic networks. Nature 407, 651–654 (2000) 4. Podani, J., Oltvai, Z., Jeong, H., Tombor, B., Barab´ asi, A., Szathm´ ary, E.: Comparable system-level organization of Archaea and Eukaryotes. Nature Genetics 29, 54–56 (2001) 5. Forst, C., Flamm, C., Hofacker, I., Stadler, P.: Algebraic comparison of metabolic networks, phylogenetic inference, and metabolic innovation. BMC Bioinformatics 7, 67 (2006) 6. Ma, H., Zeng, A.: Phylogenetic comparison of metabolic capacities of organisms at genome level. Molecular Phylogenetics and Evolution 31, 204–213 (2004) 7. Heymans, M., Singh, A.: Deriving phylogenetic trees from the similarity analysis of metabolic pathways. Bioinformatics 19, i138–i146 (2003) 8. Forst, C., Schulten, K.: Phylogenetic analysis of metabolic pathways. Journal of Molecular Evolution 52, 471–489 (2001) 9. Ma, H., Zeng, A.: Reconstruction of metabolic networks from genome data and analysis of their global structure for various organisms. Bioinformatics 19, 270–277 (2003) 10. Zhu, D., Qin, Z.: Structural comparison of metabolic networks in selected single cell organisms. BMC Bioinformatics 6, 8 (2005) 11. Oh, S., Joung, J., Chang, J., Zhang, B.: Construction of phylogenetic trees by kernel-based comparative analysis of metabolic networks. BMC Bioinformatics 7, 284 (2006) 12. Luo, B., Hancock, E.: Structural graph matching using the EM algorithm and singular value decomposition. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 1120–1136 (2001) 13. Costa, L., Rodrigues, F., Travieso, G., Boas, P.: Characterization of complex networks: A survey of measurements. Advances in Physics 56, 167–242 (2007) 14. Wilson, R., Luo, E.: Pattern vectors from algebraic graph theory. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 1112–1124 (2005) 15. Ma, H., Zeng, A.: The connectivity structure, giant strong component and centrality of metabolic networks. Bioinformatics 19, 1423–1430 (2003) 16. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006) 17. Saitou, N., Nei, M.: The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular Biology and Evolution 4, 406–425 (1987) 18. Henz, S., Huson, D., Auch, A., Nieselt-Struwe, K., Schuster, S.: Whole-genome prokaryotic phylogeny. Bioinformatics 21, 2329–2335 (2005)
Influence of Network Structure on Market Share in Complex Market Structures Makoto Uchida1 and Susumu Shirayama2 1
2
School of Engineering, the University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba, 277-8568, Japan [email protected] Research into Artifacts, Center for Engineering, the University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba, 277-8568, Japan [email protected]
Abstract. We study an dynamical model on complex networks. The model is based on a multi agent modeling which is aimed for representing ‘Network Effect’ in a market of personal communication services. By series of numerical simulations using various models of complex networks, we found two classes in resulting dynamics which are dependent on underlying network structures as well as parameter settings. We also apply the model on a real social network data, and discuss the simulation results in relation with the network models.
1 Introduction Study on complex networks has been attracting wide attention in physics, engineering and other fields as a basic representation for the analysis and understanding of a variety of complex systems. In such systems, complex network structures are considered to be responsible for characteristic dynamical phenomena that take place on the systems. Under this idea a number of examples of complex networks including physical, technological, biological and social systems have been studied [1–3]. Many statistical properties which are commonly seen in many instances in networks are revealed, and theoretical models of complex networks which realize these properties have been proposed [4–7]. In our preceding study, we reported that even a simple spin-like model of dynamical process on complex networks is widely affected by underlying network structure, and inherent characteristics of network structures that are not fully represented by the statistical properties can be classified in relation with the dynamical process [8]. On the other hand, real world examples of dynamical processes are more complicated than such the simple physical modeling. For example, interaction and decision making of human in a real world market are much more complicated with various constraint and uncertainty, which often lead to specific real world phenomena, such as ‘Network Effect’ that is often referred to in business and marketing science literature [9]. In such processes of human interactions, it is not an obvious question that the structural complexity of network structures of interaction patterns will be associated with and accountable for the resulting phenomena, even with relatively complicated models of M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 535–544, 2008. c Springer-Verlag Berlin Heidelberg 2008
536
M. Uchida and S. Shirayama
dynamical process. In this paper, we investigate a relationship between a more complicated dynamical model and underlying network structures, by extending our previous works [8, 9]. The dynamical model is based on a multi agent modeling for an artificial market simulation. Through a number of numerical simulations using complex network models and a structural data of a real social network, we study resulting dynamics on the system and their classifications.
2 The Model: Dynamics of Network Effect We consider a dynamical model inspired by so-called ‘Network Effect’, one of a market characteristic that often occurs in a market where two or more products or services are competing. Generally, the network effect, or network externalities, is defined as an effect whereby the benefit of a product or a service increase with the number of users who also use the same product or service [10]. This has been considered to lead to emergence of market phenomena such as ‘winner-takes-all’. So far, a number of studies have examined the network effect, mainly for the purpose of understanding market dynamics and constructing an efficient market or effective market strategies [11–16]. Above all, Wendt and Westarp [14], and Weitzeled et al [16] reported that network structures, or topologies, of interaction patterns of users have a strong influence on the magnitude of the network effect. In the present paper, a multi agent simulation model that represents a market of a personal communication service (i.e. mobile phone or short messaging service) is constructed. The basic concept of the modeling is as follows. Individual users are represented by agents. Each user agent has a contact list of friends with whom to communicate. Users pay a charge for the service, which is proportional to the amount of communication they do. In the market, there are assumed to be two competing service providers. Users choose one of the providers so as to minimize their overall costs. Consequently, the communication pattern of users can be represented by a network
D
S S
D
S
S D
D
D D
S S
S
D
S
D
Fig. 1. Schematic diagram of the network
Influence of Network Structure on Market Share in Complex Market Structures
537
structure. Figure 1 shows a schematic representation of this model. Each circle denotes a user. For example, the user of the provider S in the center of the figure has six friends to communicate with, four of whom use the different provider D and the rest are using the same provider S. 2.1 Model of User We consider that users can select either S or D service provider. Users pay for the service in proportion to the amount of communications. Here, we assume that the charge within the same provider is cheaper than that to the different provider. We also assume that an additional cost (typically an initial setup cost) is required if a user wants to switch to the other provider. We can formulate this model as follows. The overall cost c that user i pays in a unit period is c(xi , αS , αD , kiS , kiD , δ) S S D D S D D S = min (α ki + α ki )xi , (α ki + α ki )xi + δ ,
(1)
where xi is the amount of communication per a friend in a unit period of user i, kiS and kiD are the numbers of friends of user i who are using the same provider and the different provider as user i, respectively, and αS and αD are the unit charges of communication to users of the same provider and the different provider, respectively. We assume αS and αD as constant values that satisfy αS < αD . Here, δ is the cost to switch providers, which is also assumed to be a constant. In this modeling, we assume the total amount of communication Xtotal that users make in a unit period follows a normal distribution N (μ, σ 2 ) for each user. Therefore, xi in Eqn. (1) can be rewritten as xi = Xtotal /ki , where ki = kiS + kiD is the number of friends of user i. Note that because of this assumption, the amount of communication is asymmetric; the amount of communication from user i to user j and from j to i is different if the number of friends are different. Consequently, Eqn. (1) can be rewritten as c(kiS , kiD , Xtotal ) = min (αS kiS + αD kiD )Xtotal /(kiS + kiD ), (αS kiD + αD kiS )Xtotal /(kiS + kiD ) + δ . (2) At each step t, a user is randomly selected and a choice is made as to whether the user should switch providers in order to minimize the overall cost according as Eqn. 2. In other words, user i will switch the provider if (αS kiS + αD kiD )Xtotal /(kiS + kiD ) ≥ (αS kiD + αD kiS )Xtotal /(kiS + kiD ) + δ, and pay the switch cost.
538
M. Uchida and S. Shirayama Table 1. Structural properties of networks
lattice ER random graph WS model BA model KE model CNN model
P (k) uniform Poisson Poisson-like power-law power-law power-law
L large small small small small small
C r large no small no large no small no large negative large positive
2.2 Model of Interaction Pattern by Complex Network Models In the present modeling, communication pattern among users in the market is modeled on a kind of social networks. It has been revealed by recent studies that social networks in the real world have a unique characteristics in their structures [2,3]. Examples include power-law degree distribution, short average path length, high clustering coefficients and positive degree correlations. These characteristics are surely considered to give an effect on dynamical processes on such social networks, which we assume are not be reproduced using simple structural models, such as lattices or random graphs. We assume to be able to realize the characteristics of real social networks on the market dynamics by considering network models as an interaction patterns of the multi agent modeling. In order to investigate into the effect, the main purpose of this paper is to analyze the dynamics driven by the above mentioned dynamical model under various structures of complex networks, and to find out suitable structural models for an analysis on real world problems. In the present paper, we use four complex network models: the WS model [4], the BA model [5], the KE model [6] and the CNN model [7]. The WS model is a socalled ‘small-world’ model that is based on a random rewiring procedure of the edges from a one-dimensional lattice. The remaining three models realize a power-law degree distribution. They are models of a mechanism of network growth, in which different types of growth algorithms produce different characteristics in other structural properties. The characteristics of the models are briefly summarized in Table 1. See related references for the details of the models. For comparison, we also test the relatively simple structures of the one-dimensional lattice and the Erd¨os-Reny´ı (ER) random graph.
3 Numerical Studies 3.1 Network Models First, we perform numerical simulations using the network models described in the previous section. The total amount of communication Xtotal is defined as a random value that follows normal distribution N (μ, σ 2 ), where μ = Xk and Xk/3, k is
Influence of Network Structure on Market Share in Complex Market Structures
100
100
100 carrier A carrier B
60
40
(A)
20
0
60
40
(B)
20
0 0
100
200 300 TIME STEP $
400
500
100
200 300 TIME STEP $
400
(D)
0 400
100
200 300 TIME STEP $
500
400
500
100 carrier A carrier B
80
60
40 (E)
20
80
60
40
(F)
20
0
0 200 300 TIME STEP $
(C)
20
0
MARKET SHARE (%)
MARKET SHARE (%)
40
100
40
carrier A carrier B
60
0
60
500
100 carrier A carrier B
80
20
80
0 0
100
MARKET SHARE (%)
carrier A carrier B
80
MARKET SHARE (%)
MARKET SHARE (%)
MARKET SHARE (%)
carrier A carrier B 80
539
0
100
200 300 TIME STEP $
400
500
0
100
200 300 TIME STEP $
400
500
Fig. 2. Temporal evolutions of a typical realization. k = 10, X = 100, αS = 20, αD = 25 and δ = 100. (A) regular lattice, (B) ER random graph, (C) WS network, (D) BA network, (E) KE network, (F) CNN network.
average degree, and X is a constant value. Xtotal = 0 if a probabilistic random variable N (μ, σ 2 ) is less than zero. The number of agents (users) is set 3, 600. Initial conditions are randomly given. At the initial state, the two competing providers have randomly selected 50% share of the users. Here, all of the parameter of the two providers are the same, which means neither providers is advantageous. However, we found that the system reaches an equilibrium state as time progresses, that one provider becomes advantageous (becomes the ‘winner’) at around a particular share. Examples of temporal evolutions of shares in typical realizations are shown in Fig. 2. We focus on the winner’s share after the equilibrium is reached. Figs. 3 and 4 represent average shares of the winner as functions of switch cost δ, comparing on different average degrees and different amount of communications. The values are averaged over last 50 steps after t = 450, and averaged over 50 simulations with different initial conditions. Here we can observe a number of characteristics. From Fig. 3, the resulting phenomena can be divided into two classes. One is ‘ordered’ dynamics, which is observed in the ER random graph, the BA network and the WS network. In this class, the shares of the winner well converge to particular values with little deviation. The convergence values are increasing function of the switch cost δ. This class is also observed in the CNN network with large k, however, the convergence values are rather small than the rest three networks. The other is ‘fluctuated’ class, in which the shares of the winner fluctuate with a large deviation. The average values of the share is constant, or gradually decreasing, function of δ. In regular lattice, the dynamics belongs to this class and the values are around 50%, which was the initial share. This is considered to be the inherent characteristics of the model of interaction itself, without the effects of complex network structure. In the KE network, the deviation is larger than the regular lattice so that the average values are rather large.
540
M. Uchida and S. Shirayama
AVG SHARE OF THE WINNER (%)
100 90 80 70 60 50
LATTICE ER WS BA KE CNN
40 30
(A)
20 0
500
1000
1500
2000
2500
3000
2500
3000
2500
3000
SWITCH COST ( δ ) AVG SHARE OF THE WINNER (%)
100 90 80 70 60 50
LATTICE ER WS BA KE CNN
40 30
(B)
20 0
500
1000 1500 2000 SWITCH COST ( δ )
AVG SHARE OF THE WINNER (%)
100 90 80 70 60 50
LATTICE ER WS BA KE CNN
40 30
(C)
20 0
500
1000 1500 2000 SWITCH COST ( δ )
Fig. 3. Average share of the winner as a function of switch cost δ, comparison on average degree. X = 100, αS = 20 and αD = 25. (A) k = 8, (B) k = 10, (C) k = 14.
Influence of Network Structure on Market Share in Complex Market Structures
541
AVG SHARE OF THE WINNER (%)
100 90 80 70 60 50
LATTICE ER WS BA KE CNN
40 30
(A)
20 0
500
1000
1500
2000
2500
3000
2500
3000
SWITCH COST ( δ ) AVG SHARE OF THE WINNER (%)
100 90 80 70 60 50
LATTICE ER WS BA KE CNN
40 30
(B)
20 0
500
1000 1500 2000 SWITCH COST ( δ )
Fig. 4. Average share of the winner as a function of switch cost δ, comparison on amounts of communication. k = 10, αS = 20 and αD = 25. (A) X = 50, (B) X = 150.
From Fig. 4, the system tends to be ‘fluctuated’ if the amount of communication X is small and δ is large. However, it strongly depends on network structures which class will occur at particular X and δ. 3.2 Real Social Network We then perform simulations using a real structure of an e-mail correspondence network at a Spanish university [17] 1 , which we assume reflects the characteristics of an human communication pattern in the real world. The data contains 1,133 users and 5,451 friendships (edges) k = 9.62. The clustering coefficient is C = 0.254, which is 10 times larger than a random graph of the same number of agents, and the average 1
The dataset can be obtained from http://deim.urv.cat/%7Eaarenas/data/welcome.htm.
542
M. Uchida and S. Shirayama 100
100
100 carrier A carrier B
60
40
(A)
20
0
carrier A carrier B
80
MARKET SHARE (%)
MARKET SHARE (%)
MARKET SHARE (%)
carrier A carrier B 80
60
40
(B)
20
0 0
100
200 300 TIME STEP $
400
500
80
60
40
(C)
20
0 0
100
200 300 TIME STEP $
400
500
0
100
200 300 TIME STEP $
400
500
Fig. 5. Temporal evolutions of a typical realization on a real e-mail correspondence network. (A) δ = 0, (B) δ = 100, (C) δ = 2900.
AVG SHARE OF THE WINNER (%)
100 90 80 70 60 X=10 X=25 X=50 X=100 X=150 X=300
50 40 30 0
500
1000
1500
2000
2500
3000
SWITCH COST ( δ )
Fig. 6. Average share of the winner as a function of switch cost δ on a real e-mail correspondence network, comparison on amounts of communication
path length is L = 3.606. The degrees of the network follows an exponential degree distribution [17]. Fig. 5 shows example of temporal evolution of shares in typical realizations with different δ. We can confirm that the system reaches an equilibrium state with the real data as well as in the networks models. Fig. 6 shows convergence values of the average share of the winner with different values of X. From Fig. 6, we can still observe the ‘ordered’ and the ‘fluctuated’ classes as in the network models. With large X, the system is ‘ordered’; the convergence shares of the winner are increasing function of δ, with little deviation. With smaller X, we can still observe the ‘ordered’ dynamics as long as δ is small. With such values of X, the convergence shares of the winner are ordered and increases as δ increases, then reach the maximum at particular value of δ ∗ . As δ further increases δ > δ ∗ , the system does a transition to the ‘fluctuated’ class; the convergence
Influence of Network Structure on Market Share in Complex Market Structures
543
share decreases, with large deviations. With X δ, the convergence shares reaches to 50%, which was the initial share.
4 Discussion and Conclusion In the series of the numerical studies, we found two classes, ‘ordered’ and ‘fluctuated’, on the dynamics driven by the proposed model. From the simulations, we can say that the ‘ordered’ dynamics is likely to emerge in small δ, large k and large X. The classes have a dependence on network structures as well. The ER, BA and WS network tend to exhibit ‘ordered’ dynamics in most cases, while in the KE network and lattice there is always ‘fluctuated’. The CNN network is intermediate of the two, which is also seen in the real data of e-mail correspondence network. The values of convergence shares of the ‘winner’ are also dependent on network structures. In practice, the winner’s share can be interpreted as a magnitude of ‘Network Effect’; if stronger network effect works on the market, then the winner can take a larger share consequently due to the network effect. Therefore, we can analyze which network structure can enhance or decrease the network effect, and which model of network is suitable for studying the real social network, from the simulation results. In summary, we investigated an influence of network structures on a process of market dynamics. The dynamical model is based on a multi agent modeling for artificial market simulation, which intends of representing a dynamics of ‘Network Effect’. Series of numerical studies using the simulation model showed two classes of dynamics, which are strongly dependent on network structures. The simulation with an empirical data shows the pattern like that of the CNN model, which implies the CNN model reflects the characteristic of structures of the real interaction patterns of users. In conclusion, it is confirmed that the structure of interaction patterns surely gives strong effect on the resulting dynamics on it. By using appropriate models of the real interaction patterns of users, it becomes possible to analyze dynamics of market more precisely.
References 1. Dorogovtsev, S.N., Mendes, J.F.F.: Evolution of Networks: From Biological Nets to the INternet and WWW. Oxford University Press, Oxford (2003) 2. Newman, M.E.J., Barab´asi, A.L., Watts, D.J.: The Structure and Dynamics of Networks. Princeton University Press, Princeton (2006) 3. Boccaletti, S., Latora, Y., Moreno, Y., Chavez, M., Hwang, D.U.: Complex networks: Structure and dynamics. Physics Report 424, 175–308 (2006) 4. Watts, D.J., Strogatz, S.H.: Collective dynamics of ’small-world’ networks. Nature 393, 440– 442 (1998) 5. Barab´asi, A.L., Albert, R.: Emergence of scaling in random networks. Science 286, 509–512 (1999) 6. Klemm, K., Egu´ıluz, V.M.: Highly clustered scale-free networks. Phys. Rev. E 65, 036123 (2002) 7. V´azquez, A.: Growing network with local rules: Preferential attachment, clustering hierarchy, and degree correlations. Phys. Rev. E 67, 056104 (2003)
544
M. Uchida and S. Shirayama
8. Uchida, M., Shirayama, S.: Classification of networks using network functions. In: Shi, Y., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2007. LNCS, vol. 4488, pp. 649– 656. Springer, Heidelberg (2007) 9. Uchida, M., Shirayama, S.: Network effect in complex market structures. In: Proceedings of Web Intelligence and Intelligent Agent Technology (WI-IAT 2007) (Workshops), Workshop on Multiagent Systems in E-Business: Concepts, Technologies and Applications (MASeB 2007), Sillicon Valley, November 2007, pp. 449–453 (2007) 10. Katz, M.L., Shapiro, C.: Network externalities, competition, and compatibility. American Economic Reviews 75(3), 424–440 (1985) 11. Church, J., Gandal, N.: Network effects, software provision and standardization. The Journal of Industrial Economics 40(1), 85–103 (1992) 12. Katz, M.L., Shapiro, C.: Product introduction with network externalities. The Journal of Industrial Economics 40(1), 55–83 (1992) 13. Arthur, W.B., Lane, D.A.: Information contagion. Structural Change and Economic Dynamics 4, 81–104 (1993) 14. Wendt, O., Westarp, F.: Determinants of diffusion in network effect markets. SFB 403 Research Report (2000) 15. Frels, J.K., Heisler, D., Geggia, J.A.: Standard-scope: an agent-based model of adoption with incomplete information and network externalities. In: Proceedings of 3rd International Workshop on CIEF, pp. 1219–1222 (2003) 16. Weitzel, T., Wendt, O., Westarp, F.: Reconsidering network effect theory. In: Proceedings of the 8th European Conference of Information Systems, pp. 484–491 (2002) 17. Guimer´a, R., Danon, L., D´ıaz-Guilera, A., Giralt, F., Arenas, A.: Self-similar community structure in a network of human interactions. Physical Review E 68, 065103 (2003)
When the Spatial Networks Split? Joanna Natkaniec and Krzysztof Kulakowski Faculty of Physics and Applied Computer Science, AGH–UST, al. Mickiewicza 30, PL-30059 Krak´ ow, Poland {kulakowski,3natkani}@novell.ftj.agh.edu.pl http://www.zis.agh.edu.pl/
Abstract. We consider a three dimensional spatial network, where N nodes are randomly distributed within a cube L × L × L. Each two nodes are connected if their mutual distance does not excess a given cutoff a. We analyse numerically the probability distribution of the critical density ρc = N (ac /L)3 , where one or more nodes become separated; ρc is found to increase with N as N 0.105 , where N is between 20 and 300. The results can be useful for a design of protocols to control sets of wearable sensors. Keywords: random graphs; spatial networks; extreme values.
1
Introduction
Recent interest in abstract networks is at least partially due to the fact that they are not bound by geometry. However, in many applications the networks are embedded in a metric space; then we speak on spatial networks [1], geographical networks [2], ad-hoc networks [3] or random geometric graphs [4]. If the connections between nodes are determined by their mutual distance, this embedding appears to be important for the properties of the system. The list of examples of spatial networks includes the Internet, the electriticity power grid, transportation and communication networks and neuronal networks. We consider a three dimensional spatial network, where N nodes are randomly distributed within a cube L × L × L. Each two nodes are connected if their mutual distance does not excess a given cutoff a [1]. In the case of uniform density of nodes the small-world property is absent, because the average shortest path increases linearly with the system size L. Here the small world effect could appear in the case of unlimited dimensionality D of the network; once D is fixed, the effect disappears. In literature, most papers are devoted to percolation problems; references can be found in [1],[4,5,6]. The percolation threshold can be identified with the critical density, where the size of the largest connected component becomes of the order of the number of all nodes. Here we search for another critical density, where the size of the largest connected component becomes equal to the number of all nodes. In other words, we investigate the critical spatial density where at least one node is unconnected. This kind of threshold is different than the one in the conventional percolation. In our opinion, the problem can be relevant for control sets of wearable sensors [7]. To give an example, a group of divers wants to keep contact, operating in M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 545–550, 2008. c Springer-Verlag Berlin Heidelberg 2008
546
J. Natkaniec and K. Kulakowski
dark water. Their equipment secures communication between two divers only if the distance between them is short enough. In this example, it is crucial to maintain communication with all divers; no one can be lost. Having N divers, how large volume of water can be safely penetrated? The problem has its physical counterpart; we can ask about the largest fluctuations of the density of an ideal gas. The probability distribution of this density is usually assumed to be Gaussian [8]. Then we should ask about the probability that the minimal density in a gas of N particles is not lower than some critical value proportional to a−3 . This question belongs to the statistics of extremes [9],[10]. However, the derivation of the Gaussian distribution itself relies on the assumption that different areas in the gas are statistically independent; in real systems this is not true. Then it makes sense to investigate the problem numerically. In the next section we describe the details of our calculations and the results. Short discussion in the last section closes the text.
2
Calculations and Results
A set of N points are randomly distributed with uniform probability in a cube L × L × L. We set L = 4 and we vary a and N ; then the density is ρ = N (a/L)3 . A link is set between each two nodes, if their mutual distance is less than a. The simplest method is to generate the positions of the nodes and to vary the radius a; for each a, the connectivity matrix is created and investigated. We are interested in the critical density ρc , where some nodes become unconnected with the others. The simplest way is to calculate the percentage p of isolated nodes per N ; if there is a phase transition, p could play a role of the order parameter. The problem is that in this way, splittings of the network into 10000
Np(0)
8000 6000 4000 2000 0 0
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 a
Fig. 1. The number of unconnected nodes against the radius a for N = 104
When the Spatial Networks Split? 0.16
547
N=50 N=100 N=200 N=300
0.14 0.12
p(ρc)
0.1 0.08 0.06 0.04 0.02 0 0
1
2
3 ρc
4
5
6
Fig. 2. The probability distribution of the critical density ρc for various N
f(N) g(N)
3
< ρc >
2.8 2.6 2.4 2.2 2 50
100
150 N
200
250
300
Fig. 3. The averaged critical density ρc against the system size N . The results can be fitted equally well with the functions f (N ) = 1.56755 × N 0.105309 and g(N ) = 0.264698 × ln(N ) + 1.33571.
larger pieces is disregarded; the advantage is that the code works quickly and larger networks can be investigated. In Fig. 1 we show N p as dependent on a for N = 104 ; as we see, the variation is not sharp. Therefore we cannot decide if there is a phase transition or just a crossover. Other results are obtained for smaller lattices, but in each case the algorithm detects the splitting of the whole network into pieces of any size. In Fig. 2 we present the probability distribution of the critical density ρc for selected sizes of the system. For each N , these results were obtained from 104 randomly generated
548
J. Natkaniec and K. Kulakowski
7
N=50 N=100 N=150 N=200 N=250 N=300
6 5 4 3 2 0.8
1
1.2
1.4
1.6
1.8
2
a
Fig. 4. The average value of the network diameter d against the radius a for various N
5 4.5
< d(ac) >
4 3.5 3 2.5 2 50
100
150 N
200
250
300
Fig. 5. The mean value of the network diameter d at the critical radius ac as dependent on the system size N . The results can be fitted with f (N ) = 1.18834 × N 0.250106 . This curve does not depend on the model value of L.
networks. In Fig. 3 we show the mean value of ρc , as it increases with the system size N . This dependence appears to be very slow; it can be fitted as proportional to N 0.105 or, alternatively, as 1.336 + 0.265 × ln(N ). We calculate also the network diameter d; this is the mean shortest path between nodes, calculated as the number of links between them. Obviously, d decreases with a, and it becomes infinite when the network is disconnected. Thea
When the Spatial Networks Split? 0.14
549
f(k) g(k)
0.12
p(k)
0.1 0.08 0.06 0.04 0.02 0 5
10
15
20
25
k
Fig. 6. The degree distribution averaged over 104 networks of N = 170 nodes at ac = 1.005 (crosses), compared with the log–normal distribution (f(k)) and the Poisson distribution (g(k)) with the same mean degree < k >= 8.322
calculations are done with the Floyd algorithm [11], for a > ac . The results are shown in Fig. 4. Fig. 5 reproduces the values of d at the threshold, where the network splits. In Fig. 6 we show the degree distribution of the network for N = 170 and a = ac = 1.005. As we see, the distribution differs from the Poisson distribution and the log–normal distribution with the same mean degree. This difference may be due to the correlations between numbers of nodes in different spheres.
3
Discussion
It is obvious that the probability that at some point the density will be lower than the critical value increases with the system size. This increase is compensated by the decrease of the critical density and, subsequently, an increase of the critical cutoff with N . The question is how this cutoff increases. The result shown in Fig. 3 indicate, that this increase is rather slow. Our numerical method does not allow us to differ between the power law with a small exponent and the logarithmic law. The data of the mean free path d as dependent on a can be used to design protocols to communication between sensors. An example of such a protocol could be that a signal ’zero’ detected by one sensor is sent to its neighbors which after a time τ should reproduce it once, adding plus one to the content. The number when the last sensor gets the signal is just the shortest path between this sensor and the one which initialized the series. At the threshold density, the communication is partially broken; high value of d near the threshold should activate the message ’go back to the others’ at these sensors which get the high value of the signal. Here,
550
J. Natkaniec and K. Kulakowski
the data shown in Fig. 5 can be useful. How to design routing protocols in ad hoc networks is a separate branch of computer science [12]. In fact, our Monte Carlo simulations can be seen as an attempt to sample the phase space of the system. The probability that the contact is broken can be interpreted dynamically as the percentage of time when the communication is incomplete. If the trajectory wanders randomly, all nodes can happen to be connected again. Acknowledgments. The authors are grateful to Sergei N. Dorogovtsev and Paul Lukowicz for helpful suggestions.
References 1. Herrmann, C., Barth´elemy, M., Provero, P.: Connectivity distribution of spatial networks. Phys. Rev. E 68, 026128 (2003) 2. Huang, L., Yang, L., Yang, K.: Geographical effects on cascading breakdowns of scale-free networks. Phys. Rev. E 73, 036102 (2006) 3. Stepanov, I., Rothermel, K.: Simulating mobile ad hoc networks in city scenarios. Computer Commun 30, 1466–1475 (2007) 4. Dall, J., Christensen, M.: Random geometric graphs. Phys. Rev. E 66, 016121 (2002) 5. Sen, P.: Phase transitions in Euclidean networks: A mini-review. Phys. Scripta. T 106, 55–58 (2003) 6. Manna, S.S., Mukherjee, G., Sen, P.: Scale-free network on a vertical plane. Phys. Rev. E 69, 17102 (2004) 7. Kunze, K., Lukowicz, P., Junker, H., Troster, G.: Where am I: Recognizing on-body positions of wearable sensors. In: Strang, T., Linnhoff-Popien, C. (eds.) LoCA 2005. LNCS, vol. 3479, pp. 264–275. Springer, Heidelberg (2005) 8. Reichl, L.E.: A Modern Course in Statistical Physic, p. 353. John Wiley and Sons Inc., Chichester (1998) 9. Gumbel, E.J.: Statistics of Extremes. Columbia UP, New York (1958) 10. Coles, S.: An Introduction to Statistical Modeling of Extreme Values. Series in Statistics. Springer, London (2001) 11. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms. MIT Press, Cambridge (2001) 12. Royer, E.M., Toh, C.K.: A review of current routing protocols for ad hoc mobile wireless networks. IEEE Personal Communications, 46–55 (1999)
Search of Weighted Subgraphs on Complex Networks with Maximum Likelihood Methods Marija Mitrovi´c1 and Bosiljka Tadi´c2 1
Institute of Physics, Belgrade, Serbia Joˇzef Stefan Institute, Ljubljana, Slovenia [email protected],[email protected] http://scl.phy.bg.ac.yu, http://www-f1.ijs.si/∼tadic 2
Abstract. Real-data networks often appear to have strong modularity, or network-of-networks structure, in which subgraphs of various size and consistency occur. Finding the respective subgraph structure is of great importance, in particular for understanding the dynamics on these networks. Here we study modular networks using generalized method of maximum likelihood. We first demonstrate how the method works on computer-generated networks with the subgraphs of controlled connection strengths and clustering. We then implement the algorithm which is based on weights of links and show it’s efficiency in finding weighted subgraphs on fully connected graph and on real-data network of yeast. Keywords: modular networks, subgraphs, maximum likelihood method.
1
Introduction
Complex dynamical systems can be adequately represented by networks with a diversity of structural and dynamical characteristics, [1], [2] and [3]. Often such networks appear to have multiscale structure with subgraphs of different sizes and topological consistency. Some well known examples include gene modules on genetic networks [4], social community structures [5], topological clusters [6] or dynamical aggregation on the Internet, to mention only a few. It has been understood that in the evolving networks some functional units may have emerged as modules or communities, that can be topologically recognized by better or tighter connections. Finding such substructures is therefore of great importance primarily for understanding network’s evolution and function. In recent years great attention has been devoted to the problem of community structure in social and other networks, where the community is topologically defined as a subgraph of nodes with better connection among the members compared with the connections between the subgraphs, [5] and [7]. Variety of algorithms have been developed and tested, a comparative analysis of many such algorithms can be found in [5]. Mostly such algorithms are based on the theorem of maximal-flow–minimal-cut [8], where naturally, maximum topological flow falls on the links between the communities. Recently a new approach was proposed based on the maximum-likelihood method, [9]. In maximization likelihood method an assumed mixture model is fit to a given data set. Assuming M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 551–558, 2008. c Springer-Verlag Berlin Heidelberg 2008
552
M. Mitrovi´c and B. Tadi´c
that the network nodes can be split into G groups, where group memberships are unknown, then the expectation-maximization algorithm is used in order to find maximum of the likelihood that suites the model. As a result a set of probabilities that a node belongs to a certain group are obtained. The probabilities corresponding to global maximum of the likelihood are expected to give the best split of the network into given number of groups. In complex dynamical networks, however, other types of substructures may occur, that are not necessarily related to “better connectivity” measure. Generally, the substructures may be differentiable with respect certain functional (dynamical) constraints, such as limited path length (or cost), weighted subgraphs, or subgraphs that are synchronizable at a given time scale. Search for such types of substructures may require new algorithms adapted to the respective dynamical constraints. In this work we adapt the maximum-likelihood methods to study subgraphs with weighted links in real and computer-generated networks. We first introduce a new model to generate a network-of-networks with a controlled subgraph structure and implement the algorithm, [9], to test its limits and ability to find the a priori known substructures. We then generalize the algorithm to incorporate the weights on the links and apply it to find the weighted subgraphs on a almost fully connected random graph with known weighted subgraphs and on a real-data network of yeast gene-expression correlations.
2
Network of Networks: Growth Algorithm and Structure
We introduce an algorithm for network growth with a controlled modularity. As a basis, we use the model with preferential attachment and preferential rewiring, [10], which captures the statistical features of the World Wide Web. Two parameters α ˜ and α control the emergent structure of the Webgraph when the average number of links per node M is fixed. For instance, for M = 1: when α ˜ < 1 the emergent structure is a scale-free clustered and correlated network, in particular the case α ˜ = α = 1/4 corresponds to the properties measured in the WWW [10]; when α ˜ = 1 a scale-free tree structure emerges with the exponents depending on the parameter α. Here we generalize the model in a nontrivial manner to permit development of distinct subnetworks or modules. The number of different groups of nodes is controlled by additional parameter Po . Each subgroup evolves according to the rules of Webgraph. At each time step t we add a new node i and M new links. With probability Po a new group is started. The added node is assigned current group index. (First node belong to the first group.) The group index plays a crucial role in linking the node to the rest of the network. The links are created by attaching the added node inside the group, with probability α, ˜ or else rewiring within the entire network. The target node k is selected preferentially with respect to the current situation in the group, which determines the linking probability pin (k, t).
Search of Weighted Subgraphs on Complex Networks
553
Similarly, the node which rewires the link n is selected according to its current number of outgoing links, which determines the probability pout (n, t): pin (k, t) =
M α + qin (k, t) , tM α + Lgk (t)
pout (n, t) =
M α + qout (n, t) , tM (α + 1)
(1)
where qin (k, t) and qout (n, t) are in- or out-degrees of respective nodes at time step t, tM (α + 1) is number of links in whole network, while Lgk (t) is number of link between nodes in a group of node k. It is assumed that qin (i, i)=qout (i, i) = 0. Suggested rules of linking insure existence of modules in the network. Each group has a central node, hub, in terms of in-degree connectivity, and a set of nodes along which it is connected with the other groups. The number of groups G in the network depends on number of nodes, N, and the parameter Po as G ∼ N Po . Some emergent modular structures are shown in Fig. 1. For the purpose of this work we only mention that the networks grown using the above rules are scalefree with both in-coming and out-going links distributed as (κ=”in” or ”out”): P (qκ ) ∼ qκ−τκ .
(2)
The scaling exponents τin and τout vary with the parameters α and Po . In Fig. 2 we show cumulative distribution of in- and out-links in the case N=25000 nodes, M=5, α ˜ = α = 0.9 and number of groups G = 6. The slopes are τin − 1 = 1.616 ± 0.006 and τout − 1 = 7.6 ± 0.3.
Pajek
Pajek
Fig. 1. Network of networks generated by the algorithm described above for N = 1000 nodes and different combinations of control parameters α,α ˜ , Po and M . Different colors represent topological subgraphs as found by the maximum-likelihood method.
These networks with controlled modularity will be considered in the next Section to test the maximum-likelihood algorithm for finding subgraphs. In addition, we will apply the method on a gene network, in which the modular structure is not known. The network is based on the empirical data of gene expressions for a set of 1216 cell-cycle genes of yeast measured at several points along the cellcycle, selected from [11]. The pairs of genes are connected with weighted links
554
M. Mitrovi´c and B. Tadi´c
according to their expression correlation coefficient. In such network, for the correlations exceeding a critical value W0c a percolation-like transition occurs where functionally related clusters of genes join the giant cluster [12]. However, below that point the network is too dense and separation of the modules becomes difficult. The topological betweenness-centrality measures for both, nodes and links in the gene network shown in Fig. 2 (right), exhibit broad distribution, suggesting a nontrivial topology of the network. 10
104
0
in out
10-1
103
-2
P(q)
P(u)
10
102
10-3 101
10-4
Nodes Links
10-5 100
101
102
103
100 0 10
1
10
10
2
3
10
10
4
u
q
Fig. 2. (left) Cumulative distribution of in-coming and out-going links for grown modular network. (right) Distribution of betwenness-centrality for nodes and links in the gene-expression network of Yeast.
3
Maximum Likelihood Methods for Weighted Graphs
The method is based on a mixture model and numerical technique known as the expectation-maximization algorithm. We first describe the basic idea for unweighted networks, [9], and then generalize the method for weighted graphs. 3.1
Theoretical Background
A network of N nodes, directed or undirected, is represented mathematically by an adjacency matrix A with N × N elements. The elements Aij as (1,0), represent the presence/absence of a link between nodes. The idea is to construct a mixture model network partitioned into G groups, where members of the groups are similar in some sense and the numbers gi denote the group to which vertex i belongs [9]. Group memberships are unknown, they are commonly referred as ”hidden” data. The basic idea is to vary parameters of a suitable mixture model to find the best fit to the observed network. The model parameters are: θri , defined as the probability that a link from some node in group r connects to node i, and πr , representing the probability that a randomly chosen vertex falls in group r. The normalization conditions πr = 1 , θri = 1 , (3) r
i
Search of Weighted Subgraphs on Complex Networks
555
are required. The parameters can be estimated by the maximum likelihood criterion using expectation-maximization algorithm. In the present case the problem reduces to maximization of the likelihood P r(A, g|π, θ) with respect to π and θ. Using the factorization rule, we can write P r(A, g|π, θ) in the following form: P r(A, g|π, θ) = P r(A|g, π, θ)P r(g|π, θ) , where P r(A|g, π, θ) =
A
θgiijj ,
P r(g|π, θ) =
ij
πgi .
(4)
(5)
i
Combining Eqs. (4) and (5) we obtain P r(A, g|π, θ) =
πgi
i
A
θgiij,j .
(6)
j
It is common to use the logarithm of likelihood instead of the likelihood itself [9]. In addition, averaging of the log-likelihood over distribution of group memberships g is necessary with the distribution P r(g|A, π, θ), leading to L=
G g1
=
ir
···
G
P r(g|A, π, θ)
gn
qir [lnπr +
[lnπgi +
i
Aij lnθgi ,j ]
(7)
j
Aij lnθgi ,j ] ,
j
where qir = P r(gi = r|A, π, θ) represents the probability that node i belongs to group r. Using again the factorization rule for P r(A, gi = r|π, θ) in the case when A represents the missing data we find the expression for qir A πr j θrjij P r(A, gi = r|π, θ) = qir = P r(gi = r|A, π, θ) = Aij . P r(A|π, θ) s πs j θsj
(8)
Now we can use qir given by the Eq. (8) to evaluate the expected value of loglikelihood and to find πr and θri which maximize it. The maximization can be carried out analytically. Using the method of Lagrange multipliers to enforce the normalization conditions in (3), we find relations for the parameters πr and θri : j Aji qjr i qir πr = , θri = . (9) n q j out (j)qjr In the numerical implementation, starting from an initial partitioning and iterating the Eqs. (8) and (9) towards convergence, we determine the modular structure, which is defined by quantities qir . In practice, the runtime to the convergence depends on the number of nodes and number of groups, G. In practical calculations, the clustering algorithm converges to one out of many local maxima, which is sensitive to the initial conditions. Hence the choice
556
M. Mitrovi´c and B. Tadi´c
of initial values of the model parameters is a not trivial step. The obvious unbi1 1 0 = , ased choice of the initial values is the symmetric point with πr0 = and θri c N which is consistent with normalization conditions for numbers π 0 and θ0 . Unfortunately, this represents a trivial fixed point of the iterations. Instead, we find that the starting values that are perturbed randomly at small distance from the fixed point are better in terms of convergence of algorithm to right maxima of expected value of log-likelihood. After initialization of the parameters, we com0 1 using Eq. (8) and then that πr1 and θri according to Eq. (9), etc. After pute qir some number of iterative steps, the algorithm will converge to a local maximum of the likelihood. In order to find the global maximum, it is recommendable to perform several runs with different initial conditions. We test the the MLM algorithm on the networks grown by the model rep˜ resented in Section 2, for a wide range values of parameters M , P0 , α and α. As we expected, the algorithm works well on networks in which the modules with respect to connectivity—most of the links are between vertices inside the group—are well defined, which is the case for large parameter α ˜ . The partitions are shown in Fig. 1: Clusters of the network found by the algorithm corresponds perfectly to the division derived from the growth. Group membership suggested by the algorithm and the original one are the same for the 98% of the nodes in the network. The algorithm even finds the “connector” nodes, that play a special role in each group. However, it is less efficient for the networks with large density of links between groups, as for instance for networks with α ˜ ≈ 0.6 or lower. Some other observations: Number and size of different groups do not affect the efficiency of the algorithm. Size of the network and sparseness of the network affect the convergence time. As a weak point, the number of groups G is an input parameter. 3.2
Generalization of the Algorithm for Weighted Networks
The topological structure of networks, usually expressed by the presence or absence of links, can be considerably altered when links or nodes aquire different weights. Here we modify the algorithm presented in Section 3.1 in order to take into account weighted networks with modular structure. The main idea is based on the fact that a weighed link between a pair of nodes on the network can be considered as multiple links between that pair of nodes. Then a straightforward generalization of the MLM is to apply the mixture model and expectationmaximization algorithm described above to the multigraph constructed with the appropriate number of links between pairs of nodes. The quantities to be considered are: Wij measured matrix of weights, gi missing data, and model parameters {πr ,θri }. Following the same steps as above leads to the following expressions which are relevant for the algorithm: W πr j θrj ij qir = Wij , s πs j θsj
(10)
Search of Weighted Subgraphs on Complex Networks
πr =
i qir
n
,
j θri =
Wji qjr j lj qjr
,
557
(11)
where lj = i Wji is the summation over all weights of links emanating from the node j. The implementation of the algorithm and the choice of initial values of parameters are illustrated in previous chapter. Although the formal analogy between the expressions in Eqs. (9) and (11) and also between Eqs. (8) and (10) occurs with Wij → Aij , the important difference occurs in the quantity lj in Eq. (11), which is a measure of strength of node rather than its connectivity. Therefore, within the weighted algorithm, nodes of same strength appear to belong to the same community. The weighted communities may have important effects in the dynamics, as for instance, the clusters of nodes with same strength tend to synchronize at the same time scale.Such properties of the weighted networks remain elusive for the classical community structure analysis based on max-min theorem, mentioned in the introduction. In the remaining part of the Section we apply the algorithm to two weighted networks: first we demonstrate how it finds weighted subgraphs on a computer generated random graph with large density of links, and then on a gene expression network of yeast generated from the empirical expression data. The results are given in Fig. 3. A random graph of N = 100 nodes with a link between each pair of nodes occurring with probability p = 0.5 is generated. In this homogeneously connected graph we create G = 4 groups and assign different weights to links in each group of nodes. In the network of yeast gene expressions, as described in Section 2, the weights of links appear through the correlation coefficient of the gene expressions. As the Fig. 3 shows, the algorithm retrieves accurately four a-priori known weighted groups of nodes on the random graph. Similarly, in the case of gene networks, we tried several partitiones with the
Pajek
Fig. 3. Weighted clusters found by the extended ML algorithm in the correlation network of gene expressions of yeast (left), and in the weighted random graph (right)
558
M. Mitrovi´c and B. Tadi´c
potentially different number of groups. One such partition with G = 5 groups is shown in Fig. 3. Each weighted group on the gene network actually represents a set of genes which are closely co-expressed during the entire cell cycle.
4
Conclusions
We have extended the maximum-likelihood-method of community analysis to incorporate multigraphs (wMLM) and analysed several types of networks with mesoscopic inhomogeneity. Our results show that the extended wMLM can be efficiently applied to search for variety of subgraphs from a clear topological inhomogeneity with network-of-networks structure, on one end, to hidden subgraphs of node with the same strength on the other. Acknowledgments. Research supported in part by national projects P1-0044 (Slovenia) and OI141035 (Serbia), bilateral project BI-RS/08-09-047 and COSTSTSM-P10-02987 mission. The numerical results were obtained on the AEGIS e-Infrastructure, supported in part by EU FP6 projects EGEE-II, SEE-GRID-2, and CX-CMCS.
References 1. Dorogovtsev, S.N., Mendes, J.F.F.: Evolution of Networks: From Biology to the Internet and the WWW, ch. 27, p. 543. Oxford University Press, Oxford (2003) 2. Boccaleti, S., Latora, V., et al.: Complex Networks: Structure and Dynamics. Physics Reports 424, 175–308 (2007) 3. Tadi´c, B., Rodgers, G.J., Thurner, S.: Transport on Complex Networks: Flow, Jamming & Optimization. Int. J. Bifurcation and Chaos 17, 2363–2385 (2007) 4. Ravasz, E., Somera, A., Mongru, D.A., Oltvai, Z.N., Barab´ asi, A.L.: Hierarchical Organization of Modularity in Metabolic Networks. Science 297, 1551 (2002) 5. Danon, L., Diaz-Guilera, A., Arenas, A.: Effect of size heterogeneity on community identification in complex networks. J. Stat. Mechanics: Theory & Experiment, P11010 (2006) 6. Flake, G.W., Lawrence, S.R., Giles, C.L., Coetzee, F.M.: Self-organization and identification of Web communities. IEEE Computer 35, 66–71 (2002) 7. Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69, 026113 (2004) 8. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn., ch. 26, pp. 643–700. MIT Press and McGraw-Hil (2001) 9. Newman, M.E.J., Leicht, E.A.: Mixture models and exploratory analysis in networks. PNAS 104, 9564 (2007) 10. Tadi´c, B.: Dynamics of directed graphs: the world-wide Web. Physica A 293, 273– 284 (2001) 11. Cho, R.J., et.,, al.,: A Genome-Wide Transcriptional Analysis of the Mitotic Cell Cycle. Molecular Cell 2, 65–73 (1998), http://arep.med.harvard. edu/cgi-bin/ExpressDByeas ˇ 12. Zivkovic, J., Tadi´c, B., Wick, N., Thurner, S.: Statistical Indicators of Collective Behavior and Functional Clusters in Gene Expression Network of Yeast. European Physical Journal B 50, 255 (2006)
Spectral Properties of Adjacency and Distance Matrices for Various Networks Krzysztof Malarz AGH University of Science and Technology, Faculty of Physics and Applied Computer Science, al. Mickiewicza 30, PL-30059 Krak´ ow, Poland [email protected], http://home.agh.edu.pl/malarz/
Abstract. The spectral properties of the adjacency (connectivity) and distance matrix for various types of networks: exponential, scale-free (Albert–Barab´ asi) and classical random ones (Erd˝ os–R´enyi) are evaluated. The graph spectra for dense graph in the Erd˝ os–R´enyi model are derived analytically. Keywords: Eigensystem; growing networks; classical random graphs; computer simulations.
1
Introduction
Studies of the network structure seem to be essential for better understanding of many real-world complex systems [1–3]. Among these systems are social [4–15], economic [16–20], biological [21–24] systems or networks sensu stricto [25–37] like Internet or World Wide Web. In the latter case effective algorithms for WWW content search are particularly desired. The Google search engine of the network search bases on the eigenvector centrality [3, 38–40] which is well known in the social network analysis and not different from the Brin and Page algorithm [3, 41]. In this algorithm each vertex i of the network ischaracterized by a positive weight wi proportional to the sum of the weights j wj of all vertexes which point to i, where wi are elements of the i-th eigenvector w of the graph adjacency matrix A Aw = λw. (1) The concept of eigenvector centrality allows distinguish between different importance of the links and thus is much richer than degree or node centrality [42]. The adjacency matrix A of the network with N nodes is square N × N large matrix which elements a(i, j) shows number of (directed) links from node i to j. For undirected network this matrix is symmetrical. For simple graphs (where no multiple edges are possible) this matrix is binary: a(i, j) = 1 when nodes i–j are linked together else a(i, j) = 0. The set of eigenvalues (or its density ρA (λ)) of the adjacency matrix A is called a graph/network spectrum. The graph spectrum was examined [43, 44] for classical random graphs (Erd˝ os–R´enyi, ER) [45, 46] and investigated numerically for scale-free networks [47] by Farkas et al. [48, 49]. The spectra of complex networks were derived exactly for infinite M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 559–567, 2008. c Springer-Verlag Berlin Heidelberg 2008
560
K. Malarz
random uncorrelated and correlated random tree-like graphs by Dorogovtsev et al. [50]. Several other examples of networks properties obtained by studies of graph spectra are given in Refs. [51–56]. While many papers refer to eigenvalues of the adjacency matrices A, less is known about the spectra of the distance matrices D. In the distance matrix D element d(i, j) is the length of the shortest path between nodes i and j. On the other hand, whole branch of topological organic chemistry for alkens was developed for small graphs which symbolize alkens’ structural formula [57–63]. There, not only adjacency A and distance D matrix but also their sum A + D spectral properties were investigated. The detailed description of the distance matrix construction during the network growth for the various network types is given in Ref. [64]. Other solutions of this problem are also known; an example is the Floyd algorithm [65]. During the network growth nodes may be attached to so far existing nodes randomly or according to some preferences P . When this preference bases on nodes connectivity k, P (k) ∝ k, the scale-free Albert–Barab´ asi (AB) [47] networks will appear. The pure random attachment (P (k) = const) leads to exponential nodes degree distribution. New nodes may bring with itself one (M = 1) or more (M ≥ 2) edges which serve as links to pre-existing graph. For M = 1 the tree-like structure appears, while for M > 1 the cyclic path are available. Let us recall that degree distributions π(k) are π(k) ∝ k −γ , π(k) ∝ exp(−k) and Poisson’s one for AB, exponential and ER networks, respectively [1–3]. Here we study numerically1 the graph spectra ρA (λ) for growing networks with exponential degree distribution for M = 1 and M = 2. We check the eigenvalue density ρD (λ) of the distance matrix D for AB, exponential and ER graphs. In literature known to us these spectra was never checked before. The graph spectrum ρA (λ) for dense graph in the ER model is derived analytically in Sec. 2.1 as well. Here we profit much from Ref. [66].
2
Results and Discussion
Here we show densities of eigenvalues ρ(λ) for matrices A and D for various kinds of networks. Results are averaged over Nrun = 102 realizations of networks of N = 103 nodes. 2.1
Spectral Properties of Adjacency Matrix
For the adjacency matrix of ER, the density of eigenvalues consist two separated parts: the Wigner-semicircle centered over λ = 0 and with radius approximately equal to 2 N p(1 − p), and the single Frobenius–Perron principal eigenvalue near N p [43, 44, 67, 68] (see Fig. 1(a)). The detailed study of graph spectrum for AB graphs may be found in Refs. [48, 49] by Farkas et al. There, the deviation for semicircular law was observed 1
With LAPACK procedure http://www.netlib.org/lapack/double/dsyev.f
Spectral Properties of Adjacency and Distance Matrices 0.1
561
p [%] 2 5
ρA
0.01
0.001
0.0001
1e-05 -20
-10
0
10
(a)
20 λ
30
40
1
50
60
M 1 2
0.1
ρA
0.01 0.001 0.0001 1e-05 -15
-10
-5
(b)
0 λ
5
10
1
15
M 1 2
0.1
ρA
0.01 0.001 0.0001 1e-05 -15
(c)
-10
-5
0 λ
5
10
15
Fig. 1. Density of eigenvalues ρA (λ) for adjacency matrices A for (a) ER, (b) AB and (c) exponential networks with N = 103 . The results are averaged over Nrun = 100 simulations and binned (Δλ = 0.1). The isolated peaks in Fig. 1(a) correspond to the principal eigenvalue.
and ρA (λ) has triangle-like shape with power law decay [48]. A very similar situation occurs for the exponential networks, but ρA (λ) at the top of the “triangle” is
562
K. Malarz
now more rounded. The separated eigenvalues are not observed for this kind of networks (see Fig. 1(b-c)). Let us discuss the spectrum of eigenvalues of adjacency matrices of dense graphs in the ER model [66]. The diagonal elements of these matrices are equal zero a(i, i) = 0 while the off-diagonal elements a(i, j) assume the value 1 with the probability p or 0 with the probability 1 − p. The elements a(i, j) above the diagonal are independent identically distributed random numbers with the probability distribution P (a(i, j)) = (1 − p)δ(a(i, j)) + pδ(1 − a(i, j)). This probability distribution of a(i, j) ≡ x has the mean value: x0 = x = p and the variance σ 2 = x2 − x2 = p(1 − p). The universality tells us that the spectrum of random matrices does not depend on the details of the probability distribution but only on its mean value and variance:2 the eigenvalue spectrum in the limit N → ∞ is identical for different distributions as long as they have the same one can take a Gaussian distribution: √ mean and variance. In particular 1/ 2πσ 2 exp −(x − x0 )2 /2σ 2 . Thus one can expect that the spectrum of adjacency matrices of ER graphs can be approximated for large N by the spectrum of matrices with continuous random variables which have the following probability distribution: da(i, i) a(i, i)2 da(i, j) (a(i, j) − p)2 √ √ exp − exp − · . (2) 2σ 2 2σ 2 2π 2π i i<j For the diagonal elements the distribution has the mean equal zero to reflect the fact that the corresponding adjacency matrix elements a(i, i) = 0. The last expression can be written in a compact form: 1 1 2 2 (3) DA exp − 2 tr(A − pC) = DA exp − 2 trB , 2σ 2σ where
√ da(i, i) da(i, j) DA ≡ ( 2π)−N (N +1)/2 i
i<j
is the standard measure in the set of symmetric matrices. The matrix B is obtained from A by a shift B = A − pC where C has the form: ⎛ ⎞ 0 1 1 ··· 1 1 1 ⎜1 0 1 · · · 1 1 1 ⎟ ⎜ ⎟ ⎜1 1 0 · · · 1 1 1 ⎟ ⎜ ⎟ ⎜ ⎟ .. C=⎜ (4) ⎟. . ⎜ ⎟ ⎜1 1 1 · · · 0 1 1 ⎟ ⎜ ⎟ ⎝1 1 1 · · · 1 0 1 ⎠ 1 1 1 ··· 1 1 0 2
If the variance is finite.
Spectral Properties of Adjacency and Distance Matrices
563
The spectrum of the matrix B is given by the Wigner semi-circle law [69–71]: 1 4N σ 2 − λ2 . (5) 2πN σ 2 √ √ It has a support [−2 N σ, 2 N σ], where σ = p(1 − p) as we calculate above. We want to determine the spectrum of A. It is a sum A = B + pC of matrix B for which we already know the spectrum (5) and of matrix pC whose spectrum consists of an (N − 1)-degenerated eigenvalue −p and one eigenvalue p(N − 1). The low eigenvalue −p mixes with the eigenvalues of B leaving the bulk of the distribution (5) intact while the eigenvalue p(N − 1) adds to the disa well separated peakin the position p(N − 1) far beyond the support tribution −2 N p(1 − p), 2 N p(1 − p) of the main part of the distribution: ρB (λ) =
ρA (λ) ≈ ρB (λ) +
1 δ(λ − p(N − 1)). N
(6)
The considerations hold as long as p is finite. For sparse graphs p ∼ 1/N → 0 one sees modifications to the presented picture [66]. We note, that the matrix C is both the adjacency and distance matrix for a complete graph. Thus two very sharp peaks at λ = −1 and λ = N − 1 constitute a complete graph spectrum ρC (λ) = 2.2
1 N −1 δ(λ + 1) + δ(λ − (N − 1)). N N
Spectral Properties of Distance Matrix
Spectra of the distance matrix ρD (λ) of growing networks for trees (M = 1) and other graphs (M > 1) are quantitatively different. For trees the part of spectrum for λ > 0 is wide and flat. Moreover, the positive and negative eigenvalues are well separated by a wide gap (see Fig. 2(b-c)) which increases with networks size N as presented in Fig. 3. On the other hand, we do not observe any finite size effect for negative part of the spectrum. The density of negative eigenvalues of D (see Fig. 2) is very similar for considered networks. The positive value part of the spectrum for growing networks does not depend on growth rule and it is roughly the same for AB and exponential networks. For complete graph D = A = C and graph spectra consist two sharp peaks as mentioned earlier.
3
Summary
In this paper the spectral properties of the adjacency A and distance D matrices were investigated for various networks. For ER and AB networks the well known densities of eigenvalues ρA (λ) were reproduced. For the growing networks with attachment kernel P (k) ∝ const(k)
564
K. Malarz
0.1
p [%] 2 5
ρD
0.01
0.001
0.0001
1e-05 1
2
3
4
5
(a)
6 7 λ/N+p
8
9
1
10 11
M 1 2
0.1
ρD
0.01 0.001 0.0001 1e-05 -1
0
1
2
3
(b)
4 5 λ/N+M
6
7
0.1
8
9 10
M 1 2
ρD
0.01
0.001
0.0001
1e-05 -2
(c)
0
2
4
6 8 λ/N+M
10
12
14
Fig. 2. Density of eigenvalues ρD for distance matrices D for (a) ER, (b) AB and (c) exponential networks with N = 103 . The results are averaged over Nrun = 100 simulations and binned (Δλ = 0.1). The graphs are horizontally shifted by M or p for better view.
the graph spectra are similar to the AB networks except of the spectra center. For the complete graph two well separated peaks constitute the graph spectrum.
Spectral Properties of Adjacency and Distance Matrices 1
565
N 50 100 500 1000
0.1
ρD
0.01 0.001 0.0001 1e-05 -2
0
2
4 λ/N
6
8
10
Fig. 3. Density of eigenvalues ρD (λ) for distance matrices D for AB trees with various network size N
The spectra of distance matrix D differ quantitatively for trees and other graphs. In case of trees (M = 1) the density of positive eigenvalues is very well separated from the part of the spectrum for λ < 0 and extremely flat. Thus the specific shape of the distance matrix spectrum may be a signature of absence of loops and cyclic paths in the network. Acknowledgments. Author is grateful to Zdzislaw Burda for valuable scientific discussion and to Krzysztof Kulakowski for critical reading the manuscript. Part of calculations was carried out in ACK CYFRONET AGH. The machine time on HP Integrity Superdome is financed by the Polish Ministry of Science and Information Technology under Grant No. MNiI/HP I SD/AGH/047/2004.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
Albert, R., Barab´ asi, A.L.: Rev. Mod. Phys. 286, 47 (2002) Dorogovtsev, S.N., Mendes, J.F.F.: Adv. Phys. 51, 1079 (2002) Newman, M.E.J.: SIAM Rev. 45, 167 (2003) Newman, M.E.J.: Phys. Rev. E64, 016131 (2001) Newman, M.E.J.: Phys. Rev. E64, 016132 (2001) Simkin, M.V., Roychowdhury, V.P.: Complex Syst. 14, 269 (2003) Simkin, M.V., Roychowdhury, V.P.: Annals Improb. Res. 11, 24 (2005) Erez, T., Moldovan, S., Solomon, S.: arXiv:cond-mat/0406695v2 Galam, S., Mauger, A.: Physica A323, 695 (2003) Galam, S.: Physica A336, 49 (2004) Galam, S.: Eur. Phys. J. B26, 269 (2002) Stauffer, D.: arXiv:cond-mat/0204099v1 Galam, S.: Eur. Phys. J. B26, 269 (2002) Proykova, A., Stauffer, D.: Physica A312, 300 (2002) Solomon, S., Weisbuch, G., de Arcangelis, L., Jan, N., Stauffer, D.: Physica A277, 239 (2000)
566
K. Malarz
16. Mantegna, R.N., Stanley, H.E.: Introduction to Econophysics. Cambridge University Press, Cambridge (2000) 17. Barra˜ no ´n, A.: arXiv:nlin/0404009v1 18. Hohnisch, M., Pittnauer, S., Stauffer, D.: arXiv:cond-mat/0308358v1 19. Makowiec, D., Gnaci´ nski, P., Miklaszeski, W.: arXiv:cond-mat/0307290v1 20. Goldenberg, J., Libai, B., Solomon, S., Jan, N., Stauffer, D.: Physica A284, 335 (2000) 21. Liljeros, F., Edling, C.R., Amaral, L.A.N., Stanley, H.E., Aberg, Y.: Nature 411, 907 (2001) 22. L¨ assig, M., Bastolla, A.-L., Manrubia, S.C., Valleriani, A.: Phys. Rev. Lett. 86, 4418 (2001) 23. Camacho, J., Guimer` a, R., Amaral, L.A.N.: Phys. Rev. E65, 030901(R) (2002) 24. Camacho, J., Guimer` a, R., Amaral, L.A.N.: Phys. Rev. Lett. 88, 228102 (2002) 25. Shargel, B., Sayama, H., Epstein, I.R., Bar-Yam, Y.: Phys. Rev. Lett. 90, 068701 (2003) 26. Magoni, D.: IEEE J. Selected Areas Commun. 21, 949 (2003) 27. Crucitti, P., Latora, V., Marchiori, M., Rapisarda, A.: Physica A320, 622 (2003) 28. Motter, A.E., Nishikawa, T., Ying-Cheng, L.: Phys. Rev. E66, 65103 (2002) 29. Lin, G.-J., Cheng, X., Ou-Yang, Q.: Chinese Phys. Lett. 20, 22 (2003) 30. Zonghua, L., Ying-Cheng, L., Nong, Y.: Phys. Rev. E66, 36112 (2002) 31. Zonghua, L., Ying-Cheng, L., Nong, Y., Dasgupta, P.: Phys. Lett. A303, 337 (2002) 32. Dorogovtsev, S.N., Mendes, J.F.F., Cohen, R., Erez, K., ben-Avraham, D., Havlin, S.: Phys. Rev. Lett. 87, 219801 (2001) 33. Dorogovtsev, S.N., Mendes, J.F.F., Cohen, R., Erez, K., ben-Avraham, D., Havlin, S.: Phys. Rev. Lett. 87, 219802 (2001) 34. Cohen, R., Erez, K., Ben-Avraham, D., Havlin, S.: Phys. Rev. Lett. 86, 3682 (2001) 35. Barab´ asi, A.-L., Albert, R., Jeong, H.: Physica A281, 69 (2000) 36. King, K.M.: Educom Bulletin 23, 5 (1988) 37. Cunningham, W.H.: J. Assoc. Comput. Machinery 32, 549 (1985) 38. Scott, J.: Social Network Analysis: A Handbook, 2nd edn. Sage Publications, London (2000) 39. Wasserman, S., Faust, K.: Social Network Analysis. Cambridge University Press, Cambridge (1994) 40. Bonacich, P.F.: Am. J. Sociol. 92, 1170 (1987) 41. Brin, S., Page, L.: Computer Networks 30, 107 (1998) 42. Newman, M.E.J.: Mathematics of networks. In: Blume, L.E., Durlauf, S.N. (eds.) The New Palgrave Encyclopedia of Economics, 2nd edn., Palgrave Macmillan, Basingstoke (2008) 43. Mehta, M.L.: Random Matrix Theory. Academic Press, New York (1995) 44. Cvetkovi´c, D., Rowlinson, P., Simi´c, S.: Eigenspaces of graphs. Cambridge University Press, Cambridge (1997) 45. Erd˝ os P., R´enyi, A.: Publications Mathematicae 6, 290 (1959) 46. Erd˝ os P., R´enyi, A.: Publ. Math. Inst. Hung. Acad. Sci. 5, 17 (1960) 47. Barab´ asi, A.-L., Albert, R.: Science 286, 509 (1999) 48. Farkas, I.J., Der´enyi, I., Barab´ asi, A.-L., Vicsek, T.: Phys. Rev. E64, 026704 (2001) 49. Farkas, I.J., Der´enyi, I., Jeong, H., Neda, Z., Oltvai, Z.N., Ravasz, E., Schubert, A., Barab´ asi, A.-L., Vicsek, T.: Physica A314, 25 (2002) 50. Dorogovstev, S.N., Goltsev, A.V., Mendes, J.F.F., Samukhin, A.N.: Phys. Rev. E68, 046109 (2003) 51. Faloutsos, M., Faloutsos, P., Faloutsos, C.: Comput. Commun. Rev. 29, 251 (1999)
Spectral Properties of Adjacency and Distance Matrices
567
52. Monasson, R.: Eur. Phys. J. B12, 555 (1999) 53. Graovac, A., Plavsic, D., Kaufman, M., Pisanski, T., Kirby, E.C.: J. Chem. Phys. 113, 1925 (2000) 54. Eriksen, K.A., Simonsen, I., Maslov, S., Sneppen, K.: arXiv:cond-mat/0212001v1 55. Vukadinovic, D., Huang, P., Erlebach, T.: In: Unger, H., B¨ ohme, T., Mikler, A.R. (eds.) IICS 2002. LNCS, vol. 2346, pp. 83–95. Springer, Heidelberg (2002) 56. Golinelli, O.: arXiv:cond-mat/0301437v1 57. Schultz, H.P., Schultz, T.P.: J. Chem. Inf. Comput. Sci. 40, 107 (2000) 58. Schultz, H.P., Schultz, E.B., Schultz, T.P.: J. Chem. Inf. Comput. Sci. 35, 864 (1995) 59. Schultz, H.P., Schultz, E.B., Schultz, T.P.: J. Chem. Inf. Comput. Sci. 34, 1151 (1994) 60. Schultz, H.P., Schultz, T.P.: J. Chem. Inf. Comput. Sci. 33, 240 (1993) 61. Schultz, H.P., Schultz, T.P.: J. Chem. Inf. Comput. Sci. 31, 144 (1991) 62. Schultz, H.P., Schultz, E.B., Schultz, T.P.: J. Chem. Inf. Comput. Sci. 30, 27 (1990) 63. Schultz, H.P.: J. Chem. Inf. Comput. Sci. 29, 227 (1989) 64. Malarz, K., Kulakowski, K.: Acta Phys. Pol. B36, 2523 (2005) 65. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms. MIT Press, Cambridge (2001) 66. Burda, Z.: unpublished 67. Goh, K.-I., Kahng, B., Kim, D.: Phys. Rev. E64, 051903 (2001) 68. Dorogovtsev, S.N., Goltsev, A.V., Mendes, J.F.F., Samukhin, A.N.: Physica A338, 76 (2004) 69. Wigner, E.P.: Ann. Math. 62, 548 (1955) 70. Wigner, E.P.: Ann. Math. 65, 203 (1957) 71. Wigner, E.P.: Ann. Math. 67, 325 (1958)
Simplicial Complexes of Networks and Their Statistical Properties Slobodan Maleti´c, Milan Rajkovi´c*, and Danijela Vasiljevi´c Institute of Nuclear Sciences Vinˇca, Belgrade, Serbia *[email protected]
Abstract. Topological, algebraic and combinatorial properties of simplicial complexes which are constructed from networks (graphs) are examined from the statistical point of view. We show that basic statistical features of scale free networks are preserved by topological invariants of simplicial complexes and similarly statistical properties pertaining to topological invariants of other types of networks are preserved as well. Implications and advantages of such an approach to various research areas involving network concepts are discussed. Keywords: Networks, statistical mechanics, complex systems, topology, simplicial complexes, homology, betti numbers.
1
Topological Features of Simplicial Complexes
In this section we give a short introduction to the subject of simplicial complexes and related topological terminology [8]. Any subset of V , {vα0 , vα1 , ..., vαn } determines an n-simplex denoted by vα0 , vα1 , ..., vαn . The elements vαi of V are the vertices of the simplex denoted by vαi , and n is the dimension of the simplex. Any set of simplices with vertices in V is called a simplicial family and its dimension is the largest dimension of its simplices. A q-simplex σq is a q-face of an n-simplex σn , denoted by σq σn , if every vertex of σq is also a vertex of σn . A simplicial complex represents a collection of simplices. More formally, a simplicial complex K on a finite set V = {v1 , ..., vn } of vertices is a nonempty subset of the power set of V , so that the simplicial complex K is closed under the formation of subsets. Hence, if σ ∈ K and ρ.∈ σ, then ρ.∈ K. Two simplices σ and ρ are q − connected if there is a sequence of simplices σ, σ1 , σ2 , ..., σn , ρ, such that any two consecutive ones share a q-face,.implying that they have at least q+1 vertices in common. Such a chain is called a q−chain. The complex K is q-connected if any two simplices in K of dimensionality greater or equal to q are q-connected. The dimension of a simplex σ is equal to the number of vertices defining it minus one. The dimension of the simplicial complex K is the maximum of the dimensions of the simplices comprising K. In Fig. 1 we show an example of a simplicial complex and its matrix representation. In this example V = {1, 2, ..., 11}, and the simplicial complex K consists of the subsets {1, 2, 3, 4, 5}, {2, 3, 6, 7}, {6, 7, 8, 9} and {10, 11}. Its dimension is M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 568–575, 2008. c Springer-Verlag Berlin Heidelberg 2008
Simplicial Complexes of Networks and Their Statistical Properties
A B C D E
1 1 0 0 0 0
2 1 1 0 0 0
3 1 1 0 0 0
4 1 0 0 0 0
5 1 0 0 0 0
6 0 1 1 0 0
7 0 1 0 1 0
8 0 0 1 1 0
9 10 11 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1
Incidence Matrix
7
4
Simplicial Complex
3
5
A 1
D 8
C
B 2
569
6
9
10
E 11
Fig. 1. An example of a simplicial complex and its representation
4, as there is a 4-dimensional simplex, in addition to two 3-dimensional ones attached to it and one 1-dimensional simplex. A convenient way to represent a simplicial complex is via a so called incidence matrix, whose columns are labeled by its vertices and whose rows are labeled by its simplices, as shown also in Fig. 1. The multifaceted property (algebraic, topological and combinatorial) of simplicial complexes makes them particularly convenient for modelling complex structures and connectedness between different substructures. 1.1
The Method
Our approach is to encode a network into a simplicial complex, construct vector valued quantities representing topological or algebraic invariants and examine statistical properties of vector valued measures. More precisely, we inspect the distribution of vector components as a function of dimension and compare these distributions for different types of networks which should reflect the basic statistical properties of the networks under study (scale-free, random etc.). Furthermore, by extracting topological properties of substructures the characterization of networks may go beyond the degree distribution, and give insight into higher levels of connectedness of networks. This implies that some topological properties of networks may be distinct even if they have the same degree distribution. 1.2
Construction of Simplicial Complexes from Graphs
Simplicial complexes may be constructed from directed graphs (digraphs) in several different ways. Here we only consider construction of the so called neighborhood complex N (G) from the graph G, with vertices {v1 , ..., vn }. For each vertex v of G there is a simplex containing the vertex v, along with all vertices w corresponding to directed edges v → w. The neighborhood complex is obtained by including all faces of those simplices and in terms of matrix representation, the incidence matrix is obtained from the adjacency matrix of G by increasing all diagonal entries by 1.
570
S. Maleti´c, M. Rajkovi´c, and D. Vasiljevi´c
The second method associated to digraphs (or undirected graphs) has the complete subgraphs as simplices. The complete subgraph complex C(G) has the vertices of G as its vertices, while the maximal simplices are given by the collection of vertices that make up the cliques of G. Naturally, these two methods are not the only ones that may be used for constructing simplicial complexes from graphs. Actually, any property of the graph G that is preserved under deletion of vertices or edges may be used for construction purposes. A detailed account of the methods for obtaining simplicial complexes from graphs may be found in [9]. 1.3
Invariants of Simplicial Complexes
Simplicial complexes may be considered from three different aspects: (1) a combinatorial model of a topological space; (2) a combinatorial object; (3) an algebraic model. Consequently, the invariants of simplicial complexes may be defined based on their different aspects and each aspect provides completely different measures of the complex and, by extension, of the graph from which the complex was constructed. In the first case various algebraic topological measures may be associated such as homotopy and homology groups [6]. In the second case several invariants may be defined and numerically evaluated. The first is the dimension of the complex. The next one is the so called an f -vector (also known as the second structure vector ) [2], [3], [4], [5], which is an integer vector with dim(K) + 1 components, the i-th one being equal to the number of i-dimensional simplices in K. An invariant is also a Q-vector (f irst structure vector), an integer vector of the same length as the f -vector, whose i-th component is equal to the number of i-connectivity classes. The structure vector, illustrated in Fig. 2, provides information about connected components at each level of connectivity with initial level equal to the dimension of the complex. In terms of the topological framework of the Q-analysis, we denote the q-level at which the simplex first appears its top q (symbol qˆ), and the q-level at which it first joins another simplex in a local component its bottom q (symbol qˇ). These symbols are then used to define eccentricity of a particular simplex as ecc =
qˆ − qˇ , qˆ + 1
so that this expression scales eccentricity from 0 to 1 [7]. Eccentricity quantifies the way the simplex is integrated into the complex such that high values reflect low levels of integrity, while low values result from high integrity levels. Also important is the so called vertex signif icance defined as ρ=
Δi , max(Δk )
where Δi represents the sum of the node weights which the simplex shares with other simplices, max(Δk ) represents the maximal value of Δi .and the node weight is the number of simplices formed by that node.
Simplicial Complexes of Networks and Their Statistical Properties
571
4 3
5
A 1
q=4 level
2 7
4 3 3
5
A 1
B
q=3 level
6
2 2
7
4
8
3 3
5
A
C
B
1
9
6 6
2 2 7
4
7
D
8
3
5
A 1
B
q=2 level
8
C 6
2
9
6
E
10
11 7
4 3
5
A 1
8
C
B 2
D
q=1 level
9
6
10
E 11
q=0 level
Fig. 2. Structure vector (Q-vector) disclosing connectivity of the complex at various levels
1.4
The Betti Numbers
Betti numbers may be associated to the simplicial complex when the aspect of abstract algebra is applied. In simple terms the Betti numbers are a topological invariant allowing to measure either the number of holes (simplices representing holes) of various dimensions present in a simplicial complex, or equivalently, the number of times the simplex loops back upon itself. Hence Betti numbers form an integer vector where each component corresponds to a distinct dimension. 1.5
Clustering Coefficients of Simplicial Complexes
In analogy to networks [1], one may define the clustering coefficients of simplices. We can define this variable between the reference simplex and its neighbors in the following way. Suppose we have two simplices σ i and σ j which share face of dimension fij . The dimensions of the simplices σ i and σ j are qi and qj ,
572
S. Maleti´c, M. Rajkovi´c, and D. Vasiljevi´c
respectively. Since they share face whose dimension is fij , they also share faces with dimensions fij − 1, fij − 2, ..., 0. The clustering coefficient is defined as C ij =
N umber of shared f aces , (1/2)(Overall number of f aces that each of simplices can have)
or
fij
C
ij
f =0 αfij ,f qj α + α f =0 qi ,f f =0 qj ,f
= qi 1 2
,
(1)
(x+1)! where αx,f = (f +1)!(x−f of the )! and x = fij , qi , qj . If we sum over all neighbors simplex, we get the variable which characterizes the simplex i.e. C i = j∈nn C ij where nn stands for nearest-neighbor. Simplifying equation (1) one gets
C ij =
2(−1 + 21+fij )(1 + fij )! , 1+qi (−1+21+qj )(1+qj )! i )! Γ (2 + fij ) (−1+2Γ (2+q)(1+q + Γ (2+qj ) i)
(2)
where Γ (z) is gamma-function. We can further simplify equation (2) and get 21+fij − 1 , 2qi + 2qj − 1
(3)
21+fij − 1 . 2qi + 2qj − 1 j∈nn
(4)
C ij = and for the simplex Ci =
2
Random (Erd¨ os-R´ enyi) Network
The network constructed in order to illustrate the concepts mentioned previously consists of 2000 nodes with the probability of two nodes having a link equal to p = 0.05. As is well known, a random network has a characteristic scale in its node connectivity reflected by the peak of the distribution which corresponds to the number of nodes with the average number of links. A careful reader has certainly noticed that the number of links connecting a specific node in a network corresponds to the dimension of the encoded simplex. Hence, distribution with respect to the dimension of simplices, shown in Fig 3 (left), is equivalent to the degree distribution of random networks and follows a bell-shaped curve. The distribution of vector valued measures is illustrated by distributions of the first and second structure vectors (Q-vector and f -vector), shown in Fig. 3 (right). The distribution in each case assumes dimension dependence. As in the dimension (network degree) distribution, characteristic dimension of a simplicial complex may be noticed, and both vectors have a similar Poissonianlike shape, as expected for a random network.
Simplicial Complexes of Networks and Their Statistical Properties
573
Fig. 3. Dimension distribution of a random simplicial complex (left); Q-vector and f-vector as a function of dimension (right)
3
A Network with Exponential Degree Distribution
A US power grid network of the western United States consists of 4941 nodes [11]. As an illustration of vector-valued network measures and topological invariants of the corresponding simplicial complexes we present in the Fig. 4 the number of simplices of a given dimension as a function of dimension, first and second structure vectors. The distribution of vertex significance also follows an exponential dependence (not shown).
Fig. 4. Degree distribution and distributions of first and second structure vector for the US power grid network
4
Scale-Free Network
The absence of a peak in a power-law degree distribution of scale-free networks implies absence of characteristic scale (characteristic node). This property is
574
S. Maleti´c, M. Rajkovi´c, and D. Vasiljevi´c
0
10
P(q)
q=1.25 gamma=4
−1
10
−2
10
0
1
10
10
2
10
q
Fig. 5. Distribution of simplex dimensions. Fitting parameters to the q-exponential function are shown in the inset. 0
P(vs)
10
−1
10
q=1.21 exponent=4.76 −2
10
0
10
1
2
10
10
vs
Fig. 6. Distribution of vertex significance with respect to dimension. The parameters of the fit to the q-exponential function are in the inset.
Fig. 7. Distribution of normalized Betti numbers
Simplicial Complexes of Networks and Their Statistical Properties
575
reflected in the distribution of dimensions of simplices forming the simplicial complex obtained from the scale-free network (Fig. 5). The distribution fits very well to the q-exponential function, with parameters shown in the inset (q = 1.25, γ = 4). The distribution of the vertex significance along with the q-exponential fit is presented in Fig. 6, while in Fig. 7 the distribution of the Betti numbers exhibits very good fit to the q-exponential function.
5
Final Remarks
Based on the results pertaining to random and scale-free networks, and to networks showing exponential degree distribution, it is clear that simplicial complexes encoded from these networks generate remarkable properties characterizing their topological, algebraic and combinatorial features. Based on this short exposition it is clear that conventional network approaches lack the diversity and abundance of information offered by the simplicial complex approach. Moreover, advanced algebraic topology methods enable dynamic analysis of simplicial complexes along with the dynamic updating of topological properties, a topic to be discussed elsewhere [10]. Route-link structure, so important in many applications of network models, may obtain more suitable description using the simplicial complex approach while it is evident that this method may be able to address a wider class of problems than network theory. In a straightforward manner traffic (in the most general sense) may be developed for these structures so that a theory and modeling of multidimensional traffic on a multidimensional topological background may be realized.
References 1. Albert, R., Barabasi, A.L.: Statistical mechanics of complex networks. Reviews of Modern Physics 74, 47–97 (2002) 2. Atkin, R.: From cohomology in physics to q-connectivity in social sciences. Int. J. Man-Machines Studies 4, 341–362 (1972) 3. Atkin, R.: Mathematical Structure in Human Affairs, London, Heinemann (1974) 4. Atkin, R.: An Algebra of Patterns on a Complex I. Int. J. Man-Machine Studies 6, 285–307 (1974) 5. Atkin, R.: An Algebra of Patterns on a Complex II. Int. J. Man-Machine Studies 8, 483–498 (1974) 6. Barcelo, H., Kramer, X., Laubenbacher, R., Weaver, C.: Foundations of connectivity theory for simplicial complexes. Adv. Appl. Math. 26, 97–128 (2001) 7. Gould, P., Johnson, J., Chapman, G.: The Structure of Television. Pion Limited, London (1984) 8. Hatcher, A.: Algebraic Topology. Cambridge University Press, Cambridge (2002) 9. Jonsson, J.: Simplicial Complexes from Graphs. Lecture Notes in Mathematics. Springer, Heidelberg (2008) 10. Vasiljevi´c, D., Maleti´c, S., Rajkovi´c, M.: (in preparation) 11. Watts, D.J., Strogatz, S.H.: Collective dynamics of ’small-world’ networks. Nature 393, 440–442 (1998)
Movies Recommendation Networks as Bipartite Graphs Jelena Gruji´c Scientific Computing Laboratory, Institute of Physics Belgrade, Pregrevica 118, 11080 Belgrade, Serbia [email protected], http://www.phy.bg.ac.yu
Abstract. In this paper we investigate the users’ recommendation networks based on the large data set from the Internet Movie Database. We study networks based on two types of inputs: first (monopartite) generated directly from the recommendation lists on the website, and second (bipartite) generated through the users’ habits. Using a threshold number of votes per movie to filter the data, we actually introduce a control parameter, and then by tuning this parameter we study its effect on the network structure. From the detailed analysis of both networks we find that certain robust topological features occur independently from the value of the control parameter. We also present a comparison of the network clustering and shortest paths on the graphs with a randomized network model based on the same data. Keywords: recommendation networks, bipartite graphs, topology.
1
Introduction
Social networks, representing interactions and relationships between humans or groups, recently became subject of broad research interest [1]. One of the reasons behind this is rapidly evolving electronic technology, which created e-social networks, representing social communication through the Internet. They can be seen as social networks or a technological communication networks [2]. E-mail, chats, web portals, etc., gave us huge amount of information needed for investigated social structures, but also adds new dimension to the social structures. In contrast to the typical communication between pairs of users on the network, such as e-mail network, where a message is sent directly to one known user [3], we can also investigate social structures where users comunicate through common interests, like books, musics, movies, etc. In these user-based dynamical systems subtle correlations between users are developed through hidden feedback mechanisms, in which users share currently available opinions and make actions which, in turn, contribute to further evolution of the network. Recommendation systems help to overcome information overload by providing personalized suggestions. On many Internet portals, when user selects a product a certain number of alternative products are suggested. These products are nodes of a recommendation network and links point toward the products in their M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 576–583, 2008. c Springer-Verlag Berlin Heidelberg 2008
Movies Recommendation Networks as Bipartite Graphs
577
106 105 104 103 102 101
Number of movies Number of users 101
102 103 104 Minimal number of votes Vmin
105
Fig. 1. Dependence of the number of movies and users in the networks on from the control parameter Vmin , minimal number of votes for a movie in the network
retrospective recommendation lists. For example, networks where products are music groups is studied in Ref. [6]. Recommendations can be generated either through collaborative filtering or using content-based methods or by combination of these two methods. If collaborative filtering is used, the generated network, are actually one-mode projections of bipartite networks, where one type of nodes are products and the other type are users. Links go only between nodes of different types, in this case link is created if the user select the product. Another example connect users with music groups they collected in their music sharing liberties [4,7]. The loss of information in transition from bipartite network to one-mode projection is obvious. In Ref. [5] authors discused the way of obtaining one-mode projection with minimal loss of information and tested their method on the movies-users network. Here we examined movies-users network using the data from the largest and the most comprehensive on-line movies database IMDb [8]. Generated networks were based on the data on more than 43, 000 movies and 350, 000 users. Furthermore, we introduced a control parameter and tested the universality of our conclusions for the networks of different sizes. We combine two approaches. As a starting point we perform an analysis of the empirical data collected from the website IMDb. Then we investigate the properties of naturally emerging bipartite network based on the users’ behavior.
2
Movie Networks
Investigated database IMDb has numerous information regarding more that 1,000,000 films, TV series and shows, direct-to-video product and video games.
578
J. Gruji´c
The database also posses the user comment, ratings and message boards which characterize users habits. For each movie we collect the following information: ID-number, number of votes V , ID numbers of movies in their recommendation list and ID-numbers of users which commented that movie. Collected information belong to two types of data. First type are IMDb recommendations based on the recommendation system of the site. Second are users’ habits, through which, using collaborative filtering, we generate our own recommendation networks. For computational purposes, we concentrated our research only on USA theatrically released movies. This choice was also motivated by the fact that USA theatrical releases had the most comprehensive data. In order to analyze the universality of our conclusions, we introduced number of votes as control parameter. Users have the opportunity to give their rating for any movie in the Database. Number of votes varies from movie to movie, with more popular ones having more votes. The largest network we analyzed has movies with more that 10 votes and it consists of more that 43, 000 movies and 300, 000 users. We investigated five different sizes of networks according to the minimal number of votes casted for movies Vmin ∈ {101 , 102 , 103 , 104 , 105 }. As presumed, number of movies quickly decreases when minimal number of votes increases. Vmin 1. On the other hand, number of users does not drastically change when Vmin is increased, which is also expected: if a small number of people voted for some movie than also small number of users commented the movie and by cutting-off huge number of less popular movies you do not cut-off many users. Users usually commented on more than one movie, so by cutting-off some movie you do not necessarily cut-off the user as well. From gathered information we constructed three different networks: IMDb recommendations (IMDb) monopartite directed network is made by using recommendation system provided by the website [8]. Nodes are movies and links are pointing towards the movies in their respective recommendation list. Rules for generating this recommendation lists is not publicly available (due to the IMDb privacy policy). It is only known that it uses factors such as user votes, genre, title, keywords, and, most importantly, user recommendations themselves to generate an automatic response. User driven bipartite network (UD-BP) is constructed using users’ comments. One type of nodes are movies, and the other type of nodes are users. The movie and the user are linked if a specific user left a comment on the webpage on a specific movie. We do not distinguish between positive and negative comments. As before, we made large bipartite network of almost 43, 000 movies and almost 400, 000 users. Average number of users per movie was 27 with maximum number of 4, 763 comment left for the movie “The Lord of the Rings: The Fellowship of the Ring”. Average number of movies per user is much smaller (around three), but the maximal number was 3, 040. One-mode projection of user driven network (UD-OM) is generated from previous bipartite network. For each movie we generate recommendation list similar to the list provided by the website, but this time based on the users’ comments. Like in usual one-mode projection, two movies will be connected if they have
Movies Recommendation Networks as Bipartite Graphs
579
a user which left comment on both movies, but in the recommendation list we put only ten movies with the highest numbers of common users. Among movies with the same number of common users, we choose randomly. We note that this network will also be directed: despite having common users being symmetric relation, if movie i is in the top ten movies of movie j, that does not imply that the movie j is in the top ten movies of movie i. Users-preferential random network (UP-RD) is generated by connecting each movie to ten other movies randomly from probability distribution proportional to the number of users which left comment on that movie. This way we expect to obtain the degree distribution similar to the distribution of number of users per movie. However since linked movies are chosen randomly, not by their similarity, we expect this to create significant differences in other properties of such network. Test if networks’ properties are just the consequence of favoring more popular movies in recommendation system or there is some other underlying mechanism which actually connect movies with movies of their kind.
3
Investigated Properties
In order to determine topology of the network, we focus on the properties which we consider as most importante for the network searchability. Observed properties could lead to possible optimizations of movies recommendation system. Investigated properties are similar to those already studied for real networks: Degree ki of the node i is the number of links incident to that node. Average degree k is ki averaged over all nodes [1]. The degree distribution P (k) is probability that a node selected uniformly at random has degree k. For directed network we can calculate in-degree distribution P (k in ) as distribution of incoming links and out -degree distribution P (k out ) as distribution of outgoing links. Since the number of outgoing links is limited to 10, only degree of incoming links in nontrivial, so we present only those results. In bipartite network we calculate degree distribution for each type of nodes separately. Clustering coefficient introduced in Ref. [10] expresses how likely is for the two first neighbors j and k of the node i to be connected. Clustering coefficient ci of the node i is the ratio between the total number ei of links between its nearest neighbors and the total number of all possible links between these nearest neighbors: 2ei (1) ci = ki (ki − 1). Clustering coefficient for the whole network c is average of Ci over all nodes. In directed monopartite networks we did not distinguish between the directions of the links. In bipartite networks there are two types of nodes and links go only between different types, so the above definition for clustering does not apply because triangle does not exist. A measure of the typical separation between two nodes in the graph is given by the average shortest path length D, defined as the mean of all shortest paths lengths dij [10]. A problem with this definition is that D diverges if network
580
J. Gruji´c 10 2
1 Vmin =10 Vmin =10 2 Vmin =10 3 Vmin =10 4 Vmin =10 5
10 1
10 2 Degree distribution
Degree distribution
10 0 10 -1 10 -2 10 -3 10 -4 10
-5
10 -6 10 -7
10 0
10 1
10 2 Degree
10 3
1 Vmin =10 Vmin =10 2 Vmin =10 3 Vmin =10 4
10 3
10 4
10 1 10 0 10 -1 10 -2
10 1
10 2 Degree
10 3
Fig. 2. Degree distributions for networks with different minimal number of votes Vmin . On the left directed monopartite networks based on IMDb recommendations. The tail is fitted with the power-law k−1.8 . On the right for user driven one-mode projection (UD-OM) fitted to the power law k−1.6 .
is not connected, as in our example. Here we calculate D as the average of all existing shortest paths. We also use alternative approach and considered the harmonic mean [11] of the shortest paths, so-called topological efficiency. D=
1 dij N (N − 1) i=j
4
E=
1 1 . N (N − 1) dij
(2)
i,j,i=j
Results and Discusion
A common feature which appears in all studied networks is a degree distribution with power law tail. Even more important is the fact that the degree distributions are universal for networks of different sizes, obtained by changing the control parameter. This suggests that even though we studied large but limited number of movies, our main result do not depend on the number of movies, i.e. on finite size effect. For user driven bipartite network, the distribution of number of movies per user is very robust power-law, universal for all sizes of networks. The distribution fits to the power law with the exponent 2.16. This exponet occurs in most of the studied real world networks [1]. The distribution of number of users per movie (Fig 3) is well described by a power law for the largest investigated networks (Vmin > 10), for the smaller networks this is not the case, as the number of movies per user decreases below 20. This is expected since in smaller networks we do not have less popular movies, and those are the movies which usually have small number of users. Even though IMDb recommendations network has degree distributions which are not power laws for the degrees less than 11, the distributions can be rescaled so as to fit to the same curve even for the small degrees. Like their bipartite
Movies Recommendation Networks as Bipartite Graphs 10 5 10
4
10 4
1 Vmin =10 Vmin =10 3
10 3
10 1 10 0 10 -1 10 -2 10
-3
10 0
1 Vmin =10 Vmin =10 3
10 2
10 2
Number of instances
Number of instances
10 3
581
10 1 10 2 Number of movies per user
10 3
10 4
10 1 10 0 10 -1 10 -2 10 -3
10 0
10 1 10 2 Number of users per movie
10 3
10 4
Fig. 3. Degree distributions for bipartite networks for Vmin = 101 and Vmin = 103 . On the left graph the number of movies per user fit ku−2.19 , while on the right graph −1.58 . Distributions are logarithmically binned to the number of users per movie fit km reduce the noise.
counterparts, one-mode projections have the power-law degree distributions through the whole range of degrees (Fig 2). Exponent of the power law is close to the exponent of the distribution of the number of users per movie. We emphasize that we did not performe separate one-mode projection of network of different sizes. Rather, we constructed one-mode projection for the largest network and then constructed smaller networks by eliminating movies with less that Vmin votes and all of their links from the largest network. All distributions are logarithmically binned in order to decrease the noise. Both networks based on the real data show small world property. Clustering coefficients are high and are increasing when size of the network decreases. Since smaller networks are missing less popular movies, more popular movies networks are more clustered. Average path lengths are small and decreasing with the size of the network. Topological efficiency is increasing for smaller networks (Fig 4). As expected, the degree distribution of the users-preferential random network is a power-law with the similar exponent as IMDb and UD-OM networks. But apart from the degree distribution, other properties of are significantly different. Most obvious difference is in the clustering coefficient. As expected, random networks have the clustering coefficient few orders of magnitude smaller than those of the real networks. Average shorthest paths are significantly smaller although they exhibit similar behavior. Also, we see that the efficiencies are proportionally larger. We note that some properties of IMDb network are closer to the ones of UP-RD networks. Power-law degree distribution of UD-OM could be a consequence of the preferential attachment to more popular movies. During construction of the network we connected movies with more common users. Movies with more users would also have greater probability to have more common users with some other movie. However, we see that if we connect movies only by favoring more popular movies, other properties would be different. Smallest network with Vmin = 105 is so sparse that we eliminated it from the investigation.
582
J. Gruji´c
10 4
0.7
10 2 10 1 0
10
10 -1 10
10
0
10
1
2
10 Degree
10
3
0.5 0.4 0.3 0.2 0.1
10
10 1
4
20
10 2
10 3 Vmin
10 4
10 5
10 3 Vmin
10 4
10 5
0.3 IMDb UD-OM
18 16
IMDb UD-OM
0.25
UP-RD
UP-RD Topological efficiency
14 Average paths
UP-RD
0.6
0
-2
12 10 8 6 4 2
IMDb UD-OM
0.8
Clustering coefficient
Degree distributions
10
0.9
IMDb UD-OM UP-RD
3
10 1
10 2
10 3 Vmin
10 4
10 5
0.2 0.15 0.1 0.05 0
10 1
10 2
Fig. 4. Compareson of IMDb recommendations (IMDb), User driven monopartite directed (UD-OM) and Users-Preferentail random (UP-RD) networks. Degree distribution for Vmin = 10 (top left), clustering coefficient (top right), average shortest paths lengths (bottom left), topological efficiency (bottom right) as a function of Vmin .
5
Conclusion and Future Directions
Using the data from the largest and most comprehensive movie database IMDb, we considered two types of networks: one based on the IMDb recommendations and one based on collaborative filtering based on user habits. As a starting point we investigated properties of movies directed network following directly from IMDb recommendations data. We generated bipartite networks by connecting users with the movies they commented on. In order to compare these two approaches, we made one-mode projection of bipartite networks. We introduced the minimal number of votes as a control parameter and constructed different sizes of networks according to the number of votes of movies. All networks show high clustering coefficients and small world property, although some variations are noticed in the behavior for different sizes of networks. Degree distributions for both types of networks are universal. Networks obtained through collaborative filtering exhibits robust power-low distributions seemingly universal for all sizes of networks. Networks based on IMDb recommendations although not power law distributions for small values of degrees, still have power-law tail. Since the properties of random vote-preferential networks, most noticeably clustering coefficient are significantly different from one-mode projection, we believe
Movies Recommendation Networks as Bipartite Graphs
583
that the power-low distribution is not just the consequence of the favoring more popular movies, but some self-organizing mechanism. In the future, we plan to generalize the presented approach. By investigating the community structures, users clustering and by further theoretical modeling, we are going to try to understand natural mechanisms behind these properties. Acknowledgments. Author thanks B. Tadi´c for numerous useful comments and suggestions. This cooperation is supported by COST-STSM-P10-02988 and PATTERNS project MRTN-CT-2004-005728. This work was financed in part by the Ministry of Science of the Republic of Serbia under project no. OI141035. and FP6 project CX-CMCS. The presented numerical results were obtained on the AEGIS GRID e-infrastructure whose operation is supported in part by FP6 projects EGEE-II and SEE-GRID-2.
References 1. Boccaletti, S., Latora, V., Moreno, Y., Chavez, M., Huang, D.-U.: Complex Networks: Structure and dynamics. Phys. Rep. 424, 175–308 (2006) 2. Tadi´c, B., Rodgers, G.J., Thurner, S.: Transpot on complex networks: flow, jamming and optimization. Int. J. Bifurcation and Chaos (IJBC) 17(7), 2363–2385 (2007) 3. Guimera, R., Danon, L., Diaz-Guilera, A., Giralt, F., Arenas, A.: Self-similar community structure in a network of human interactions. Phys. Rev. E 68, 065103 (2003) 4. Lambiotte, R., Ausloos, M.: Uncovering collective listening habits and music genres in bipartite networks. Phys. Rev. E 72, 066107 (2005) 5. Zhou, T., Ren, J., Medo, M., Zhang, Y.C.: Bipartite network projection and personal recommendation. Phys. Rev. E 76, 046115 (2007) 6. Cano, P., Celma, O., Koppenberger, M.: Topology of music recommendation networks. Chaos 16, 013107 (2006) 7. Lambiotte, R., Ausloos, M.: On the genre-fication of Music: a percolation approach. Eur. Phys. J. B 50, 183–188 (2006) 8. Internet Movie Database, http://www.imdb.com 9. Lind, P., Gonz´ alez, M., Herrmann, H.: Cycles and clustering in bipartite networks. Phys. Rev. E 72, 056127 (2005) 10. Watts, D.J., Strogatz, S.H.: Collective dynamics of small-world networks. Nature 393, 440–442 (1998) 11. Marchiori, M., Latora, V.: Harmony in the Small-World. Physica A 285, 198701 (2000)
Dynamical Regularization in Scalefree-Trees of Coupled 2D Chaotic Maps Zoran Levnaji´c Department for Theoretical Physics, Joˇzef Stefan Institute Jamova 39, SI-1000 Ljubljana, Slovenia [email protected]
Abstract. The dynamics of coupled 2D chaotic maps with time-delay on a scalefree-tree is studied, with different types of the collective behaviors already been reported for various values of coupling strength [1]. In this work we focus on the dynamics’ time-evolution at the coupling strength of the stability threshold and examine the properties of the regularization process. The time-scales involved in the appearance of the regular state and the periodic state are determined. We find unexpected regularity in the the system’s final steady state: all the period values turn out to be integer multiples of one among given numbers. Moreover, the period value distribution follows a power-law with a slope of -2.24. Keywords: complex networks, coupled maps systems, emergent behaviour, self-organization in complex systems.
1
Introduction
Complex networks are overwhelmingly present in nature: a variety of systems from technology, biology or sociology can be seen as networks of interacting units that behave collectively [2,3]. The paradigm of many basic elements that generate complex emergent behavior by interacting through the network links received a lot of attention as a convenient way to model complex systems. Coupled Maps Systems (CMS) are networks of interacting dynamical systems (maps) that can be easily computationally modeled allowing the study of complex phenomena like synchronization, self-organization or phase transitions [4,5,6,7]. The emergent behavior of a CMS can be investigated in relation to the architecture/topology of the network, type/strength of the coupling or the properties of the uncoupled units, often representing a model of applicational interest [8,9]. The discovery of intrinsic modularity of many naturally occurring networks [10] triggered the investigations of the dynamical properties of small graph structures (termed motifs) [11], and their role in the global behavior of the network [12,13]. Recent works emphasized the physical importance of timedelayed coupling due to its ability to mimic the realistic network communication [14,15]. Also, following the extensive studies of collective behavior in 1D CMS, the relevance of still poorly explored case of 2D CMS was recently indicated [16,1], having special relevance in the context of scalefree networks [2]. M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 584–592, 2008. c Springer-Verlag Berlin Heidelberg 2008
Dynamical Regularization in Scalefree-Trees
585
In the previous works [1,17] we studied the self-organization of coupled 2D standard maps on scalefree trees and motifs investigating different types of the collective behavior in function of the network coupling strength. After transients, the CMS dynamics was found to have two main collective effects: the dynamical localization, acting at all non-zero coupling strengths that inhibits the chaotic phase space diffusion, followed by the regularization acting at larger coupling strengths that produces groups of quasi-periodic emergent orbits with common oscillation properties. For specific coupling values the network interaction eventually turns the quasi-periodic orbits into the periodic orbits or even more complex structures like strange attractors, some of which unexpectedly appear to have characteristics of the strange non-chaotic attractors, generally known to arise in the driven systems [18]. As we revealed, there exists a critical network coupling strength μc relatively common for all scalefree-trees and motifs, at which the collective motion becomes regular even after very short transient. In this work we will further examine the dynamics at μc for a fixed scalefree-tree. We will be concerned with the timeevolution of our CMS and the appearance of the collective effects in function of the initial conditions, seeking to reveal the relevant time-scales and the properties of the final emergent motion. The paper is organized as follows: after defining our CMS on a scalefree-tree in the next Sect., we study its regularization process using the node-average and the time-average orbit approach in the Sect. 3. In the Sect. 4 we study the appearance and the properties of the emergent periodic states, and conclude with the Sect. 5 outlining the results and discussing the open questions.
2
Coupled Map System Set-Up on a Scalefree Tree
A scalefree network with the tree topology is grown using the standard procedure of preferential attachment [2] by 1 link/node for N = 1000 nodes. Every node is assumed to be a dynamical system given by the Chirikov standard map: x = x + y + ε sin(2πx) y = y + ε sin(2πx).
[mod 1]
(1)
The nodes (i.e. dynamical systems) are interacting through the network links by one-step time-delay difference in angle (x) coordinate so that a complete time-step of the node [i] for our Coupled map system (CMS) is given by: x[i]t+1 = (1 − μ)x [i]t + kμi j∈Ki (x[j]t − x [i]t ) (2) y[i]t+1 = (1 − μ)y [i]t , as in [1,17]. Here, ( ) denotes the next iterate of the (uncoupled) standard map (1), t is the global discrete time and [i] indexes the nodes (ki being a node’s degree). The update of each node is the sum of a contribution given by the update of the node itself (the part) plus a coupling contribution given by the sum of differences between the node’s x-value and the x-values of neighboring
586
Z. Levnaji´c
nodes in the previous iteration (for motivation behind this coupling form, see [1]). We set the standard map chaotic parameter to ε = 0.9 and study the dynamics of CMS (2) on a fixed scalefree-tree (visualized in Fig. 1b) for the fixed network coupling strength μc = 0.012. The μc -value corresponds to the initial stability phase transition for this CMS with all the orbits becoming periodic after a sufficiently long transient, as shown in Fig. 1a for a shorter transient (see [1] for details). As opposed to the previous works, the focus will be maintained on the time-evolution of CMS (2) in relation to the initial conditions for a fixed μc , rather then on the motion properties after transients for various μ-values. We therefore investigate the time-sequence of 2 × N -dimensional vectors {x[i]t , y[i]t } ,
i = 1, · · · , N
(3)
in function of the discrete time t ≥ 0. The initial conditions are set by randomly selecting the values (x[i]t=0 , y[i]t=0 ) for each node [i] from (x, y) ∈ [0, 1]× [−1, 1]. Future iterations are computed according to (2) with the parameter constraints described above. Updates of all the nodes are computed simultaneously for all the network nodes. Besides considering the properties of a single node/single initial
1 0.8
fr(np)
0.6 0.4 0.2 0
μc 0.01 0.02 0.03 0.04 0.05 0.06 0.07
μ
(a)
(b)
Fig. 1. (a) Fraction of non-periodic orbits for an outer tree’s node averaged over the initial conditions after a transient of 105 iterations, (b) visualization of the tree studied in this paper. Bright nodes reached the periodic orbits after a transient of 5 × 106 iterations for a random set of initial conditions (see Sect. 4.).
condition time-evolution sequence, we will also consider the quantities obtained by averaging over the tree’s nodes, over the initial conditions or over the timeevolution, as they can provide alternative qualitative/quantitative insights into the time-evolution of the dynamics.
Dynamical Regularization in Scalefree-Trees
3
587
The Average Orbit and the Dynamics Regularization
We begin the investigation of the general regularization properties of CMS (2) by examining the time-evolution of the node-average orbit defined as: (ˆ xt , yˆt ) =
N 1 (x[i]t , y[i]t ), N i=1
(4)
and representing the medium node position (averaged over the whole network) at each time-step. In Fig. 2 we show three stages of time-evolution of a generic node-
0.1
0.1
0.1
0
0
0
y
0.2
y
0.2
y
0.2
-0.1
-0.1
-0.1
-0.2
-0.2
-0.2
-0.3
-0.3
-0.3
0.47 0.48 0.49
x 0.5
(a)
0.51 0.52 0.53
0.47 0.48 0.49
x 0.5
(b)
0.51 0.52 0.53
0.47 0.48 0.49
x 0.5
0.51 0.52 0.53
(c)
Fig. 2. 1000 iterations of a node-average orbit of the CMS (2) for a random set of initial conditions. (a) after 0, (b) after 5 × 104 and (c) after 5 × 105 iterations.
average orbit of (2). The initial chaotic cluster is divided into two smaller clusters that shrink and eventually become a quasi-periodic orbit (in the sense of not repeating itself while being spatially localized) oscillating between the clusters, with even and odd iterations belonging to the opposite clusters. The qualitative structure of this process is invariant under the change of the initial conditions. We will use the clusterization of this motion to design a more quantitative approach. Let us consider even and odd iterations of the node-average orbit xt , yˆt )odd as two separate sequences. They evolve in the phase (ˆ xt , yˆt )even and (ˆ space from initially being mixed together, towards each forming a separate cluster that shrinks in size. The sum of cluster sizes divided by their distance gives a good time-evolving measure of the motion’s ’regularity’; one expects this quantity to decrease with the time-evolution, eventually stabilizing at some small value (or possibly zero). To that end we fix a certain number of iterations τ and coarse-grain the sequences by dividing them into intervals of length τ and considering two sequences of clusters (given as points within each interval for each sequence) instead. We compute the 2-dimensional standard deviations for each pair of clusters σeven (T ) and σodd (T ), and measure the distance dc (T ) between their centers. These quantities depend on the coarse-grained discrete time T given as the integer part of (t/τ ). We define Σ(T ):
588
Z. Levnaji´c
Σ(T ) =
σeven (T ) + σodd (T ) , dc (T )
(5)
that quantifies the evolution of the cluster separation process as a function of T . In Fig. 3a we show the behavior of Σ(T ) for three different sets of initial conditions: for each plot there exist a critical time tc at which the node-average
0.4
0.12
0.35
0.1
0.3
P(tc)
0.08
(t)
0.25 0.2
0.06
0.15
0.04
0.1 0.02
0.05 0
100000
t
200000
300000
0 200000
400000
(a)
tc600000
800000 1000000
(b)
Fig. 3. (a) Evolution of Σ(T ) for three sets of initial conditions (in normal time t), with time-window set to τ = 1000 iterations. (b) distribution of values of tc averaged over 1000 random initial conditions, with log-normal tail fit.
dynamics suddenly ’regularizes’ into a stable steady state characterized by a small Σ(T )-value that shows no further change with T . We term this collective dynamical state as regular in the sense of constancy of Σ(T ) (Note: we use the term ’regular’ only to make a clear distinction with the chaotic dynamics preceding this state, while a discussion of the precise properties of regular state will be given later). We emphasize that as this consideration is done using the node-average orbit, the result is a global property of scalefree tree CMS (2). The value of tc depends only on the initial conditions and it is given as the value of T after which Σ(T ) remains constant. In Fig. 3b we examine the distribution of tc -values over the initial conditions. The distribution P (tc ) presents a log-normal tail with a prominent peak at tc ∼ = 2.79 × 105 iterations. Note that the value of tc is by construction independent of the initial conditions and hence refers only to the network structure and the coupling strength μc . 3.1
Properties of the Regular Dynamical State
The regular dynamics after tc is characterized by quasi-periodic oscillatory motion of each node between two clusters, in analogy with the node-average orbit. All nodes however do not share the same clusters, but instead every node settles
Dynamical Regularization in Scalefree-Trees
6 4
# nodes
y
2 0
-2 -4 -6 0
200 180 160 140 120 100 80 60 40 20 0-6
589
0.2
0.4
x
0.6
0.8
1
-4
(a)
-2
0
y[i]
2
4
6
(b)
Fig. 4. A regular state of CMS (2) for a set of initial conditions after a transient of 500,000 iterations. (a) 10 iterations of all the nodes plotted in the same phase space, (b) distribution of y¯[i]-values for all the nodes computed over t1 − t0 = 1000 iterations.
to oscillate between a certain pair of clusters that are horizontally corresponding to each other as illustrated in Fig. 4a. Every node therefore belongs to one group of nodes oscillating between a common pair of clusters with a phase space organization as in Fig. 4a (see [1]). Let the time-average orbit of a node averaged over the time-interval (t0 , t1 ) be defined as: (¯ x[i], y¯[i]) =
t=t 1 1 (x[i]t , y[i]t ), t1 − t0 t=t
(6)
O
which is a single phase space point that represents the node’s average position during that time interval. In Fig. 4b we show the distribution of y¯[i]-values for all the nodes of a single regular state: sharp peaks indicate groups of nodes sharing the same pairs of clusters and hence sharing the same y¯[i]-value. Note that there are 11 pairs of horizontally linked clusters on Fig. 4a, in correspondence with the 11 peaks visible in Fig. 4b. The symmetry properties of this distribution are invariant to the changes of the initial conditions.
4
Properties of the Periodic Dynamical State
Further evolution of any regular state eventually generates periodic orbits on all the tree’s nodes under the dynamics of CMS (2). This is the final equilibrium steady state of this system that undergoes no further changes, and will be called the periodic state. While the regular state is defined only qualitatively by the constancy of Σ(T ), the periodic state is precisely defined in terms of the periodicity of orbits on all the nodes. Group phase space organization of the nodes and their common oscillation properties described earlier (¯ y [i]-values, see Fig. 4) still remain, with the periodic orbits replacing the quasi-periodic ones.
590
Z. Levnaji´c
1000
80
800
P(π)
# nodes
# nodes
60
600
40
400
-3
10
-4
10
-5
10
-6
10
-7
10-8
20
200
10
10 0100000
300000
t
(a)
500000
1000
[i]
2000
-9
-10
3000 10 102
(b)
3
10
10
4
π
10
5
10
6
(c)
Fig. 5. (a) Number of nodes with periodic orbits as function of time for three sets of initial conditions, (b) histogram of period values (up to π = 3000) for a single periodic state, (c) period distribution averaged over 20 periodic states, with slope of -2.24.
We describe the appearance of a periodic state in Fig. 5a: at each τ = 1000 iterations we check for nodes with the periodic orbits (for periods up to π = 105 ) and report their number. As opposed to the case of a regular state (Fig. 3a), the nature of the transition to a periodic state is not unique. As clear from Fig. 5a besides a typical phase-transition (red), one can observe other phenomena including a slowly evolving non-equilibrium steady state (light blue) that does not reach an equilibrium state within 5 × 105 iterations (we analysed the states of this sort concluding they do not reach equilibrium even within 107 iterations). Note that a sort of double phase-transition is also possible (dark blue) as the architecture of the tree may induce time scales into the regularization process. in Fig. 1b shows a typical situation during the time-evolution: nodes achieve periodic orbits starting from ones with lesser links away from the hub, but not necessarily connected among them. Once the hub-node becomes periodic the whole tree is in a dynamical equilibrium, but the path of this transition can be diverse depending on the initial conditions. 4.1
Properties of the Periodic Dynamical State
The key property of a periodic state is that almost all the period values turn out to be integer multiples of a given number. These base multiple numbers are predominantly 240 (73% of initial conditions) or 48 (18% of initial conditions), with others being 96, 480, 720 etc. This is illustrated in Fig. 5b where we show a histogram of period values – almost all the nodes have periods that are multiples of 240, with remaining nodes having periods multiples of 48. It is to be observed that all the base multiple numbers mentioned above are all multiples of 48. Furthermore, we computed the averaged distribution of period values (for 20 cases of multiples of 240) shown in Fig. 5c. The distribution exhibits a powerlaw tail with an exponent of about -2.24, for periods up to π = 106 . These properties clearly indicate a presence of a self-organizational mechanism behind the creation of the periodic state of CMS (2). The same base multiple
Dynamical Regularization in Scalefree-Trees
591
numbers are involved in cases of periodic states obtained for this CMS with other μ-values, but with a different ratio of their presence depending on the initial conditions (generally, smaller the μ-value bigger the presence of larger base multiple numbers). Note that the periodicity of mentioned node orbits have no similarity with the periodic and quasi-periodic orbits known to exist in standard map’s dynamics, as for this ε-value standard map shows only very little isolated regularity (otherwise being strongly chaotic).
5
Conclusions
We examined the nature of the collective dynamics of CMS (2) with the fixed coupling parameters ε = 0.9 and μc = 0.012 realized on a scalefree-tree, in terms of its time-evolution and the properties of its emerging dynamical states. We showed that for all the initial conditions dynamics becomes regular (in the sense defined above) after a critical time tc . Also, for a majority of the initial conditions dynamics reaches a final steady state characterized by the periodicity of each node’s orbit. Curiously, almost all the period values happen to be integer multiples of a given number that (depending on the initial conditions) varies within a given set of numbers. Also, for the periodic states having the base multiple number 240 (the most frequent one), the period value distribution follows a power-law with a tail slope of roughly -2.24. Despite the oscillatory nature of emergent dynamics in many CMS known so far (mainly 1D), period values structure of this sort does not seem to be observed yet. We speculate that this self-organization feature might owe its origin to the time delayed structure of the coupling, and the nature of the standard map: once a node achieves a periodic orbit, its neighbours ought to do the same, inducing the correlation in the period values. Open questions include network organization of periodic states in terms of period–node relationship, as well as investigation of the steady states of this CMS realized with different networks. It might be also interesting to study similar CMS with time-delays that vary throughout the network in a way to model a given naturally occurring behaviour. Of particular interest might also be the investigation of the non-equilibrium steady state mentioned in Fig. 5a (light blue) from a statistical point of view. Acknowledgments. This work was supported by the Program P1-0044 of the Ministery of Higher Education, Science and Technology of Republic of Slovenia. Many thanks to prof. Bosiljka Tadi´c for her guidance and useful comments.
References 1. Levnaji´c, Z., Tadi´c, B.: Self-organization in trees and motifs of two-dimensional chaotic maps with time delay. Journal of Statistical Mechanics: Theory and Experiment (P03003) (2008) 2. Dorogovtsev, S.N., Mendes, J.F.F.: Evolution of networks: From Biological Nets to the Internet and WWW. Oxford University Press, Oxford (2003)
592
Z. Levnaji´c
3. Boccaletti, S., Latora, V., Moreno, Y., Chavez, M., Hwang, D.U.: Complex networks: Structure and dynamics. Physics Reports 424, 175 (2006) 4. Baroni, L., Livi, R., Torcini, A.: Transition to stochastic synchronization in spatially extended systems. Physical Review E 63, 036226 (2001) 5. Ahlers, V., Pikovsky, A.: Critical properties of the synchronization transition in space-time chaos. Physical Review Letters 88, 254101 (2002) 6. Jabeen, Z., Gupte, N.: Spatiotemporal intermittency and scaling laws in the coupled sine circle map lattice. Physical Review E 74, 016210 (2006) 7. Tadi´c, B., Rodgers, G.J., Thurner, S.: Transport on complex networks: Flow, jamming & optimization. International Journal of Bifurcation and Chaos 17(7), 2363 (2007) 8. Coutinho, R., Fernandez, B., Lima, R., Meyroneinc, A.: Discrete time piecewise affine models of genetic regulatory networks. Journal of Mathematical Biology 52, 524 (2006) 9. Rajesh, S., Sinha, S., Sinha, S.: Synchronization in coupled cells with activatorinhibitor pathways. Physical Review E 75, 011906 (2007) 10. Milo, R., Shen-Orr, S.S., Itzkovitz, S., Kashtan, N., Chklovskii, D., Alon, U.: Network motifs: Simple building blocks of complex networks. Science 298, 824 (2002) 11. Vega, Y.M., V´ azquez-Prada, M., Pacheco, A.F.: Fitness for synchronization of network motifs. Physica A 343, 279–287 (2004) 12. Oh, E., Rho, K., Hong, H., Kahng, B.: Modular synchronization in complex networks. Physical Review E 72, 047101 (2005) 13. Arenas, A., D´ıaz-Guilera, A., P´erez-Vicente, C.J.: Synchronization reveals topological scales in complex networks. Physical Review Letters 96, 114102 (2006) 14. Masoller, C., Mart´ı, A.C.: Random delays and the synchronization of chaotic maps. Physical Review Letters 94, 134102 (2005) 15. Li, C.P., Sun, W.G., Kurths, J.: Synchonization of complex dynamical networks with time delays. Physica A 361(1), 24 (2006) 16. Altmann, E.G., Kantz, H.: Hypothesis of strong chaos and anomalous diffusion in coupled symplectic maps. EPL 78, 10008 (2007) 17. Levnaji´c, Z., Tadi´c, B.: Dynamical patterns in scalefree trees of coupled 2d chaotic maps. In: Shi, Y., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2007. LNCS, vol. 4488, pp. 633–640. Springer, Heidelberg (2007) 18. Ramaswamy, R.: Synchronization of strange nonchaotic attractors. Physical Review E 56(6), 7294 (1997)
Physics Based Algorithms for Sparse Graph Visualization ˇ Milovan Suvakov Department for Theoretical Physics, Joˇzef Stefan Institute, Box 3000, 1001 Ljubljana, Slovenia [email protected] http://www-f1.ijs.si/∼suvakov/
Abstract. Graph visualization represents an important computational tool in analysis of complex networks. Recently, variety of network structures in complex dynamical systems have been found which require appropriately adjusted visualization algorithms. We are testing quantitatively performance of two visualization algorithms based on energy minimization principle on variety of complex networks from cell-aggregated planar graphs to highly clustered scale-free networks. We found that fairly large structures with high clustering can be efficiently visualized with spring energy model with truncated interaction. Keywords: graph layout, complex networks, energy minimization.
1
Introduction
Networks which represent complex dynamical systems in physical, biological and social systems usually appear to have strongly inhomogeneous structures [1], which require improved methods for analysis and visualization. Some unusual properties of these networks which may effect visualization procedure are that networks are: sparsely connected, large, strongly inhomogeneous, and/or modular. In addition, constraints such as planarity, absence of clustering, on one side, or small-worldness, scale-free organization, strong clustering, on the other are often making difficulties to the standard visualization procedure [2]. Recent examples include physical models such as: the spring algorithm [3], where the links are replaced by springs with unit natural length and the nodes are interacting with a repulsive force, and Kamada-Kawai algorithm [4], where one looks for the optimal potential that makes the distances between the nodes tend to their corresponding topological distances. Another important ingredient in the visualization algorithm is the actual minimization procedure. Here we studied these two “energy” models [3,4] and implementing two different algorithms for finding the energy minimum: time integration and metropolis-based algorithm. We compare the efficiency of these two algorithms and analyze the emergent structures for different kinds of networks: trees, scale-free graphs, cellular networks which are planar graphs introduced in [5], and gene interaction network [7]. We also suggest modification of the spring algorithm with a truncated interaction and demonstrate it efficiency on these networks. M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 593–600, 2008. c Springer-Verlag Berlin Heidelberg 2008
594
ˇ M. Suvakov
Terminology and Definition of the Problem: A graph G is an ordered pair (V, E) given with nonempty set of verteces V and set of edges E. If for each edge (u, v) ∈ E there is an edge (v, u) ∈ E graph is undirected (symmetrical). The graph visualization is mapping between the verteces and two-dimensional or three-dimensional coordinates f : V → Rn (where n ∈ {2, 3}). The quality of mapping is not well defined, but several criteria are generally accepted. One example is minimization of number of edge crossing. While some graphs cannot be drawn without edge crossings, some graphs can. Planar graphs belong to this group. To test uncrossing property of the visualization algorithms in this paper we will use model of cell-aggregated planar graphs which we were presented in [5]. Other criteria for satisfactory of visualization often used are: minimization of variance in the edge lengths, maximization of minimal angle between neighboring edges, minimization of the number of bends in the edges, minimization of different slopes used in edge visualization, and other.
Kamada-Kawai
Spring
Fig. 1. Minimal energy configuration in the case of a small graph
In this paper we will concentrate on two-dimensional (planar) visualization of undirected graphs. Also, we will focused on physical models with given Hamiltonian (energy) function which depends on coordinates of vertices and graph topology given by the adjacency matrix. The main idea of this approach is that final planar embedding is given by the minimum of the energy function. The two different “energy” models will be studied supplemented by two different energy minimization algorithms.
2
“Energy” Models and Minimization Algorithms
The first model is known as Kamada-Kawai [4] model and it is based on idea to place the vertices on such places in the plane with geometric distances correlated with topological distances. The energy in this model is given with: E= kij (|ri − rj | − lij )2 , (1) i,j
Physics Based Algorithms for Sparse Graph Visualization Kamada − K awai
595
Time Integration
Metropolis
S pr ing
Fig. 2. Same cellular planar graph [5] of N = 256 nodes shown with four different visualization methods. Relative numbers of edge crossings χ are: 0.01%, 0.31%, 0.05%, 0.06%.
where ri are vectors of positions of vertices, kij are parameters (in algorithm used in this paper with constant value kij = 1), and lij is topological distance between vertices - shortest path length. Shortest path between two vertices is defined as path between them with smallest number of edges [9]. In this case, the value of lij is an positive integer number. The physical interpretation is an harmonic interaction between every two vertices with a minimum on the distance given with shortest path length. Second model is known as spring model [3] in which the energy is given by: E=
(i,j)∈E
kij (|ri − rj | − l0 )2 +
i,j
g η, |ri − rj |
(2)
where kij , l0 , g and η are parameters, and ri are vectors of the positions of vertices. In the physical interpretation, first term represent springs with length l0 between neighboring verteces and second term represent “electrostatic repulsion” each pair of verteces.
ˇ M. Suvakov
596
The illustration of global energy minima for a small graph for both models is given on Fig. 1. For larger graph sizes there are lot of local minima in the energy function, which make the visualization procedure increasingly complex. Kamada − Kawai 30
Spring 35
Integration Metropolis
25
25 Energy
20 Energy
Integration Metropolis
30
15
20
10
15
5
10
0
5 0
10
20
30
Time [iter]
40
50
0
10
20
30
40
50
Time [iter]
Fig. 3. Relaxation of two energy models for the same planar graph shown on Fig. 2 ant two minimization algorithms. Different lines correspond to different initial conditions.
We show two models for graph visualization given by energy as function of the positions of the vertices. The final planar embedding is given as a minimum of the energy function. In order to find the energy minimum we use two different algorithms: metropolis and time-integration algorithm. In the metropolis algorithm we calculate the energy change for node random movement, usually taken from Gaussian distribution. The movement is accepted with a probability given by p = max(1, e−ΔE/T ), where ΔE is energy change and T is “temperature” parameter. We implement this algorithm according to following steps: Initial node positions: randomly given Do: For i=1 to N: e0=energy of the system move node i randomly e1=energy of the system Accept movement with probability: exp(-(e1-e0)/T)) End loop i Until maximal movement is smaller than a threshold value As the second algorithm we use the numerical time-integration of the following set of equations: ∂E dri = −ν , (3) dt ∂ri where ν is a “viscosity” parameter. It is implemented as follows:
Physics Based Algorithms for Sparse Graph Visualization
597
Initial node positions: randomly given Do: For i=1 to N: Calculate force on node i as gradient of the energy Move the node End loop i Until maximal movement is smaller than a threshold value
S pr ing
m=2 (clustered SF graph)
m=1 (SF tree)
Kamada − K awai
Fig. 4. Emergent layouts for two different models (left column: Kamada-Kawai, right column: spring model) for scale-free tree (top panels), and clustered scale-free graph [6] (bottom panel). In all cases the metropolis minimization algorithm was used. Relative numbers of edge crossings χ are: 0.70%, 0.25%, 23.5%, 17.0%.
3
Algorithm Efficiency and the Graph Layouts
On Fig. 2 we show different graph layouts for the same planar graph with cellular structure [5]. The graph is grown using algorithm given in [5] with control parameters μ2 = 1.0 and ν = 1.0. For reasons of clarity here we use the small size of network is N = 256. To compare results we will use relative number of
598
ˇ M. Suvakov
edge crossings defined as χ = Ncs /Ncs,0 where Ncs is number of edge crossings for given layout and Ncs,0 is average number of edge crossings for the same graph for a random layout. In all results we calculated χ using value for Ncs,0 averaged over 10 random layouts. The Kamada-Kawai method gives better result (see left column on Fig. 2). Quantitatively this difference can be related to faster energy relaxation and absence of many minima shown on Fig. 3. In both cases there are a lot of minima, but in the Kamada-Kawai case a global minima is always found with metropolis algorithm. For the planar graph the difference in the minimization procedure is less pronounced in the case of the spring algorithm. The comparison between two energy models for scale-free graphs is shown on Fig. 4. In both cases spring algorithm gives better results.
(a)
(b)
(c)
(d)
Fig. 5. Emergent layouts in the case of next-neighbor spring model approximation for different graphs: (a) planar graph of Fig. 2; (b) scale-free tree of Fig. 4; (c) clustered scale-free graph of Fig. 4; (d) gene network from Ref. [7]. Relative numbers of edge crossings χ are: 3.7%, 6.0%, 25.6%, 45.0%.
Physics Based Algorithms for Sparse Graph Visualization
4
599
Spring Model with Next-Neighbor Interaction
A considerable computational acceleration in the algorithm is possible if the range of interactions in Eqs. 1 and 2 can be restricted. For instance here we use next-neighbor interactions in the sums in Eqs. 1 and 2 and apply to different graphs. We studied when this approximation gives satisfactory results. For this purpose we studied the next-neighbor spring model on four different networks shown on Fig. 5: planar cellular graph, scale-free tree, clustered sparse scale-free graph, and gene network.
(b) 100000
10000
10000 Algorithm complexity
Algorithm complexity
(a) 100000
1000 m=1 α=0.1 m=1 α=1 m=1 α=5 m=2 α=0.1 m=2 α=1 m=2 α=5 slope=1.0 slope=1.3
100
10
1000
100
GNet1 GNet2 slope=1.0
10
1
1 1
10
100
1000
N
1
10
100
1000
N
Fig. 6. Algorithm complexity N × < knn > on size of network N : (a) Growing network model from [6] for different model parameters (b) Gene networks from [7]
In the case when clustering is too small the emergent layouts are not well outstretched. However, in the case of sparse networks with higher clustering coefficient the algorithm seems much better (see Fig. 5(c)). In the case of notscalefree gene network, see example in Fig. 5(d) the next-neighbor approximation does not give a good layout. The complexity of the energy calculation algorithm is proportional to the mean value of next-neighbors < knn >. The one computational step (loop through all nodes) in both energy minimization algorithms takes N × < knn > calculations. Since the mean value < knn > increases with system size and saturates at a threshold, the complexity of energy calculation algorithm for all nodes appears to be proportional to system size N instead to N 2 . For the network sizes around the threshold value, the exponent is between one and two (see Fig. 6).
5
Conclusions and Future Work
We tested three energy models for two-dimensional visualization of graphs with given size and structure defined with adjacency matrix Cij . We give certain quantitative measures of efficiency of the energy minimization algorithm. We
600
ˇ M. Suvakov
Table 1. For different graph structures the listed are methods of choice with metropolis minimization Graph structure Planar graph with loops Scale-free trees Scale-free clustered Non-SF clustered
Visualization Method of Choice Kamada-Kawai model Spring model - infinite range interaction Spring model- short range interaction ??
χ 0.01% 0.26% 26% 45%
presented layouts for different type of graphs, in particular the graphs which appear in applications of statistical physics to real-world networks. In summary we find that different algorithms work differently depending on the graph structure. For particular graph type we summarize the method of choice in Table 1. Another idea for visualization of complex networks is to use growth models, starting with small sub-graph and minimizing the energy while adding nodes one by one with known patterns of links (see online java applet on author’s web page [12]). Acknowledgments. M.S. thanks financial support from the Marie Curie Research and Training Network MRTN-CT-2004-005728 project. Many thanks to my PhD supervisor prof. Bosiljka Tadi´c for her useful help and guidance.
References 1. Boccaletti, S., Latora, V., Moreno, Y., Chavez, M., Hwang, D.U.: Complex networks: Structure and dynamics. Physics Reports 424, 175–308 (2006) 2. Batagelj, V., Mrvar, A.: Pajek, http://vlado.fmf.uni-lj.si/pub/networks/ pajek/ 3. Eades, P.: A heuristic for graph drawing. Congraessus Numerantium 42, 149–160 (1984) 4. Kamada, T., Kawai, S.: An algorithm for drawing general undirected graphs. Inf. Process. Lett. 31, 7–15 (1989) ˇ 5. Suvakov, M., Tadi´c, B.: Topology of Cell-Aggregated Planar Graphs. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2006. LNCS, vol. 3993, pp. 1098–1105. Springer, Heidelberg (2006) 6. Tadi´c, B.: Dynamics of directed graphs: the world-wide Web. Physica A 293, 273 (2001) ˇ 7. Zivkovi´ c, J., Tadi´c, B., Wick, N., Thurner, S.: Statistical indicators of collective behavior and functional clusters in gene networks of yeast. The European Physical Journal B - Condensed Matter and Complex Systems 50(1), 255–258 (2006) 8. Dorogovtsev, S.N., Mendes, J.F.F.: Evolution of Networks. Oxford University Press, Oxford (2003) 9. Bollob´ as, B.: Modern Graph Theory. Springer, New York (1998) 10. Tadi´c, B.: From Microscopic Rules to Emergent Cooperativity in Large-Scale Patterns. In: Krasnogor, N., Gustafson, S., Pelta, D., Verdegay, J.L. (eds.) Systems Self-Assembly: multidisciplinary snapshots, Elsevier, Amsterdam (2005) 11. Ahuja, R.K., Magnanti, T.L., Orlin, J.B.: Network Flows: Theory, Algorithms, and Applications. Prentice-Hall, New Jersey (1993) 12. Suvakov M.: http://www-f1.ijs.si/∼ suvakov/
High Performance Geocomputation - Preface Yong Xue1,2,*, Dingsheng Liu3, Jianwen Ai1,4, and Wei Wan1,4 1
State Key Laboratory of Remote Sensing Science, Jointly Sponsored by the Institute of Remote Sensing Applications of Chinese Academy of Sciences and Beijing Normal University, Institute of Remote Sensing Applications, CAS, P.O. Box 9718, Beijing 100101, China 2 Department of Computing, London Metropolitan University, 166-220 Holloway Road, London N7 8DB, UK 3 Center for Earth Observation and Digital Earth, Chinese Academy of Sciences, Beijing 100080, P.R. China 4 Graduate University of Chinese Academy of Science, Yuquan Road, Beijing 100049, China [email protected]
Abstract. This paper presents the introduction to Geocomputation workshop in ICCS2008. The Workshop on Geocomputation continues with the ICCS conferences held in Amsterdam (2002), St. Petersburg (2003), Krakow (2004), Atlanta (2005), Reading (2006), and Beijing (2007).
1 Preface High Performance Geo-Computation (HPGC) is the application of high performance computational science to Earth sciences. It applies high-performance computational resources to various types of Earth science data, information, and models for solving Earth science problems. It develops discipline-specific theories, algorithms, architectures, systems, supporting tools, and infrastructure within the overall context of computational science. HPGC is concerned with new computational techniques, algorithms, and paradigms that are dependent upon and can take advantage of high performance computing, distributed computing, and high throughput computing. The areas of application for HPGC include, but are not limited to, spatial data analysis, dynamic modelling, simulation, space-time dynamics and visualization, virtual reality, and applications employing non-conventional data clustering and analysis techniques. HPGC is an integrated computing environment. It is driven by the advances in computing technologies, such as cluster computing, the pervasive computing and/or ubiquitous computing and Grid computing as part of a common framework that offers the best immersion of users and applications in the global environment. However, the key technologies for Geo-grid differ from those for general purpose Grid computing in many aspects, which often refer to methods and solutions being used to implement geo scientific application on top of grid infrastructure. These technologies include three levels: *
Corresponding author.
M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 603 – 604, 2008. © Springer-Verlag Berlin Heidelberg 2008
604
1.
2.
3.
Y. Xue et al.
The Grid adaptation to geoscience, Grid enabled geo resources, and Grid enabled geographic application. The Grid adaptation to geoscience, namely incorporating general-purpose Grid middleware and third party components into Earth science missions, aims to change the Grid to better meet the needs from geo community. For example, traditional Grid service for data management are required to be modified, as for adapting to complexity and massiveness in spatial data processing, by modifying the data transfer protocol and improving data replica mechanism. Grid enabled geo resources means builders and administrators change diverse resources of geo-spatial data, models and algorithms, with special emphasis on encapsulating, deploying and registering them in the grid environment. Thus users can have access to these resources. Grid enabled geo resources is one of the most important parts with respect to Geo-Grid. Grid enabled geographic application actually is a sets of invocations of operations with dependency on each other, and the contexts in which certain operations are. Grid application in geo science filed include as follow: resources describing mechanism, platform building-up tool, visualization tool, and develop tool. These toolboxes provide essential foundation for building large-scale system for Geo-science application.
This workshop consists of six papers. All papers have gone through the normal refereeing process of ICCS conference. These papers discuss recent results on several aspects of geocomputation, including theories and applications. The first paper by Jankowski et al. deals with the numerical simulation of threshold-crossing problem allowing us to assess the probability that a random field of contamination does not exceed a fixed level in a certain two-dimensional (2-D) spatial domain. The numerical evaluation of the Legendre functions are especially challenging for very high degrees and orders which are required for advanced geocomputations. The computational aspects of SHTs and their inverses using both quadrature and leastsquares estimation methods are discussed with special emphasis on numerical preconditioning that guarantees reliable results for degrees and orders up to 3800 in REAL*8 or double precision arithmetic in the paper by Blais et al.. In the paper from Tawfik et al., a prototype, which uses context information and a range of design and user-related criteria to analyse the accessibility of road layouts and plan routes, is developed; a case study is used to validate the prototype. In the paper by Cope et al., authors examined emerging urgent Earth and geoscience workflows, the data services used by these workflows, and their proposed urgent data management framework for managing urgent data services. The design, analysis and implementation of InterCondor system was presented in the paper by Xue et al.. The InterCondor system is an implementation of the concept of InterGrid. Finally, Liu et al. in their paper proposed the new architecture of SIG, upon which the constituted GIS nodes can provide GISer-vices based on existing SIG platform. Last but not least, we are grateful to the authors and referees for their hard work.
Study on Implementation of High-Performance GIServices in Spatial Information Grid Fang Huang1,2,4 , Dingsheng Liu1, Guoqing Li1, Yi Zeng1,3,4, and Yunxuan Yan1,4 1
Center for Earth Observation and Digital Earth, Chinese Academy of Sciences, Beijing 100086, P.R. China 2 Institute of Remote Sensing Applications, Chinese Academy of Sciences, Beijing 100101, P.R. China 3 Institute of Electronics, Chinese Academy of Sciences, Beijing 100090, P.R. China 4 Graduate University of Chinese Academy of Sciences, Beijing 100049, P.R. China {fhuang, dsliu, gqli, yzeng, yxyan}@ceode.ac.cn
Abstract. Providing geo-spatial data services (GDS) and processing functionality services (PFS) are the key issues in spatial information grid (SIG). Especially, it‘s crucial for SIG to offer PFS related to Geographic Information Science (GIS), instead of just focused on Remote Sensing (RS) field. Furthermore, implementing high-performance GIServices is the main task of SIG to offer PFS for GIS. Lacking of high-performance GIServices mainly resulted from the limitations of architecture as well as the complexity for services implementation and encapsulation. Based on existing SIG platform, we propose the new architecture of SIG, upon which the constituted GIS nodes can provide GIServices. Within the improved architecture, some parallel GRASS GIS algorithms programs, which are built by different parallelization patterns and can run in cluster with better efficiency, are encapsulated to high-performance GIServices guiding by certain generic mode. Lastly, the analyses of the test demonstrate that the approach can reach our aims. Keywords: Spatial Information Grid (SIG); GIServices; Cluster; GRASS GIS.
1 Introduction Generally, spatial information grid (SIG) is a fundamental infrastructure that can collect and share all types of geospatial information rapidly and effectively, with powerful capabilities for service on demand, geospatial data management and information processing. In addition, SIG is a distributed environment that combines resources such as geospatial data and computing, story and processing tools to supply services to geospatial applications [1]. The current SIG platform was supported by the Hi-Tech Research and Development Program of China [2]. It is comprised several grid nodes such as data grid node, computing grid node, controlling and management node, which are built based on basic grid container. Moreover, the platform integrated such software as Titan (one type of commercial software for RS image processing) [3], PIPS (one parallel RS image processing software in cluster developed by CEODE, CAS) [4]. In all, the platform can provide both geo-spatial data services (GDS) and processing functionality services (PFS). The M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 605 – 613, 2008. © Springer-Verlag Berlin Heidelberg 2008
606
F. Huang et al.
former can handle terabytes of data; the latter not only provides normal RS services (came from sequential RS algorithms), but also offers some high-performance RS services (encapsulated from parallel RS algorithms in cluster with high performance and better efficiency). However, current PFS in SIG have not extended to GIS field yet. Meanwhile, to the extent that it’s crucial and full of challenges to provide some highperformance GIServices in SIG, which mainly resulted from: (1) The existing architecture can not flexibly suitable for extending GIS node to provide GIServices, because the node construction and services procedure for GIS is much more different than that of RS; (2) It’s difficult to get some parallel GIS programs running in Linux cluster utilizing common commercial GIS packages; and (3) Lacking of some instructions to guide the GIServices implementation. Owing to the limitations, most geospatial end users cannot access the services related to GIS available through SIG. Thus, the challenges that arise are how to overcome the architecture limitation and how to implement some GIServices, especially some algorithms processing functionality with one easy and convenient approach. Fortunately, the paper puts forward new layout of SIG, which can support GIS nodes and provide GIServices. Within the improved architecture, some parallel GRASS GIS (Geographic Resources Analysis Support System)[5] algorithms programs, which are reconstructed by different parallelization patterns with better speed-up and efficiency, are encapsulated to high-performance GIServices in cluster with assistance of SIG tools. The paper is organized as follows. Section 2 gives a brief introduction to SIG’s improved architecture. Based on this, in Section 3, various parallelization patterns for extracting some GIS parallel programs from GRASS GIS package are proposed. Subsequent section mainly concentrates on the encapsulation mode for those parallel programs, and explains the service invoking flow. In Section 5, one practical way is put forward to evaluate the efficiency of those high-performance GIServices based on one test example. Finally, Section 6 gives some conclusions.
2 Improved Architecture and Analyses for Implementation 2.1 Improved Architecture As vector, one data structure type of GIS, has different characteristics from RS, on which make the algorithms based become relative complicated. Thus, the difference makes it much difficult to extract some parallel programs from the GIS package in Linux cluster, and to wrap into GIServices, especially into the high-performance GIServices. Meanwhile, the existing architecture of SIG considered litter to the related aspects, which makes it difficult to provide GIServices in SIG platform directly. Thus, the overall layout of SIG need be improved when considering these factors. The new architecture is just illustrated as Fig.1. In Fig.1, SIG container is the fundamental part of SIG, which is the middleware combination of several grid tools that is quite suitable for using grid technology in geospatial field. Through container, the different grid nodes can be easily to communicate and achieve one task collaboratively in the distributed heterogeneous environment. In the overall arrangement, there are 4 types of nodes in SIG: SIG management & controlling node (SIG MC Node), geo-spatial data Grid Service node (GDS Node), processing functionality service node (PFS Node), and SIG Web portal.
Study on Implementation of High-Performance GIServices in Spatial Information Grid
607
Fig. 1. Improved architecture of SIG. In the initial stage, there are only some RS nodes. In the new arrangement, some GIS nodes are added and can provide some GDS and PFS of GIS. Especially, the new architecture facilitates providing high-performance GIServices.
SIG Web portal is the only entrance of SIG. Through the authentication and authorization, the user can select the appropriate services. SIG MC Node is the controlling and management centre of SIG. It not only manages all kinds of the operations as authentication and authorization, transaction controlling in SIG Web portal, but also takes responsibilities for Grid services node, comprising updating of services registry information, version management, node maintenance, state controlling, resources scheduling and services controlling. GDS Node can publish the storied RS/GIS data with the form of services through SIG. Those services called GDS for RS and GDS for GIS, respectively. The users can share or download them through SIG Web portal. Meanwhile, PFS Node can serve PFS related to RS and GIS, respectively called PFS for RS and PFS for GIS. Among those 2 types of PFS, they are respectively divided into normal PFS (sequential programs before encapsulation) and highperformance PFS (parallel programs in cluster before encapsulation) according to the computing environment and other factors. Thus, we can provide 4 kinds of PFSs, namely, S-PFS for RS/GIS and HP-PFS for RS/GIS. 2.2 Analyses for High-Performance GIServices Implementation From the above, we know that high-performance GIServices here ascribes to HP-PFS for GIS. Similarly, the procedure for high-performance GIServices can be deduced from that of HP-PFS for RS, and mainly includes: Step 1: It’s critical to get some parallel GIS programs with high speed-up and better efficiency in Linux cluster; and Step 2: Guiding by some encapsulation mode, those programs can be encapsulated into SIG services; Relatively, step 1 should be paid more attention for implementation highperformance GIServices, because: (1) Utilizing commercial GIS packages, we can not get some parallel programs with certain GIS algorithms according with our demands,
608
F. Huang et al.
for we can not get any source codes of them; and (2) Those commercial packages mostly run in Windows, while our computing platform is Linux system. From the point of performance and convenience, it also adds extra difficulties to the first step. Just because those factors, the subsequent step cannot work due to lacking of some parallel GIS algorithms programs produced in step 1. 2.3 Our Approach With carefully studies, we select one open source GIS package in Linux, GRASS GIS, as our research object, which can overcome the difficulties mentioned above. The following sector will discus several parallelization patterns to it, through which some parallel GRASS GIS modules with better speed-up can be easily reconstructed.
3 Reconstructed Parallel GIS Programs Based on GRASS GIS Our cluster is Linux based system established by commodity PCs, and belongs to shared disk architecture. In general, it needs some special libraries such as MPI (Message Passing Interface) [6] to develop parallel programs on it. 3.1 Several Parallel Patterns for GRASS GIS Parallel programming involves developing a single computer program in such a way that it can be executed by more than one processor simultaneously. Data partitioning and function partitioning are effective parallel programming techniques for most applications [7, 8]. Taken account of the characteristics of cluster, the database and other factors of GRASS GIS, we tentatively put forward several parallel patterns for GRASS GIS. Those patterns mainly comprise multi-user data paralleling pattern (MUDPP), GRASS GIS algorithm parallel pattern (GGAPP) and duple parallel pattern (DPP). MUDPP is the method of data partitioning based on the multi-user runtime environment (MURE) and geo-database of GRASS [9]. In fact, MUDPP is the development mode of SPMD (single program multi data). GGAPP dedicates to develop some parallel programs with function partitioning technique. Those independent modules concentrated on v.path and v.buffer. DPP is the integration of MUDPP and GGAPP. For the limitation of paper space, we don’t further to specify the last 2 patterns, which deserve further study and will be introduced in certain special articles. Namely, here we only focus on MUDPP. 3.2 Working Principle, Generic Mode of MUDPP In MUDPP, some GRASS mapsets belong to one LOCATION are established both in master and slave nodes of cluster. The working principle of MUDPP is described as follows. Firstly, the mapset belongs to master node will partition the input map into litter subparts and send instructions to the corresponding mapsets located in slave nodes. When the slave mapsets finished their own subtasks by receiving and invoking the same processing instructions concurrently, the master will merge the sub-results as the entire output map. Of course, the output must be identical to the processing result when the same task processed in sequential.
Study on Implementation of High-Performance GIServices in Spatial Information Grid
609
Thus, the problems of data partitioning and merging in the course of input and output should be attached sufficient importance to. As GRASS has only 2 data type, raster and vector [10], if we can solve their partitioning and merging problems, some of the GRASS modules can be paralleled in cluster without many changes in their codes. The deception is just what our generic model [9] dedicates to achieve. 3.3 Realization of MUDPP In order to accomplish MUDPP, it is requirement to establish some modules firstly, which includes 2 kinds: partitioning & merging modules, and universal module. The former can partition and stitch the input/output dataset of raster/vector. The latter can parallel several GRASS GIS functionalities with one universal program by invoking the other modules. The functionalities of those modules are listed in Table 1. Table 1. In MUDPP, p.universal can obtain the parallelization GRASS modules by invoking the remainders Module name RunModuelInOneUser.sh p.universal
Functionality Start GRASS GIS in one mapset either in master or slave, thus the functionality modules can run on the active mapset. The implementation of MUDPP. Within the pattern, the parallel GRASS GIS modules are reconstructed with the method of SPMD. Finished the partitioning processing for raster map. Accomplished the merging procedure for raster map. Finished the partitioning processing for vector map. Accomplished the merging procedure for vector map.
p.r.in.partition p.r.out.merge p.v.in.partition p.v.out.merge
Initialize N
Y InputMap=rast
Invoking “p.v.partion”
Invoking “p.r.partion”
Invoking “RunModuelInOneUser.sh” Sending the processing instruction
Receiving processing instruction
Receiving the accomplished instruction
Sending accomplished instruction
Invoking the processing module
Y
N
Slave Nodes
InputMap=rast Invoking “p.v.merge”
Invoking “p.r.merge” Deconstruction and so on END
Master Nodes
Fig. 2. Flow chart of MUDPP development. The universal module invokes the partitioning and merging modules with MPI under the fundamental GRASS GIS environment.
610
F. Huang et al.
Fig.2 illustrates the development of MUDPP, which invokes the fundamental modules of GRASS GIS, and MPI.
4 GIServices Encapsulation Mode and Invoking Flow in SIG Through these patterns, the parallel executing programs of GRASS modules are available in cluster. Utilizing relevant tools, we can wrap the parallel modules into high-performance GIServices under the encapsulation mode. 4.1 High-Performance GIServices Encapsulation Mode As those parallel programs are running in Linux cluster, and still need the support of GRASS fundamental environment, all of those make its encapsulation become more different than that of RS PFS. Integrating with the existing SIG platform, 4 steps are summed up (Fig. 3): Step 1. Extracted some executed parallel program (C programs) from GIS package. Those programs can reconstructed by the patterns mentioned above; Step 2. Those executed programs need be encapsulated to .class files with help of Java JNI (Java Native Interface). When the java program can be run successfully in local, it indicates the service entity is implemented successfully; Step 3. Publish the service entities (Java class files) to SIG services with Tomcat. As the result, the WSDL (Web Services Description Language) file will be produced; Step 4. The published high-performance GIServices should be registered to the SIG MC Node with corresponding tools.
Internet
Fig. 3. The high-performance GIServices encapsulation mode. The details for each step are illustrated under the figure.
Study on Implementation of High-Performance GIServices in Spatial Information Grid
611
4.2 High-Performance GIServices Invoking Flow in SIG When the published GIServices is needed, the user should select the corresponding data for processing located in SIG. After the processing accomplished, the results can be either viewed online or downloaded to the user’s computer. Fig. 4 instantiates the whole work flow of the services invoked in SIG platform in detail.
Fig. 4. Invoking workflow of the published high-performance GIServices in SIG. Follow the figure, some explanations for the invoked steps are listed.
5 Analyzing of the High-Performance GIServices Efficiency There is also requirement to use one suitable way to validate the efficiency of the high-performance GIServices through those parallelization patterns. It seems to be the best method to contract the high-performance GIServices with the corresponding normal GIServices directly under identical conditions. In fact, the sequential GIServices may be developed from some commercial GIS packages, which has different computing efficiency from GRASS GIS. Moreover, some uncertain factors such as the state of the network etc. may exist in the respective services procedure. Those factors must be considered to evaluate the services efficiency. Thus, we propose the following formula to ascertain the services efficiency.
TGIServices = TData _ Acquision + TData _ Proces sin g + TCommunication + TRe sult _ Dowload + TOthers Here,
(1)
TGIServices represents the sum elapsed time of the whole services;
TData _ Acquision is the consuming time of the data acquisition, by means of downloading or sharing;
TData _ Proces sin g means the processing time of the program with the same
datasets in one some computing platform; related to network;
TCommunication is the communication time
TRe sult _ Dowload indicates the time for user to download the results;
612
F. Huang et al.
while the last part
TOthers is the elapsed time for the remainder except the parts men-
tioned. The equation illustrates the consuming time of the GIServices in the dynamic environment, which can be used to represent the efficiency of GIServices indirectly. When we suppose those 2 kinds of GIServices are in same conditions, namely, all of items expect TData _ Proces sin g have the identical values in (1). Therefore, the whole services efficiency depends on
TData _ Proces sin g , i.e., we can use it to represent the
corresponding GIServices efficiency. In order to illustrate the excellent efficiency of high-performance GIServices, we select the sequential and parallel programs developed both from GRASS GIS, which can avoid the computing capability differences result from different GIS packages. Table 2 shows the value of TData _ Proces sin g in r.example and r.contour in the form of sequential and parallel respectively. Table 2. The approximate consuming time(s) of r.example and r.contour in sequential and parallel forms under different processors in the same computing platform Module name r.example p.universal/r.example r.contour p.universal/ r.contour
Number of the processors 1 2 4 6 158 / / / / 129 119 91 148 / / / / 157 63 45
From the contrast results, we know that the values of
8 / 101 / 38
10 / 105 / 37
12 / 106 / 34
20 / 115 / 39
TData _ Proces sin g have much
difference. When they in the right numbers of processors (>2), the parallel modules has a better efficiency than the common modules. Therefore, we can deduce that under the same conditions, including the same dataset, network environment, computing platform and so on, the high performance GIServies has a better efficiency than that of the common GIServies, especially for the big size of data, whose processing are full of computation intensive.
6 Conclusions Much work is still needed to explore efficient approaches to make GRASS GIS algorithms parallel in cluster except the mentioned 3 parallelization patterns. Moreover, there is also a requirement to construct more high-performance GIServices based on the new architecture with GRASS GIS. However, the test examples and analyses to the experimental GIServices have led to some useful conclusions: (1) The new architecture is practicable for constructing GIServices; (2) The parallelization patterns, especially MUDPP, are suitable for presenting some parallel GIS algorithms in cluster; and (3) The constructed highperformance GIServices have a better efficiency than the opposite normal GIServices.
Study on Implementation of High-Performance GIServices in Spatial Information Grid
613
References 1. Jin, J.J.: The applications of grids in geosciences [in Chinese], http://support. iap.ac.cn/bbs/viewthread.php?tid=176&extra=page%3D1 2. http://www.863.org.cn 3. http://www.otitan.com/index.shtml 4. http://159.226.224.52:8021/showdetail.asp?id=2 5. http://grass.itc.it/ 6. http://www-unix.mcs.anl.gov/mpi/mpich 7. Brawer, S.: Introduction to parallel programming. Academic Press, San Diego (1989) 8. Wang, F.J.: A Parallel GIS-Remote Sensing System for Environmental Modeling. In: IGARSS 1992, pp. 15–17 (1992) 9. Huang, F., Liu, D.S., Liu, P., et al.: Research On Cluster-Based Parallel GIS with the Example of Parallelization on GRASS GIS. In: GCC 2007, pp. 642–649 (2007) 10. Blazek, R., Neteler, M., Micarelli, R.: The new GRASS 5.1 vector architecture. In: Proceedings of the Open Source GIS–GRASS user conference 2002 (2002)
Numerical Simulation of Threshold-Crossing Problem for Random Fields of Environmental Contamination Robert Jankowski Faculty of Civil and Environmental Engineering, Gdańsk University of Technology, ul. Narutowicza 11/12, 80-952 Gdańsk, Poland [email protected] Abstract. The present paper deals with the numerical simulation of thresholdcrossing problem allowing us to assess the probability that a random field of contamination does not exceed a fixed level in a certain two-dimensional (2-D) spatial domain. A real-valued, homogeneous random field described by the mean value and the covariance (differentiable) function is assumed as the basic theoretical model of the contamination field. In the numerical simulation, a suitable discrete model defined on a regular or irregular grid has been developed and tested by the conditional simulation method. The practical example concerns a case study of heavy metals concentration in soil of the northern part of Poland. The results of the study indicate that theoretical modelling of the level crossings in 2-D random fields with the continuous parameter shows a good agreement with the numerical simulations of the fields with the discretised parameter. Keywords: random fields, environmental contamination, numerical simulation, threshold-crossing.
1 Introduction In recent years, the theory of random fields has been intensively studied and applied to a number of randomly occurring geographical and environmental processes [1-3]. In particular, numerical methods of modelling of random fields of contamination have been shown to be very useful for the purposes of monitoring contamination level as well as for predicting unknown contamination values (see, for example, [3-5]). The present paper is dedicated to the development of a new approach to analyse and control 2-D fields of environmental contamination. It deals with the numerical method of assessing the probability, that a random field of contamination does not exceed a fixed level in a certain two-dimensional (2-D) spatial domain. The main concept of this work is the mean number of upcrossings of the field level, calculated for some intervals in one-dimensional subspaces of the 2-D domain. Theoretical model and numerical simulations are treated jointly, which leads to a deeper understanding of the random phenomena in terms of covariance functions, optimal sampling and probabilities of threshold crossings.
2 Model of Random Fields of Environmental Contamination The study presented in this paper concerns the spatial behaviour of some natural geographical or environmental phenomena, e.g. contamination of soil. The approach is M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 614 – 621, 2008. © Springer-Verlag Berlin Heidelberg 2008
Numerical Simulation of Threshold-Crossing Problem for Random Fields
615
based on the interpretation of the natural process by means of the spatial random field models (see [1-3]). It is assumed that X (r ) represents a scalar, in general spacenonhomogeneous random field, where r ∈ R 2 denotes a two-dimensional position vector. The so-called second order field is characterised in terms of its mean value function:
m x ( r ) = E ( X (r ) )
(1)
and spatial covariance function: K x ( r1 , r2 ) = E ( ( X (r1 ) − mx (r1 ) ) ⋅ ( X (r2 ) − mx (r2 ) ) ) ,
(2)
where E (⋅) denotes the expectation operator and r1 , r2 ∈ R 2 . The following three concepts: homogeneity, ergodicity and isotropy serve as some useful hypotheses. The second-order field X (r ) is called space-homogeneous if its mean and covariance functions do not change under a shift of the vector arguments: mx ( r ) = const ,
(3)
K x ( r1 , r2 ) = K x ( ρ ) ,
(4)
where ρ = r2 − r1 is the distance vector. The homogeneous field is ergodic if the statistical information is included in the single realisation available. A special case of a homogeneous random field is an isotropic field. In this case, the covariance function depends only on the length ρ of the distance vector:
K x (ρ ) = K x ( ρ ) ≡ K x ( ρ ) .
(5)
A very important for our analysis is the fact that for a homogeneous random field, the behaviour of its covariance function in the neighbourhood of ρ = 0 may be a determining factor in regard to differentiability (in the so-called mean square sense) of the field. For example, the homogeneous 1-D field is differentiable (in the mean square sense) if, and only if, its covariance function K x has the second derivative at ρ = 0 . Moreover, the covariance function of the first derivative of the process equals:
Kx' ( ρ ) = −
d2 K x ( ρ ) ≡ − K x'' ( ρ ) . dρ2
(6)
The following covariance function describes the differentiable, homogeneous, isotropic, 2-D random field (named as Shinozuka field - see [5]):
( (
K x ( r2 − r1 ) = σ x2 ⋅ exp −α ( r2 x − r1x ) + ( r2 y − r1 y ) 2
2
)) ≡ σ
2 x
(
)
exp −α ( ρ x2 + ρ y2 ) ,
(7)
616
R. Jankowski
where σ x is a standard deviation of the field, α is a scale parameter describing the degree of space correlation, α > 0 , and the indices x, y on the right hand side denote the orthogonal axes. The process along the line x = 0 has the covariance function: K x ( ρ y ) ≡ K x ( ρ ) = σ x2 ⋅ exp ( −αρ 2 ) .
(8)
According to Equation (6) one obtains: Kx' ( ρ ) = −
(
) (
d −2αρσ x2 exp ( −αρ 2 ) = − −2ασ x2 exp ( −αρ 2 ) + 4α 2 ρ 2σ x2 exp ( −αρ 2 ) dρ
)
(9)
and finally: K x ' ( 0 ) = 2ασ x2 .
(10)
If the covariance function of the 2-D field is of the form: K x ( ρ ) = σ x2 ⋅ exp ( −αρ x2 − βρ y2 ) ,
(11)
where: α ≠ β (α > 0, β > 0 ) , then the field is anisotropic but homogeneous and differentiable (in the mean square sense). An example of the non-differentiable (in the mean square sense) although homogeneous and isotropic 2-D field is the so-called white-noise field defined by the covariance function: ⎧σ 2 for ρ = 0 Kx ( ρ ) = ⎨ x ⎩ 0 for ρ ≠ 0
(12)
and the zero-mean value function.
3 Upcrossing Problem in 2-D Contamination Fields with Continuous Arguments First principles allow us to derive an upper bound on the probability Pu of upcrossing some deterministic level
u (r)
in 1-D homogeneous random field
X (r ) ,
where: r ∈ R . Let N u ( S ) denotes the number of upcrossings in the space interval
[0, S ] . The probability is expressed as:
Pu ( X ( r ) ≥ u ( r ) for some r ∈ [ 0, S ]) = Pu ( upcrossing at r = 0 or N u ( S ) ≥ 1) =
Pu ( upcrossing at r = 0 ) + Pu ( N u ( S ) ≥ 1) − Pu ( upcrossing at r = 0 and N u ( S ) ≥ 1) .
(13)
It should be noticed that the last negative term of Equation (13) is smaller than the smallest positive one. Therefore, from the theorem of the total event, the upper bound on Pu is found:
Numerical Simulation of Threshold-Crossing Problem for Random Fields
617
Pu ( X ( r ) ≥ u ( r ) for some r ∈ [0, S ]) ≤ Pu ( X ( 0 ) ≥ u ( 0 ) ) + Pu ( N u ( S ) ≥ 1)
(14)
. This upper bound is further developed as: ∞
Pu ( X ( 0 ) ≥ u ( 0 ) ) + Pu ( N u ( S ) ) ≤ Pu ( X ( 0 ) ≥ u ( 0 ) ) + ∑ Pu ( N u ( S ) = n ) ≤ n =1
∞
Pu ( X ( 0 ) ≥ u ( 0 ) ) + ∑ nPu ( N u ( S ) = n ) = Pu ( X ( 0 ) ≥ u ( 0 ) ) + E ( N u ( S ) ) .
(15)
n =1
The last approximation is proper if: ∞
Pu ( N u ( S ) = 1) ∑ nPu ( N u ( S ) = n ) .
(16)
n=2
The above inequality is valid in practical situations with the high level u, when clustering of crossings can be neglected. The classical theory of Rice (see [1]) gives the mean value of N u ( S ) in terms of the covariance function K x of the underlying process. If X ( r ) is a zero-mean, homogeneous, differentiable (in the mean square
sense) Gaussian process on [ 0, S ] then: E ( Nu ( S )) =
S 2π
⎛ − K x'' ( 0 ) u2 ⎞ exp ⎜⎜ − ⎟⎟ . K x (0) ⎝ 2 K x ( 0) ⎠
(17)
For example, in the case of covariance function (8), from Equation (9-10) one may obtain: E ( Nu ( S )) =
S 2π
⎛ u2 ⎞ 2α exp ⎜ − 2 ⎟ . ⎝ 2σ x ⎠
(18)
It is useful to evaluate the numerical values of the two terms on the right-hand side of basic formula (15). Let us consider a practical environmental random field of soil contamination by a heavy metal (chromium) in the northern part of Poland – Gda sk region (see also [5,8,9]). The measured chromium concentration values in the soil are as follows: lower bound a = 11.3 ppm , upper bound b = 26.1 ppm , mean value m = 18.7 ppm . The upper bound of the space interval and the scale parameter are equal to: S = 92 km , α = 0.12 km-2 . The three constant u levels above the mean value: u = m + 2σ x , u = m + 3σ x , u = m + 4σ x (σ x = 1.48 ppm ) for Gaussian 1-D random field with covariance function (8) are taken into consideration. The probability values for different levels of u, obtained using Equation (15) and Equation (18), are presented in Table 1. It can be clearly seen from Table 1 that the second term is the dominating one in the probability upper bound.
618
R. Jankowski Table 1. Theoretical probability of upcrossing for different levels of u u level
Pu ( X ( 0 ) ≥ u )
E ( Nu ( S ))
Pu ( X ( 0 ) ≥ u ) + E ( Nu ( S ))
m + 2σ x
0.0228
0.9684
0.9912
m + 3σ x
0.0014
0.0796
0.0810
m + 4σ x
0.0001
0.0024
0.0025
The next model connected with the upcrossing level in 2-D fields deals with certain non-differentiable covariances. We make use of the so-called Slepian inequality [10]. If X ( r ) and Y ( r ) are two zero-mean Gaussian fields such that for all
r1 , r2 ∈ C ⊂ R 2 , where C is a compact domain, and moreover if: E ( X ( r1 ) X ( r2 ) ) ≥ E (Y ( r1 ) Y ( r2 ) )
(19)
and E ( X 2 ( r ) ) = E (Y 2 ( r ) ) ,
(20)
P ⎛⎜ sup X ( r ) ≥ u ⎞⎟ ≤ P ⎛⎜ sup Y ( r ) ≥ u ⎞⎟ . ⎝ C ⎠ ⎝ C ⎠
(21)
then for any level u:
where “sup” denotes a least upper bound. In the special 1-D case described by the covariance of the white-noise field type (see Equation (12)), it follows from inequality (21) that the probability of the upcrossing is greater than in the case of the covariance (8), since the conditions (19-20) are fulfilled. The same conclusion is valid in the 2-D case described by Equation (7) and Equation (12).
4 Model of Conditional Simulations of Discretised Random Fields For the numerical simulation of the threshold-crossing problem we have to consider the discrete parameter random field in the form of the multi-dimensional continuous random variables. The variables are defined at every node of the regular or irregular spatial grid. An important question that arises is: at what points in the parameter space should we sample a random field? Starting from the informational-theoretic approach Vanmarcke [1] concludes that the length of the optimal sampling interval Δr may be expected to be proportional to the scale of the fluctuation θ and he proposes:
1 Δr = θ , 2 where:
(22)
Numerical Simulation of Threshold-Crossing Problem for Random Fields
θ=
2
σ x2
619
∞
∫ K (ρ )d ρ ,
(23)
x
0
for a 1-D homogeneous, ergodic in the mean field. In the case of the squared exponential covariance function (8), one obtains: ∞
θ = 2 ∫ exp ( −αρ 2 ) d ρ = 0
π . α
(24)
Therefore, the length of the sampling interval should be equal to: Δr =
1 π . 2 α
(25)
As the first step of presented approach, an exploratory (experimental) data such as: lower bounds, upper bounds, mean values, standard deviations, correlation coefficients are collected from the small number of places. From the assumed theoretical form of the covariance function we select the best function in the mean-square sense. As the example, let us consider once again the random field of soil contamination by a heavy metal (chromium) in the northern part of Poland (Gdańsk region) with properties specified above. Treating a generated value at a chosen point of the field as the known one, let us numerically generate values of contamination at other locations along one axis of the space interval [ 0, S ] . For α = 0.12 km-2 , using Equation (25), the length of the optimal sampling interval is calculated as equal to: Δr 2.6 km what means that the distance of S = 92 km should be divided into 35 intervals. For the generation purposes, the method of the conditional random fields simulation with
Chromium contamination (ppm)
26 24.62 ppm 24
23.14 ppm 21.66 ppm
22 20
mean level
18 16 0
10
20
30
40
50 60 Distance (km)
70
Fig. 1. Example of generated contamination values
80
90
100
620
R. Jankowski
the acceptance-rejection algorithm is used (see [5,8,9] for details). In the analysis, the simplified cumulative simulation procedure (see [7]) is chosen. In this procedure, the field value at every next location is generated independently based on all so far generated field values at previous locations. The example of numerical generations of contamination values at all 36 field points is presented in Fig. 1. In the graph, the lines indicating the mean level ( 18.7 ppm ) and three different u levels above the mean: u = m + 2σ x = 21.66 ppm , u = m + 3σ x = 23.14 ppm , u = m + 4σ x = 24.62 ppm are also plotted. The probability of upcrossing of different u levels calculated based on 100 numerical realisations is shown in Table 2. Table 2. Probability of upcrossing for different levels of u based on numerical simulations
u level m + 2σ x
Probability 0.9444
m + 3σ x
0.0015
m + 4σ x
0.0001
5 Conclusions The method of numerical simulation of threshold-crossing problem for random fields of environmental contamination has been considered in this paper. The method allows us to assess the probability that a random field of contamination does not exceed a fixed level in a certain two-dimensional (2-D) spatial domain. In the numerical simulation, a suitable discrete model defined on a regular or irregular grid has been developed and tested by the conditional simulation approach. The results of the study indicate that theoretical modelling of the level crossings in 2-D random fields with the continuous parameter (see Table 1) shows a good agreement with the numerical simulations of the fields with the discretised parameter (see Table 2). This fact allows us to consider the method described in this paper as the useful practical tool in the theory of geographical and environmental random fields. In the methods of random field simulation of various geographical or environmental phenomena, theoretical model and stochastic simulations are often considered as the two distinct problems. In the presented modelling approach, theoretical model and numerical simulations are treated jointly, what leads to a deeper understanding of the random phenomena in terms of covariance functions, optimal sampling and probabilities of threshold crossings.
References 1. Vanmarcke, E.H.: Random Fields: Analysis and Synthesis. MIT Press, Cambridge (1983) 2. Shinozuka, M.: Stochastic fields and their digital simulation. Stochastic Methods of Structural Dynamics, pp. 93–133. Martinus Nijhoff Publ., Boston (1987) 3. Christakos, G.: Random Field Models in Earth Sciences. Academic Press Inc., San Diego (1992)
Numerical Simulation of Threshold-Crossing Problem for Random Fields
621
4. Namieśnik, J., Chrzanowski, W., Żmijewska, P. (eds.): New horizons and challenges in environmental analysis and monitoring. Centre of Excellence in Environmental Analysis and Monitoring (CEEAM), Gdańsk University of Technology, Gdańsk, Poland (2003) 5. Jankowski, R., Walukiewicz, H.: Modeling of two-dimensional random fields. Probabilistic Engineering Mechanics 12, 115–121 (1997) 6. Walukiewicz, H., Bielewicz, E., Górski, J.: Simulation of nonhomogeneous random fields for structural applications. Computers & Structures 64, 491–498 (1997) 7. Jankowski, R., Wilde, K.: A simple method of conditional random field simulation of ground motions for long structures. Engineering Structures 22, 552–561 (2000) 8. Jankowski, R., Walukiewicz, H.: Modelling of conditional spatiotemporal contamination fields. In: Proceedings of 5th International Symposium & Exhibition on Environmental Contamination in Central & Eastern Europe, ID 382, Praque, Czech Republic, September 12-14 (2000) 9. Jankowski, R., Walukiewicz, H.: Conditional simulation of spatiotemporal random fields of environmental contamination. TASK Quarterly 10, 21–26 (2006) 10. Slepian, J.D.: The one-sided barrier problem for Gaussian noise. Bell System Tech. Journal 41, 463–501 (1962)
A Context-Driven Approach to Route Planning Hissam Tawfik, Atulya Nagar, and Obinna Anya Intelligent and Distributed Systems Lab, Deanery of Business and Computer Sciences, Liverpool Hope University, Liverpool, United Kingdom L16 9JD {tawfikh, nagara, 05008721}@hope.ac.uk
Abstract. Prototyping urban road network routes improves the accessibility of road layouts and enhances people’s use of road networks by providing a framework for the analysis and evaluation of routes based on multiple criteria, such as spatial quality, transportation cost and aesthetics. However features identified in this way do not, often, incorporate information about current road condition in order to proactively provide real-time contextual information to users. Context-aware computing has the potential to provide useful information about current road condition by leveraging on contextual information of people, places and things. A prototype, which uses context information and a range of design and user-related criteria to analyse the accessibility of road layouts and plan routes, is developed. A case study is used to validate the prototype. Keywords: road networks, route planning, context-awareness, multi-criteria, simulation.
1 Introduction Space layout and route planning presents a critical challenge as a result of increasing levels of urbanisation and road traffic. Urban planners have always aimed at optimizing the road network design to meet transportation cost, safety, land use, aesthetic and environmental considerations. With the rapid growth in traffic patterns and space utilisation, there is a growing need for a tool to design and evaluate urban road networks based on a number of context-driven criteria. This can be accomplished by efficient computer simulation techniques to aid the analysis of routes based on relevant contextual information about a road network as well as various design and userrelated objectives. Multi-criteria analysis allows the use of various factors in a route planning scenario. A relevant definition of multi-criteria analysis by [1], views it as a decision-aid and a mathematical tool that allows the comparison of different alternatives or scenarios according to many criteria, often contradictory, in order to guide toward a ‘good’ decision. The decision maker has to choose from several options or alternatives and will generally have to be contented with a compromising solution. Context-awareness leverages on contextual information of people, places and things to provide useful services necessary to enable real-time determination of optimum route based on multiple objectives. The difficulty in incorporating context-awareness information in route planning arises from the fact that data models for supporting context information are not well-suited for managing geographic information such as proximity M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 622–629, 2008. © Springer-Verlag Berlin Heidelberg 2008
A Context-Driven Approach to Route Planning
623
relationships and; similarly, data models supported by current GIS (Geographic Information Systems) are ill-suited for representing context information [3]. This research work aims to investigate the application of computer modelling and graphical simulation techniques for supporting the evaluation of routes in road networks from a range of design and user-related perspectives, and present a contextdriven approach to route planning. In an earlier part of this work [4], which focused on road network analysis of Liverpool City, it was obtained that a multi-criteria based analysis of urban road network and spatial layouts enables local and global accessibilities of the road network for the set criteria, and the finding of optimum path between two points in a network.
2 Related Work Route planning has been performed using different technologies and algorithms, one of which is Space Syntax [2], a set of analytical and computational tools for the analysis of urban systems. Space syntax can be considered a form of study of geometrical accessibility where a functional relationship is made, or attempted, between the structure of a city and the social, economical and environmental dimensions. A number of space syntax empirical studies have been devoted to the exploration of the complexity and configuration of urban space by searching for a computational representation of a form of geometric order [5]. Space syntax has been successfully applied to relationship between the structure of a city and criminality, way-finding and pollution [6], [7]. According to the space syntax concepts, the accessibility of the spatial layouts is based on the idea that some places or streets are more accessible than others, though the terms used for accessibility, e.g., proximity, integration, connectivity, cost or centrality differ [8]. The concept of network analysis uses Dijkstra algorithm [9] to visualize the network pattern of street structure. Space syntax’s Axial Map was applied to evaluate syntactic intelligibility and the degree of natural movement at both towns. Nophaket and Fujii [10] proposed graph geometry for detecting the minimum path structures. However, most researchers argue that space syntax cannot be generalized across different cultures, archetypes and scales of street pattern. Cheverst et al. [11] presented the GUIDE system, a context-aware tourist guide, which provides information about a city to a tourist based on the visitor’s profile and contextual information, and is relevant to route planning. Lee and Lee [12] implemented inexact strategy and planning in route planning in Taipei City. Chakhar and Martel [13] presented a strategy for integrating GIS and multi-criteria analysis, and also proposed a design for a spatial decision support system. However, very little research has been carried to formulate prototypes to aid route planning based on the contextual information of a road network.
3 Road Network Analysis and Route Planning We designed a graphical simulation prototype to support multi-objective route evaluation of traffic networks. Our prototype uses a graphical user interface system to enable a user to load, initialise and modify an existing road traffic layout. It then determines
624
H. Tawfik, A. Nagar, and O. Anya
the best route between two points on the network designated as the start and destination points, based on a combination of distance, safety, road context, comfort and view criteria. Ultimately it allows for the evaluation of the level of accessibility of the local and global road network. The prototype consists of three layers: Visualisation Layer, Road Layout Generation Layer and Road Layout Evaluation Layer, which are described below. − − −
Visualisation Layer: This layer is the interface between the user and the system where the user inputs the requirements which consist of loading the map, setting the weights for the criteria and the starting and destination points for a given path. Road Layout Generation Layer: This layer converts the road map into the connected graph by using junctions as nodes and roads as links, and stores information about distance, safety, comfort and view for all the roads in the network. Road Layout Evaluation Layer: This layer is used to obtain the optimum paths in the road network. It uses Dijkstra algorithm [9] to obtain optimum accessibility of the road networks. This layer uses the data obtained from the road layout generation layer and cost function to evaluate accessibility based on multiobjective criteria. The outcome of the multi-criteria analysis is represented visually in this layer, and is used to analyse the local and global accessibility of the road network. Local accessibility of a given road or path can be determined by calculating the mean value of the cost of all roads in the network from the given road, while global accessibility of the whole road network is a result of the average values of the local accessibility of all roads in the network.
The application is developed using the Java programming language and uses applets to construct the user interface (the visualization layer). A typical prototyping scenario starts by loading the road map of the location whose road network is to be evaluated into the prototype. The road map is converted into a connected graph using the nodes and links on the map. The junctions are represented as nodes and roads as links. A data file is used to store information for every road in the road map, such as distance between the connecting node pair, safety of the road, comfort, and view. This information is heuristically determined based on information about the number of turns, number of signals, number of accidents or thefts, etc. The attributes of the road, such as safety, comfort and view are provided between the scales ranging from 1 to 10, where 1 represents the best case and 10, the worst. Fig. 1a and fig. 1b show a graphical representation of a road map and the node-arc matrix format in which the data for the map is stored. In this matrix, the five columns represent the heuristically generated values for the start node, the destination node, the journey’s distance, level of safety, level of comfort, and view (aesthetics) level, respectively. The cost function is designed with the objective of determining the cost of the road according to the multi objective criteria. The cost function is applied to all the possible roads to reach the destination from a starting point. The selection of optimum path is based on the cost of path, denoted as Cf. n
Cf =
∑
Ci * Wi
(1)
i =1
Where n is the number of criteria (safety, comfort, view), Ci is the cost of the road for criterion i, Wi is the weight of criterion i provided by the user.
A Context-Driven Approach to Route Planning
Fig. 1a. Link-node representation of a road map
625
Fig. 1b. Node-arc Matrix
Dijkstra’s algorithm [9] is used to choose the optimum path, i.e. the path with minimum cost. Dijkstra’s algorithm solves the single source shortest path problem on a weighted, directed graph in which all edge weights are non-negative. The input to the algorithm consists of a weighted directed graph G, the road network converted to connected graph, which takes the roads as edges and junctions as nodes. Let s be the source node in G from where the user wants to travel, and t, the destination point. Let V denote the set of all nodes in graph G. Each edge of the graph is an ordered pair of vertices (u,v) representing a connection from node u to node v. The set of all edges is denoted by E. Weights of edges are given by a weight function w: E → [0, ∞]; therefore w(u,v) is the non-negative cost of moving from node u to node v. The cost of an edge is determined by using the cost function. The cost of a path between two nodes is the sum of costs of the edges in that path. For a given pair of nodes s and t in V, the algorithm finds the path from s to t with lowest cost. It is also used for finding costs of minimum cost paths from a single node s to all other nodes in the graph. Dijkstra’s algorithm with directed graph G = (V, E), maintains a set S of vertices whose final shortest path weights from the source s have already been determined. The algorithm repeatedly selects the node u є V – S with the minimum cost path estimate, adds u to S and relaxes all edges leaving u. Hence an optimum path is determined by calculating the cost of every road using cost function and finding the path with lowest cost using Dijkstras algorithm.
4 Role of Context-Awareness in Route Planning Context is defined as any measurable and relevant information that can be used to characterise the situation of an entity, such as a road network route [14]. Techniques in context-awareness computing employ temporal context information to proactively inform a user of relevant information about current road condition so as to guide them in making decisions about the best route to take for a particular trip. Context is highly dynamic in space and time [15], and as such can be used to provide real-time information about the condition of a road network route and increase the probability of a road user selecting the optimum route to a destination. The use of context-awareness in our simulation prototype enables us to build a route planning prototype that improves the local and global accessibility of the road network, and allows users to consider current road condition in making decisions about the route to take by: (1) assessing the situation, (2) identifying the available options, (3) determining the costs and benefits of each option, and (4) selecting the
626
H. Tawfik, A. Nagar, and O. Anya
option with the lowest and highest benefits [16]. More specifically, the use of context services enables a road user to extract and model: − − − −
Information reflecting current road context, which can be used to react to current events, such as “it is raining”, “road repair work going on 5km ahead, there is likelihood of delay”, or “high traffic density expected.” Critical safety information, such as “an accident just occurred 2km ahead beside Albert Dock.” Information relating to the wider environmental context, such as “route C is the most aesthetic route to your destination, and has the following attractions: St Johns park and front view Liverpool Cathedral.” Information about a business interest or a particular user interest, such as “Hotel A situated 3km ahead has available rooms.
5 Route Planning and Prototyping Scenario In [4], a road network within a construction site was used to demonstrate our approach. In this work, we incorporate context as a criteria, and illustrate this using Liverpool city centre road network. The road network layouts were analysed in terms of their accessibility based on different criteria as well as the prevailing road context. To achieve this, weights are assigned to each of the criteria, such as distance, safety, comfort and aesthetics. Based on these criteria the following are obtained: (1) Local accessibility of the road network for the set criterion, (2) Global accessibility of the road network and, (3) Optimum path between two points based on multi-criteria.
Fig. 2a. Liverpool city centre map
Fig. 2b. Link-node Graph for Liverpool city centre road network
Various scenarios are described to demonstrate the use of the prototype. The road network map of Liverpool city centre is converted into a connected graph using the nodes and links on the map. The junctions are represented as nodes and roads as links. Fig. 2a shows the map of Liverpool city centre, and fig. 2b shows the corresponding connectivity graph. The values for the criteria are predefined for different roads. See fig. 1. The map highlights four locations – Hotel, the Albert Dock, St Johns and the Royal Hospital – chosen to illustrate context-driven route planning. There is more than one path from a source location to chosen destination location. The optimum path obtained by the network analysis is based on the criteria and weights specified. Using equal weights for the various criteria – distance, safety, traffic density, view
A Context-Driven Approach to Route Planning
Low accessibility
Fig. 3a. Path from Hotel to Albert Dock with equal weights
Low accessibility
High accessibility
Fig. 3c. Global accessibility for multi criteria analysis with equal weights
High accessibility
Fig. 3b. Local accessibility for multi criteria analysis with equal weights Low accessibility
Fig. 4a. Path from Hotel to Albert Dock with high aesthetic appeal
627
High accessibility
Fig. 4b. Local accessibility of Albert Dock with high aesthetic appeal Low accessibility
High accessibility
Fig. 4c. Global accessibility of Albert Dock with high aesthetic appeal
and comfort, a context-driven analysis for a path from Hotel to Albert Dock is obtained. Fig. 3a to fig. 3c show the optimum path, local accessibility and global accessibility using equal weights. In the second scenario the same set of multi criteria - distance, safety, view and comfort are assigned different weights to account for preferences. To analyse the road network based on the high aesthetic appeal from Hotel to Alert Dock, the weight for view criteria is set to high. The optimum path after analysis is shown in fig. 4a to fig. 4c. The local accessibility and global accessibility based on high aesthetic appeal are indicated in fig. 4b and fig. 4c. In the third scenario, contextual information represented by the current road condition is used as a criterion. The contextual data
628
H. Tawfik, A. Nagar, and O. Anya
employed is the varying density of traffic on the route from Hotel to Alert Dock, which is assigned different weights based on the time of the journey. To analyse the road network based on the current road condition from Hotel to Alert Dock, the weight for high traffic density is set to high. The case study demonstrates the concept of accommodating for more than one aspect of road context, considered as criteria, in route planning. The study considered the quality and aesthetics of the journey, current road condition and levels of safety and comfort in addition to the conventional travel time, distance and cost. The results demonstrate that for the path chosen from source to destination, local accessibility and global accessibility will vary based on the weights assigned to the criteria. The criteria or contextual information emphasised depends on the type of user. For example, new visitors to a city would typically like to take the safest route to their destination. Tourists, generally, are more interested in aesthetics, and would go through routes that present the most beautiful views of a city. Habitual road users are interested in saving time and cost as well as comfort, and would always take the shortest routes and those with less number of turns and lower traffic density. Emergency services providers, such as ambulance drivers and the police are more likely to prioritise short routes over other criteria.
6 Conclusion This work investigates the application of computer simulation and modeling techniques to the analysis, evaluation and optimization of road networks based on context information about the road network. Our prototype uses multi-criteria evaluation to determine the optimum path based on current road condition, such as traffic density, and a range of multiple criteria, such as distance, safety, comfort and view. A contextdriven prototype could potentially support route planners in early evaluation of road network layout in terms of its time, cost, road context and safety implications on road users and pedestrians, and to assist with the provision of enhanced road network layouts. Our future work will focus on research to accommodate for complex road scenarios and traffic restrictions. Acknowledgement. The authors would like to thank S. Dasari for the contribution to the programming phase of this work.
Reference 1. Roy, B.: Multicriteria Methodology for Decision Analysis. Kluwer, Dordrecht (1996) 2. Hillier, B.: Space is the machine. Cambridge University Press, Cambridge (1996) 3. Coyle, M., Shekhar, S., Liu, D., Sarkar, S.: Experiences with Object Data Models in Geographic Information Systems. Internal Technical Report, Department of Computer Science, University of Minnesota, US (1997) 4. Soltani, A.R., Tawfik, H., Fernando, T.: A multi-criteria based path-finding application for construction site layouts. In: Proc. of 6th Int’l Conf on Information Visualisation, London, UK (July 2002) 5. Jiang, B., Claramunt, C.: Topological Analysis of Urban Street Networks. Environment, Planning & Design 31(1), 151–162 (2003)
A Context-Driven Approach to Route Planning
629
6. Peponis, J., Zimring, C., Choi, Y.: Finding the building in way finding. Environment and Behavior 22, 555–590 (1990) 7. Ortega, A., Jiménez, E., Jiménez, C., Mercado, S., Estrada, C.: Sintaxis espacial: una herramienta para la evaluación de escenarios. XV meeting of the Mexican Society of Behavioral Analysis, México (2001) 8. Porta, S., Crucitti, P., Latora, V.: The Network Analysis of Urban Streets: A Primal Approach. Environment and Planning B: Planning and Design 33(5), 705–725 (2005) 9. Dijkstra, E.W.: A note on two problems in connection with graphs. Numerische Mathematik 1, 269–271 (1959) 10. Nophaket, N., Fujii, A.: Syntactic and Network Pattern Structures of City – Comparison of Grid and Meandering Street Patterns in Kyojima and Honjo. Journal of Asian Architecture and Building Engineering, JAABE 3(2), 349–356 (2004) 11. Cheverst, K., Davies, N., Mitchell, K., Friday, A.: Experiences of Developing and Deploying a Context-Aware Tourist Guide: The GUIDE Project. In: Proc of MOBICOM. ACM Press, Boston (2000) 12. Lee, H., Lee, C.: Inexact strategy and planning-the implementation of route planning in Taipei city. In: Proc. of the 1996 Asian Soft Computing in Intelligent Systems and Information Processing, Kenting, Taiwan, vol. 1996, pp. 308–313 (1996) 13. Chakhar, S., Martel, J.: Enhancing Geographical Information Systems Capabilities with Multi-Criteria Evaluation Functions. Journal of Geographic Information and Decision Analysis 7(2) (2003) 14. Dey, A., Abowd, G.: Towards a better understanding of context and context-awareness. In: Conference on Human Factors in Computing Systems (CHI 2000): Workshop on the What,Who,Where,When and How of Context-Awareness, The Hague (April 2000) 15. Rakotonirainy, A.: Design of context-aware systems for vehicles using complex system paradigms. In: de Lavalette, B.C., Tijus, C. (eds.) Proc. of the CONTEXT 2005 Workshop on Safety and Context, Paris, France (July 2005) 16. Strauch, B.: Investigating human error: Incidents, accidents and complex systems, Ashgate Publishing Limited (2002)
InterCondor: A Prototype High Throughput Computing Middleware for Geocomputation Yong Xue1,2,*, Yanguang Wang1, Ying Luo1,4, Jianping Guo1, Jianqin Wang3, Yincui Hu1, and Chaolin Wu1 1
State Key Laboratory of Remote Sensing Science, Jointly Sponsored by the Institute of Remote Sensing Applications of Chinese Academy of Sciences and Beijing Normal University, Institute of Remote Sensing Applications, CAS, P.O.Box 9718, Beijing 100101, China 2 Department of Computing, London Metropolitan University, 166-220 Holloway Road, London N7 8DB, UK 3 College of Information and EEng., China Agricultural University, Beijing 100083, China 4 Graduate University of Chinese Academy of Science, Yuquan Road, Beijing 100049, China [email protected] Abstract. This paper presents the design, analysis and implementation of InterCondor system. The InterCondor system is an implementation of the concept of InterGrid. It uses Condor as a basic local Grid computing engine. It utilizes a series of Grid services, which including register service, data transfer service, task schedule service, security authentication service and status monitor service, to manage the resources such as remote sensing algorithms, remote sensing data, and computing resource under the management of Condor engine. We aim at integrating Grid Service data management, task schedule, and the computing power of Condor into remote sensing data processing and analysis to reduce the processing time of a huge amount of data and long-processing-time remote sensing task by algorithms issuance, data division, and the utilization of any computing resources unused on Internet.
1 Introduction The InterCondor system is an implementation of the concept of InterGrid. InterGrid [5][7] is from the comparison of Internet and Grid. With the reference to the idea of Internet, which integrates different local networks into a global network with a certain protocol, InterGrid utilizes a set of Grid services to manage resources such as data, algorithms, and local Grid pools. There are several projects on how to implement an InterGrid. E.g. IVDGL work group devoted to study on the interoperability of USATLAS, EDG and NorduGrid[6]. InterCondor is a Grid system that uses Condor as local Grids. According to the three points checklist of what is the Grid[4], it is no matter how big a Grid is. The key characters of Grid are the coordination of resources, standard protocols and interfaces, and the nontrivial qualities of service. E.g. the famous Cactus project that won 2001 years Gordon Bell Prizes integrated four supercomputers to *
Corresponding author.
M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 630–637, 2008. © Springer-Verlag Berlin Heidelberg 2008
InterCondor: A Prototype High Throughput Computing Middleware
631
solve Grand Challenge problems in physics which require substantially more resources than can be provided by a single machine (http://www.cactuscode.org). There could be office Grid, enterprise Grid, certain functional Grids, etc. We can call them local Grid pools. Correspondingly, there is a global Grid, whose components may be local Grid pools. InterGrid is one kind of global Grid, which has at least two local Grids in it. Although Globus seems to be a defacto standard of Grid, there is no real standard protocol for Grid up to now. There are many different Grids implemented by different technologies, such as Condor, LSF (http://www.platform.com/products/LSF/), or SRB system (http://www.sdsc.edu/srb/). Condor, LSF, and SRB system are independent Grid systems. It could be a problem to interconnect them into a global Grid. For example, although Condor has the glide mechanism to communicate among different Condor pools, it cannot go through firewalls and its protocol is private (http://www.cs.wisc.edu/condor/glidein/). The Condor Project has performed research in distributed high-throughput computing for the past 18 years, and maintains the Condor High Throughput Computing resource and job management software originally designed to harness idle CPU cycles on heterogeneous pool of computers[1]. In essence a workload management system for compute intensive jobs, it provides means for users to submit jobs to a local scheduler and manage the remote execution of these jobs on suitably selected resources in a pool. Boasting features such as check-pointing (state of a remote job is regularly saved on the client machine), file transfer and I/O redirection (i.e. remote system calls performed by the job can be captured and performed on the client machine, hence ensuring that there is no need for a shared file system), and fair share priority management (users are guaranteed a fair share of the resources according to pre-assigned priorities), Condor proves to be a very complete and sophisticated package. While providing functionality similar to that of any traditional batch queuing system, Condor's architecture allows it to succeed in areas where traditional scheduling systems fail. As a result, Condor can be used to combine seamlessly all the computational power in a community. In this paper, we present a Grid system (InterCondor system) that interconnects Condor pools as local Grids so that Condor can go through firework and integrate with other Grid systems. It is developed in Institute of Remote Sensing Applications, Chinese Academy of Sciences, China. We aim at integrating Grid Service data management, task schedule, and the computing power of Condor into geocomputation to reduce the processing time of a huge amount of data and long-processing-time remote sensing task by algorithms issuance, data division, and the utilization of any computing resources unused on Internet. The design of InterCondor system is introduced in details in Section 2. The implementation and analysis of it will be demonstrated in Section 3. Finally, the conclusion and further development will be addressed in Section 4.
2 Design of InterCondor Grid and Internet are both from bottom to top[8]. In fact Grid is a series of standards and protocols in the application layer of Internet. But its inner hierarchy is highly
632
Y. Xue et al.
similar with that of Internet. As to design InterCondor system, we use the idea to divide and rule it too. Figure 1 shows the layer architecture of InterCondor. InterCondor utilizes a series of Grid services, which including register service, data transfer service, task schedule service, security authentication service and status monitor service, to manage the resources such as remote sensing algorithms, remote sensing data, and computing resource under the management of Condor engine. Users choose services and submit tasks to InterCondor through the user interface. If InterCondor gets a task from the users, it Fig. 1. InterCondor layer architecture will trigger the IAST (Intelligent Analyzer & Synthesizer of Tasks). Then the IAST will divide the task into several CPE (Condor Pool Entry) tasks according to the status of the Condor pools currently in InterCondor, and submit the CPE tasks to the Grid Service layer. The Grid Service layer is the key of InterCondor. It stands for dividing the CPE tasks into Condor tasks, submitting the Condor tasks to the right machines which satisfied the computing conditions required by applications, merging the computing results of Condor into CPE results, and return the CPE results to IAST. As a computing engine, Condor stands for no upper management. Finally, the IAST merges the CPE results returned by Grid service layer into a final result corresponding to the division strategy, and returns the final result to user. 2.1 User Interface Layer The user interface includes a user graphical interface, a console interface, and an API to advance users such as developers. There is no need for users to know the whole system. The user interface makes the users operations transparently. The InterCondor software is ostensibly no difference with traditional software. To achieve this, we need an Intelligent Analyzer & Synthesizer of Tasks (IAST). 2.2 Client Layer A client is mainly an IAST. An IAST stands for shield the difference between the network attribute of InterCondor and software running on single machine. Figure 2 shows the IAST work flow chart. 2.3 Grid Service Layer Grid Service layer stands for system management, tasks schedule, and data transfer. It’s the key layer of InterCondor. The architecture of InterCondor is clear that grid service stands for management and Condor stands for computing. Grid service layer includes many kinds of services. They are registration service, status service, transferring service, invoking service stands for receiving the call from users, security authentication service and other special services.
InterCondor: A Prototype High Throughput Computing Middleware
633
Fig. 2. IAST work flow chart
2.4 Condor Layer The Condor layer is composed of many machines managed by CPEs in the InterCondor system. As a computing engine, it’s no need for the Condor layer to care about the external architecture or status. The only things which need Condor layer to do are: execute the tasks submitted by CPEs, respond to the call to inquire the status of Condor pool and report errors occurred during the execution of the tasks to CPEs. These functions can be accomplished by configuring Condor. 2.5 Resource Layer The resource layer is a necessary and rock-bottom entity of the InterCondor system. It comes down to the resources of data, algorithm modules and computing power. It’s not the focus of this paper.
3 Implementation of InterCondor Version 1.0 As shown in Figure 3, we use a peer-to-peer structure to implement the design mentioned in Section 2. The management of the InterCondor system is in different levels: the InterCondor server doesn’t manage each machine in the system directly. It manages the Condor pools instead, and the CPEs manage the machines in the Condor pools. CPE is the most important element in InterCondor system. It has the most functional modules. The InterCondor server has only the registration service on it. Actually it does not manage the whole system. Instead, it only collects the information about the whole system and provides the information to CPEs, so that CPEs can know which CPEs to be cooperate with. It’s up to the CPEs to receiving tasks submitted by users, execute the first level division, push CPE tasks to other CPEs, execute the second level division, submit the Condor task to Condor pools, merge the results, and return the final results to users. Each CPE is both a client and a server. Each CPE can start a InterCondor task and also can cooperate on a task which is started by other CPEs. Up to now InterCondor version 1.0 accomplished using Java. The user interface of InterCondor should shield the Grid computing character of InterCondor in order to make the users feel like the way they are using the traditional remote sensing software that they are familiar to. The Java GUI developed by the Java toolkits of AWT and
634
Y. Xue et al.
SWING means s slow run speed and strange interface. But using Standard Widget Toolkit (SWT) developed by Eclipse organization, people can develop high effect GUI that has standard appearance. Furthermore, the GUI developed by SWT has no matter with platforms. Fig. 3. Peer-to-peer structure of InterCondor version 1.0 We use Web service technology such as SOAP, XML, and WSDL to implement the system. Globus is a new technology which progressing fast. From the issuance of Globus toolkits version 1.0 at 1999 to version 4.0 at 2005, the protocol and technology in it changed evidently. The reliability and practicality need textual research. Therefore, it’s venturesome to follow it. Web service is a mature technology, which has been tested out in the field of industry for decades of years. Globus is adjusting its steps to web service. So long as our node supports web service, our node will be compatible with Globus. Web service ensures the compatibility and expansibility of our node. It’s more complex and difficult to develop using Globus than using web service. There’re some convenient tools available for web service. Figure 4 shows the relationship of different kinds of tasks. 3.1 Condor Pool Entry (CPE) CPE stands for the connection of Condor pool and InterCondor server. It acts as a middleware between Grid services and Condor commands from the point of view of software. The development work of InterCondor mostly aims at CPE. It’s up to CPE
Fig. 4. The relationship of different kinds of tasks
InterCondor: A Prototype High Throughput Computing Middleware
635
to accept the commands from clients, turn client’s commands to DOS commands which Condor can understand, submit tasks to Condor, manage Condor pool, and return the results. The most of the InterCondor toolkit is installed on CPEs, which including the user interface, the client, and the Grid services. The Grid services currently include status service, transferring service, invoking service and security authentication service. There are three kinds of machines in a Condor pool: submit machine, manager machine, and execute machine. The manager machine is the manager defined by Condor software itself. It can schedule, monitor, manage, submit, and execute tasks in the scale of a Condor pool. The submit machine stands for submitting tasks and inquiring status of tasks. It can submit and execute tasks. The execute machine only can execute tasks. Only the manager machine or submit machine can be a CPE. 3.2 Module Module is a collection of executable programs and correlative files owned by users. The executable programs are algorithm codes developed by users to solve certain problems such as NDVI, road extraction, etc. If the users are volunteers to allow other InterCondor users to use their algorithms, they can issue their algorithms to InterCondor. There’s no need for the providers to open their source codes or details of their algorithms. They can just provide the executable programs. Each module has a ID to identify itself and some description files to present its requirement to hardware and software, its function, its parameters, its command format, etc. There is a module manager in each CPE. It stands for the query, add, delete, and update operation to modules. Implementing InterCondor1.0, the default module DB is $INTERCONDOR_HOME/depot/code. All the information about modules is saved in XML files named Modules.xml (Figure 5). Each module has a corresponding configuration file named module.xml. Only the module manager can operate on modules. Only the module can operate on its module.xml. Analogously there is a task manager on CPE. It stands for tasks receive, initialization, delete, trigger, stop, division, mergence, and status query. Only the task manager has the right to maintain the configure files (task.xml). Figure 6 presents the relationship between module manager and task manager. 3.3 PUSH Technology Transferring service is a necessary infrastructure in InterCondor. It must satisfy the following requirements: Active; Can go through firewalls; Safe and reliable and Fast speed. The nowadays’ general transfer protocols are FTP and HTTP. But neither FTP nor HTTP is a passive method for the server. It’s up to the client what should be transferred from server to client. But in the InterCondor system, the server makes decision what to be transferred to client. It must in an active way that we called it PUSH. It’s for the server (a CPE which start an instance) to make decision which client (other CPEs) to cooperate with according to the inquire result from InterCondor server. Then the server CPE in this instance will divide the task into several CPE tasks, and push each CPE task to each cooperative CPE. Receiving a CPE task, each CPE will divide
636
Y. Xue et al.
it into several Condor tasks, and submit them to Condor. When the CPE task finished, each CPE will push its result to the server CPE in this instance. Finally the server CPE integrates these results and returns the final result to the user. FTP is prohibited by many firewalls. GridFTP of Globus has added security, parallel transfer, and checkpoint functions in general FTP. But GridFTP cannot go through many Fig. 5. Module and module manager firewalls either. While firewalls won’t prohibit HTTP, and HTTP1.1 has the parallel transfer and check-point functions provided by GridFTP, we choose to improve HTTP1.1 to implement our InterCondor system. The core idea to use HTTP download technology to implement our PUSH technology is configure a Grid service on each CPE. When the server CPE wants to transfer something to cooperative CPEs, it will call the Grid service on CPEs. The information about what the server CPE wants to transfer to the cooperative CPE is in the call command. Receiving the call command, the cooperative Fig. 6. Relations of module and task CPE will analysis it so that the cooperative CPE should know what to be downloaded from the server CPE. And the following things are accomplished by general HTTP download technology. HTTP is not fast and safe in some cases. So we use some other technology to make up for these. To obtain faster transfer speed, we use: z z z
Data compression technology HTTP multi-thread download technology Break point resuming technology
To make the transfer safer, we use the CRC32 accurate checkout codes technology. If the checkout codes wrong, the data should be transferred again.
4 Conclusions and Future Work In this paper, we introduce our ongoing research on the InterCondor system. It has a peer-to-peer architecture, so that in case that some CPE fails its function the whole InterCondor system also can work. There’s no exclusive entry to InterCondor. Each CPE can be an entry. And each CPE can drive other CPEs to cooperate on one task. Implementing InterCondor version 1.0 we mainly use web service and Java technology. We use Condor as a computing engine. And we developed a PUSH technology
InterCondor: A Prototype High Throughput Computing Middleware
637
to transfer data actively. The function of the InterCondor system like a broker which pushes machines on the Internet as more as possible to cooperate on one task by dividing the task into sub-tasks in different level. We try to make a large virtual supercomputer by integrating cheap personal computers on Internet, and utilize them when they are left unused. Theoretically, InterCondor can provide enough processing ability to anyone at anytime and anywhere if there are enough PCs left unused in it. The InterCodnor system seems to work well, but actually it is far away from intactness. There are still many problems to deal with. One problem is lacking of management to physically distributed data. Remote sensing application is a data- dense application. The long file transfer time between the PCs physically distributed weakens the decrease of the total processing time improved by using more PCs doing one task. Although the data compression technology and multi-thread technology are used, in some case using Intercondor cannot decrease the total processing time of a task. The up to now version of InterCondor only support Java programs, IDL programs, and programs that can run on the Windows operation system. And the security mechanism of it is SimpleCA. More supports should be developed in the future. Acknowledgement. This publication is an output from the research projects “Multiscale Aerosol Optical Thickness Quantitative Retrieval from Remotely Sensing Data at Urban Area” (40671142), "Grid platform based aerosol monitoring modelling using MODIS data and middlewares development" (40471091) funded by NSFC, China, “Multi-sources quantitative remote sensing retrieval and fusion” (KZCX2-YW-313)” funded by CAS, and “973 Project - Active and passive remote sensing of land surface ecological and environmental parameters” (2007CB714407) by MOST, China and "Research Fund for Talent Program" funded by China Agricultural University.
References 1. Basney, J., Livny, M., Tannenbaum, T.: High Throughput Computing with Condor. HPCU news 1(2) (1997) 2. Cactus project, http://www.cactuscode.org 3. Condor glide mechanism, http://www.cs.wisc.edu/condor/glidein/ 4. Foster, I.: What is the Grid? A Three Point Checklist (2002), http://www. globus.org 5. Foster, I., Kesselman, C., Tuecke, S.: The anatomy of the grid: Enabling scalable virtual organizations. International Journal of High Performance Computing Applications 15(3), 200–222 (2001) 6. Gieralttowsk, J.: US-ATLAS, EDG, NorduGrid Interoperability: A Focus on ATLAS and GRAPPA (2002), http://www.hep.anl.gov/gfg/us-edg-interconnect/ jerryg-intergrid-comparison.ppt 7. Perez, J.M., Carretero, J., Garcia, J.D., Sanchez, L.M.: Grid data access architecture based on application I/O phases and I/O communities. In: PDPTA 2004: Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, Las Vegas, NV, USA, vol. 1–3, pp. 568–574. CSREA PRESS (2004) 8. Segal, B.: Grid Computing: The European Data Project. In: IEEE Nuclear Science Symposium and Medical Imaging Conference, Lyon, pp. 15–20 (October 2000)
Discrete Spherical Harmonic Transforms: Numerical Preconditioning and Optimization J.A. Rod Blais Department of Geomatics Engineering Pacific Institute for the Mathematical Sciences University of Calgary, Calgary, AB, T2N 1N4, Canada [email protected], www.ucalgary.ca/~blais
Abstract. Spherical Harmonic Transforms (SHTs) which are essentially Fourier transforms on the sphere are critical in global geopotential and related applications. Among the best known strategies for discrete SHTs are Chebychev quadratures and least squares. The numerical evaluation of the Legendre functions are especially challenging for very high degrees and orders which are required for advanced geocomputations. The computational aspects of SHTs and their inverses using both quadrature and least-squares estimation methods are discussed with special emphasis on numerical preconditioning that guarantees reliable results for degrees and orders up to 3800 in REAL*8 or double precision arithmetic. These numerical results of spherical harmonic synthesis and analysis using simulated spectral coefficients are new and especially important for a number of geodetic, geophysical and related applications with ground resolutions approaching 5 km.
1 Introduction On the spherical Earth as on the celestial sphere, array computations can be done for regional and global domains using planar and spherical formulations. Spherical quadratures and least-squares estimation are used to convert continuous integral formulations into summations over data lattices. Spherical topologies are quite different from planar ones and these have important implications in the computational aspects of array data processing. Spherical geocomputations for regional domains of even continental extents can be reduced to planar computations and under assumptions of stationarity or shift invariance, discrete array computations can be optimized using Fast Fourier Transforms (FFTs). Specifically, convolution operations for filtering and other data processing applications thereby require only O(NlogN) instead of O(N2) operations for N data in one dimension, O(N2logN) instead of O(N4) operations for NxN data in two dimensions, and so on. For global appplications, Gaussian, equiangular and other similar regular grids can be used for spherical quadratures and discrete convolutions. Various quadrature strategies are available in the literature going back to Gauss and Neumann, in addition to least-squares estimation techniques (e.g. [8, 16]). Other approaches have also been M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 638– 645, 2008. © Springer-Verlag Berlin Heidelberg 2008
Discrete Spherical Harmonic Transforms
639
used for discretization and analysis of functions on the sphere using triangular and curvilinear tesselations based on inscribed regular polytopes (see e.g. [2]). Depending on the applications, these strategies may be preferable to the equiangular ones which will be discussed in the following. The associated Legendre functions for high degrees and orders are computationally very challenging. Without any normalization, one can hardly compute SHTs of degrees and orders over 50 or so in REAL*8 or double precision arithmetic. With proper normalization such as the geodetic one used in the following computations, one can achieve degrees and orders to around 1800 in REAL*8 or double precision arithmetic [4] and over 3600 in REAL*16 or quadruple precision arithmetic [5,6]. With latitudebased preconditioning, [13,14,19] achieved about 2700. With Clenshaw’s approach, similar results can be achieved for appropriately decreasing spectral coefficients such as with the EGM06 model coefficients of degree and order 2190 [7,13]. The following shows that with proper numerical preconditioning independent of the latitude, the Legendre functions can be evaluated reliably for degrees and orders over 3800 in REAL*8 or double precision arithmetic. This is demonstrated explicitly in synthesis and analysis computations using unit spectral coefficients with equiangular grids that do not include the poles. In order words, the previously published results [5,6] using Chebychev quadrature and least squares can be extended to degrees and orders over 3800 working in REAL*8 or double precision arithmetic. This is very important for numerous applications in geocomputations for ground resolutions of about 5 km.
2 Continuous and Discrete SHTs The orthogonal or Fourier expansion of a function f(θ, λ) on the sphere S2 is given by ∞
f (θ, λ) = ∑
∑
n = 0 |m|≤ n
f n ,m Ynm (θ, λ)
(1)
using colatitude θ and longitude λ, where the basis functions Ynm (θ, λ ) are called the spherical harmonics of degree n and order m. In particular, the Fourier or spherical harmonic coefficients appearing in the preceding expansion are obtained as inner products f n ,m = ∫ f (θ, λ) Ynm (θ, λ) dσ
(2)
S2
with the overbar denoting the complex conjugate with dσ denoting the standard rotation invariant measure dσ = sin θ dθ dλ on S2. In most practical applications, the functions f(θ,λ) are band-limited in the sense that only a finite number of those coefficients are nonzero, i.e. f n,m ≡ 0 for all degrees n > N and orders |m| < n. Hence, using the regular equiangular grid θj = jπ/J and λk = k2π/K, j = 0, …, J-1, k = 0, …, K-1, with J and K to be specified later on, spherical harmonic synthesis can be formulated as N −1
f (θ j , λ k ) = ∑
∑
n = 0 |m|≤ n
f n,m Ynm (θ j , λ k )
(3)
640
J.A.R. Blais
and using some appropriate spherical quadrature, the corresponding spherical harmonic analysis can be formulated as J −1 K −1
f n ,m = ∑
∑q
j= 0 k = 0
j
f (θ j , λ k ) Ynm (θ j , λ k )
(4)
for quadrature weights qj as discussed by various authors e.g. [9,16,3]. The usual geodetic spherical harmonic formulation is given as ∞
f (θ, λ) = ∑
n
∑ [c
n =0 m=0
nm
cos mλ + s nm sin mλ] Pnm (cos θ)
(5)
where ⎧cnm ⎫ 1 ⎧cos mλ ⎫ f (θ, λ) ⎨ (6) ⎨ ⎬= ⎬ Pnm (cos θ) dσ ∫ ⎩ sin mλ ⎭ ⎩ s nm ⎭ 4π S2 with the geodetically normalized Legendre functions Pnm (cos θ) expressed in terms of
the usual spherical harmonics Ynm (θ, λ ) (see e.g. [11] and [3] for details). The tilde “~” will be used to indicate geodetic normalization in the following. Explicitly, using the geodetic formulation and convention, one has for synthesis, N −1
f (θ, λ) = ∑
n
∑ [c
n =0 m=0
nm
cos mλ + s nm sin mλ] Pnm (cos θ)
and for analysis, using complex analysis, 1 2π π c nm + is nm = f (θ, λ) (cos mλ + i sin mλ) Pnm (cos θ)sin θdθdλ 4π ∫0 ∫0 π
(7)
(8)
= ∫ [u m (θ) + iv m (θ)]Pnm (cos θ)sin θdθ 0
where u m (θ) + iv m (θ) =
1 2π f (θ, λ) (cos mλ + i sin mλ) dλ 4π ∫0
(9)
which is simply the parallel-wise Fourier transform of the data. Hence using data equispaced in longitude and the corresponding Discrete Fourier Transform (DFT) and Inverse DFT (IDFT), one can write for each parallel, DFT IDFT {f (θ, λ k )} ⎯⎯⎯ → {u m (θ) + iv m (θ)} ⎯⎯⎯ → {f ′(θ, λ k )}
(10)
and correspondingly, for each meridian, with some appropriate Chebychev Quadrature (CQ) or Least Squares (LS) to be described explicitly below, CQ or LS SYNTHESIS {c nm + is nm } ⎯⎯⎯⎯⎯ → {u m (θ) + iv m (θ)} ⎯⎯⎯⎯ → {c′nm + is′nm }
(11)
in which the Synthesis is only partial, i.e. in the Fourier domain. Notice that in general for band limit N, N rows of equilatitude data are required with LS while 2N rows of equispaced equilatitude data are required with the CQ, and at least 2N equispaced data are required for DFT per parallel (see e.g. [5,6] for more discussion).
Discrete Spherical Harmonic Transforms
641
Furthermore, for data grids with Δθ = Δλ which are often required in practice, such as {(qj, lk) | qj = jp/N, lk = kp/N; j = 0, 1, …, N-1, k = 0, 1, …, 2N-1} for LS, or {(qj, lk) | qj = jp/2N, lk = kp/2N; j = 0, 1, …, 2N-1, k = 0, 1, …, 4N-1} for CQ, the poles can be excluded with a shift in latitude {(qj, lk) | qj = (j+½)p/N, lk = kp/N; j = 0, 1, …, N-1, k = 0, 1, …, 2N-1} and correspondingly, {(qj, lk) | qj = (j+½)p/2N, lk = kp/2N; j = 0, 1, …, 2N-1, k = 0, 1, …, 4N-1} which allow the use of hemispherical symmetries in the associated Legendre functions Pnm(cos (π – θ)) = (-1)n+m Pnm(cos θ) Notice that these data grids with Δθ = Δλ have Nx2N and 2Nx4N quantities and N2 spectral coefficients for band limit N. More details with simulated results can be found in [5,6].
3 Numerical Preconditioning and Optimization The numerical evaluation of the associated Legendre functions Pnm(cosθ) for colatitude θ, is very challenging for high degrees n and orders m. To see this, one only has to evaluate the diagonal terms Pnn(cos 1º) for n = 1, 2, …, 100, …, using different normalizations. Furthermore, there are several strategies with recursions in n and m and these are far from being numerically equivalent (see e.g. [1,3,14]). The geodetically normalized associated Legendre functions Pnm (cos θ) are computed as a lower triangular matrix with the rows corresponding to the degrees n and the columns corresponding to the orders m. With the initialization for degrees and the orders 0 and 1, P00 (cos θ) = 1, P10 (cos θ) = 3 cos θ and P11 (cos θ) = 3 sin θ, diagonal terms Pnn (cos θ) = (2n + 1) / 2n sin θ Pn −1,n −1 (cos θ) and the subdiagonal terms Pn ,n −1 (cos θ) = 2n + 1 cos θ Pn −1,n −1 (cos θ) computed recursively. The remaining terms for n ≥ 2 and n-2 ≥ m ≥ 0 are evaluated with a three-term formula cos θ Pnm (cos θ) = α nm Pn −1,m (cos θ) + α n +1,m Pn +1,m (cos θ) with α nm = (n − m)(n + m) / (2n − 1)(2n + 1) as detailed in [18], based on [10, 15]. With this approach, degrees and orders up to 1800 or so have been achieved quite reliably with REAL*8 or double precision and over 3600 with REAL*16 or quadruple precision arithmetic [5, 6]. Using IEEE standards for floating point arithmetic on personal and similar computers, the EXPONENT limits for real, double and quadruple precisions are as follows: Variable Type
REAL*4 REAL*8 REAL*16
Minimum EXPONENT -125 -1021 -16381
Maximum EXPONENT
128 1024 16384
642
J.A.R. Blais
Hence to avoid underflow problems in computing Pnm (cos θ) , a simple strategy is to bias the EXPONENT of the corresponding variable by a large number and remove the bias once the computations are done. Specifically, using personal and similar computers with REAL*8 or double precision arithmetic, a bias of 1000 was applied to the EXPONENT of the variable in question and the resulting numerical computations are very stable. More details are included in the next Section. This strategy of biasing the EXPONENT is independent of the colatitude θ which is very important in this context. The literature on computing very high degree spherical harmonics contains numerous strategies such as using the Clenshaw summation approach [13], preconditioning the variable with 10280sinθ in SHTools [19], modifying the recursion to avoid high powers of sinθ [14], and others. These modifications have been reported to achieve degrees around 2700-2800 in synthesis computations [13, 14] and in synthesis and analysis [19]. Notice that in synthesis only computations, it is only necessary to avoid underflows and set the corresponding incremental contributions to zero while in synthesis and analysis computations, the numerical recovery of the input spectral coefficients is expected in addition to avoiding possible underflows.
4 Numerical Experimentation To test the numerical preconditioning procedure, consider the following CQ or LS SYNTHESIS {c nm + is nm } ⎯⎯⎯⎯⎯ → {u m (θ) + iv m (θ)} ⎯⎯⎯⎯ → {c′nm + is′nm }
with the partial SYNTHESIS in the Fourier domain as u m (θ) + iv m (θ) =
N −1
∑ (c
n=m
nm
+ is nm )Pnm (cos θ)
(12)
for 2N isolatitudes with Δθ = π/2N for CQ and N isolatitudes with Δθ = π/N for LS. In longitude, 2N equispaced points with Δλ = π/2N for both CQ and LS in the following experimentation. A shift in latitude of the grids by half Δθ has been implemented to exclude the poles. The Chebychev quadrature is as follows c′′nm + is′′nm =
2N −1
∑ q (u j= 0
j
jm
+ iv jm )Pnm (cos θ j )
(13)
with ujm ≡ um(θj) and vjm ≡ vm(θj), and CQ weights qj =
N −1 1 1 sin ( ( j + ½ )π /2N ) ∑ sin ( (2h + 1)( j + ½ )π /2N ) N 2h +1 h =0
(14)
with q2N-j = qj for j = 0, 1, …, N-1 by hemispherical symmetry. These computations are roughly O(N3) for degree N. The least-squares formulation per degree m is as follows
Discrete Spherical Harmonic Transforms N −1
∑P
n =m
nm
(cos θ j ) (c′′′nm + is′′′nm ) = u m (θ j ) + iv m (θ j )
643
(15)
with isolatitudes θ j = ( j + ½ )π / N = (2 j + 1)π /2N
for
j = 0,1,..., N − 1 .
The least-squares computations for c′′′nm + is′′′nm per degree m are obviously very demanding and roughly O(N4). The elements in the corresponding normal matrices can be evaluated using the Christoffel-Darboux formula as shown in [18, Appendix B] based on [12], but this has not yet been implemented in the code yet. Starting with simulated unit spectral coefficients, the reconstructed coefficients are compared with the input coefficients and Root-Mean-Square (RMS) values are computed for various degrees and orders using both CQ and LS. The key results, all obtained in REAL*8 or double precision arithmetic, are shown in Table 1 for degrees and orders to 3800. The results for degrees and orders to 1800 or so agree exactly with those REAL*8 or double precision results published in [5 and 6]. The second column of Table 1 are the RMS results on a Notebook (NB) while the third column are the RMS results on a desktop PC, both with AMD Athlon64 Dual-Core Processors. The RMS decreases to E-04 for 3900 with CQ but the situation is better using LS with E-13 for 3900 and E-09 for 4000. The numerical stability exhibited in Table 1 for synthesis/ analysis for degrees over 3000 has not previously been seen anywhere else in the literature. Table 1. Numerical RMS Results for SYNTHESIS/ANALYSIS with simulated unit spectral coefficients on AMD64 NB and PC in REAL*8 or Double Precision arithmetic
Degrees N 1000 2000 3000 3200 3400 3600 3800
CQ RMS on AMD64 NB (data grid: 2Nx2N) 1.24557E-12 3.16427E-12 6.72616E-12 2.59890E-12 3.86647E-12 3.54980E-12 5.63723E-11
LS RMS on AMD64 PC (data grid: Nx2N) 5.53768E-14 1.13533E-13 1.67988E-13 1.66504E-13 1.65382E-13 1.64626E-13 2.08633E-13
5 Concluding Remarks Considerable work has been done on solving the computational complexities, and enhancing the speed of calculation of spherical harmonic transforms for different equiangular grids. The numerical problems of evaluating the associated Legendre functions for very high degrees and orders have been solved using numerical preconditioning in terms of the EXPONENT of the corresponding variable. Explicitly, using
644
J.A.R. Blais
simulated unit spectral coefficients for degrees and orders up to 3800, partial synthesis and analysis lead to RMS errors of orders 10-12-10-13. When starting with simulated spherical harmonic coefficients corresponding to 1/degree2, the previously results can be expected to improve by a couple of orders of magnitude, as experienced in previous experimental work [4, 5, 6]. The latter simulations would perhaps be more indicative of the expected numerical accuracies in practice. As enormous quantities of data are involved the intended gravity field applications, parallel and grid computations are imperative for these applications. Preliminary experimentation with parallel processing has already been done [17] and these REAL*8 or double precision results can readily be duplicated in parallel environments. Acknowledgement. The author would like to acknowledge the sponsorship of the Natural Science and Engineering Research Council in the form of a Research Grant on Computational Tools for the Geosciences. Special thanks are also expressed to Dr. M. Soofi, Post Doctoral Fellow (2005-2006) in Geomatics Engineering and Geoscience, University of Calgary, for helping with the optimization of the code for different computer platforms.
References 1. Adams, J.C., Swarztrauber, P.N.: SPHEREPACK 2.0: A Model Development Facility (1997), http://www.scd.ucar.edu/softlib/SPHERE.html 2. Blais, J.A.R.: Optimal Spherical Triangulation for Global Multiresolution Analysis and Synthesis. In: The 2007, Fall Meeting of the American Geophysical Union in San Francisco, CA (2007) 3. Blais, J.A.R., Provins, D.A.: Spherical Harmonic Analysis and Synthesis for Global Multiresolution Applications. Journal of Geodesy 76(1), 29–35 (2002) 4. Blais, J.A.R., Provins, D.A.: Optimization of Computations in Global Geopotential Field Applications. In: Sloot, P.M.A., Abramson, D., Bogdanov, A.V., Gorbachev, Y.E., Dongarra, J., Zomaya, A.Y. (eds.) ICCS 2003. LNCS, vol. 2658, pp. 610–618. Springer, Heidelberg (2003) 5. Blais, J.A.R., Provins, D.A., Soofi, M.A.: Spherical Harmonic Transforms for Discrete Multiresolution Applications. Journal of Supercomputing 38, 173–187 (2006) 6. Blais, J.A.R., Provins, D.A., Soofi, M.A.: Optimization of Spherical Harmonic Transform Computations. In: Sunderam, V.S., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2005. LNCS, vol. 3514, pp. 74–81. Springer, Heidelberg (2005) 7. Blais, J.A.R., Soofi, M.A.: Optimization of Discrete Spherical Harmonic Transforms and Applications. Poster Presentation at the 2006 Fall Meeting of the American Geophysical Union in San Francisco, CA (2006) 8. Colombo, O.: Numerical Methods for Harmonic Analysis on the Sphere. Report no. 310, Department of Geodetic Science and Surveying, The Ohio State University (1981) 9. Driscoll, J.R., Healy Jr., D.M.: Computing Fourier Transforms and Convolutions on the 2Sphere. Advances in Applied Mathematics 15, 202–250 (1994) 10. Gradshteyn, I.S., Ryzhik, I.M.: Tables of Integrals, Series and Products. Academic Press, London (1980) 11. Heiskanen, W.A., Moritz, H.: Physical Geodesy, p. 363. W.H. Freeman, San Francisco (1967)
Discrete Spherical Harmonic Transforms
645
12. Hildebrand, F.B.: Introduction to Numerical Analysis. McGraw-Hill, New York (1956) 13. Holmes, S.A., Featherstone, W.E.: A Unified Approach to the Clenshaw Summation and the Recursive Computation of Very High Degree and Order Normalised Associated Legendre Functions. Journal of Geodesy 76, 279–299 (2002) 14. Jekeli, C., Lee, J.K., Kwon, J.H.: On the Computation and Approximation of Ultra-HighDegree Spherical Harmonic Series. Journal of Geodesy 81(9), 603–615 (2007) 15. Paul, M.K.: Recurrence Relations for Integrals of Associated Legendre Functions. Bulletin Geodesique 52, 177–190 (1978) 16. Sneeuw, N.: Global Spherical Harmonic Analysis by Least-Squares and Numerical Quadrature Methods in Historical Perspective. Geophys. J. Int. 118, 707–716 (1994) 17. Soofi, M.A., Blais, J.A.R.: Parallel Computations of Spherical Harmonic Transforms, Oral Presentation at the 2005 Annual Meeting of the Canadian Geophysical Union, Banff, Alberta, Canada (2005) 18. Swarztrauber, P.N., Spotz, W.F.: Generalized Discrete Spherical Harmonic Transforms. J. Comp. Phys. 159, 213–230 (2000) 19. Wieczorek, M.: SHTOOLS: Tools for Working with Spherical Harmonics. Centre National de la Recherche Scientifique, Institut de Physique du Globe de Paris (2007), http://www. ipgp.jussieu.fr/~wieczor/SHTOOLS/www/accuracy.html
A Data Management Framework for Urgent Geoscience Workflows Jason Cope and Henry M. Tufo Department of Computer Science, University of Colorado at Boulder, 430 UCB, Boulder, CO, 80309-0430, USA {jason.cope, henry.tufo}@colorado.edu
Abstract. The emerging class of urgent geoscience workflows are capable of quickly allocating computational resources for time critical tasks. To date, no urgent computing capabilities for data services exists. Since urgent geoscience and Earth science workflows are typically data intensive, urgent data services are necessary so that these urgent workflows do not bottleneck on inappropriately managed or provisioned resources. In this paper we examine emerging urgent Earth and geoscience workflows, the data services used by these workflows, and our proposed urgent data management framework for managing urgent data services.
1
Introduction
The emergence of Grid computing as a viable high-performance computing (HPC) environment has provided several innovative technologies that enhance traditional scientific workflows. Dynamic data driven applications and workflows in particular benefit from improvements in data integration technology, distributed and dynamic computing resource integration, and wide-area network infrastructure. Recent research into urgent computing systems has further improved several of these workflows that perform emergency computations. Examples of these applications and workflows include Linked Environments for Atmospheric Discovery (LEAD) [1], the Southern California Earthquake Center’s (SCEC) TeraShake [2], the Southeastern Universities Research Association (SURA) Coastal Ocean Observing and Prediction (SCOOP) project [3], and the Data Dynamic Simulation for Disaster Management project which is developing a Coupled Atmosphere-Fire (CAF) workflow for wildfire prediction [4]. The LEAD and SCOOP projects successfully use the Special PRiority and Urgent Computing Environment (SPRUCE) to obtain high-priority access to the shared computing resources available on the TeraGrid [5]. SPRUCE provides project users with elevated and automated access to TeraGrid computational resources so that high-priority applications run immediately or as soon as possible. SPRUCE currently provides urgent computational resource allocation capabilities but does not yet support urgent storage or data management capabilities. Urgent storage and data management capabilities provide prioritized usage of storage resources, such as file systems, data streams, and data catalogs. M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 646–654, 2008. c Springer-Verlag Berlin Heidelberg 2008
A Data Management Framework for Urgent Geoscience Workflows
647
Since a common definition for a scientific workflow is the flow of data between computational processes [6], providing urgent storage and data management is an essential and currently absent capability for urgent computing workflows. Supporting end-to-end urgent computing workflows requires support for common data capabilities, such as data storage, access, search, and manipulation capabilities, required by these workflows. Our framework provides workflows and users with several urgent storage and data management capabilities, including the configuration of Service Level Agreements (SLAs) and Quality of Service (QoS) for data services, management of urgent and non-urgent data in shared computing environments, and autonomic management infrastructure that can adapt and tune data services without administrator intervention. These capabilities are designed as a series of shims that can integrate with existing data management infrastructure. In this paper, we present our proposed approach and framework in further detail. Section 2 describes the data requirements for urgent geoscience applications. Section 3 describes current urgent computing infrastructure. Section 4 describes common data services available to geoscience workflows. Section 5 describes our urgent data management framework. In the final sections, we present future work and conclusions.
2
Urgent Geoscience Applications and Grids
Advances in data and resource integration tools foster a computing environment capable of executing time critical workflows. These time-critical or urgent computing workflows harness distributed computational and data resources to quickly and reliably execute applications. The geoscience and Earth science community developed several applications with urgent computing use cases, such as earthquake, severe weather, flooding, and wildfire modeling applications. A typical characteristic of these applications is that they are I/O intensive. These applications often generate or ingest large amounts of data using various data resources, such as sensor networks, archival storage, and distributed storage systems. An example urgent application whose I/O requirements have been thoroughly analyzed in past work is SCEC TeraShake. The TeraShake simulations are constrained by data management resources because of the large amount of data produced [7]. A high-resolution SCEC simulation generated a total of 40TB of data on a 36TB storage resource. These limitations required developers to move the data as it was generated to other storage resources [2]. The integration of streaming data is demonstrated in two urgent computing workflows. LEAD uses Calder to integrate various data streams into the simulation environment. Calder can accommodate variable data sizes, data generation rates, and user access loads [8]. The CAF workflow has integrated sensor data and shown that prefetch of the data can improve application performance [9]. While the sensor data streams may not be high volume, their performance is limited by lack of network capacity and storage availability. In order to achieve urgent data management capabilities, the data requirements for the various applications
648
J. Cope and H.M. Tufo
must be accounted for. Other services, such as data processing tasks, utilize both computational and storage resources. Conflict free allocation of multiple resources that satisfy QoS requirements is necessary. Several of these urgent workflows utilize Web services and service-oriented architectures. Both the LEAD and SCOOP projects utilize Web services to interface with a suite of data services. These services include access to archival storage, distributed storage systems, and metadata catalogs. To complicate matters, these worklfows are dynamic and are not limited to urgent computing use. Many of these services, such as UCAR’s Unidata services, are available for use by the general Earth science community. Access to these resources or services must be appropriately provisioned based on the need and urgency of the request.
3
Urgent Computing Infrastructure
The Special PRioirty and Urgent Computing Environment (SPRUCE) [5] enables on-demand resource allocation, authorization, and selection for urgent computing applications. This environment provides on-demand access to shared Grid computing resources with a token-based authorization framework. SPRUCE allows Virtual Organizations (VOs) to utilize existing computing infrastructure for time critical tasks instead of procuring dedicated resources for these tasks. Users submitting SPRUCE jobs specify a color-coded urgency parameter with their job description. SPRUCE authorizes the urgent job by verifying that a user is permitted to execute tasks with the specified urgency on the target resource. Each VO defines policies for how the urgent tasks are handled on a per-resource basis. For example, a resource provider may choose to preempt nonurgent jobs for high-priority tasks or to give the urgent tasks next-to-run privileges. The infrastructure is currently deployed on several TeraGrid resources, including the NCAR’s Frost Blue Gene/L, ANL’s DTF TeraGrid cluster, and the SDSC’s DataStar and DTF TeraGrid cluster. The LEAD and SCOOP projects use SPRUCE for urgent allocation of computational resources. While SPRUCE currently provides access to computational resources, in the future it could also be adapted to manage other resources common in workflows, including storage and network resources. To completely support end-to-end urgent computing workflows in Grids, the usage and performance of storage and network resources must be accounted for in urgent computing management infrastructure. Therefore, we proposed the development of an urgent data management framework and services to support data-related tasks in urgent computing workflows. These capabilities will provide the appropriate SLAs and QoS for data services used in urgent computations. Several components are required to adapt current urgent computing capabilities to support these new resource types and several new capabilities are required to support the data requirements of urgent computing workflows. Tools are necessary to integrate existing urgent computing authorization infrastructure with common data services. Additional resource management tasks and processes are necessary to manage data products for Grid resources, resource users, and Grid workflows executing on these resources.
A Data Management Framework for Urgent Geoscience Workflows
649
Infrastructure is also required to coordinate access to the multiple urgent resources and to ensure that conflicts in usage do not occur. Our proposed framework will provide these capabilities for common data services used by urgent applications.
4
Common Data Services
Across the spectrum of geoscience and Earth science Grids, there are several common data services available. These services can be classified as data storage, management, and processing services. The capabilities provided by these services are meant for general use and most provide little or no support for QoS, SLAs, or prioritized access to the data resources. The management of storage and data resources for urgent computing workflows is not addressed by any of these common data services. In this section, we describe the data services available in most Grids and how these service types can be augmented to support urgent storage and data management. The most prevalent data services are data storage services. These services usually tightly couple to a computational resource, provide a staging area for transferring data between distributed resources, and provide a scratch work space for applications to store temporary data. The most common storage in Grids are file and archival systems accessible to one or more resource within a single VO. Transferring data between VOs requires the use of data transfer tools, such as GridFTP [10] and the Reliable File Transfer (RFT) service [11]. Recent developments and research in Grid storage systems have adapted cluster file systems to wide-area computing environments. Examples of wide-area file systems on the TeraGrid include the deployment of IBM’s GPFS file system and Sun Microsystem’s Lustre file system [12,13]. Recent work with Grid data transfers has begun to address providing quality of service in Grids with variable throughput links and resource availability using tools such TeraPaths and autonomic computing [14,15]. To date, none of these capabilities have addressed urgent computing data storage services. The capabilities of recent services, such as the GridFTP QoS provisioning [16], the Managed Object Placement Service (MOPS) [17], and the Data Placement Service (DPS)[18], provide the means for obtaining and sustaining QoS for a storage resource but cannot manage urgent data requirements without additional support. To adequately support urgent computations and workflows, an additional management layer is required to obtain the QoS best suited for the I/O footprint of an urgent computation, adjust the QoS of other concurrent workflows so that urgent workflows are not starved for resources, and negotiate end-to-end workflow scheduling for all urgent computing resources, such as storage, compute, or network resources through the utilization of available capabilities. The immense amount of data produced by some emerging computations has frequently been cited as a hurdle to scaling applications to larger systems. Numerous tools have been developed and integrated into geoscience workflows to support data management tasks. Example data management tools include the
650
J. Cope and H.M. Tufo
Storage Resource Broker (SRB) [19], the Globus Replica Location Service (RLS) [20], and GriPhyn [21]. These services allow users to store and retrieve data from a variety of Grid storage systems or to build catalogs of data products. Several geoscience-specific data services also provide data access and management capabilities for users. Examples include the myLEAD metadata catalog [22], Unidata’s Thematic Realtime Environmental Distributed Data Services (THREDDS) [23], and data stream services such as Calder [8]. There is no explicit support in these data management services for urgent computing applications. There is little or no support for indicating QoS or SLA requirements to these services. Data management features for urgent computing workflows are necessary. These required capabilities include data replica and lifetime management of urgent and non-urgent data. The last set of common data services are data processing services. These services provide various data integration or manipulation tasks. In this set of services, we include data discovery, assimilation, validation, and visualization services. Geoscience workflows, such as those in LEAD, SCOOP, and Grid-BGC [24], use several of these services. These services are generally computationally intensive and urgent computing capabilities built from these services would benefit from coupling urgent data provider and computation. The interplay of provisioning multiple resource required to support these services increases the management complexity of these data services. An additional resource management component is likely required to coordinate these provisioning tasks.
5
Urgent Data Management Capabilities and Framework
To support common data services described in urgent workflows, we proposed an additional data management layer. This Urgent Data Management Framework (UDMF) will leverage existing QoS, SLA, and resource provisioning infrastructures to allocate data resources for urgent computing workflows. These capabilities will integrate into a data and resource management layer responsible for allocating the appropriate SLAs and QoS for data resources and provide appropriate access levels and services. The UDMF will provide application and Web service interfaces so that workflows can invoke the urgent data management capabilities. Several challenges exist that must be addressed by our framework so that it can provide urgent data services. First, our framework must interoperate with a variety of heterogeneous Grid resources managed by different VOs, such as the data services mentioned in the previous section. Since Grids are heterogeneous computing environments, UDMF must cope with differences in data service types, management policies, and characteristics. Another challenge that must be addressed is how to configure and manage these urgent data services. The urgent data services should require minimal human interaction for configuration, but should be intelligent enough to adapt to environmental changes. Another non-trivial integration issue for our framework involves how much effort is required for software developers and users to interact with our framework.
A Data Management Framework for Urgent Geoscience Workflows
651
Data Client Task Request
Web Service
Network Service
Web Service
Network Service
Urgent Data Manager
Data Processing Task
Data Manager
Urgent MultiResource Manager
Urgent Service Access Manager
Urgent Computing Authorization Infrastructure
Urgent Storage Manager
Urgent Computational Resource Manager
Storage Resource
Computational Resource
Fig. 1. The architecture and component interactions within UDMF
While configuring QoS for a data service will be required, the amount of effort to use our framework in existing tools should be minimized. Fig. 1 illustrates the proposed UDMF components and interactions. The components in boxes with solid lines represent existing infrastructure while the components in boxes with dashed lines are the proposed UDMF components. A variety of data and computational resources interact within our proposed architecture. Data resources, computational resources, service providers, service consumers, and urgent resource managers interact to provide urgent data services and data management capabilities to users. UDMF consists of several shims that fit between existing layers of data, service, and computing infrastructure. These shims will intercept normal data operations through hooks into the existing resources and adapt these resources for urgent computing demands or requirements. For example, we have begun development of the urgent computing access and provisioning tools for Grid services. This urgent computing infrastructure hooks into the authorization framework of Grid services to negotiate consumer access to the service based on the applicable urgent computing policy for the consumer. Hooks into other existing tools are available and include the Data Storage Interface (DSI) of GridFTP or intercepting I/O requests with overloaded operations similar to Trickle [25]. UDMF addresses the requirements and challenges for supporting urgent data services that we have defined throughout this paper. To fulfill these urgent data requests at the user-interface level, we will provide several Grid or Web services
652
J. Cope and H.M. Tufo
that allow users to define urgent computing requirements for a specific data service. These services will be accessible for use in existing workflow managers, such as Kepler [6]. A specific urgent data manager will be available for each data service type and these managers will coordinate through a general urgent resource manager. Resource administrators will define policies that these managers will follow. Autonomic computing tools will use the policies to adapt, tune, and manage these resources during urgent computations. The use of autonomic computing will relieve the need for human intervention during urgent computing events. The capabilities provided by each resource manager will vary based on the service. The urgent storage manager will negotiate access to storage and data resources, such as data transfer tools and scratch file systems. The urgent data manager will provide data replication, migration, and removal tasks to adapt existing data and resources for urgent computations. The multi-resource storage manager will negotiate access between coupled data and computational resources used by an urgent data task. This manager will leverage real-time computing and deadline scheduling techniques to schedule resource allocations and to identify potential scheduling conflicts. The UDMF architecture is intentionally small and designed to be as unobtrusive as possible. The managers will be easy to deploy and manage through modular integration or instrumentation of existing services. Workflows will only require minimal changes to properly allocate or access urgent data services by invoking the urgent data service APIs or Web service operations.
6
Conclusions
In this paper, we described our proposed approach to provisioning geoscience and Earth science data services for urgent computing applications and workflows. Our proposed approach is based on analyses of existing geoscience and Earth science data services and how these services are used in urgent computing workflows. Based on our studies, we devised an urgent data management framework that consists of additional layers and shims that can augment existing data management systems with urgent computing capabilities. This infrastructure is lightweight and is designed to not interfere with existing workflow execution, but to provide additional capabilities to urgent computing workflows. We have begun development of the autonomic management infrastructure, urgent data manager, and urgent service access components of our proposed framework. We expect to demonstrate the operation of these tools during the summer of 2008. Acknowledgments. University of Colorado computer time was provided by equipment purchased under DOE SciDAC Grant #DE-FG02-04ER63870, NSF ARI Grant #CDA-9601817, NSF MRI Grant #CNS-0421498, NSF sponsorship of the National Center for Atmospheric Research, and a grant from the IBM Shared University Research (SUR) program. We would like to thank the members of the SPRUCE project, including Pete Beckman, Suman Nadella, and Nick Trebon, for their guidance and support of this research.
A Data Management Framework for Urgent Geoscience Workflows
653
References 1. Droegemeier, K., Chandrasekar, V., Clark, R., Gannon, D., Graves, S., Joesph, E., Ramamurthy, M., Wilhelmson, R., Brewster, K., Domenico, B., Leyton, T., Morris, V., Murray, D., Pale, B., Ramachandran, R., Reed, D., Rushing, J., Weber, D., Wilson, A., Xue, M., Yalda, S.: Linked Environments for atmospheric discovery (LEAD): A Cyberinfrastructure for Mesoscale Meteorology Research and Education. In: Proceedings of the 20th Conference on Interactive Information Processing Systems for Meteorology, Oceanography, and Hydrology, Seattle, WA, January 2004, American Meteorological Society (2004) 2. Cui, Y., Moore, R., Olsen, K., Chourasia, A., Maechling, P., Minster, B., Day, S., Hu, Y., Zhu, J., Majumdar, A., Jordan, T.: Enabling Very–Large Scale Earthquake Simulations on Parallel Machines. In: Shi, Y., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2007. LNCS, vol. 4487, pp. 46–53. Springer, Heidelberg (2007) 3. Bogden, P., Gale, T., Allen, G., MacLaren, J., Almes, G., Creager, G., Bintz, J., Wright, L., Graber, H., Williams, N., Graves, S., Conover, H., Galluppi, K., Luettich, R., Perrie, W., Toulany, B., Sheng, Y., Davis, J., Wang, H., Forrest, D.: Architecture of a Community Infrastructure for Predicting and Analyzing Coastal Inundation. Marine Technology Society Journal 41(1), 53–71 (2007) 4. Mandel, J., Beezley, J., Bennethum, L., Chakraborty, S., Coen, J., Douglas, C., Hatcher, J., Kim, M., Vodacek, A.: A Dynamic Data Driven Wildland Fire Model. In: Shi, Y., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2007. LNCS, vol. 4487, pp. 1042–1049. Springer, Heidelberg (2007) 5. Beckman, P., Beschatnikh, I., Nadella, S., Trebon, N.: Building an Infrastructure for Urgent Computing. High Performance Computing and Grids in Action (to appear, 2008) 6. Ludascher, B., Altintas, I., Berkley, C., Higgins, D., Jaeger, E., Jones, M., Lee, E., Tao, J., Zhao, Y.: Scientific Workflow Management and the Kepler System. IEEE Internet Computing 18(10), 1039–1065 (2005) 7. Faerman, M., Moore, R., Cui, Y., Hu, Y., Zhu, J., Minster, B., Maechling, P.: Managing Large Scale Data for Earthquake Simulations. Journal of Grid Computing 5(3), 295–302 (2007) 8. Liu, Y., Vijayakumar, N., Plale, B.: Stream Processing in Data-driven Computational Science. In: 7th IEEE/ACM International Conference on Grid Computing (Grid 2006) (September 2006) 9. Douglas, C., Beezley, J., Coen, J., Li, D., Li, W., Mandel, A., Mandel, J., Qin, G., Vodacek, A.: Demonstrating the Validity of Wildfire DDDAS. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2006. LNCS, vol. 3993, pp. 522–529. Springer, Heidelberg (2006) 10. Allcock, B., Bester, J., Bresnahan, J., Chervenak, A., Foster, I., Kesselman, C., Meder, S., Nefedova, V., Quesnal, D., Tuecke, S.: Data Management and Transfer in High Performance Computational Grid Environments. Parallel Computing Journal 28(5), 749–771 (2002) 11. Allcock, W., Foster, I., Madduri, R.: Reliable Data Transport: A Critical Service for the Grid. In: Global Grid Forum 11 (June 2004) 12. TeraGrid GPFS WAN (2008), http://www.teragrid.org/userinfo/data/gpfswan.php 13. Simms, S., Pike, G., Balog, D.: Wide Area Filesystem Performance using Lustre on the TeraGrid. In: Proceedings of the TeraGrid 2007 Conference, Madison, WI (June 2007)
654
J. Cope and H.M. Tufo
14. Katramatos, D., Yu, D., Gibbard, B., McKee, S.: The TeraPaths Testbed: Exploring End-to-End Network QoS. In: 2007 3rd International Conference on Testbeds and Research Infrastructures for the Development of Networks and Communities (TridentCom 2007) (May 2007) 15. Bhat, V., Parashar, M., Khandekar, M., Kandasamy, N., Klasky, S.: A SelfManaging Wide-Area Data Streaming Service using Model-based Online Control. In: Proceedings of the IEEE Conference on Grid Computing 2006 (Grid 2006), Barcelona, Spain (September 2006) 16. Bresnahan, J., Link, M., Khanna, G., Imani, Z., Kettimuthu, R., Foster, I.: Globus GridFTP: What’s New in 2007. In: Proceedings of the First International Conference on Networks for Grid Applications (GridNets 2007) (October 2007) 17. Baranovski, A., Bharathi, S., Bresnahan, J., Chervenak, A., Foster, I., Fraser, D., Freeman, T., Gunter, D., Jackson, K., Keahey, K., Kesselman, C., Konerding, D., Leroy, N., Link, M., Livny, M., Miller, N., Miller, R., Oleynik, G., Pearlman, L., Schopf, J., Schuler, R., Tierney, B.: Enabling Distributed Petascale Science. In: Proceedings of SciDAC 2007, Boston, MA (June 2007) 18. Chervenak, A., Schuler, R.: A Data Placement Service for Petascale Applications. In: Petascale Data Storage Workshop, Supercomputing 2007, Reno, NV (November 2007) 19. Rajasekar, A., Wan, M., Moore, R., Schroeder, W., Kremenek, G., Jagatheesan, A., Cowart, C., Zhu, B., Chen, S., Olschanowsky, R.: Storage Resource Broker Managing Distributed Data in a Grid. Computer Society of India Journal 33(4), 42–54 (2003) 20. Chervenak, A.L., Palavalli, N., Bharathi, S., Kesselman, C., Schwartzkopf, R.: Performance and Scalability of a Replica Location Service. In: International IEEE Symposium on High Performance Distributed Computing (HPDC-13), Honolulu, HI (June 2004) 21. Nefedova, V., Jacob, R., Foster, I., Liu, Z., Liu, Y., Deelman, E., Mehta, G., Su, M., Vahi, K.: Automating Climate Science: Large Ensemble Simulations on the Teragrid with the GriPhyN Virtual Data System. In: 2nd International IEEE Conference on e-Science and Grid Computing (December 2006) 22. Plale, B., Gannon, D., Alameda, J., Wilhelmson, B., Hampton, S., Rossi, A., Droegemeier, K.: Active Management of Scientific Data. IEEE Internet Computing 9(1), 27–34 (2005) 23. Domenico, B., Caron, J., Davis, E., Kambic, R., Nativi, S.: Thematic Real-time Environmental Distributed Data Services (thredds): Incorporating Interactive Analysis Tools into NSDL. Journal of Digital Information 2(4) (May 2002) 24. Cope, J., Hartsough, C., Thornton, P., Tufo, H.M., Wilhelmi, N., Woitaszek, M.: Grid-BGC: A Grid-Enabled Terrestrial Carbon Cycle Modeling System. In: Cunha, J.C., Medeiros, P.D. (eds.) Euro-Par 2005. LNCS, vol. 3648. Springer, Heidelberg (2005) 25. Eriksen, M.A.: Trickle: A Userland Bandwidth Shaper for Unix-like Systems. In: Proceedings of USENIX 2005, Anaheim, CA (April 2005)
Second Workshop on Teaching Computational Science WTCS 2008 A. Tirado-Ramos1 and Q. Luo2 1 University of Amsterdam, Amsterdam, The Netherlands Wuhan University of Science and Technology, Zhongnan, China [email protected], [email protected] 2
Abstract. The Second Workshop on Teaching Computational Science, within the International Conference on Computational Science, provides a platform for discussing innovations in teaching computational sciences at all levels and contexts of higher education. This editorial provides an introduction to the work presented during the sessions. Keywords: computational science, teaching, parallel computing, e-Learning, collaborative environments, higher education.
1 Introduction Experience shows that students who have been trained in technology-based environments, such as computational science, tend to thrive in today’s technology-driven societies. The interdisciplinary nature of computational science allows the integration of methods from computer science, mathematical modeling, and data visualization, among others, in order to create virtual laboratories for in-silico experimentation that just a few years ago would have proved costly and impractical for most academic institutions of higher learning. It is evident that the interaction of computational methods allows more intriguing questions to be posed by teachers and students at lower cost experimental settings [1]. The field of higher education is therefore currently witnessing the rapid adoption of computational tools and methods by science teachers. A large majority of those teachers have steadily joined forces and shared experiences in the last few years on the use of high performance computing facilities in order to promote the benefits and importance of computational science instruction in science classrooms. The International workshop on Teaching Computational Science (WTCS2008), held in Krakow, Poland, in conjunction with the International Conference on Computational Science 2008 (ICCS 2008) offers a technical program consisting of presentations dealing with the state of the art in the field. The workshop includes presentations that describe innovations in the context of formal courses involving, for example, introductory programming, service courses and specialist undergraduate or postgraduate topics. During the workshop sessions, Gimenez et al present their experiences with the use of metaheuristics in a parallel computing course, mapping problems in which processes are assigned to processors in a heterogeneous environment, with heterogeneity M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 657 – 658, 2008. © Springer-Verlag Berlin Heidelberg 2008
658
A. Tirado-Ramos and Q. Luo
in computation and in the network. Freitag et al discuss how to introduce students to collaborative work, in project-based learning courses on computer network applications. Hamada et al provide their experiences with e-Learning using supporting active tools to improve learning and evaluation. Hnatkowska et al present an assessment approach to software development process used within student team projects, based on the Process Assessment Model. Aracely et al present their experiences teaching cryptography to engineering students, using Maple software in a graduate-level course. Iglesias et al discuss teaching in the context of the European space of higher education, focusing on the problem of teaching computer graphics. Ramos-Quintana et al elaborate on collaborative environments to encourage self-directed learning, focusing on their experiences with object-oriented programming and case-based reasoning. Shiflet et al discuss theirs undergraduate computational science curriculum, which served as a basis the first textbook designed specifically for an introductory course in the computational science and engineering curriculum. Finally, GonzalezCinca et al discuss innovative ways of teaching computational science in aeronautics, with a particular emphasis in Computational Fluid Dynamics, and the experiences derived from implementation. We feel that the width and breadth of topics shown at the workshop provide a glimpse on the current state of the art in the field, and a promising window to challenges and possibilities ahead. Acknowledgments. The workshop chairs would like to thank all the work by the scientific reviewing committee, as well as the efforts by ICCS chairs Dick van Albada, Marian Bubak, and Peter Sloot.
Reference 1. Selwyn, N.: The use of computer technology in university teaching and learning: a critical perspective. Journal of Computer Assisted Learning 23(2), 83–94 (2007)
Using Metaheuristics in a Parallel Computing Course ´ Angel-Luis Calvo, Ana Cort´es, Domingo Gim´enez , and Carmela Pozuelo Departamento de Inform´ atica y Sistemas, Universidad de Murcia, Spain [email protected], [email protected], [email protected], [email protected]
Abstract. In this paper the use of metaheuristics techniques in a parallel computing course is explained. In the practicals of the course different metaheuristics are used in the solution of a mapping problem in which processes are assigned to processors in a heterogeneous environment, with heterogeneity in computation and in the network. The parallelization of the metaheuristics is also considered.
1
Introduction
This paper presents a teaching experience in which metaheuristic and parallel computing studies are combined. A mapping problem is proposed to the students in the practicals of a course of “Algorithms and parallel programming” [1]. The problem consists of obtaining an optimum processes to processors mapping on a heterogeneous system. The simulated systems present heterogeneity both in the computational and network speeds, and the processes to map constitute a homogeneous set, which means a HoHe (Homogeneous processes in Heterogeneous system) model is represented [2]. The mapping problem is NP [3]. Each student must propose the solution of the mapping problem with some metaheuristic. The paper is organized in the following way: section 2 explains the course in which the experience has been carried out; section 3 presents the mapping problem; in section 4 the application of some of the metaheuristics is explained, including their parallelization; a test is given to the students to see how the teaching objectives have been fulfilled, and the results of the test are commented in section 5; finally, section 6 summarizes the conclusions and outlines possible future studies.
2
Organization of the Course
The course is part of the fifth year of the studies in Computer Science, at the University of Murcia, in Spain. The students had studied Algorithms and Data
This work has been funded in part by the Consejer´ıa de Educaci´ on de la Comunidad de Murcia, Fundaci´ on S´eneca, project number 02973/PI/05. Corresponding author.
M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 659–668, 2008. c Springer-Verlag Berlin Heidelberg 2008
660
´ A.-L. Calvo et al.
Structures, Computer Architecture, Concurrent Programming and Artificial Intelligence. The course is optional, so the students are high level students who are interested in the subject. This, together with the fact that a reduced number of students (approximately fifteen per year) take the course, means that the teaching is personalized and focused on the work of the students. They do different studies and practicals: preparation of a presentation about some algorithmic technique, both sequential and parallel; solution and theoretical and experimental study of an algorithm to solve a challenging problem sequentially; and obtaining parallel versions (in shared memory with OpenMP and in messagepassing with MPI) of the sequential algorithms. The course lasts one semester and it has sequential and parallel parts, which means the parallelism is studied in approximately two months. This reduced time together with the difficulty of an initial approach to parallelism means that the goal is to introduce the students to the problems and tools of parallelism, but we do not expect them to be able to develop new algorithms and carry out detailed experiments, but they must study and program available algorithms, to adjust them to the proposed problem, to design significant experiments and to draw valid conclusions. Thus, the topics of the course are: Introduction to complexity of problems, Tree traversal methods, Probabilistic algorithms, Metaheuristics, Matricial algorithms, Models of parallel programming, Analysis of parallel algorithms and Parallel algorithms. First, the difficulties to solve some problems in a reduced time are stated. Then, some approximate, heuristics or numerical sequential algorithms are studied, and finally, the basics of parallel programming are analysed. Each student will develop sequential and parallel algorithms for the solution of a challenging problem. The proposed problem is a mapping problem where a set of identical processes is assigned to processors in a heterogeneous system. So, the students tackle a challenging problem in the field of parallel programming, and they work with topics in two parts (sequential approximate methods and parallel computing) of the syllabus. The methods proposed to solve this problem are: Backtracking or Branch and Bound with pruning based on heuristics (possibly pruning nodes which would lead to the optimum solution), Backtracking with tree traversal guided by heuristics, Probabilistic algorithms, Hill climbing, Tabu search, Scatter search, Genetic algorithms, Ant colony, Simulated annealing and GRASP. There are a lot of books on algorithms [5,6] and metaheuristics [7,8] which can be consulted by the students. Each student makes two presentations: one on the general ideas of the technique assigned, and the other on the parallelization with OpenMP and MPI of some algorithm which implements this technique. The presentations are previous to the practical work, so that the students can exchange ideas about some parts of the problem (the representation of solutions and nodes, the general scheme of the algorithms, schemes of metaheuristics, possible combinations of techniques, ...) The collaboration of the students is fostered. The experimental comparison of the different techniques developed by the students is positively valued in the final evaluation of the practical. Additionally, at least two individual tutorials with each student would be organized, prior to each presentation.
Using Metaheuristics in a Parallel Computing Course
3
661
The Assignation Problem
The problem proposed is a simplified version of a mapping problem in which the execution time of a parallel homogeneous algorithm (all the processes work with the same amount of data and have the same computational cost) is used to obtain the mapping in a heterogeneous system with which the lowest possible execution time is achieved. The method was proposed in [9]. It is explained (simplified) to the students after the study of the topics about problem complexity, probabilistic algorithms and metaheuristics, and the papers in which the method was introduced and applied, along with other related papers, is made available to students. The method is summarized below. The execution time of a parallel algorithm is modelled as a function of some algorithmic and system parameters [10]: t(s) = f (s, AP, SP )
(1)
where s represents the problem size. The system parameters (SP ) represent the characteristics of the system, and can be the cost of an arithmetic operation, the start-up (ts ) and the word-sending (tw ) time of communications. The algorithmic parameters (AP ) can be modified to obtain faster execution times. Some typical parameters in homogeneous systems are the number of processors to use from those available, or the number of rows and columns of processes. The execution time model considered has the form: t(s, D) = tc tcomp (s, D) + ts tstart (s, D) + tw tword(s, D)
(2)
where D represents the number of processes used in the solution of the problem, tc the cost of a basic arithmetic operations, tcomp the number of basic arithmetic operations, tstart the number of communications and tword the number of data communicated. In a homogeneous system the values of tc , ts and tw are the same in the different processors. In a heterogeneous system it is also necessary to select the number of processes to use and the number of processes assigned to each processor. These numbers are stored in d = (d1 , d2 , . . . , dP ), with P being the number of processors. The costs of a basic arithmetic operation in each one of the processors in the system are stored in an array tc with P components, where tci is the cost in processor i. And the costs of ts and tw between each pair of processors are stored in two arrays ts and tw of sizes P × P , and tsij and twij are the start-up and word-sending times from processor i to processor j. The execution time model would be that of equation 2, but with the values of tc , ts and tw obtained from the formulae: tc = max{di tci }, tsij =
max {tsij }, twij =
di =0,dj =0
max {twij }
di =0,dj =0
(3)
In the model the cost of a basic operation in a processor is proportional to the number of processes in the processor, and no interferences are considered between processes in the same processor.
662
´ A.-L. Calvo et al.
Obtaining an optimum mapping becomes a tree traversal problem if we consider the tree of all the possible mappings. Figure 1 shows one such tree, with P = 3. Each level represents the possible processors to which a process can be assigned. There is no limit to the height of the tree. Because the processes are all equal, the tree is combinatorial, and because more than one process can be assigned to a processor, it includes repetitions. The form of the logical tree, and the representation of the tree or the set to work with must be decided by the student. Each node in the solutions tree could be represented in at least two forms. In a representation with a value for each level, the grey node in figure 1 would be stated by (1, 2, 2, . . .). Because all the processes are equal, it is also possible to store the number of processes assigned to each processor. So, the grey node is represented by (1, 2, 0).
Fig. 1. Tree of the mappings of identical processes in a system with three processors
Sequential and parallel algorithms are developed and studied both theoretically and experimentally. The study would include the analysis of how the use of parallel computing contributes to reduce the execution time and/or the goodness of the solution. In order to have comparable results they must obtain experimental results with at least the functions: n2 p(p − 1) n(p − 1) + ts + tw 5p 2 2 which corresponds to a parallel dynamic programming scheme [9], and: 3 2n n2 2n2 √ tc +√ + ts 2n p + tw √ 3 p p p tc
(4)
(5)
which corresponds to a parallel LU decomposition [11]. The experiments should be carried out with the values in the ranges: 1 < tc < 5, 4 < tw < 40 and 20 < ts < 100. Small values of ts and tw would simulate the behaviour of shared memory multicomputers, medium values would correspond to distributed memory multicomputers, and large values to distributed systems.
Using Metaheuristics in a Parallel Computing Course
4
663
Application of Metaheuristics to the Mapping Problem
In this section the results obtained with three of the methods are shown. Two of the methods are metaheuristic methods (genetic algorithms and tabu search) and the other is a backtracking with pruning based on heuristics. In the sequential algorithms the stress is put on the high algorithmic representation which allows us to obtain different versions only by changing a routine in the scheme. For metaheuristic techniques a general scheme is studied [12]. One such scheme is shown in algorithm 1. Algorithm 1. General scheme of a metaheuristic method Inicialize(S); while not EndCondition(S) do SS =ObtainSubset(S); if |SS| > 1 then SS1 =Combine(SS); else SS1 = SS; end SS2 =Improve(SS1); S =IncludeSolutions(SS2); end
4.1
Backtracking with Node Pruning
Backtracking methods were used for this mapping problem in [9]. For the simulation of small systems, backtracking was satisfactory, but for large systems huge assignation times were necessary. So, the work of the student was: – For the sequential method: • To understand the mapping problem and the increment of the assignation time when backtracking is used for large systems, which makes the backtracking impractical in most cases. • To develop a backtracking scheme for the proposed mapping problem. The scheme should include a pruning routine which should be easy to substitute to experiment with different pruning techniques. • To identify possible techniques to eliminate nodes which in some of the cases would not lead to the optimum mapping. The most representative techniques were: PT1 The tree is searched until a maximum level, and nodes are not pruned. This method is included as a reference which ensures the optimum mapping. PT2 At each step of the execution the lowest value (GLV ) of the modelled execution time of the nodes generated is stored. To decide if a node is pruned a “minimum value” (N M V ) is associated to it.
664
´ A.-L. Calvo et al.
When N M V > GLV the node is pruned. In a node corresponding to p processes, N M V is obtained with a greedy method. From the execution time associated to the node, new values are obtained by substituting in the model p by p + 1, p + 2, ... while the value decreases. N M V is taken as the minimum of these values. PT3 N M V is calculated in a node by substituting in the formula the value of the number of processes for the maximum speed-up achievable. For example, in node (0,2,0), with a tree like that in figure 1, the first processor will not participate in the computation, and with tc = (1, 2, 4), the relative speed-ups would be sr = (1, 0.5, 0.25), and the maximum achievable speed-up is 0.75. PT4 The same value as in the previous case is used for p in the computation part, and the communication part does not vary. • To carry out experiments to compare the results obtained with the different pruning techniques. Initially, experiments were carried out for small simulated systems (between 10 and 20 processors). The best results were obtained with PT3. This technique was used in successive experiments. The main conclusion was that for small systems backtracking with pruning can be used without a large execution time and obtaining a modelled time not far from the optimum. For big systems, the mapping time is too large to be applied in a real context. Parallelism could contribute to reduce the mapping time, so making the technique applicable. – Different schemes were considered to obtain parallel versions, and finally a master-slave scheme was used: • An OpenMP version is obtained in the following way: the master generates nodes until a certain level; slaves are generated and all the threads do backtracking from the nodes assigned cyclically to them. • The MPI version works in the same way, but in this case the master processor sends nodes to the slave processors and these send back the results to the master. • The sequential and parallel versions are compared. There is no important variation in the modelled time. The speed-up achieved is far from the optimum, and this is because independent backtrackings are carried out, which means less nodes are pruned with the parallel programs. 4.2
Genetic Algorithm
Genetic algorithms are possibly the most popular metaheuristic techniques. The students saw this technique in a previous course on Artificial Intelligence. The work of the student was: – For the sequential method: • To understand the mapping problem and to identify population and individual representations to apply genetic algorithms to the problem, to identify the possible forms of the routines in algorithm 1 for the genetic scheme, and to develop a genetic scheme at a high level. The scheme must allow easy changing of some parameters or routines.
Using Metaheuristics in a Parallel Computing Course
665
• To experimentally tune the values of the parameters and the routines to the mapping problem. The main conclusion was that to obtain a reduced assignation time it is necessary to reduce the number of individuals and the number of iterations, but on the other hand this reduction would produce a reduction in the goodness of the solution. Satisfactory results (both for the assignation time and the goodness of solution) are obtained from experiments with 10 individuals and with convergence after 10 iterations without improvement. In any case, genetic algorithms do not seem to be the most adequate metaheuristic for this problem. – About the parallel versions: • The OpenMP program works by simply parallelizing the combination of the population. • From different parallel genetic schemes [13], the island scheme was selected for the message-passing version. The number of generations to exchange information between the islands is one parameter to be tuned. • The sequential and parallel versions are compared. With OpenMP the same mappings are found, but with an important reduction in the assignation time. In MPI this time is not reduced substantially, but better mappings are normally obtained. 4.3
Tabu Search
Tabu search is a local search technique which uses memory structures to guide the search. The students saw this technique in a previous course on Artificial Intelligence. The work of the student was: – For the sequential method: • To understand the mapping problem and to identify set and element representations to apply tabu search to the problem, to identify how the routines in algorithm 1 would be for tabu search, and to develop a tabu search at a high level. The scheme must allow easy change of some parameters or routines. • To experimentally tune the values of the parameters and the routines to the mapping problem. Satisfactory results were obtained when: the number of iterations a movement is tabu is equal to half the number of simulated processors (P ); the initial stage is obtained by assigning P processes to the fastest processors; the number of iterations to begin the diversification phase is three quarters of the maximum number of iterations. – About the parallel versions: • The OpenMP program works by selecting a number of nodes to explore at each step equal to the number of available processors. • For the MPI version, different tabu techniques have been studied [14]. A pC/RS/MPDS technique was used: each process controls its own search; knowledge is not shared by the processes; multiple initial solutions; and different search strategies are used. To diversify the search,
666
´ A.-L. Calvo et al.
some processes start with heuristic solutions and others with random solutions, and the number of iterations a movement is tabu depends on the number of the process. • The sequential and parallel versions are compared. In OpenMP the speedup is satisfactory. In MPI the time is not reduced substantially, and only a small improvement in the mappings is obtained. A small reduction in the execution time can be achieved by reducing the number of iterations.
5
Evaluating Teaching
In order to evaluate if the teaching objectives have been fulfilled a test has been prepared. The test has seven statements. In some of them a high value is positive, but in others the answer is more positive when the value is lower. In that way, the test must be completed by reading it carefully. Each item is valued from 1 to 5, with 1 meaning total disagreement and 5 total agreement. Two questions are analyzed in the test: is the combination of the study of sequential approximation methods and of parallel programming appropriate? is it interesting to use a problem to guide the teaching? The statements in the test are: 1. It is suitable to combine the study of sequential algorithms with the study of parallel computing because in that way two methods to solve high cost problems are studied together. 2. The combination in a course of the study of sequential methods with parallel programming means the course is overloaded, and it would be preferable to have two different courses to study the two subjects. 3. The use of a mapping problem in parallel computing and tackling the problem with heuristic methods is useful to clarify ideas about some methods previously studied in other courses. 4. The mapping problem has made the study of parallel programming difficult, because the problem deals with heterogeneous systems and the parallel programming practicals have been done in homogeneous systems. 5. The proposal of the mapping problem at the beginning of the course has been useful because it has motivated and guided the study of parallel programming. 6. The combination of the study of sequential methods and parallel programming has made it more difficult to follow the course because it has not been possible to study a theme in depth before beginning the next one. 7. The use of a problem to guide the teaching is appropriate because it motivates the study of each one of the themes. Table 1 shows the positive or negative rating of each item in the test in relation to the two questions. The number of students in the course is low. So, no significant conclusions can be drawn from the answers, but some indicators can be observed. Table 2
Using Metaheuristics in a Parallel Computing Course
667
Table 1. Positive or negative rating of each item in relation with the two questions to be evaluated item 1 2 3 4 5 6 7 Heuristic-Parallel + - + Problem-Guided + -+ +
summarizes the answers. The mean value for each item is shown. Also the means of the questions about the join study of approximation methods and parallel programming (H-P) and about the use of a problem to guide the course (PG) are shown. To obtain the mean, in the negative items negative and positive values are interchanged (1 for 5 and 2 for 4). The conclusion is that the two teaching objectives have been successfully fulfilled. Table 2. Mean answer for each item and question in the test 1 2 3 4 5 6 7 H-P P-G 3.57 3.14 4.43 2.71 3.57 3.14 4.00 3.40 3.82
6
Conclusions and Possible Future Studies
The paper presents a teaching experience using metaheuristics in combination with parallel computing in a course of “Algorithms and Parallel Programming”. With this combination the students work at the same time with two of the topics of the course, the importance of approximate methods and heuristics is better understood when working with a challenging problem like the one proposed, and the difficulty and importance of the mapping problem is better understood when working on the problem with a metaheuristic approach. Furthermore, parallel programming is introduced using the same metaheuristics with which the mapping problem is tackled, and parallelism at different levels and in shared memory and message-passing is considered. In addition, because all the students work with the same mapping problem, but each student works with a different mapping algorithm, collaboration between students and common enrichment is fostered. The success of the course organization has been evaluated through a test for the students. The preliminary experience seems to be very positive, and so it will be continued in successive courses. At the moment other mapping problems in the field of parallel computing are being considered.
References 1. Gim´enez, D.: Web page of the Algorithms and Parallel Programming course at the University of Murcia, http://dis.um.es/∼ domingo/app.html 2. Kalinov, A., Lastovetsky, A.: Heterogeneous distribution of computations while solving linear algebra problems on network of heterogeneous computers. Journal of Parallel and Distributed Computing 61(44), 520–535 (2001)
668
´ A.-L. Calvo et al.
3. Lennerstad, H., Lundberg, L.: Optimal scheduling results for parallel computing. In: SIAM News, pp. 16–18. SIAM, Philadelphia (1994) 4. Brucher, P.: Scheduling Algorithms, 5th edn. Springer, Heidelberg (2007) 5. Brassard, G., Bratley, P.: Fundamentals of Algorithms. Prentice-Hall, Englewood Cliffs (1996) 6. Cormen, T.H., Leiserson, C.E., Rivest, R.L.: Introduction to Algorithms. MIT Press, Cambridge (1990) 7. Dr´eo, J., P´etrowski, A., Siarry, P., Taillard, E.: Metaheuristics for Hard Optimization. Springer, Heidelberg (2005) 8. Hromkoviˇc, J.: Algorithmics for Hard Problems, 2nd edn. Springer, Heidelberg (2003) 9. Cuenca, J., Gim´enez, D., Mart´ınez-Gallar, J.P.: Heuristics for work distribution of a homogeneous parallel dynamic programming scheme on heterogeneous systems. Parallel Computing 31, 717–735 (2005) 10. Cuenca, J., Gim´enez, D., Gonz´ alez, J.: Architecture of an automatic tuned linear algebra library. Parallel Computing 30(2), 187–220 (2004) 11. Cuenca, J., Garc´ıa, L.P., Gim´enez, D., Dongarra, J.: Processes distribution of homogeneous parallel linear algebra routines on heterogeneous clusters. In: Proc. IEEE Int. Conf. on Cluster Computing, IEEE Computer Society Press, Los Alamitos (2005) 12. Raidl, G.R.: A unified view on hybrid metaheuristics. In: Almeida, F., Blesa Aguilera, M.J., Blum, C., Moreno Vega, J.M., P´erez P´erez, M., Roli, A., Sampels, M. (eds.) HM 2006. LNCS, vol. 4030, pp. 1–12. Springer, Heidelberg (2006) 13. Luque, G., Alba, E., Dorronsoro, B.: Parallel genetic algorithms. In: Alba, E. (ed.) Parallel Metaheuristics (2005) 14. Crainic, T.G., Gendreau, M., Potvin, J.Y.: Parallel tabu search. In: Alba, E. (ed.) Parallel Metaheuristics (2005)
Improving the Introduction to a Collaborative ProjectBased Course on Computer Network Applications Felix Freitag1, Leandro Navarro1, and Joan Manuel Marquès2 1
Computer Architecture Department, Polytechnic University of Catalonia, Spain 2 Open University of Catalonia, Spain {felix,leandro}@ac.upc.edu, [email protected]
Abstract. In engineering studies, there is a shift to new teaching methodologies with focus on the student involvement, like project-based learning. Projectbased learning courses, however, often relay on a previous course where the technical background to be used in the projects is taught, requiring this way two terms for an area. In this paper, we consider a project-based course on computer network applications, which has been designed to cover both the technical and non-technical content in only one term. In the three years we have been teaching this course, our observation based on questionnaires is that the organization of the course allows to successfully reaching the course objectives. We feel, however, that we do not fully exploit the learning potential the course could have in the first few weeks of the course. We describe how the course is organized and the problems we have identified. We propose a project demonstration tool and describe how our solution improves towards our goals. With the proposed tool, the students should better obtain already in the very first days of the course a clear vision about the projects, allowing fully taking advantage of the opportunities which the course offers. format. Keywords: Project-based learning, Course content, Tools for teaching.
1 Introduction In 2003, the computer science curriculum of the Computer Science Faculty of Barcelona (FIB) of the Technical University of Catalonia (UPC) in Barcelona was revised. This revision stated that students of the computer science studies should acquire more non-technical competences (like working in groups, the capacity to manage projects, the oral and written presentation of work, the capacity to learn independently a new technical context and be able to solve new problems). In order to acquire these new non-technical competences, a number of new courses based on projects were added to the computer science curriculum. In this paper, we consider one of such project-based learning courses, the course “Project on Computer Network Applications” 1 at the Computer Science Faculty of Barcelona of the Technical University of Catalonia. Courses on Computer Network Applications have recently gained substantial importance in many computer science curricula, being now often part of the core subjects. M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 669–677, 2008. © Springer-Verlag Berlin Heidelberg 2008
670
F. Freitag, L. Navarro, and J.M. Marquès
We explain how the course “Project on Computer Network Applications” is organized in order to highlight the improvement we have identified. After three years of teaching experience in this course, our observation based on questionnaires is that both the technical and non-technical objectives of the course are successfully achieved. However, from students’ feedback obtained we feel that we do not fully exploit the learning potential the course could offer to the students in the first weeks. We describe the potential for enhancement we have identified and explain our solution. Our solution should provide the students already in the first days a clear vision about the project in the course, which allows better taking advantage of the opportunities offered by the course. The rest of this paper is organized as follows. In Sec. 2 we explain how the course is organized and identify the problem. In Sec. 3 we evaluate possible solutions. Section 4 explains the design and implementation of our solution, and discusses the expected improvement. Finally, Sec. 5 presents our conclusions.
2 Problem Identification 2.1 Objectives of the Course The course “Project on Computer Network Applications” in the Computer Science Curriculum at the Computer Science Faculty of Barcelona of the Technical University of Catalonia has the following objectives, ordered into technical and non-technical objectives [2]: Technical objectives • Choose the appropriate protocol and format for a certain application. • Design and configure application components and services. • Define and extend elements of an application to provide services taking into account interoperation, performance, scalability and urity. • Install and deploy applications necessary for a certain organization. Non-technical objectives • Collaborative work 3, oral and written communication, work planning, capability to find information, being able to evaluate alternatives, being able to defend a project. Furthermore, the course is expected to have 5 ECTS credits [4], which corresponds to a total dedication of 125 h –150 h of the students. 2.2 Design of the Course The technical context in which the project of this course is to be carried out, computer network applications, is new to the students, since in the previous courses on computer networks up to the network level is taught, but the application level is not addressed. Within the technical context of computer network applications, the development of a complete project in group is the main target of our course.
Improving the Introduction to a Collaborative Project-Based Course
671
Taking into account this situation, the course is organized to introduce application layer technologies in the first part of the course to provide the basis by means of small applications developed in laboratory sessions. While these fundamental technologies are becoming known to the student, project groups are formed which choose a project idea and develop a project proposal. The project proposal is prepared in parallel to the laboratory sessions and finishes when the last laboratory sessions also finish. The proposed project is then developed in the group in the second part of the course. Fig. 1 shows the organization of the course. The first six weeks of the course referring to part I include the laboratory sessions, a few session on theory, and the project proposal. Weeks seven to fifteen, referring to part II, is dedicated to the project development.
Weeks 1 LABORATORY
PART 1
6 7
P1: Web Server P2: Servlets P3: RMI P4: XML P5: Web Services P6: Security
THEORY
PROJECT PROPOSAL
• XML • Caches/CDN • Security • Web Services
PROJECT Proposal presentation
Additional activities on • making project presentations • reading technical papers • writing technical documentation
PART 2 Review presentation
15
PROJECT DEVELOPMENT
Final project presentation
Fig. 1. Schema of the organization of the course
In order to better explain the potential for improvement which we have identified, we describe in the following more details on both parts of the course. Course organization in the first 6 weeks The first six weeks of the course focus on the technical foundations of application layer networks. In the first week, groups of two students are formed for the laboratory sessions. Six laboratory sessions are carried out where different application layer technologies are explored by means of simple implementations. At week three, groups of four students are formed as project groups. A project - different for each group but within the technical context of the course - is proposed at week five by the group by means of a written proposal and approved by the lecturer. This proposal includes,
672
F. Freitag, L. Navarro, and J.M. Marquès
among other points, the division of the project tasks into work packages and assigning them to members of the group. A Gantt chart is made to define the temporal distribution of the tasks. Course organization in weeks 7 to 15 The second part of the course (part II in weeks 7-15) is dedicated to carry out the proposed projects in groups. During this time the projects are developed. In order to achieve the non-technical objectives of the course (see Sec. 2.1), a number of project presentations are scheduled, periodic review meetings with the lecturer take place, project planning and coordination is done, decisions are taken, a demonstration of the project is run, a project presentation to the students is done, and a written project report is elaborated. Optionally, students can realize additional activities concerning how to make the presentation of the project, and how to read and write technical documentation. 2.3 Problem Statement In order to evaluate the course, we have been carrying out a questionnaire via Web with the students at the end of each term, which provides us with feedback from the students’ point of view. In order to get the point of view of the lecturers who teach the course, meetings between the lecturers have been taken place. The information gathered has led to the identification of the following three main problems: 1.
2.
3.
Insufficient knowledge of the technologies at the beginning of the course: Recalling the situation, different to other project-based learning courses, our course does not have a previous course in the computer science curriculum, which provides the technical background on computer network applications. In the project proposal, due at week 5, this lack of sufficiently mature knowledge makes it difficult to estimate the technologies’ impact and usage correctly. In the first few weeks, a lack of a clear vision of what is the project in this course: The project proposal is made at week 5 without having seen other similar projects. The project proposal has to be made without being supported by this knowledge. In the first few weeks, the expected results of the project are still not clear: At the time of the project proposal writing, students face the lack of illustrative examples of finished projects, which could allow them assessing better what their project should achieve. Currently, students make the project proposal at week 5 without having this information fully clear.
3 Our Solution The problems identified indicate the need for more information in the first few weeks of the course concerning what is a project in the context of this course. It is clear, however, that a solution to this problem cannot be sought by increasing the number of hours of the course, since the student dedication to the course is already fixed and given by the ETCS credits.
Improving the Introduction to a Collaborative Project-Based Course
673
In order to provide the project knowledge without increasing the course hours, two alternative solutions have been identified: Alternative 1. Organize the laboratories in part I of the course as a project. Alternative 2. Develop a project demonstration tool available at the beginning of the course. Alternative 1, organize the laboratories in the first part of the course as a project, could be reasonable in the sense that a many project proposal include these technologies to implement different components. A particular context like in a project could be sought and used in the laboratories, where in each of the six laboratory sessions a component of the project is then implemented. The result would be that at the end of part I (when at week 6 the laboratory sessions finishes), these sessions have been carried out within the context of a project. This way, all students made a project before developing their own project from week 7 to 15. Alternative 2, providing a project demonstration tool to the students, is based on the idea that already in the very first day of the course, information about projects would be made available by such a tool. This tool could support and provide more details about the information on projects given by the lecturer in the first sessions of the course. Between these alternatives providing a solution to the problem, our choice has been the alternative 2. The main reason for deciding to develop a project demonstration tool is the early availability of the project information to the students. Already in the first days of the course, illustrative project information can be accessed. On the contrary, alternative 1 would provide a more practical project experience, however the full experience would be available after finishing the laboratories in week 6, while the project proposal has to be written by the students already in weeks 3 to 5.
4 Design and Implementation 4.1 User Interface The project demonstration tool has been decided to be a web application. The user accesses via a web browser the application server, which provides by means of the GUI different types of project information. Referring to the content of the user interface, the problems identified in Sec. 2.3 needs to be addressed. This means, the project application tool needs to provide information about 1. 2. 3.
technologies for computer network applications, provide a clear vision about what is a project in this course, allow getting a clear picture about the results to be achieved by a project.
Fig. 2 shows a schema of the organization of the main page of the project demonstration tool. The highlighted frames illustrate the access provided to the three main information needs in the main page. In the frames, links to other pages addressing the particular issue are given.
674
F. Freitag, L. Navarro, and J.M. Marquès
Project Demonstration Tool for the Computer Network Application Course Generic course information
Project demonstrations
Technologies
Case study
Fig. 2. Organization of main page of the project demonstration tool
The field on technologies allows accessing definitions and use cases of technologies usually applied in any of the projects developed in this course. Depending on the particular technology, information on software packages and comparisons with related technologies is also given. Currently, information about the following technologies is provided: Bluetooth, Peer-to-Peer, J2ME for mobile clients, Ajax, Streaming, the Spring Framework, Apache Struts, and Hibernate. In the field on project demonstrations, access to a number of project demonstrations in terms of results of applications developed within real projects is provided. These demonstrations are given as videos showing the functioning of the different components of a project, which new students of the course can download. Additionally, code developed within projects can be downloaded for installation, which allows the students to personally executing the applications. In the case study field, a link to a detailed project execution example is provided. The case study shows by means of a concrete project example how a project is carried out along the whole course. It shows how a project has been envisioned as first ideas and how this leaded to a project proposal. The access to documents generated during the project execution allows showing the progress during the project development and the final result obtained. Perceiving the whole project development in detail allows the students to better perceiving what a project is in the context of this course. 4.2 Architecture and Implementation It was decided to provide the project demonstration tool as a web application. Between different alternatives for the implementation of such an application, like allow advanced support by Content Management Systems, taking advantage of Wiki
Improving the Introduction to a Collaborative Project-Based Course
675
platforms, taking into account the requirement of easy maintenance it was needed to use well known technologies. Therefore, it was chosen to use Apache Tomcat as web server, generate the presentation of the application by JSPs, use servlets for processing the user requests, and use MySQL as the data base. Fig. 3 shows the architecture of the application. Users are the students who use the application and the administrator. A PC hosts the application server (Apache Tomcat) and the database (MySQL). By design an external FTP server can be used where the videos of the project demos are stored. The FTP server is implemented by means of the ProFTPd tool.
User Administrator HTTP HTTP
Application Server
Data base
FTP
FTP Server
Fig. 3. Architecture
4.3 Prototype and Evaluation The work plan for the development of the project demonstration tool is divided into two phases. The first phase started in September 2007. At time of this writing, the user interfaces have been defined and the application has been designed. Three projects
676
F. Freitag, L. Navarro, and J.M. Marquès
have been selected as demonstrators of finished projects. The description of the technologies they use has been made. The implementation of the first prototype has been completed. The prototype has been deployed in January 2008 on a PC with publicly available IP. The second phase started in February 2008, at the time when the new course started. The deployed prototype is available to the new students since the beginning of this new course. This second phase focuses first on the feedback gathering from the students and then on a final version of the application which incorporates the suggested improvements. In order to evaluate the usefulness of the application, the following methods are pursued. 1.
2.
Interviews: Informal conversation between the lecturers and the students during the project definition phase in the first weeks of the course will provide data on the intensity of use of the application and the usefulness perceived by the students. Enquiry: During the course (after the project definition phase), an anonymous enquiry via web will be launched to formally capture student’s feedback on different aspects of the application.
Concerning the first method, the students who already used the application indicated its usefulness and suggested to add more project demos. On the other hand, we observed that not all students have accessed the application in the first days of the course, which might be related to the fact that the project proposal is due still some six weeks ahead. 4.4 Discussion on Impact The success of the project demonstration tool should be measured by the following impacts: 1.
2.
3.
By providing definitions and use cases of the technologies to be used in the projects, it represents an additional source of information to the students that will allow making technically sound project proposals already in the first weeks of the course. It enhances the current situation in which the technologies are shown relatively late and mainly in the context of the laboratories of the course. The demonstration of finished projects allows the students to make a better estimation of the project goals and the related efforts. This will allow making more accurate project proposals. A detailed example of a project execution both contributes the students to better elaborate the management part of the project in the project proposal, and to actually carry out this management during the project execution.
4.5 Vision on Future Deployment Considering the need to store a growing number of video demos as suggested by the students, we will need to add dynamically additional storage capabilities to those currently available in the PC which hosts the FTP server. We consider therefore
Improving the Introduction to a Collaborative Project-Based Course
677
adapting the hosting of the videos to the Grid storage infrastructure (VOFS) which is currently developed within the Grid4All project [5]. This way, the FTP server will be able to dynamically increase its storage resources if needed. It could therefore adapt to a sudden increase of video demos which might happen by the end of the course when a large number of student projects finishes. Another aspect of integrating part of the application in Grid4All which is interesting for the course from an instructor point of view is that the application itself could become a demonstrator of a Grid enabled application.
5 Conclusions The paper showed by means of a project-based computer network applications course, that such courses might not fully take advantage of the learning potential available in the first few weeks of the course, due to not providing students with sufficient practical and illustrative examples of projects. Revising the organization of this course, the benefits of such project information at the beginning of the course were stated. A project demonstration tool was chosen as a solution to support the lecturer in teaching at the beginning of the course on project related issues. It was described how the project demonstration tool could address the current lack of practical information. The information provided will allow the student acquiring more knowledge in the first weeks of the course, allowing them to better exploit the course’s opportunities. The course presented, project-based computer network applications, achieved teaching both the technical and non-technical contents in one term. This experience could interesting in the sense that other project-based course rely on the technical background provided in another previous course. Our experience could motivate to think if other project-based courses could also be enabled to cover an area in one term instead of two in the curriculum. Acknowledgements. This work has been partially financed by the Grid4All European project under contract FP6-IST-034567, the Spanish Ministry of Education and Science under contract TIN2007-68050-C03-01, FEDER funds, and the Technical University of Catalonia.
References 1. Course Project on Computer Networks at Computer Science Faculty at UPC Barcelona, http://www.fib.upc.edu/en/infoAca/estudis/assignatures/PXC.html 2. CRITERIA FOR ACCREDITING COMPUTING PROGRAMS. Accreditation Board for Engineering and Technology (ABET), http://www.abet.org 3. Johnson, D.W., Johnson, R.T., Smith, K.A.: Active Learning: Cooperation in the College Classroom (1998) 4. European Credit Transfer System, http://europa.eu.int/comm/education/ socrates/ects.html 5. Grid4all project: http://grid4all.elibel.tm.fr/
Supporting Materials for Active e-Learning in Computational Models Mohamed Hamada Languages Processing Lab The University of Aizu, Aizuwakamatsu, Fukushima, Japan [email protected]
Abstract. In traditional lecture-driven learning, material to be learned is often transmitted to students by teachers. That is, learning is passive. In active learning, students are much more actively engaged in their own learning while educators take a more guiding role. This approach is thought to promote processing of skills and knowledge to a much deeper level than passive learning. In this paper, a research using supporting materials for active e-learning in computational models and related fields is presented. The contributions of this paper are supporting active tools to improve learning and an evaluation of its use in context.
1 Introduction Active and collaborative learning provide a powerful mechanism to enhance depth of learning, increase material retention, and get students involved with the material instead of passively listening to a lecture. Active learning is a learning with students involved in the learning process as active partners: meaning they are “doing”, “observing” and “communicating” instead of just “listening” as in the traditional (lecturedriven) learning style. Learning science research indicates that engineering students tend to have active and sensing learning preferences, and engineering related educators are recognizing the need for more active and collaborative learning pedagogy [22]. So far several learning models have been developed (e.g. [4, 9, 13, 17]) for the realization of the learning preferences of science and engineering learners. Among these models, Felder-Silverman [4] is simpler and easier to implement through a web-based quiz system, as in Felder-Soloman [21]. The model classifies engineering learners into four axes: active vs. reflective, sensing vs. intuitive, visual vs. verbal, and sequential vs. global. Active learners gain information through a learning-by-doing style, while reflective learners gain information by thinking about it. Sensing learners tend to learn facts through their senses, while intuitive learners prefer discovering possibilities and relationships. Visual learners prefer images, diagrams, tables, movies, and demos, while verbal learners prefer written and spoken words. Sequential learners gain understanding from details and logical sequential steps, while global learners tend to learn a whole concept in large jumps. In Rosati [20] a study of this model was carried out to classify the learning style axes of engineering learners. The study showed that engineering learners tend to have strong active, sensing, visual, and sequential learning preferences. M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 678 – 686, 2008. © Springer-Verlag Berlin Heidelberg 2008
Supporting Materials for Active e-Learning in Computational Models
679
The concepts of computational models have important use in designing and analyzing several hardware and software applications. These concepts are abstract in nature and hence used to be taught by a traditional lecture-driven style, which is suitable for learners with reflective preferences. Since computer engineering learners tend to have strong active preferences, a lecture-driven teaching style is less motivating for them. In this paper, a research using supporting materials for active e-learning in computational models and related fields is presented. The contributions of this paper are supporting active tools to improve learning and an evaluation of its use in context. As a first contribution, we introduce an integrated environment that is designed to meet the active learning preferences of computer engineering learners. For the second contribution: several classroom experiments are carried out. The analysis of the experiments’ outcomes and the students feed back show that our integrated environment is useful as a learning tool, in addition to enhancing learners’ motivation to seek more knowledge and information on their own. Our active materials can be used as a supporting tool for active (e)-learning not only for computational models subject, but also for several other courses such as automata and formal languages, theory of computation, discrete mathematics, principles of programming languages, compiler design and other related courses. Such courses cover a variety of topics including finite state machines (automata), pushdown automata, and Turing machines, in addition to grammars and languages. We cover such topics in our active tools. The tools are written using the Java2D technology of Sun Microsystems [10]. This implies that our tools are portable, machine independent and web-based enabled, which makes it a useful tool as an interactive and online learning environment. The tools integrate several different materials to support the learners’ preferred style. It includes a movie-like welcome component, an animated hyper-text introduction for the basic concepts, a finite state machine simulator with several operations, a set of visual examples for learners’ motivation, a Turing machine simulator, and an interactive set of exercises for self assessment. To show the effectiveness of our tools as a model of interactive online collaborative learning tool, several classroom experiments were carried out. The preliminary results of these experiments showed that using our environment not only improved the learners’ performance but also improved their motivation to actively participate in the learning process of the related subjects and seek more knowledge on their own. The paper is organized as follows. Following the introduction, section two introduces our active tools including finite state machines, visual examples, and Turing machines. The performance evaluation of the environment will be presented in section three. Finally, we conclude the paper and discuss results, related work, and possible future extensions in section four.
2 Active Materials Our active materials contain eight components which have been integrated into a single unit to make all topics easily accessible for learners. The components include the following: a movie-like welcome component, a hyper text introduction to the computational models topics, a finite state machine (FSM) simulator. a Turing machine
680
M. Hamada
(TM) simulator, a self assessment exercises, and the other three components showing the visual examples of finite state machines. The welcome and introduction components use plain and animated text, which are suitable for learners with sequential learning preferences. The simulators and visual examples of components are best suited for learners with active and sensing learning preferences which most computer engineering learners prefer. In the sequel of this section, we will describe of these components. 2.1 FSM Simulator The finite state machine simulator is integrated as a basic component of the environment. It allows learners to draw an automaton visually and apply several operations to it. The possible operations include: NFA to DFA transformation, λ-NFA to NFA transformation, DFA to regular expression, and regular expression to λ-NFA. In addition to these transformations, learners can minimize the given automaton, check the acceptance/rejection of an input to the automaton, zoom-in and out, and auto layout the automaton. The simulator interface is shown in Fig. 1.
Fig. 1. The FSM simulator interface
2.2 TM Simulator The Turing machine simulator is integrated into the environment as well. This simulator is based on the work of [11]. Learners can write their machine in the input window, and then write the input of the machine on the (infinite) tape. After that, they can start to operate the machine on the input and observe how it works. For example, to add two positive integers m and n, the function add(m, n) = m+n, is represented by the Turing machine rules shown in Fig. 2(b). A rule in the form a b c > means that if the current state is a and the current input tape symbol is b, then the controller changes the current state to c and moves one step to the right (right is represented by > and left by <). A rule in the form a b c d means that if the current state is a and the current
Supporting Materials for Active e-Learning in Computational Models
681
input tape symbol is b, then the controller changes the current state to c and the current input tape symbol to d. If the learner wants to add, for example, 2 and 3, i.e. compute the function add(2,3)=2+3, then he/she must write 2 as 11 and 3 as 111 separated by 0 on the input tape, which means the input string will be 110111. Running the machine on this input by clicking the run button will result in the machine halting with the output string 11111 written on the tape, which means 5. Learners can see the machine running on the input symbol in a step-by-step manner which can help the learner to see how the Turing machine acts on input and how it can compute functions. All operations of the Turing machines can be simulated by this simulator. The component interface showing the Turing machine simulator is shown in Fig. 2(a). While the Turing machine operates on its input, a number of short comments also appear on the editor to give the learners more information about the theory of Turing machines. The Turing machine simulator also includes sound effects to make learning more fun and interesting to learners.
010> 0011 111> 1-2< 2133-4< 414< 4-5> (a)
(b)
Fig. 2. (a) The TM simulator interface (b) TM example
2.3 Visual Examples Our tools contain a set of visual examples that we introduced with the aim of motivating learners in courses that include such topics. These selected examples represent useful daily life machines, games, and a puzzle. We have created six examples: an elevator, a vending machine, a man, a wolf and a goat puzzle, a video player, a rice cooker, and a tennis game. In this section, we will describe the last one as an example. 2.3.1 Tennis Game Simulator A poplar game such as Tennis can be modeled by automata. The following example shows such model. The input set represents the two players a and b. In this DFA there are 20 states. The start state is located at the root and denoted by in-arrowed circle.
682
M. Hamada
a
a
b
b
a
b
a
b
a
b
b
a
b
a
b
a
a
b
b a a
b
a
b
a
a
b b
b a win
Deuce
a
b win
a
a
b Adv. a
b
b
a Adv. b
Fig. 3. An automaton representing tennis game
There are two final states denoted by double circles. The automaton enters a final state when one of the players wins the game. The transition diagram is shown in Fig. 3. A typical tennis game can be represented by three finite automata: one for the points, one for the games, and one for the sets. Our tennis game simulator considers two players, A and B, who can be selected to play. It also allows auto play where the players play randomly. The simulator displays the score as well as the underlying three automata. The first automaton is related directly to the points; when a player wins a point, a new state is created. The second automaton is related to the game; when a player wins a game, a new state is created. The third automaton is related to sets; when a player wins a set, a new state is created. Fig. 4 shows a snapshot of the tennis game simulator interface. The game is completed when a final state of the
Fig. 4. The Tennis game simulator interface
Supporting Materials for Active e-Learning in Computational Models
683
automaton, representing the game set, is created. The winner is the player who reaches this final state first. The simulator also integrates animation of the two virtual players and various sound effects to make the learning more fun and interesting. In the video player and rice cooker examples, the underlying automata are given first, and then when an operation takes place, the corresponding state is highlighted. In the tennis game automata, however, we considered a different approach: the automata are created state by state in response to the corresponding operation. This is done intentionally to give the students a different taste of machine modeling by automata and to make it more interesting. 2.4 Learners Self-assessment A set of exercises with different levels is also integrated with the environment. There are various types of quizzes: some are multiple choice, some are fill in the blanks, and some test for Turing machines, finite automata or regular expressions. Learners can perform a pre-assessment, an in-assessment, or a post-assessment. First, the learner must select an exercise and then a description of the test and the evaluation method will be shown in the main window. Learners can navigate among the quizzes by using the navigation buttons at the bottom of the main window. Learners can check the score at any time by clicking on the ‘score’ button. While answering a quiz, learners can get hints or click on the introduction button on the top of the window to go to the introduction component and read more about the topics related to the quiz.
3 Tools Assessment We carried out two experiments in order to evaluate the effectiveness of our integrated environment tools on the learning process of engineering students. The first experiment evaluates the improvement in the students’ motivation. The second experiment evaluates the effectiveness of using the tools on the students’ performance. The purpose of introducing the visual automata examples is to enhance the students’ motivation. To measure the effectiveness of these visual examples, we performed two experiments in the automata and formal languages course. The first one was for students who already completed the course; the sample population included 52 students who studied the topics in different classrooms. The following question was asked: “If the course was an elective course, would you choose to study it? And, do you recommend other students to study it?” Five options were given for responses: (a) don’t know, (b) no, (c) maybe no, (d) maybe yes, and (e) yes. The responses were as follows: 3 answered a, 3 answered b, 6 answered c, 27 answered d, and 13 answered e. Then, we demonstrated our visual examples to the students and repeated the same question again. Their responses (after seeing the examples) were: 1 for a, 3 for b, 2 for c, 29 for d and 17 for e. Comparing the results from “Before” and “After” exposure to the examples, there was a slight improvement in motivation. For choices a, b, and c, if the number of responses decreased, it indicates a positive response, which is what occurred. While for the other choices d and e, the increasing number of responses indicates positive response, which also occurred.
684
M. Hamada
We note that there was only a small improvement in the students’ motivation, which is natural in this case because the students had already completed the course. In the next experiment we noted a better improvement in the motivation of students who were new to the course. In the second experiment, a total of 69 students were included, and they were all new to the course. The same steps, as with the pervious experiment, were repeated with a slight modification in the question. The question was “If the course was an elective one would you chose to study it?” As before, students were allowed to choose from among the five responses: a, b, c, d, and e. Their responses (before seeing the examples) were as follows: 22 answered a, 6 answered b, 10 answered c, 23 answered d, and 8 answered e. Next, we demonstrated our visual examples to the students and presented the same question to them again. Their responses (after seeing the examples) were as follows: 9 answered a, 4 answered b, 8 answered c, 34 answered d, and 14 answered e. Comparing the results “Before” and “After” exposure to the examples, we can see a better improvement in their motivation. As with the previous experiment, for choices a, b, and c, if the number of responses decreased it meant a positive response, which is what occurred. While for the other choices d and e, an increasing number of responses meant a positive response, which also occurred. We note that the motivation in the case of junior students (second experiment) was better than that of the senior students (first experiment). This result might be explained by the fact that the juniors had not studied the course before. A preliminary study shows that the integrated environment can improve the learning process of computer engineering students who study automata theory course and related courses. Last semester, the students were divided into four groups, each group containing 20 students. A set of 40 randomly selected exercises was distributed among the groups, 10 for each group. Each group members could collaborate inside their group but not with any other group members. No group could see the exercises of other group. Two groups were asked to answer their assigned exercises using the integrated environment and the other two groups without using it. An equal time period was provided to all the groups. The result showed a better performance for the two groups using the IE. Then, the experiment was repeated by redistributing the exercises among the four groups. Again, the two groups with the IE showed better performance.
4 Conclusion Applications of technology can provide course content with multimedia systems, active learning opportunities and instructional technology to facilitate education in the area of computer engineering to a broad range of learners. Such interactive course materials have already been introduced for several topics in computer engineering courses; see for example [5, 6, 7, 14, 15, 18]. In this paper, we followed the same path and introduced a set of visual tools to support interactive (e)-learning for computational models concepts. It can also be used in other courses such as automata and formal languages, language processing, theory of computation, compiler design, discrete mathematics, and other similar courses.
Supporting Materials for Active e-Learning in Computational Models
685
There are a number of similar tools which have been developed (e.g. [1,2,3,8, 16, 19]) to enhance the learning of computational models topics. Most of them suffer from one or more flaws that make them less effective as a learning tool, particularly for less advanced students. For example, JFLAP [19] is a comprehensive automata tool but it requires skilled learners who already know the basics of automata to make full use of its rich operations. The automata tools in [16] are a powerful tool, but do not provide a convenient mechanism for displaying and visually simulating the finite state machines. The ASSIST automata tools in [8] are difficult to setup and use. The tools in [1] lack visual clarity and dynamic capability. Almost all have been designed as tools for advanced learners. These tools work on the assumption that the learners have already grasped the fundamental concepts. They are also dependent on advanced mathematical and idiosyncratic user interactions. On the contrary, our tools are designed as an easy-to-use, easy-to-learn, stand-alone, and all-in-one integrated environment. Through the results of our experiments, we also showed that our visual tools can enhance learners’ motivation and performance. In addition an opinion poll showed a positive feedback on the environment tools from the students. In future work, we plan to enhance our visual tools by adding more features, more visual examples and games, and by performing more performance evaluation experiments.
References 1. Bergstrom, H.: Applications, Minimization, and Visualization of Finite State Machines. Master Thesis. Stockholm University (1998), http://www.dsv.su.se/~henrikbe/petc/ 2. Bovet, J.: Visual Automata Simulator, a tool for simulating automata and Turing machines. University of San Francisco (2004), http://www.cs.usfca.edu/~jbovet/vas.html 3. Christin, N.: DFApplet, a deterministic finite automata simulator (1998), http:// www.sims.berkeley.edu/~christin/dfa/ 4. Felder, R., Silverman, L.: Learning and teaching styles in engineering education. Engineering Education 78(7), 674–681 (1988) 5. Hadjerrouit, S.: Learner-centered Web-based Instruction in Software Engineering. IEEE Transactions on Education 48(1), 99–104 (2005) 6. Hamada, M.: Web-based Tools for Active Learning in Information Theory. ACM SIGCSE 38 (2007) 7. Hamada, M.: Visual Tools and Examples to Support Active E-Learning and Motivation with Performance Evaluation. In: Pan, Z., Aylett, R.S., Diener, H., Jin, X., Göbel, S., Li, L. (eds.) Edutainment 2006. LNCS, vol. 3942, pp. 147–155. Springer, Heidelberg (2006) 8. Head, E.: ASSIST: A Simple Simulator for State Transitions. Master Thesis. State Univesity of New York at Binghamton (1998), http://www.cs.binghamton.edu/~ software/ 9. Herrmann, N.: The Creative Brain. Brain Books, Lake Lure (1990) 10. Java2D of Sun Microsystems, http://www.sun.com 11. Java Team, Buena Vista University, http://sunsite.utk.edu/winners_circle/education/EDUHM01H/ applet.html
686
M. Hamada
12. Keller, J.: Development and use of the ARCS model of motivational design. Journal of Instructional Development 10(3), 2–10 (1987) 13. Kolb, D.: Experiential Learning: Experience as the Source of Learning and Development. Prentice-Hall, Englewood Cliffs (1984) 14. Li, S., Challoo, R.: Restructuring an Electric Machinery course with Integrative approach and computer-assisted Teach Methodology. IEEE Transactions on Education 49(1), 16–28 (2006) 15. Masters, J., Madhyastha, T.: Educational Applets for Active Learning in Properties of Electronic Materials. IEEE Transactions on Education 48(1) (2005) 16. Mohri, M., Pereria, F., Riley, M.: AT&T FSM Library Software tools (2003), http:// www.research.att.com/sw/tools/fsm/ 17. Myers, Gifts Differing. Palo Alto, CA: Consulting Psychologists Press (1980) 18. Nelson, R., Shariful Islam, A.: Mes- A Web-based design tool for microwave engineering. IEEE Transactions on Education 49(1), 67–73 (2006) 19. Rodger, S.: Visual and Interactive tools. Automata Theory tools at Duke University (2006), http://www.cs.duke.edu/~rodger/tools/ 20. Rosati, P.: The learning preferences of engineering students from two perspectives. In: Proc. Frontiers in Education, Tempe, AZ, pp. 29–32 (1998) 21. Soloman, B., Felder, R.: Index of Learning Style Questionnaire, http://www.engr.ncsu.edu/learningstyle/ilsweb.html 22. Transforming undergraduate education in science, mathematics, engineering, and technology. In: Committee on Undergraduate Science Education, Center for Science, Mathematics, and Engineering Education. National Research Council ed. Washington, DC: National Academy Press (1999)
Improving Software Development Process Implemented in Team Project Course Iwona Dubielewicz and Bogumiła Hnatkowska Institute of Applied Informatics, Wroclaw University of Technology, Wyb. Wyspianskiego 27, 50-370 Wroclaw, Poland {Iwona.Dubielewicz, Bogumila.Hnatkowska}@pwr.wroc.pl
Abstract. The paper presents an assessment approach to software development process used within students’ team project. The assessment is based on exemplary Process Assessment Model given in ISO/IEC 15504-5 standard. The results of the assessment suggest the area of improvement in our software development process realization. The history, context and basic assumptions established and proposed for the future improvements in the course are given. Keywords: team project, software development, process assessment model.
1 Introduction Software engineering (SE) is a discipline of developing and maintaining software systems that behave reliably and efficiently, are prepared for future enhancement and modification, and satisfy the requirements that customers have formulated for them. Students of SE during their program of study should participate in the development of software of real-world software. To fulfill the industrial postulate of incorporating professional practice into the curriculum, since 1998 we have been offering a Software Development Course (SSD). During this course students work in teams on design and implementation of projects that involve consideration of real-word issues including safety, efficiency, and suitablity for the intended users. Usually every educational course after several editions is a subject to change. It is especially true for software area because of the rapid changes in the field of software engineering. Similarly to software development, the course development can be divided into four phases: specification, design, implementation, and assessment. Even though a course specification is stable and is performed only once, design and implementation of the course is not a one-time process but rather a "work in progress" i.e. it can be seen as an iterative process. In order to conduct the next course iteration one should know the results from the assessment of the previous one. The main aim of the paper is: -
to present an assessment carried out for the existing software development process implemented in project course; the assessment process is based on the assessment model as it is proposed in ISO/IEC 15504-5 [3] standard,
M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 687 – 696, 2008. © Springer-Verlag Berlin Heidelberg 2008
688
I. Dubielewicz and B. Hnatkowska
-
to figure out the elements which could be changed in order to reach the higher quality of this process, and to propose a technical tool aiding of new quality-oriented activities.
We expect that due to more formal process assessment of course implementation we will get a strong rationale for the course modification. The paper consists of four sections. Section 2 presents briefly a model of process assessment as defined in [2], [3]. Section 3 describes an assessment of the taught process and gives some suggestions how it could be change for better quality. Section 4 summarizes the paper.
2 Model of Process Assessment An assessment of software development process can be conducted according to ISO 15504-2 standard [2]. Process Assessment Model (PAM) is used as a common basis for performing assessments of software engineering process capability. PAM is a twodimensional model of process capability. The considered dimensions are: 1. the process dimension, where processes are defined and classified into process categories; further these categories are decomposed into groups. 2. the capability dimension, where a set of process attributes grouped into capability levels is defined; these process attributes provide the measurable characteristics of process capability. In the process dimension there are distinguished three categories of processes: primary, supporting and organizational. An individual process is described in terms of process name, purpose, and outcomes [1]. For example in Table 1 Software construction process is described. In PAM every process is extended with additional information in the form of: -
-
a set of base practices for the process providing a definition of the tasks and activities needed to accomplish the process purpose and fulfil the process outcomes; each base practice is explicitly associated to a process outcome (see Table 1 the line named Base Practices); a list of input and output work products associated with each process and related to one or more of its outcomes; each work product has some characteristics associated.
PAM is based on the principle that the capability of a process can be assessed by demonstrating the achievement of process attributes on the basis of evidences related to assessment indicators. There are two types of assessment indicators: process capability indicators, that can be applied to capability levels 1 to 5, and process performance indicators, that can be applied exclusively to capability level 1 . Process capability indicators are as follows: -
Generic Practice (GP); Generic Resource (GR); Generic Work Product (GWP).
Improving Software Development Process Implemented in Team Project Course
689
These indicators concern significant activities, resources and or results associated with the process. The existence of these indicators provides evidence of process capability. Table 1. Exemplary process description in PAM (the source: [3]) Process ID Name Purpose
ENG.6
Software construction The purpose of the Software construction process is to produce executable software units that properly reflect the software design. Outcomes As a result of successful implementation of Software construction process: 1) verification criteria are defined for all software units against their requirements; 2) software units defined by the design are produced; 3) consistency and traceability are established between software requirements and design and software units; and 4) verification of the software units against the requirements and the design is accomplished. ENG.6.BP1: Develop unit verification procedures. Develop and document Base Practices procedures and criteria for verifying that each software unit satisfies its design requirements. The verification procedure includes unit test cases, unit test data and code review. [Outcome: 1] ENG.6.BP2: Develop software units. Develop and document the executable representations of each software unit. Update test requirements and user documentation. [Outcome: 2] (…)
In Table 2 the examples of mapping from the defined generic practices (GP x.y.z) to the relevant process attributes (PA.x.y.z) are given. A capability level is a set of process attribute(s) that provide a major enhancement in the capability to perform a process. There are defined six process capability levels: 0–incomplete, 1–performed, 2–managed, 3–established, 4–predicable, 5–optimizing. Every capability level attribute is described in structured, standardized form [3]. An example of such a description for the attribute of process performance indicator of the 1st level is presented below: Level 1: Performed process PA 1.1 Process performance attribute The process performance attribute is a measure of the extent to which the process purpose is achieved. As a result of full achievement of this attribute: a) the process achieves its defined outcomes. Generic Practices for PA 1.1: - Achieve the process outcomes; - Perform the intent of the base practices; - Produce work products that evidence the process outcomes; Generic Resources for PA 1.1: - Resources are used to perform the intent of process specific base practices [PA 1.1 Achievement a]
690
I. Dubielewicz and B. Hnatkowska Table 2. Exemplary relationships between generic practices and process attributes ([3])
GP Practice Name PA 1.1: Process performance attribute GP 1.1.1 Achieve the process outcomes. PA 2.1: Performance management attribute GP 2.1.1 Identify the objectives for the performance of the process. GP 2.1.2 Plan and monitor the performance of the process to fulfill the identified objectives. GP 2.1.3 Control the performance of the process. GP 2.1.4 Define responsibilities and authorities for performing the process. GP 2.1.5 Identify and make available resources to perform the process according to plan. GP 2.1.6 Manage the interfaces between involved parties. PA 2.2: Work product management attribute GP 2.2.1 Define the requirements for the work products. GP 2.2.2 Define requirements for documentation and control of work products. GP 2.2.3 Identify, document and control the work products. GP 2.2.4 Review and adjust work products to meet the defined requirements.
Maps to PA.1.1.a PA.2.1.a PA.2.1.b PA.2.1.c PA.2.1.d PA.2.1.e PA.2.1.f PA.2.2.a PA.2.2.b PA.2.2.c PA.2.2.d
Generic Work Products for PA 1.1 -
Work products exist that provide evidence of the achievement of the process outcomes [PA 1.1 Achievement a]
All process attributes for all capability levels are given in Table 3. In this table it is also defined what attribute ratings are required for a process to be categorized as being on the appropriate level of capability. Table 3. Process Capability levels and their attributes Capability level Level 0: Incomplete Level 1: Performed Level 2: Managed
Process attribute Incomplete Process Performance Level 1+ Performance Management, Work prod. Management Level 3: Level 2+ Established Process Definition, Process Deployment Level 4: Predictable Level 3+ Process Measurement, Process Control Level 5: Level 4+ Optimizing Process Innovation, Continuous Optimization
where symbols of attribute rating N Not achieved P Partly achieved L Largely achieved F Fully achieved
Attribute rating N/P L/F F L/F F L/F F L/F F L/F
are interpreted as follows: i.e. 0 up to 15% attribute achievement i.e. > 15% up to 50% attribute achievement i.e. > 50% up to 85% attribute achievement i.e. > 85% up to 100% attribute achievement
Improving Software Development Process Implemented in Team Project Course
691
3 Improvement of Software Development Process Implemented in SSD Course 3.1 Historical Background The considered software development process is implemented within team project classes offered since 1999. Initially students followed USDP [6] methodology, but since 2004 we have introduced RUP process [7]. The previous editions of SSD course are described in [4], and [5]. As this project corresponds to the concept of the capstone course, it has three common elements of capstone programs: 1) students are divided into teams of typically 4 to 6 students each; one of them plays a role of the project manager and the lecturer plays a supervisor/auditor role; 2) each team is given a real-world project or problem to solve; 3) this project takes 30 weeks to complete (it is continued along two semesters – 60 class-hours, offered within 7th and 8th semesters for about 120 students every year). The project constitutes a part of a subject called System Software Development. Before attending the course we require the students have finished the courses of object-oriented programming, UML [8], fundamentals of databases. In parallel the students are offered software project management and database design courses. Design and implementation of software system belongs to the important activities within project. We suggest students to use multi-tiers architecture as well as component techniques. The target system architecture should be distributed between at least two tiers. Client-server is the most often selected system architecture. 3.2 Assessment of Current Software System Development Process The students are obliged to use customized RUP approach while developing software system within SSD project. Comparing attributes of our implemented development process to these given in [3], we can recognize that our process is at least on the 1st capability level. According to [2] the process is performed when it fully achieves its performance attribute due to achieving its defined outcomes. In Table 4 there are presented selected implemented workflows (GP), required work products (GWP) and some of resources used in process (GR). To establish how far is our process from the 2nd level of capability we have to point out that our performed process is implemented in a managed fashion (planned, monitored and adjusted) and its work products are appropriately established, controlled and maintained (see Table 3). We have decided to performed such assessment for limited numbers of attributes, which, in our opinion need to be improved. The same approach to assessment might be conducted for all attributes defined in [3] for managed process.
692
I. Dubielewicz and B. Hnatkowska Table 4. Software system development work products
RUP discipline Work products Requirements Vision, Glossary Supplementary Specification Use Case Model Use-Case specifications User-interface prototype Design Design Model Deployment Model Data Model, SAD Deployment Model Test Test plan, Defects list Test evaluation summary Defects list Project manIteration Plan agement (from 2nd iteration) Iteration Assessment Use-case priorities Environment Project specific guidelines (option)
Notes Sketches of GUI-interface, screen shots or executable GUI prototype.
Tools Requisite Pro Rational Rose
Rational Rose SoDa
Test Factory (option) Clear Quest Software Development Plan was elaborated by a teacher. Risk list is limited to use-case priorities. Design guidelines Programming guidelines.
Table 5 gathers information about performance management attributes for assessment – they are described by letters from A to D. Particular indicators are evaluated subjectively – in 0-5 point scale – basing on our expectations and experiences. Some comments about each element rate are put below. The process in considered area is partially achieved (42%). Ad. A) Better monitoring of the project progress and tasks realization is needed. Now each team prepares weekly reports for team supervisor. The control over the project from the supervisor’s perspective is relatively small. Tasks to be performed within a concrete iteration should be communicated in a readable way. Ad. B) More flexible treatment of iteration content is needed. The disciplines performed within the project should be carefully selected, e.g. in many cases full business modeling is spared. Ad. C) Easier access to templates and examples of artifacts used during software development is needed. The artifacts should be clearly described; now they are distributed as packed zip archives. Ad. D) Improved communication inside the development team as well as between the development team and the supervisor is needed; now, the direct contact between the supervisor and a team is usually limited to 30 min a week; it is often too short, so additional information is exchanged by e-mails or printed documents; the feedback is rather low and late. Table 6 gathers information about attributes needed for assessment of work product management attributes (row E). Some comments about each element rate are put below the table. The process in considered area is partially achieved (30%).
Improving Software Development Process Implemented in Team Project Course
693
Table 5. Assessment of performance management attributes ID Attribute A
B
Indicator GP
Performance of the process is planned and monitored
GR GWP
Performance of the process is adjusted to meet plans
GP GR GWP
Resources and information necessary for performing the process are identified, made available, allocated and used
GP
Interfaces between involved parties are managed (…) Summary
GP
C
D
GR GWP
GR GWP
Element to be assessed a.1) Process performance is monitored to ensure that planned results are achieved a.2) Workflow management system a.3) Contains status information about corrective actions; schedule and work b.1) The plan(s) are adjusted, as necessary, b.2) Rescheduling is done when necessary. b.3) Facilities and infrastructure resources are available b.4) Process definition describes the way of plan controlling and adjusting c.1) The human and infrastructure resources necessary to performed the process are defined, made available, allocated and used. c.2) Information and/or experiences repository. c.3) States results achieved or provides evidence of activities performed in a process. d.1) Communication between the involved parties is effective None None
Assessment [0-5] scale 2 (rarely, manual) 0 (none) 2 (incomplete, not up to date) 3 (very rarely) 2 (very rarely) 0 (fully manual) 2 (not formalised) 4 (distributed as zip files)
3 (zip files) 2 (not up to date) 3 (weekly meetings)
23/55
(42%)
Table 6. Assessment of work product management attributes ID Attribute
Indicator GP
Element to be assessed
Assessment [0-5] scale e.1) Change control is established for work pro- 2 (manual) E Work prodducts. ucts are ape.2) The work products are made available 0 (not for supropriately through appropriate access mechanisms. identified, pervisor) e.3) The revision status of the work products 3 (contact with documented, may readily be ascertained. and controlled supervisor is needed)) GR e.4) Configuration management system. 0 (none) e.5) Document identification and control proce- 3 (manual) dure. GWP e.6) Records the status of documentation or 2 (not up to work product. date) e.7) Contains and makes available work prod0 (not for suucts and/or configuration items. pervisor) e.8) Supports monitoring of changes to work 2 (often not up products. to date) Summary 12/40 (30%)
694
I. Dubielewicz and B. Hnatkowska
Ad. E) There is a need for introducing commonly used change and version control mechanisms for artifacts (especially documents). Now, a team is responsible for assuring both mechanisms. The supervisor has limited access and thus not complete knowledge on about the document versions, their authors, and the reasons of changes. 3.3 Proposed Solution This section presents how the generic elements, identified and described in section 3.1, are to be instantiated within software system developing process. The solution will be implemented during following semester. The main element of the solution is Office SharePoint Server 2007 [9] that is used within the Institute of Applied Informatics as a LMS e-learning platform and as a portal for document management. SharePoint supports business life cycle of design documents. All documents prepared by a particular team are stored and versioned by SharePoint server. Creation of new documents is simplified due to possibility of using templates. In SharePoint a document can be in one of four states: draft, pending, approved, rejected. The document is in draft state after creation. Within this state document can be modified and updated many times. History of document revisions is available for both the team and the supervisor. The team can change the state of document to pending – see fig. 1. That means, that the document is ready to be checked. If the document verification is successful the supervisor changes its state to approved, otherwise – to rejected state. The team is obligated to prepare a new version of rejected document – the life cycle of such document begins once again. The information about the authors/updaters of each document could be easily obtained (realization of c.3, d.1, e.3e.8 – tables 5 and 6).
Fig. 1. Information about work product state
SharePoint supports monitoring of design tasks and design progress. Iteration plan for 1st iteration is prepared by team supervisor. Plans of subsequent iterations are prepared by team managers. The plan is visualized by Gantt’s graph – see fig. 2 (realization of a.1, a.2, a.3). Any plan rescheduling has to be agreed with the supervisor. After that a team manager is allowed to introduce changes into iteration breakdown (realization of b.1, b.2, b.3). Each week the supervisor controls project progress according to the plan. Usually the supervisor asks for reasons of noticed delays and, if they are important he/she talks over the possible corrective actions and/or about the rearrangement of schedule (realization of b.4). Discussion forum and wiki pages enable knowledge transfer between all stakeholders within the process, i.e. students, domain experts, IT experts etc. (realization of c.2). Table 7 presents assessment of a current process (CSA) and a priori assessment of proposed solution (FSA). We hope the process will move to fully achieved 2nd capability level in the considered area.
Improving Software Development Process Implemented in Team Project Course
695
Fig. 2. Iteration process breakdown Table 7. A priori assessment of planned improvements Element a.1) a.2) a.3) b.1) b.2) b.3) b.4) c.1) c.2) c.3) d.1)
CSA [0-5] 2 0 2 3 2 0 2 4 3 2 3 Summary
FSA [0-5] 4 (planned and documented) 5 4 (omits corrective actions) 5 5 5 5 5 4 (acceptance procedure is needed) 5 4 (communication not guaranteed) 49/55 89%
Element e.1) e.2) e.3) e.4) e.5) e.6) e.7) e.8)
CSA [0-5] 2 0 3 0 3 2 0 2
FSA [0-5] 5 5 5 5 5 5 5 5
Summary 40/40 100%
4 Summary The aim of the paper was to show how to evaluate a process capability level and point out these task and activities in a process implementation improving of which would rise up a capability level. The assumption is that the process should be capable in terms of ISO/IEC 15504-2 assessment model. Up to now the software development process was at least at 1st performed capability level. It results from the fact that it is a customised version of RUP methodology. The RUP disciplines: Configuration, and Change Management and Project Management need improvement in our opinion. The elements of these two disciplines were assessed to fulfil requirements defined for the managed level (according to Process
696
I. Dubielewicz and B. Hnatkowska
Assessment Model described in [3]). The solution aiming at improvement of the software development process in considered area and moving the process from partially achieved to fully achieved attributes was also described. The key element of the solution is based on application of Office SharePoint 2007 Server. The solution is fully developed by the students attending the course. It is planned for deployment during the following semester. Its a posteriori assessment will be possible in six months.
References 1. ISO/IEC 12207:1995/Amd.1:2002; Amd.2:2004, Information technology – Software life cycle processes (2004) 2. ISO/IEC 15504-2:2003, Information technology – Process assessment – Part 2: Performing an assessment (2003) 3. ISO/IEC 15504-5:2006, Information technology – Process assessment – Part 5: An exemplar Process Assessment Model (2006) 4. Dubielewicz, I., Hnatkowska, B.: Teaching of Information system modeling with UML. In: Proceeding of International Conference MOSIS 2000, Roznov, Czech Republic 2000, pp. 149–159 (2000) 5. Dubielewicz, I., Hnatkowska, B.: RUP customization in teaching of system software development. In: Benes, M. (ed.) Proceedings of the 7th ISIM conference. Acta MOSIS, MARQ 2004, Ostrava, vol. 96, pp. 19–27 (2004) 6. Jacobson, I., Booch, G., Rumbaugh, J.: The Unified Software Development Process. Addison-Wesley, Reading (1999) 7. Rational Unified Process, Rational Software Corporation (2003) 8. OMG Unified Modeling Language Specification, Version 1.4 (2001) 9. Microsoft Office SharePoint Server (2007), http://office.microsoft.com/ pl-pl/sharepointserver/FX100492001045.aspx
An Undergraduate Computational Science Curriculum Angela B. Shiflet and George W. Shiflet Wofford College Spartanburg, South Carolina, USA {shifletab, shifletgw}@wofford.edu, http://www.wofford.edu/ecs/
Abstract. Wofford College instituted one of the first undergraduate programs in computational science, the Emphasis in Computational Science (ECS). Besides programming, data structures, and calculus, ECS students take two computational science courses (Modeling and Simulation for the Sciences, Data and Visualization) and complete a summer internship involving computation in the sciences. Materials written for the modeling and simulation course and developed with funding from National Science Foundation served as a basis the first textbook designed specifically for an introductory course in the computational science and engineering curriculum. The successful ECS has attracted a higher percentage of females than in most computer science curricula. The SIAM Working Group on Undergraduate Computational Science and Engineering Education summarized features of Wofford’s ECS and other computational science programs. Besides its established curriculum, Wofford has incorporated computational science in other courses, such as in a sequence of three microbiology laboratories on modeling the spread of disease. Keywords: computational science, education, modeling, simulation, undergraduate, internships, females.
1 Introduction Much scientific investigation now involves computing as well as theory and experimentation. Computing can often stimulate the insight and understanding that theory and experiment alone cannot achieve. With computers, scientists can study problems that previously would have been too difficult, time consuming, or hazardous; and, virtually instantaneously, they can share their data and results with scientists around the world. The increasing speed and memory of computers, the emergence of distributed processing, the explosion of information available through the World Wide Web, the maturing of the area of scientific visualization, and the availability of reasonably priced computational tools all contribute to the increasing importance of computation to scientists and of computational science in education. With funding from the National Science Foundation (NSF Grant No. 0087979), Wofford College developed one of the first undergraduate programs in computational science, an Emphasis in Computational Science (ECS), which the college’s faculty unanimously approved in 1998 [1]. One highly successful feature of the ECS is the requirement of a summer internship involving computation in the sciences. With its M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 697–705, 2008. © Springer-Verlag Berlin Heidelberg 2008
698
A.B. Shiflet and G.W. Shiflet
emphasis on applications, the program has attracted women to a much higher percentage than for the average computer science major. 1.1 Wofford College’s Emphasis in Computational Science (ECS) Wofford College is a selective residential undergraduate liberal arts institution of 1350 students, where the sciences are particularly strong [2]. The SAT range (mid 50 % of the class) for freshman entering in the fall of 2007 was 1140-1340. Approximately one-third of the students major in science, and about two-thirds of those majoring in science and mathematics attend postgraduate or professional schools. For a year in 1997-1998, faculty members in biology, chemistry, mathematics, physics, psychology, and computer science met to discuss how better to prepare students to use computing in the sciences. The following general needs were identified: − A balanced program for interested and qualified science majors through which they can expand their knowledge of and skills in computational science − Increased opportunities for students to obtain internships, graduate work, and jobs in computational science − Ready access for science and computer science students to modern computational software, such as scientific visualization tools and graphical computer algebra systems − Familiarity for all computer science majors and many science majors with distributed processing and the UNIX environment because of their extensive use in the sciences In response to these needs, with consultation from scientists at various laboratories, and with assistance from Dr. Bob Panoff and the Shodor Educational Foundation [3], the faculty developed the Emphasis in Computational Science (ECS), which has the following requirements: − Complete a Bachelor of Science in lab science or Mathematics, Physics, or Psychology − Complete five courses: Programming and Problem Solving, Data Structures, Calculus I, and two computational science courses: Modeling and Simulation for the Sciences (COSC/MATH 175) and Data and Visualization (COSC 370) − Complete a summer internship involving computation in the sciences 1.2 Modeling and Simulation for the Sciences (COSC/MATH 175) Prerequisites for Modeling and Simulation for the Sciences (COSC/MATH 175), which does not require computer programming experience, are minimal. The course uses the concept of rate of change, or derivative, from a first course in calculus throughout, but students do not need to know derivative formulas to understand the material or develop the models. With a brief introduction, all students who have taken COSC/MATH 175 without having had calculus have successfully completed the course with above average grades.
An Undergraduate Computational Science Curriculum
699
Modeling and Simulation for the Sciences prepares the student to understand and utilize fundamental concepts of computational science, the modeling process, computer simulations, and scientific applications. The course considers two major approaches to computational science problems: system dynamics models and cellular automaton simulations. System dynamics models provide global views of major systems that change with time. For example, one such model considers changes over time in the numbers of predators and prey, such as hawks and squirrels. To develop such models, students employ a systems dynamics tool, such as STELLA®, Vensim®, or Berkeley Madonna®, to create pictorial representations of models, establish relationships, run simulations, and generate graphs and tables of the results. Typical applications include drug dosage, scuba diving and the ideal gas laws, enzyme kinetics, defibrillators and electrical circuits, cardiovascular system, global warming, carbohydrate metabolism, predator-prey, competition, radioactive chains, malaria, and other diseases. In contrast to system dynamics, cellular automaton simulations provide local views of individuals affecting individuals. The world under consideration consists of a rectangular grid of cells, and each cell has a state that can change with time according to rules. For example, the state of one cell could represent a squirrel and the state of an adjacent cell could correspond to a hawk. One rule could be that, when adjacent, a hawk gets a squirrel with a probability of 25%. Thus, on the average at the next time step, a 25% chance exists that the particular squirrel will be no more. Students employ a computational tool, such as Mathematica®, Maple®, or MATLAB®, to complete simulations, such as Brownian motion, movement of ants, spread of fire, HIV in body, foraging behavior, spread of disease, fish schooling, pit vipers and heat diffusion, and snow-flake solidification. 1.3 Data and Visualization (COSC 370) Because large Web-accessible databases are becoming prevalent for storing scientific information, Data and Visualization (COSC 370) covers the concepts and development of relational databases. With a prerequisite of the first programming course, currently in the Python programming language, students in the class learn Perl and HTML programming in the UNIX operating system environment. After learning to access and develop databases in MySQL, they create web pages with Perl CGI programs to interface between web pages and scientific databases. Additionally, they study a dynamic programming algorithm for alignment of genomic sequences. Interactive online modules, developed at Wofford with the help of its NSF grant, provide the textbook for this portion of the course [4]. The second half of the course covers scientific visualization. Effective visualization of data helps scientists extract information and communicate results. Thus, students learn fundamental concepts, tools, and algorithms of computer graphics and interactive scientific visualization animations using Steve Cunningham’s text on Computer Graphics: Programming, Problem Solving, and Visual Communication [5], which has all scientific applications. For example, some of the animations are of DNA and other molecules, diffusion across a membrane, movement of ocean waves, heat diffusion, spread of disease, and Lorenz equations.
700
A.B. Shiflet and G.W. Shiflet
1.4 Computational Science Internships Building on their classroom work, students obtaining the Emphasis in Computational Science have had exciting and meaningful summer internships involving computation in scientific research at such institutions as Los Alamos National Laboratory, the Jet Propulsion Laboratory, Oak Ridge National Laboratory, The Scripps Research Institute, Howard Hughes Medical Institute at the Wadsworth Center, The Shodor Education Foundation, the National Blood Data Resource Center, Greenwood Genetic Center, University of California at San Diego, Virginia Commonwealth University, Clemson University, University of South Carolina, and the Medical University of South Carolina. Examples of some of the projects are simulating the dynamics of the parasite that causes Chagas’ disease, developing software for the science operations interface of Mars Rovers, optimizing a program to simulate aspects of heart behavior, developing programs to study the evolution of bacterial genomes, performing a microgravity scaling theory experiment, analyzing the relationship of diet to birth defects, creating an extensible framework for the mathematical manipulation of music, modeling of biochemical pathways involved with cardiovascular disease, performing computer image processing of the ribosome, modeling metabolic pathways of a bacterium for bioremediation, implementing a text mining approach to evaluate terms for ontology development, and analyzing traumatic brain injuries computationally. After their internships, students have presented their results at Wofford and at conferences, such as The Society for Industrial and Applied Mathematics (SIAM) Annual Conference, the SIAM Computational Science and Engineering Conference, and the Consortium for Computing in Science in Colleges Southeastern Conference. ECS graduates have attended medical school to become physicians; pursued such graduate degrees as genetics at the University of North Carolina, biotechnology and biomedical engineering at the University of South Carolina, computational physics at the North Carolina A & T University, physics at the University of Tennessee and Oklahoma State University, and computer graphics at Columbia University; and have obtained positions, such as medical researcher at GlaxoSmithKline; researcher at the National Institutes of Health, Oak Ridge National Laboratory, and Vanderbilt Medical School; and computational science educator at the Shodor Foundation. 1.5 Attracting Female, Minority, and Biology Students A disturbing trend in recent years has been for a much smaller percentage of women to pursue undergraduate degrees in computer science. In 1984, women earned 37% of computer science bachelor’s degrees, but they obtained only 28% of such degrees in 2000 [6]. Educational research indicates that on the average women prefer applications of computer science and teamwork to such areas as game development working individually [7] and [8]. Encouraging women to take more computer science was not one of the goals of Wofford’s ECS, however, that certainly has been the effect. In the years 2002-2007, eighteen (18) students graduated with the ECS. Eight (8) of these, or 44%, were women. Perhaps emphases on applications to the sciences and working in teams have been two of the factors that have contributed to the higher percentage of interest by women in computational science at Wofford.
An Undergraduate Computational Science Curriculum
701
We also were pleased that minorities completed the ECS at a slightly higher percentage than their representation in Wofford’s general population. Three (3) ECS graduates (17%) were minorities. Another surprise is the number of biology majors who are attracted to computational science. Conventional “wisdom” is that biology majors do not like or do not excel in mathematics or other technical areas. That has not been our experience, and often biology majors are at the top of their computer science and mathematics classes. Since 2002, thirteen (13) of the eighteen ECS graduates (72%) have been biology majors.
2 Introductory Textbook While designing the two computational courses, it became evident that there were no suitable, available textbooks written for undergraduates. Arising from this need, the authors of this paper developed such a textbook. One of the authors is a mathematician/computer scientist, and the other is a biologist. The interdisciplinary nature of this area inspired collaboration. Each author had sufficient science and mathematics background to make the partnership possible and successful. Thus, with a foundation of the materials developed through the NSF grant, the authors wrote the first textbook designed specifically for an introductory course in the computational science and engineering curriculum, Introduction to Computational Science: Modeling and Simulation for the Sciences [9]. 2.1 Content Introduction to Computational Science: Modeling and Simulation for the Sciences prepares the student to understand and utilize fundamental concepts of computational science, the modeling process, computer simulations, and scientific applications. The text considers two major approaches to computational science problems: system dynamics models and cellular automaton simulations. One of the positive aspects and challenges of computational science is its interdisciplinary nature. This challenge is particularly acute with students who have not had extensive experience in computer science, mathematics, and all areas of the sciences. Thus, the text provides the background that is necessary for the student to understand the material and confidently succeed in the course. Each module involving a scientific application covers the prerequisite science without overwhelming the reader with excessive detail. The numerous application areas for examples, exercises, and projects include astronomy, biology, chemistry, economics, engineering, finance, earth science, medicine, physics, and psychology. Most sections of a module end with Quick Review Questions that provide fast checks of the student's comprehension of the material. Answers, often with explanations, at the end of the module give immediate feedback and reinforcement to the student. To further aid in understanding the material, most modules include a number of exercises that correlate directly to text examples and that the student usually is to
702
A.B. Shiflet and G.W. Shiflet
complete with pencil and paper. Answers to selected problems, whose exercise numbers are in color, appear in an appendix. A subsequent “Projects” section provides numerous project assignments for students to develop individually or in teams. While a module, such as “Modeling Malaria,” might develop one model for an application area, the projects section suggests many other refinements, approaches, and applications. The ability to work well with an interdisciplinary team is important for a computational scientist. Two chapters provide modules of additional, substantial projects from a variety of scientific areas that are particularly appropriate for teams of students. 2.2 Website The text’s website (linked from http://www.wofford.edu/ecs/) provides links to downloadable tutorials, models, pdf files, and datasets for various tool-dependent quick review questions and answers, examples, and projects. Moreover, an online Instructor’s Manual includes solutions to all text exercises, tutorials, and selected projects [4]. To model dynamic systems, students using the text can employ any one of several tools, such as STELLA®, Vensim® Personal Learning Edition (PLE) (free for personal and educational use), Berkeley Madonna®, the Python programming language, or Excel®. The text also employs a generic approach for cellular automaton simulations and scientific visualizations of the results, so that students can employ any one of a variety of computational tools, such as Maple®, Mathematica®, MATLAB®, the Python programming language, or Excel®. Typically, an instructor picks one system dynamics tool and one computational tool for class use during the term.
3 SIAM Working Group Report In 2006, a SIAM Working Group on Undergraduate Computational Science and Engineering Education issued a report [10]. The committee consisted of Peter Turner, Chair, Kirk Jordan, Linda Petzold, Angela Shiflet, and Ignatios Vakalis. To a large extent Wofford College’s Emphasis in Computational Science follows the working group’s recommendations. 3.1 The Report The SIAM Working Group Report noted, “Some content elements appear to be common in the emerging undergraduate CSE curriculum: scientific programming, numerical methods/scientific computing, linear algebra, differential equations, mathematical modeling, and statistics are common mathematics components; advanced programming, parallel and high performance computing, and scientific visualization are commonly added where the program has its home closer to computer science; simulation, optimization, computational fluid dynamics, image and signal processing are among the offerings from some of the applications areas…. By the nature of CSE, the successful undergraduate CSE student will have skills in applied mathematics, computing including some parallel or high performance computing, and at least one application field”.
An Undergraduate Computational Science Curriculum
703
The report continued, “It is absolutely essential that interdisciplinary collaboration be an integral part of the curriculum and the thesis research….Expressed in broad terms, the overall needs are a combination of disciplinary skills and cross-disciplinary skills, learning how to learn, ability to work in a team, adaptability, perseverance and an interest in solving problems that may be multi-faceted”. Topics of the report include “Nature of CSE Undergraduate Education”, “Models for CSE Programs”, “A Few Examples”, “The Value of Internships”, “Needs that Undergraduate CSE Education Must Address”, “CSE Careers”, and “Conclusion and Recommendations”.
4 Modeling in the Biology Classroom Computational science education not only refers to establish programs, but also can involve individual courses or projects in various science courses. For the past three years, George Shiflet has incorporated a three-laboratory sequence on the modeling of the spread of disease in Microbiology, a class with 30 to 40 students. 4.1 The Laboratories on Modeling In the first week’s laboratory in the sequence on modeling, students are given an introduction to a systems dynamics modeling tool, in this case STELLA, with a model infecteds
susceptibles get sick
recovereds recover
recovery rate
infection rate
Fig. 1. SIR model diagram
susceptibles(t) = susceptibles(t - dt) + (-get_sick)*dt infecteds(t) = infecteds(t - dt) + (get_sick - recover) * dt recovereds(t) = recovereds(t - dt) + (recover) * dt get_sick = transmission_constant*susceptibles*infecteds recover = recovery_rate * infecteds Fig. 2. Difference equations for SIR model generated by STELLA
704
A.B. Shiflet and G.W. Shiflet
of the interactions of predators and prey. Then, students progress through a tutorial on developing a simple SIR (susceptibles-infecteds-recovereds) model of the spread of disease. Fig. 1 displays an SIR model diagram typical of those that the students create with a systems dynamics tool. After establishing the relationships among the systems of susceptibles (S), infecteds (I), and recovereds (R), students double click each component and enter initial values, constants, and differential equations. For example, with t indicating time and with dR/dt = cR for a positive constant c being the model for the rate of change of recovereds with respect to time, students enter the equation recovery_rate * infecteds into the recover flow from infecteds to recovereds. Similarly, the flow out of susceptibles, get_sick, is gets the equation transmission_constant * susceptibles * infecteds. STELLA translates the constants and equations into corresponding difference equations, such as in Fig. 2. Upon running simulation, STELLA can display tables and graphs. After completing the tutorial, students select diseases, such as Typhus or Hepatitis C, at random. Each student is paired at with another student to investigate the disease before the next laboratory. The student pairs ascertain as much as possible about the nature of their assigned diseases, including data, such as rates of change. In the next week’s laboratory, each pair develops a model of the spread of their disease using the system dynamics modeling tool. The professor and student assistants, who are obtaining the Emphasis in Computational Science, mentor the teams. In the final week of the lab sequence, each pair makes a presentation on their disease and model to the class. Individually, each student writes a report on his or her pair’s model and what they learned from the experience. Computational science students that assist in the laboratories also help to evaluate the models. 4.2 Student Perceptions Student comments about their projects reveal a deeper understanding of the spread of their diseases, the modeling process, and the utility of models. Comments from two students are insightful and typical. One student wrote, “I understand the relationship between these factors better now that I have worked with the model and adjusted the formulas that determine these relationships. Using (a system dynamics tool) for this modeling project was also interesting because it allowed researched facts to be projected into likely outcomes. It was fascinating to see the trends that developed as the model ran.” Another student in the class commented, “I thoroughly enjoyed creating a model to better understand a biological situation and to determine the rates at which an infection, recovery, or death can occur. Developing the mumps model allowed me to better understand how this disease can actually infect a population by producing real numbers. The graphs, in particular, helped me visualize the results of the model itself….I felt as if I wanted to add more and more things to make a more complicated but more realistic model. The ability to work with simulations like this will allow the scientist or researcher to be able to understand better the disease and relate to it in a way that could possibly allow for a better prevention of the disease”.
An Undergraduate Computational Science Curriculum
705
4.3 Why Model in a Biology Lab? Modeling in the microbiology lab has proved beneficial in several key areas: − − − − −
Understanding of fundamental concepts, such as rate of change Critical thinking skills, such as model construction, extension, and testing More effective problem-solving skills Communication skills Interactive learning experience.
5 Conclusion Like others, we have found that computers have become fast and cheap enough; networks have become sophisticated enough; scientific visualization has become mature enough; and the Internet has become pervasive and friendly enough so that a meaningful undergraduate computational science program is not only desirable but also now possible. Students who successfully complete such a program enter a variety of scientific fields, where they will be able to collaborate more effectively with others and help to transform the way science is done.
References 1. Swanson, C.: Computational Science Education. The Krell Institute, http://www. krellinst.org/services/technology/CSE_survey/ 2. Wofford College, http://www.wofford.edu 3. Shodor Educational Foundation, Inc., http://www.shodor.org/ 4. Computational Science, http://www.wofford.edu/ecs/ 5. Cunningham, S.: Computer Graphics: Programming, Problem Solving, and Visual Communication. Prentice Hall, New York (2007) 6. Spertus, E.: What We Can Learn from Computer Science’s Differences from other Sciences. The Barnard Center for Research on Women, http://www.barnard. columbia.edu/bcrw/womenandwork/spertus.htm 7. Margolis, J., Fisher, A.: Unlocking the Clubhouse: Women in Computing. The MIT Press, Cambridge (2001) 8. Thom, M.: Balancing the Equation: Where Are Women and Girls in Science, Engineering and Technology? National Council for Research on Women, New York (2001) 9. Shiflet, A., Shiflet, G.: Introduction to Computational Science: Modeling and Simulation for the Sciences. Princeton University Press, Princeton (2006) 10. SIAM Working Group on CSE Undergraduate Education: Undergraduate Computational Science and Engineering Education. SIAM, http://www.siam.org/about/pdf/ CSE_Report.pdf
Cryptography Adapted to the New European Area of Higher Education A. Queiruga Dios1 , L. Hern´ andez Encinas2 , and D. Queiruga3 1
Department of Applied Mathematics, E.T.S.I.I., University of Salamanca Avda. Fern´ andez Ballesteros 2, 37700-B´ejar, Salamanca, Spain [email protected] 2 Department of Information Processing and Coding, Applied Physics Institute, CSIC C/ Serrano 144, 28006-Madrid, Spain [email protected] 3 Department of Business Administration, University of Salamanca Campus Miguel de Unamuno, FES building, 37007-Salamanca, Spain [email protected]
Abstract. A new experience for teaching Cryptography to engineering students is shown. The aim is to give them a better understanding of secure and cryptographic algorithms by using Maple software, in a graduate-level course. In this paper we discuss how to structure, define, and implement a web-based course as a part of the traditional classes, according to the convergence of the European Higher Education Project. The proposed course facilitates the use of new Information and Communication Technologies. Keywords: Public key cryptography, implementation, Maple software, information and communication technologies, European area of higher education.
1
Introduction
The Bologna declaration (1999) proposes the creation of an European Area of Higher Education (EAHE) to unify university studies in Europe. It emphasizes the creation of the European Area of Higher Education as a key to promote citizens’ mobility and employability and the Continent’s overall development [2]. Spain is one of the 46 countries involved in the Bologna Process. The corner stones of such an open space are mutual recognition of degrees and other higher education qualifications, transparency (readable and comparable degrees organised in a three-cycle structure) and European cooperation in quality assurance. This earthquake in thinking about the new education process means that we must approach the design, development, and implementation of learning environment such that the achievement and assessment of those competencies is made possible and is facilitated. This new proposal supposes a change of the instructional design process [12]. The use of Information and Communication Technologies (ICT) in higher education is considered a pre-requisite for the adaptation to the EAHE. University M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 706–714, 2008. c Springer-Verlag Berlin Heidelberg 2008
Cryptography Adapted to the New European Area of Higher Education
707
studies must be adapted to the international European context and technology development facilitating new strategies of communication. This new situation forces Universities to renew some situations that until now seemed stable as teaching methodologies, and change their degrees and studies programmes. ICT become more and more important in the higher education process, claiming new spaces and conditions of learning, and new professional roles for lecturers [8]. One of the fields of greater projection in the future and greater impact, within the ICT, is Cryptography. As it is known, this science is closely related to the Mathematics in general, and Computer Sciences in particular. Its aim is the preservation of information, including confidentiality, integrity, and authentication. The goal of the Cryptography is to provide safe communications on insecure channels, to allow people send messages by means of a channel that can be intercepted by a third person (mail or e-mail, telephone, fax, etc.), so only the authorized receiver can read the messages [7], [15]. The great importance of Cryptography in our days is due to the proliferation of personal computers and the facility in the access to the Internet. These facilities have cause serious problems of security, like virus, spam, phising, publication of confidential information, etc. All of it makes necessary that students and future professionals are conscious of the dangers that browse the Internet without safety measures supposes. In this paper, we present some educational tools to learn about Cryptography and how to implement different cryptosystems by using Maple software together with Moodle environment. Moodle is an open source package, designed using pedagogical principles, in order to help educators to create effective online learning communities. The rest of the paper is organized as follows: In section 2, we will comment the changes that are happening in the Spanish Universities to reach the European Area of Higher Education. The course Cryptography and information security will be detailed in section 3. Background and Maple concepts needed to follow this course are presented in section 4. In section 5 we will present the Moodle tools used in the University of Salamanca (http://www.usal.es), the detailed methods that we used in the course will be stated in section 6, and finally, the conclusions will be shown in section 7.
2
Changes in Higher Education
The knowledge society depends for its growth on the production of new knowledge, its transmission through education and training, and its dissemination through Information and Communication Technologies [1]. As it was mentioned in the Introduction, one of the means to get the convergence of European Higher Education and the common goal of the Bologna Declaration is the use of the ICT in higher education. Universities face an imperative necessity to adapt and adjust to a whole series of profound changes, including increased demand, internationalisation and links with business.
708
A. Queiruga Dios, L. Hern´ andez Encinas, and D. Queiruga
Online education also refers to learning methods that, at least, partly utilize the ICT available through the Internet. What we propose to the students is to use the online methods to get a more complete education in specific subjects. The online education is a new method of education, very different from traditional education, that take advantage of new media, new ways to communicate, and the design of new educational experiences. Educators are thus utilizing the Internet for professional networking, regionally and globally, they learn from one another about the new media and their applications to education [21], and renew their knowledge in virtually fields of enquiry. The ICT have changed from being considered as a mere object of use towards an instrument of support in the educational innovation [20]. They affect to different aspects in relation to traditional education, as the change in the role of the teacher, who has changed from a simple transmitter of knowledge to be a mediator in the construction of the knowledge of the students; the role of the student has changed as the traditional educative models do not adjust to the processes of learning by means of the use of the ICT [18]. Finally, it is important to take into account that the use of new technologies does not require the invention of new methodologies, but it requires a modification in the strategies for the continuous learning of the student [13]. Modern e-Learning technology may act as a bridge: On the one hand, computer systems make real experiments available over the Internet, any time, anywhere, and – even more important – make the measured data electronically available for further analysis [11]. On the other hand students could access to simulation within virtual laboratories.
3
Cryptography and Information Security Course
The Cryptography and information security course has been revised from previous years to give more emphasis to the practical elements of the course. It is an introduction to Cryptography. The viewpoint of this course is specifically the design and analysis of real world cryptographic schemes. We consider tasks like encryption and decryption processes, digital signatures, authentication, and key distribution. The goal is to instil understanding of fundamentals into the design of cryptographic protocols. Formally, the assessed course consists of 7 modules and a set of laboratory software practices with Maple. The modules are: 1. 2. 3. 4. 5. 6. 7.
Introduction to Cryptography Mathematical tools Private Key Cryptosystems RSA Cryptosystem ElGamal and Elliptic Curve Cryptosystems Chor-Rivest knapsack Cryptosystem Biometric recognition systems
In the beginning, this course was purely theoretical, nevertheless, in recent years, we have including laboratory practices, using Maple software, and now
Cryptography Adapted to the New European Area of Higher Education
709
we make use of ICT to get rather than an online course: A set of tools making possible to work with students in classes or in their own houses. 3.1
A Brief Introduction to Cryptography
The objective of Cryptography is to assure the secrecy and confidentiality of communications between several users and the goal of Cryptanalysis is to break the security and privacy of such communications [15], [16]. In particular, in Public Key Cryptography (PKC) each user has two keys: The public key, which is publicly known and it is used by the sender to encrypt a message; and the private key, which is kept in secret by the receiver and it is used by him to decrypt the received encrypted messages. In general, PKC bases its security on the computational intractability of some Number Theory Problems, as factorization problem, discrete logarithm problem or knapsack problem. 3.2
Modules of the Course
The course starts with an introduction to Cryptography and to the necessary mathematical tools for the correct understanding and development of the different modules. This introductory part includes the main mathematical problems on which the different cryptosystems security are based. Later, a basic knowledge of the Cryptography will be approached, including the different types of cryptosystems and the schemes of digital signature associated to them. The course includes the most important Secret Key or Symmetric Cryptosystems and as far as Public Key or Asymmetric Cryptosystems. RSA Public key cryptosystem [5], [19] and its protocol of digital signature will be detailed including the more important attacks against their security. ElGamal and Elliptic Curve cryptosystems [6], [14], are also included. The security of these cryptosystems is based on the discrete logarithm problem. Also the most important characteristics of the knapsack cryptosystems will be analyzed, in particular Chor-Rivest cryptosystem [4]. With the purpose of exemplifying in a practical way the schemes and protocols studied throughout the course, it will be carried out different practices from laboratory by means of the program of symbolic calculation, Maple. In this way, the analyzed difficulties of the mathematical problems considered and security of cryptosystems will be shown.
4
Implementation and Procedures in Maple
In this section we present the second part of the course, which is addressed to show the students how to work with Maple software in order to implement procedures, functions, and statements needed to transform the messages, to generate the keys, to encrypt, and to decrypt messages with the cryptosystems previously mentioned. Maple is a comprehensive environment for teaching and applying Mathematics which contains thousands of math procedures. It permits to define specific
710
A. Queiruga Dios, L. Hern´ andez Encinas, and D. Queiruga
procedures by using the Maple programming language. It contains several packages to help professors to teach and students to understand mathematical concepts. A package is a collection of routines that are collected together. For example, numtheory, ListTools and LinearAlgebra are packages, which provides a range of functionality, by commands, for solving problems in some well-defined problem domain. Maple has excellent online help. In fact, it is possible to access to this help by choosing Maple Help from the Windows menu, typing a question tag in the command prompt, or clicking on F1 key. After knowing the Maple syntax, the students will practice with the main commands needed to implement different cryptosystems. For example, for RSA cryptosystem they will need some commands like ifactors(n), which returns the complete integer factorization of the integer n; phi(n) which is the one that calculates Euler’s phi function or totient function of n, which is the number of positive integers not exceeding n and relatively prime to n; or Power(c, d) mod n which computes cd mod n. In case of the Chor-Rivest knapsack cryptosystem [9], the GF(q,h,f) command returns a table of functions and constants for doing arithmetic in the finite field of q h elements: GF (q h ) = GF (q)[T ]/(f (T )), where f (T ) is an irreducible monic polynomial of degree h over the integers modulo q. If f is not specified, Maple uses a random one. The Powmod(a,n,f,x) function computes the remainder of an in GF (q h ). Finally, if u = [u1 , . . . , un ] is a list of integers and m = [m1 , . . . , mn ] a list of moduli, pairwise relatively prime, the function chrem(u,m) solves the Chinese Remainder Theorem, i.e., it computes the unique positive integer a, 0 < a < M (M denotes the product of the moduli), such that a = u1 mod m1 , a = u2 mod m2 , . . . , a = un mod mn .
5 5.1
Working Environment EUDORED
Although the implantation of modern models of complementary education, bLearning or e-Learning, not yet are a reality in the Spanish universities, their use have considerably increased in recent years. The University of Salamanca has a virtual environment to distribute its teaching: EUDORED (University of Salamanca environment for web learning). This tool is available for students and teachers to incorporate new educative technologies to the development of educational tasks, allowing virtual teaching. EUDORED is constructed on a technological structure that canalizes the formation through the Internet. In this way, it facilitates web tools to transfer the interaction processes teacher-student. The virtual campus, EUDORED (http://www.usal.es/eudored), is based on a web platform called Moodle (Modular Object Oriented Distance Learning Environment), a course management system designed to help educators for creating quality online courses. This platform is used by universities, schools, companies and independent teachers. Moodle is an open source software package and completely free to use (http://www.moodle.org).
Cryptography Adapted to the New European Area of Higher Education
5.2
711
Moodle Platform
Moodle is a virtual environment for education which allows to place contents and tasks in the web and provides online communication tools. The design and development of Moodle is guided by a particular philosophy of learning: social constructionist Philosophy. With this learning philosophy people actively construct new knowledge as they interact with their environment, under the hypothesis that learning is more effective when you are constructing something. Constructivism is a philosophy of learning founded on the premise that, by reflecting on our experiences, we construct our own understanding of the world we live in. Each of us generates our own rules and mental models, which we use to make sense of our experiences. Learning, therefore, is simply the process of adjusting our mental models to accommodate new experiences. Constructivism calls for the elimination of a standardized curriculum. Instead, it promotes using curricula customized to the students’ prior knowledge. Also, it emphasizes hands-on problem solving [3]. Another important characteristic of Moodle is that could be considered as a set of Web 2.0 learning tools. It is known that Web 2.0 refers generally to web tools that, rather than serve as a forum for authorities to impart information to a passive or receptive audience, actually invite site visitors to comment, collaborate, and edit information, creating a more distributed form of authority in which the boundaries between site creator and visitor are blurred [17]. Web 2.0 is related to a perceived second generation of web-based communities and hosted services – such as social-networking sites, wikis and folksonomies – which aim to facilitate collaboration and share information between users. Although the term suggests a new version of the World Wide Web, it does not refer to an update to any technical specifications, but to changes in the ways software developers and end-users use the web. 5.3
Moodle Activities
One of the most important advantages of Moodle environment is that it has implemented all the useful tools and activities needed for online classes and eLearning in general. The following features are part of the learning environment: 1. Chat: The Chat module allows participants to have a real-time synchronous discussion via the web. This is a useful way to get a different understanding of each other and the topic being discussed. 2. Forums: It is in forums where most discussion takes place. Forums can be structured in different ways, and can include peer rating of each posting. The postings can be viewed in a variety for formats, and can include attachments. 3. Glossaries: This activity allows participants to create and maintain a list of definitions, like a dictionary. The entries can be searched or browsed in many different formats. 4. Hotpot: This module allows teachers to create multiple-choice, short-answer, jumbled-sentence, crossword, matching/ordering and gap-fill quizzes using Hot Potatoes software [10]. The Hot Potatoes suite is a set of six authoring
712
A. Queiruga Dios, L. Hern´ andez Encinas, and D. Queiruga
tools, created by the Research and Development team at the University of Victoria Humanities Computing and Media Centre. They enable you to create interactive Web-based exercises of several basic types. The exercises are standard web pages using XHTML code for display, and JavaScript for interactivity. 5. Lessons: A lesson delivers contents in an interesting and flexible way. It consists of a number of pages. Each page normally ends with a multiple choice question. Navigation through the lesson can be straight forward or complex. 6. Resources: Resources can be prepared files uploaded to the course server; pages edited directly in Moodle; or external web pages made to appear part of this course. 7. Wiki: A wiki is a web page that anyone can add to or edit. It enables documents to be authored collectively and supports collaborative learning. Old versions are not deleted and may be restored if required.
6
Course Objectives - Training in Cryptography
We have used Moodle to create a new interactive educational teacher-student context. Students need to construct their own understanding of each cryptographic concept, so that the primary role of teacher is not to explain, or attempt to ‘transfer’ knowledge, but to create situations for students that allow them to make the necessary mental constructions. In 21st century students are familiarized with the Internet and with the new technologies. They usually use them to chat with friends, to send and receive e-mails, to meet people or to organize holidays, but they are not conscious that it is a useful tool in the daily classes. Sometimes they do not see possible that personal computers and the Internet could be used effectively for classes about Mathematics or Cryptography. With the purpose of obtaining a suitable training of the students, in each module we will give the students access to some interesting and introductory documentation, and we will create a forum to discuss about the current module. For example, with RSA cryptosystem module we start a new Moodle activity which is a questionnaire with different items related to the algorithms and the encryption and decryption processes. Other exercises will be proposed to the students so that they can comment and debate them in the forums created for that goal. Moreover, some theoretical questions or Hot Potatoes exercises, that enable the creation of interactive tests [10], will be proposed for the students assessment. Another interesting and practical exercise that we propose the students is to generate their own keys and the possibility of sending encrypted messages to other students and to decrypt messages that they receive from other classmates or from the teacher. All of it could be possible using the cryptosystems studied during the course, because they have developed some of them using Maple software. At the moment the only way that we use to check Maple source code, is to write it in a text file and check the right functionality in Maple itself, but it could
Cryptography Adapted to the New European Area of Higher Education
713
be a good learning tool an API that allows students to write their own source code (could be Maple, Matlab, Mathematica,...) and to test it at the same time, without install those programs in their PCs.
7
Conclusions
We have designed a new experience for teaching Cryptography in the University of Salamanca, Spain. The aim is to give the students a better understanding of secure and cryptographic algorithms by using Maple software, in a graduatelevel course. In this paper we have proposed a web-based course according to the convergence of European Higher Education Project, to increase the use of new Information and Communication Technologies. This course will be available, for the students of the university, in the virtual environment EUDORED, which is based on the Moodle platform, and offers a reachable environment easy to work with. The course allows us to get a formative evaluation (focuses on improvement the security and Cryptography knowledge while the course is in progress) as much as a summative evaluation (focuses on results or outcomes). To make both evaluations possible, students can access to the online course by the Internet including the theory, papers and links related to the topic of the subject. The students will e-mail all the questions, suggestions, or whatever they need to make this e-Learning possible. Moreover, they will have access to electronic chat room, and forums to make possible an online participation whenever they want. Acknowledgment. This work has been supported by Ministerio de Industria, Turismo y Comercio (Spain) in collaboration with Telef´ onica I+D (Project SEGUR@) with reference CENIT-2007 2004.
References 1. Blackstone, T.: Education and Training in the Europe of Knowledge (January 2008), http://www.uniroma3.it/downloads/297 Lezione%20Blackstone.doc 2. Bologna Declaration (January 2008), http://www.ond.vlaanderen.be/hogeronderwijs/bologna/documents/MDC/ BOLOGNA DECLARATION1.pdf 3. Brooks, J., Brooks, M.: In Search of Understanding: The Case for Constructivist Classrooms, Revised Edition, ASCD, 1999. 4. Chor, B., Rivest, R.L.: A knapsack-type public key cryptosystem based on aritmethic in finite fields. IEEE Trans. Inform. Theory 34(5), 901–909 (1988) 5. Dur´ an D´ıaz, R., Hern´ andez Encinas, L., Mu˜ noz Masqu´e, J.: El criptosistema RSA, RA-MA, Madrid (2005) 6. ElGamal, T.: A public-key cryptosystem and a signature scheme based on discrete logarithm. IEEE Trans. Inform. Theory 31, 469–472 (1985) 7. F´ uster Sabater, A., de la Gu´ıa Mart´ınez, D., Hern´ andez Encinas, L., Montoya Vitini, F., Mu˜ noz Masqu´e, J.: T´ecnicas criptogr´ aficas de protecci´ on de datos, RAMA, 3a ed., Madrid (2004)
714
A. Queiruga Dios, L. Hern´ andez Encinas, and D. Queiruga
8. Garc´ıa-Valc´ arcel Mu˜ noz-Repiso, A., Tejedor Tejedor, F.J.: Current Developments in Technology-Assisted Education. In: M´endez-Vilas, A., Solano Mart´ın, A., Mesa Gonz´ alez, J.A., Mesa Gonz´ alez, J. (eds.) FORMATEX (2006) 9. Hern´ andez Encinas, L., Mu˜ noz Masqu´e, J., Queiruga Dios, A.: Maple implementation of the Chor-Rivest cryptosystem. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2006. LNCS, vol. 3992, pp. 438–445. Springer, Heidelberg (2006) 10. Hot Potatoe Home page, http://hotpot.uvic.ca/ 11. Jeschke, S., Richter, T., Scheel, H., Thomsen, C.: On Remote and Virtual Experiments in eLearning in Statistical Mechanics and Thermodynamics. In: Proceedings of the Fifth IEEE International Conference on Pervasive Computing and Communications Workshops, pp. 153–158. IEEE Computer Society, Los Alamitos (2007) 12. Kirschner, P.A.: Using integrated electronic environments for collaborative teaching/learning. Research Dialogue in Learning and Instruction 2(1), 1–10 (2001) 13. Mason, R.: Models of online courses. ALN Magazine 2, 2 (1998) 14. Menezes, A.: Elliptic Curve Public Key Cryptoystem. Kluwer Academic Publishers, Boston (1993) 15. Menezes, A., van Oorschot, P., Vanstone, S.: Handbook of applied cryptography. CRC Press, Boca Raton (1997) 16. Mollin, R.A.: An introduction to cryptography. Chapman & Hall/CRC, Boca Raton (2001) 17. Oberhelman, D.D.: Coming to terms with Web 2.0. Reference Reviews 21(7), 5–6 (2007) 18. P´erez i Garc´ıas, A.: Nuevas estrategias did´ acticas en entornos digitales para la ense˜ nanza superior. En Did´ actica y tecnolog´ıa educativa para una univesidad en un mundo digital (J. Salinas y A. Batista), Universidad de Panam´ a, Imprenta universitaria (2002) 19. Rivest, R.L., Shamir, A., Adleman, L.: A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM 21, 120–126 (1978) 20. Salinas, J.: Innovaci´ on docente y uso de las TIC en la ense˜ nanza universitaria. Revista de Universidad y Sociedad del Conocimiento (RUSC) 1, 1 (2004) 21. Weiss, J., et al. (eds.): The International Handbook of Virtual Learning Environments, vol. 14. Springer, Heidelberg (2006)
An Introductory Computer Graphics Course in the Context of the European Space of Higher Education: A Curricular Approach Akemi G´alvez, Andr´es Iglesias, and Pedro Corcuera Department of Applied Mathematics and Computational Sciences, University of Cantabria, Avda. de los Castros, s/n, E-39005, Santander, Spain {galveza,iglesias,pcorc}@unican.es
Abstract. Currently, European countries are in the process of rethinking their higher education systems due to harmonization efforts initiated by the so-called Bologna’s declaration. This process of reforms implies not only a re-evaluation of our way of teaching and learning but also a new curricular design in order to achieve the expected goals. In this context, the present paper focuses on the problem of teaching computer graphics, a discipline that is getting increasing importance in the realm of Computational Science. The paper discusses some issues regarding an introductory course on computer graphics taking into account its goals, contents, students’ profile and other factors derived from the new scenario given by the European Space of Higher Education.
1
Introduction
Bologna’s declaration - seen today as the well-known synonym for the whole process of reformation in the area of higher education - was signed in 1999 by 29 European countries with the objective to create a European space for higher education in order to enhance the employability and mobility of citizens and to increase the international competitiveness of European higher education [4]. The upmost goal of this process is the commitment freely taken by each signatory country to reform its own higher education systems in order to create overall convergence at European level. Some major objectives of this approach are: – the adoption of a common framework of readable and comparable degrees, – the introduction of undergraduate and postgraduate levels in all countries along with ECTS (European Credit Transfer System) credit systems to ensure a smooth transition from one country’s system to another one and – the promotion of free mobility of students, teachers and administrators among the European countries. The ultimate goal is to ensure that the European higher education system acquires a worldwide degree of attractiveness equal to [Europe’s] extraordinary cultural and scientific traditions [4]. M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 715–724, 2008. c Springer-Verlag Berlin Heidelberg 2008
716
A. G´ alvez, A. Iglesias, and P. Corcuera
Unquestionably, Bologna’s declaration opened the door to a completely new scenario for higher education in Europe. Nowadays, the European countries are in the midst of the process of restructuring their higher education system in order to attain the objectives of the declaration. At this time, the developments focus especially on academic aspects, such as the definition of the new curricula and grading systems. Although this issue represents a major concern for all areas of knowledge, it is particularly challenging for the field of Computational Science understood in its most comprehensive meaning - because of its inherent dynamic nature and the continuous emergence of new topics and subfields within. Among them, Computer Graphics (CG) has established itself as an integral part of Computational Science as well as a solid discipline at its own. This paper concerns the problem of teaching Computer Graphics in the context of the European Space of Higher Education (ESHE). In particular, we present a proposal for a first (introductory) CG course taking into account its goals, contents, students’ profile and other factors which will be discussed in this work. The structure of this paper is as follows: in Section 2 we describe the framework of CG for the last recent years (Section 2.1) as well as the profile of our students (Section 2.2), a key question for the adequate design of any course. Section 3.1 discusses some previous work done about introductory CG courses. Then, we discuss our proposal, which consists of the course’s goals, syllabus, the teaching methods and some bibliography and complementary material used in this course (Sections 3.2 to 3.5). The main conclusions in Sec. 4 closes the paper.
2 2.1
Framework Computer Graphics in Recent Years
In recent years, CG has found remarkable applications in many other fields: Science, Engineering, Medicine, advertising, entertainment, etc. and the list is expanding rapidly. Current CG hardware is orders of magnitude faster and cheaper, and it is much more robust and powerful than earlier technology. Today, we are used to have powerful graphical cards in our desktop computers and laptops. These graphical cards are especially designed for high performance and often incorporate OpenGL or other APIs (Application Programmer Interfaces) in hardware. In addition, we have witnessed extraordinary advances in CG software: – the appearance and acceptance of standardized APIs, such as OpenGL, Direct3D, Quick-Draw3D, Java3D, etc. – extensive general-purpose libraries providing high level graphical capabilities and simplified GUIs (Graphical User Interfaces) – powerful graphical programs (for instance, 3D Studio Max, LightWave or RenderMan) that provide the users with a large and very powerful collection of tools for design, rendering and animation – the Web and its wealth of free demos, software tools (POVray, rayshade, VRML), data sets and examples, etc. (see, for instance, [19,29] or the nice ACM Siggraph repository at http://www.siggraph.org).
Computer Graphics Teaching Challenges: A Curricular Approach
2.2
717
The Students
There is a general consensus that our current students are far different than their previous counterparts in the 1970s, 80s and 90s. From a CG education standpoint, two major features have been pointed out: – on one hand, today’s students exhibit less proficiency in Mathematics, Geometry and other related topics; – on the other hand, they are much more accustomed to technology. Regarding the first item, teachers and lecturers have noticed that today’s students encounter more problems than former ones in solving questions with mathematical content. In particular, their background (if any) on topics as valuable for CG as Linear Algebra, Analytic Geometry, Euclidean Geometry, Discrete Geometry or Differential Geometry, is nowadays reduced to the minimum in most cases. Students find it difficult to solve logic problems in an axiomatic way, and are less skilled in deduction, mathematical intuition and scientific reasoning. By contrast, they are surrounded by technology since their early days. Technology terms (pixel, navigation, interface) are usual to them at full extent; in fact, they are familiar with all this technological stuff in such a way that frequently cannot imagine the world without it. Today, devices like cell phones, notebook PCs, MP3 players, videogame controllers and fax machines are an essential part of their daily lives. Our Digital Age students have no trouble in using haptic devices such as joysticks or steering wheels and more recently Wii-like game controllers. They can readily identify aliasing artifacts in computer-generated imagery, in the same way only professionals did just a few years ago. They do not need to be trained about texturing, multiresolution or on-line multiplayer systems, concepts that can be easily captured from videogames. By the time students access the university they have spent thousands of hours watching TV, playing videogames, using computers and videocameras, surfing at the web, talking by cell phone and sending and receiving e-mails and SMS messages. It is obvious that our approach to teaching CG subjects should take into account this new situation and adapt the course’s contents accordingly. An important factor to be considered is the new role Bologna’s declaration assigns to teachers and students. Although current worries focus on “external” topics such as the curricular design, the course’s contents and others, the upcoming changes go far beyond these structural changes, as the personal development of students and teachers is also at the roots of this new concept of education. It is clear that the real change should come from the human parties. In what concerns the students, the new philosophy of self-learning and self-teaching demands a new approach to our course’s contents. Another issue is the need to generate a common core of knowledge for each subject so that students can freely move abroad and get their studies recognized. In order to do so, the curriculum must be clearly understood as being “somehow” similar, whatever it means. Of course, it does not mean the course’s contents to be the same; on the contrary, Bologna’s declaration encourage the diversity as a mean to enrich the global knowledge. The key word of the ESHE is convergence,
718
A. G´ alvez, A. Iglesias, and P. Corcuera
not uniformity. Because of that, the course’s contents need not to be similar, but the skills acquired from learning those contents should be. In other words, the subjects must be equivalent rather than equal and therefore compatibility must be established in these terms. Other issue is that, as it is their first course on CG, our students are still unaware on the topics and techniques involved. This fact restricts dramatically the course’s goals to simply offer a comprehensive overview on the fundamentals of CG as well as a gentle introduction to the main topics and techniques. Consequently, the tools to be used in the course must be carefully chosen in order to prevent students from boredom or discouragement.
3 3.1
A First Computer Graphics Course Previous Work
If courses on all computer science topics have largely evolved in response to the impressive developments in hardware and software, this is particularly true for CG courses. You can, for instance, take a look at the nice paper in [18] to realize how the CG curricula have adapted to changes over the last three decades. Recently, many universities have included elective CG courses in their curricula, in response to the new students’ needs and interests. Although this is particularly the case for computer science students, it is still valid for many other scientific studies, such as engineering. To quote just an example close to the authors, the education committees of the some national chapters of Eurographics have contended that CG should be a compulsory subject for computer science studies. This assessment is also shared by many other professional and educational organizations worldwide. The new regulations for the ESHE have re-opened the discussion about how the CG curriculum should be. This has been a largely vivid debate for years. For instance, the authors participated in a Joint Eurographics/ACM Siggraph panel for CG education in 1999 where the key topic was the definition of a new CG curriculum for the coming years [11]. Subsequent editions of this panel have been held in 2002 [12] and 2004 [13]. We also mention the related event in [14]. The activities of those panels were for good, as new proposals followed up the fruitful discussions [5,7,17,26,30,31]. 3.2
Course’s Goals
Based on the previous considerations, we think that any proposal for an introductory CG course should restrict its objectives to: – offer a comprehensive overview on the fundamentals of CG in terms of basic algorithms and techniques. Although the definition of “basic algorithms” can be understood in many different ways, it is almost unanimously recognized today that, in spite of their pedagogical value, many fundamental algorithms and procedures are not longer necessary. For instance, Bresenham’s algorithms for lines and circles, scan conversion algorithms for polygons, clipping
Computer Graphics Teaching Challenges: A Curricular Approach
719
procedures and similar are currently performed in very low level hardware and, consequently, are not useful to learn graphics sofware techniques. On the other hand, the original algorithms are very often modified before being implemented in hardware. Lastly, many algorithms designed for optimal performance at software level are today buried in hardware. Because current graphical cards are much more powerful than those from the previous years, some “strategies” for software efficiency are not required anymore. – develop the students’ visual sense, which is not usually acquired from most of the traditional courses [31,32]. At our experience, this visual insight is better acquired when appropriate high-level 3D APIs are applied. In particular, Open GL [22] and Java3D [20] are excellent candidates: they are free and easily available for all students, require little (or none) effort to install, can be used with many standard compilers and tools and support all the fundamental concepts needed for early work. 3.3
Syllabus
Table 1 shows a proposal of syllabus for a first (introductory) CG course for undergraduate students. The course is intended to be as general as possible in the sense that not only computer science students but also those from any other technical field (even from arts!) can follow it. Differences lie on how deep and at which extent a specific topic is covered. On the other hand, the course tries to agglutinate as much material as possible (for being an introductory course) so that teachers can select those parts they want to stress more or students are more interested in. Finally, the syllabus has been designed according to Bologna’s declaration principles and recommendations, meaning that the topics are expected to be taught and learned in a participative and cooperative manner and that most of the load falls on the students, who should be able to carry out assignments and a very significant part of the educative process by themselves. The goal of Chapter 1 is to provide the students with a comprehensive overview about CG. Thus, students are given the basic bibliography so that they can freely study and learn by themselves, the teachers’ role rather being like a tutor and advisor, someone students can go to for questions much like a personalized “Google”. Besides, teachers can focus on tutoring the students with their assignments along with offering them guidance and complementary information that is not easily found in the books and other sources. The chapter is also intended as a motivational chapter; interesting examples of real use of CG in several fields are given. Other sources of information (videotapes, web sites, etc.) are also presented. Finally, the history of CG is revisited by using some educational Siggraph clips and complementary material. Chapter 2 focuses on the hardware and software used for CG. From the graphical cards and GPUs to haptic devices, virtual reality accessories and the most recent game controllers, students are presented a complete gallery of hardware that accounts for the “machinery” of the field. The second part of the chapter focuses on the software. Special emphasis is placed upon shareware and freeware, with the twofold objective of reducing the institutional budget load required
720
A. G´ alvez, A. Iglesias, and P. Corcuera Table 1. Proposed syllabus for an introductory Computer Graphics course
Chapter Chapter Chapter Chapter Chapter Chapter Chapter Chapter Chapter Chapter Chapter Chapter Chapter Chapter Chapter
1: Introduction to CG. Applications. Information sources. History of CG. 2: Hardware and Software for CG and Virtual Reality. 3: Basic Algorithms for CG. 4: Geometric Modeling: Curves and Surfaces. 5: Rendering I: Illumination Models. Ray tracing. Radiosity. 6: Rendering II: Texturing and Color. 7: Advanced Algorithms for CG. 8: Computer Animation (Geometry, Physics, Kinematics, Behavior). 9: Virtual characters and avatars. Virtual humans. 10: Graphical User Interfaces. APIs. Development Toolkits. 11: Multimedia. HD and SW for multimedia. 12: Graphical formats for image, video, audio. Industrial formats. 13: Fractals and L-systems. Applications. 14: Graphics for the Web. Web3D. 15: Recent trends. Final project.
for these courses, on one hand, and encouraging students to use free software thus lowering their educational costs. Fortunately, as pointed out by many authors [6,8,18,19,21,28], there is a bulk of very powerful freeware available, including several powerful APIs, such as the universally used and standard low level OpenGL (evolved from SGIs proprietary GL and now available on nearly all platforms) or Java3D, graphical and virtual reality markup languages (VRML, X3D, XGL, XDML), visualizers (POVray, rayshade), etc. Other packages can be purchased a very reasonable rates. One striking thing is the gap of powerful open-source software for occasional CG users with limited (or not at all) experience in programming. The majority of freeware packages for CG, such as those listed above, force the user to write down some source code in order to get something on the screen. In other words, they are well suited for computer science students, who are assumed to have programming abilities and the know-how on programming. This is not necessarily the case for students from other disciplines. For these reasons, a truly introductory course should include some pointers to commercial software for CG (3D Studio Max, Lightwave, Rhinoceros, Renderman and others), useful for end-users with very limited programming experience. This interplay between freeware and commercial software allows us to approach to CG courses in two different (but partially overlapping) ways: one for students with limited programming abilities and/or a short background in related topics, based on the use of intuitive, user-friendly commercial software; and that for programming-skilled students, mostly relying on advances APIs and their supporting framework for GUI development, such as GLUT (GL Utility Library) for OpenGL. In fact, several initiatives support these approaches, from the Pixar University based on their proprietary software Renderman for the first approach to courses based on OpenGL [2,31] or Java3D [18] for the second one.
Computer Graphics Teaching Challenges: A Curricular Approach
721
Chapter 3 focuses on the basic algorithms for CG, including some background on vectors and matrices, 2D and 3D transformations, perspectives, cameras, projections. The main graphical primitives (points, lines, polygons, triangle fans and strips, etc.) are also studied. Based on our previous comments, Bresenham’s line and scan conversion algorithms are only briefly mentioned (or completely omitted). The same applies for clipping techniques. By contrast, the visible line and surface algorithms (painter’s algorithm, z-buffer) are analyzed. Chapter 4 talks about computer design. As such, it is a must-be ingredient of any CG course. Careful attention is put on the free-form parametric curves and surfaces, especially NURBS [23], because of their remarkable applications to computer design and industry. Some brief concepts on solid modeling might also be included in this chapter. Once the geometry of a graphical scene is generated, illuminating it is the next step. Chapter 5 concerns the illumination models used to tackle this issue. From the physics of light and color to sophisticated shading algorithms (ray tracing, radiosity), students are introduced to the fundamental principles of illumination techniques in a essentially visual way. This chapter is also important from the motivational viewpoint, as it is one students enjoy the most. The continuation is given by Chapter 6, where some texturing techniques (texture mapping, bump mapping, environment mapping) are learned. Even the color itself is seen as a (flat) texture. Chapter 7 presents some advanced algorithms for CG. They range from special effects for movies to the simulation of natural phenomena (water, fire, smoke, fog, etc.). The chapter also introduces some ideas about computer animation, a topic addressed in Chapter 8. Starting with geometry and moving “up” through (forward and backward) kinematics and physics to behavior, the basic concepts of animation are covered in sequence much like steps in a ladder. The top of the sequence is given by the cognition, leading to the animation of virtual humans and avatars, covered in Chapter 9. This has been one of the most challenging tasks in CG for many years and the subject of great efforts very recently. Chapters 10 to 15 can be considered as part of a second course if needed, as they introduce supplementary (but still very important) material. Chapter 10 concerns CG software, introducing the students to the design of GUIs and the use of APIs and their associated development toolkits. Here instructors can choose some API of their choice or restrict themselves to some commercial software, according to the students’ profile. The accompanying Chapters 11 and 12 are about multimedia and graphical formats. In these chapters students can relate their academic and real life learning in a very smooth way, as the learning process happens naturally: topics will answer questions about their daily use of technology. Chapter 13 introduces the main topics about fractals and L-systems, two subfields that captured mass media interest during the 80s because of their outstanding applications to the simulation of rough, irregular objects and natural structures. In this sense, the chapter is linked with Chapters 7 and 8 so contents
722
A. G´ alvez, A. Iglesias, and P. Corcuera
can be swapped between chapters with little or not influence on the general picture of the course. Chapter 14 covers the interesting issue of graphics for the web. The recent appearance of powerful Web3D technologies is changing our way to approach the web, as web pages have traditionally been seen as unable to support 3D graphics technology. Note that this chapter can also be subsumed into Chapter 11, as the web itself is now regarded as a multimedia interface. Finally, Chapter 15 aims at reporting some recent trends of interest for the students. Media news might be analyzed in this chapter thus reinforcing the applied, real-life character of the course. The chapter is also devoted to paving the way for the final course project. 3.4
Teaching Methods
The course’s syllabus depends strongly on the number of teaching hours and their structure. Our course is designed for 75 lectures (of one hour each), of which 30 are given at the classroom and 45 at the computer labs. Each chapter (typically about 5 lectures, depending on the complexity of the topics involved) consists of a visual introduction to the subject, a theoretical explanation of the concepts and techniques, some computer work at the computer labs and homework. Since our objectives are much easier to achieve if the evaluation is based on practical assignments (one for each chapter of the syllabus) and the number of students is relatively low (no more than 30 per course), we have chosen this kind of evaluation. Assignments consist of a brief exposition on some topic or some implementation. Our students are required to perform these assignments by themselves, either as homework or as computer training at our labs. Of course, this structure demands a significant time and effort to both students and teachers, making this approach feasible only for small groups of highly motivated students. At the end of the course, a final project on some of the topics of the course is also required. To this aim, essential ingredients are a good bibliography and additional documentation and material, analyzed in the next section. 3.5
Bibliography and Complementary Material
Regarding the bibliography used in this course, students are suggested to use the books in [9,25,27] for a gentle and general introduction to CG. These books also cover most of the course’s topics at a general (but still enough) level. On the computer training counterpart, the books in [2,16] are also recommended. Some topics, however, require additional bibliography. For geometric design we recommend the excellent books in [1,23], while illumination models are also analyzed in [15] and in [10] for ray-tracing techniques. Additional bibliography for specific topics, such as fractals [3], could also be used. Additional material comprises ACM Siggraph videotapes, freeware/shareware programs and libraries available from the Web (see above), journals and additional documentation (manuals, presentations, slides, etc.). Finally, the implementations performed by former students are used to illustrate new students about possible course’s projects.
Computer Graphics Teaching Challenges: A Curricular Approach
4
723
Conclusions
In this paper an introductory CG course is presented. The paper discusses the main issues involved in its design according to the ESHE regulations, such as the resources, goals, students’ profile, course’s contents, teaching methods and bibliography and complementary material. We also consider very seriously the suggestions in [24]. Roughly, they establish that creating a single curriculum to meet all needs could be counterproductive, taking into account the different institutional resources, student’s background, skills, goals and other factors. Instead, we encourage to carefully analyze the real constraints and goals, and then select accordingly an appropriate small set of concepts and skills we expect our students should acquired. Acknowledgments. The authors would like to thank the financial support from the University of Cantabria and the Spanish Ministry of Education and Science, National Program of Computer Science, Project Ref. #TIN2006-13615.
References 1. Anand, V.B.: Computer Graphics and Geometric Modeling for Engineers. John Wiley and Sons, New York (1993) 2. Angel, E.: Interactive Computer Graphics, a Top-Down Approach With OpenGL. Addison Wesley, Reading MA (1997) 3. Barnsley, M.F.: Fractals Everywhere, 2nd edn. Academic Press, Boston (1993) 4. The Bologna Declaration on the European space for higher education: an explanation. Association of European Universities & EU Rectors’ Conference, p. 4 (1999), (available at: http://ec.europa.eu/education/policies/educ/ bologna/bologna.pdf 5. Bouvier, D.J.: From pixels to scene graphs in introductory computer graphics courses. Computers and Graphics 26, 603–608 (2002) 6. Cunningham, S.: Re-inventing the introductory computer graphics course: providing tools for a wider audience. In: Joint Eurographics/ACM Siggraph Proceedings of the Graphics and Visualization Education, GVE 1999 (1999) 7. Cunningham, S.: Powers of 10: the case for changing the first course in computer graphics. In: Proceedings of the 31st SIGCSE Technical Symposium on Computer Science Education, Austin, TX, pp. 46–49 (2000) 8. Figueiredo, F.C., Eber, D.E., Jorge, J.A.: A refereed server for educational CG content. In: Proceedings of EUROGRAPHICS 2003 (2003) 9. Foley, J.D., van Dam, A., Feiner, S.K., Hughes, J.F.: Computer Graphics. Principles and Practice, 2nd edn. Addison-Wesley, Reading (1990) 10. Glassner, A.: An Introduction to Ray Tracing. Academic Press, San Diego (1989) 11. Joint Eurographics/ACM Siggraph Symposium on Computer Graphics and Visualization Education, GVE 1999 (Coimbra, Portugal), http://education.siggraph. org/conferences/eurographics/gve99/reports/papers/gve-fullreport.pdf 12. Joint Eurographics/ACM Siggraph Symposium on Computer Graphics and Visualization Education, GVE 2002 (Bristol, UK) http://education.siggraph. org/conferences/eurographics/cge-02/report
724
A. G´ alvez, A. Iglesias, and P. Corcuera
13. Joint Eurographics/ACM Siggraph Symposium on Computer Graphics and Visualization Education, GVE 2004 (Hangzhou, China): http://education. siggraph.org/conferences/eurographics/cge-04Rep2004CGEworkshop.pdf 14. 2006 SIGGRAPH knowledge base: http://education.siggraph.org/curriculum/knowledge-base/report 15. Hall, R.: Illumination and Color in Computer Generated Imagery. Springer, New York (1989) 16. Hill. Computer Graphics Using OpenGL. Prentice-Hall, Englewood Cliffs, NJ (2000) 17. Hitchner, L., Cunningham, S., Grissom, S., Wolfe, R.: Computer graphics: The introductory course grows up. Panel session. In: Proceedings of the 30th SIGCSE Technical Symposium on Computer Science Education (SIGCSE 1999), New Orleans, LA, USA (1999) 18. Hitchner, L., Sowizral, H.: Adapting computer graphics curricula to changes in graphics. Computers and Graphics 24(2), 283–288 (2000) 19. Hunkins, D., Levine, D.B.: Additional rich resources for computer graphics educators. Computers and Graphics 26, 609–614 (2002) 20. Java3D web site: http://www.j3d.org 21. Luengo, F., Contreras, M., Leal, A., Iglesias, A.: Interactive 3D graphics applications embedded in web pages. In: Proceedings of Computer Graphics, Imaging and Visualization-CGIV 2007, pp. 434–440. IEEE Computer Society Press, Los Alamitos, California (2007) 22. OpenGL web site: http://www.opengl.org 23. Piegl, L., Tiller, W.: The NURBS Book, 2nd edn. Springer, Berlin Heidelberg (1997) 24. Roberts, E., LeBlanc, R., Shackelford, R., Denning, P.J.: Curriculum 2001: Interim Report from the ACM/IEEE-CS Task Force. In: Proceedings of the Thirtieth SIGCSE Technical Symposium on Computer Science Education, pp. 343–344 (1999) 25. Rogers, D.F.: Procedural Elements for Computer Graphics, 2nd edn. Mc Graw-Hill, New York Boston (1998) 26. Taxen, G.: Teaching computer graphics constructively. Computers and Graphics 28, 393–399 (2004) 27. Watt, A.: 3D Computer Graphics. Addison Wesley, Reading (2000) 28. Wolfe, R.J.: OpenGL: Agent of change or sign of the times? In: Computer Graphics November 1998, pp. 29–31 (1998) 29. Wolfe, R.J.: 3D Freebies: a guide to high quality 3D software available via the Internet. In: Computer Graphics May 1998, pp. 30–33 (1998) 30. Wolfe, R., Bailey, M., Cunningham, S., Hitchner, L.: Going farther in less time: responding to change in the introductory graphics courses. In: Computer Graphics Annual Conference Series, ACM SIGGRAPH (SIGGRAPH 1999), Los Angeles, CA (1999) 31. Wolfe, R.: Bringing the introductory computer graphics course into the 21st Century. Computers and Graphics 24(1), 151–155 (2000) 32. Wolfe, R.J.: 3D Graphics. A Visual Approach. Oxford University Press, New York (2000)
Collaborative Environments through Dialogues and PBL to Encourage the Self-directed Learning in Computational Sciences Fernando Ramos-Quintana1, Josefina Sámano-Galindo2, and Víctor H. Zárate-Silva1 1
Tecnológico de Monterrey, Campus Cuernavaca Autopista del Sol, KM 104 Xochitepec, Morelos CP 62790. México. 2 PHD Student of Tecnológico de Monterrey and Teacher of Instituto Tecnológico de Zacatepec, Calzada Tecnológico No. 27, Zacatepec Morelos, CP: 62780. México {fernando.ramos, A00334774, vzarate}@itesm.mx, [email protected] Abstract. Self-directed learning has been one of the main objectives in the education domain. A learning model can drive a self-directed learning if an adequate educational environment is built. We propose an educational environment to encourage the self-directed learning, which is composed of a computer collaborative tool that uses dialogues and the concept of ill structured problems. The knowledge being learned is represented by a network of concepts built by the students through the exchanged messages. The network of concepts expressed the relation between the main concepts of the topic being learned. A coherent network is the tangible proof that the process of self-directed learning has been correctly achieved. Two topics of computer sciences are reported: Object Oriented Programming and Case Based Reasoning. The results have proven that along with the knowledge acquired, self-directed learning contributes directly to the development of skills for solving problems and attitudes of collaborative work. Keywords: self-directed learning, collaborative learning, dialogues, ill structured problems.
1 Introduction A concrete definition of self directed learning (SDL) was given by Gibbons: “In selfdirected learning (SDL), the individual takes the initiative and the responsibility for what occurs”, [1]. Gibbons proposed some important elements to be taken into account in SDL, which are described as follows: student should control as much of the learning experience as possible; the development of skills; the self-challenge after having been challenged by teachers; self-management of their time, effort and the resources they need to conduct their work; self-motivation and assessment of their own efforts. Hiemstra [2] pointed out the following relevant aspects defining selfdirected learning: more responsibility of individual learners; self-direction is not necessary carried out in isolation from others; self-directed learners develop skills to transfer learning, in terms of both knowledge and study skill, from one situation to another; teachers in self-directed learning can dialogue with learners, evaluate outcomes and promote critical thinking. M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 725 – 734, 2008. © Springer-Verlag Berlin Heidelberg 2008
726
F. Ramos-Quintana, J. Sámano-Galindo, and V.H. Zárate-Silva
From the side of learners, Garrison affirmed that more responsibility and control are fundamental in self-directed learning [3]. Meanwhile, Ryan [4] mentioned that teachers in higher education need more responsibility for helping the students to develop self-directed learners. At the same time it was proposed problem-based learning as a potential framework to provide this assistance. Teachers in higher education should assume more responsibility for helping students to develop as self-directed learners in their courses. In particular, problem-based learning is a potential educational framework within which it can be provided this assistance [4]. In the pediatric domain, Ozuah et al [5] is concerned with the effect of PBL on self-directed learning, concluding that pediatric residents exposed to PBL are engaged in significantly higher levels of self-directed learning than their counterparts. The use of problem-based learning (PBL) as an instructional methodology in undergraduate nursing curricula has been identified as one way to facilitate the development of nursing students' abilities to become self-directed in learning [6]. Ill Structured Problems (ISP´s) are the vertebral column of Problem Based Learning. The ISP’s are the key concepts used in the work presented in this paper to start the self-directed learning process. ISP’s are similar as those found in real scenarios [7]. These types of problems represent important challenges to the students because their problem space is weakly defined. On the contrary, well structured problems (WSP) are those whose problem space has been well defined. In such a way that the bridge that joins an ISP with a WSP could be represented by the knowledge to be acquired [8]. An ill-structured problem can be a real problem able to incite the discovering of concepts and become the root for the construction of a network of concepts that represent the knowledge necessary to support a well understanding of a problem [9]. ISP’s are commonly faced in everyday human experience; they have various solutions and have multiple solving processes depending on the solvers perception [10]. ISP´s can be defined as problems that do not have known solutions. Experts in the domain do not agree regarding whether a particular solution is appropriate because it has various solutions and solution paths [11], [12], [13], [14]. Many of the problems we face in real life are ill-structured, including important social, political, economic, and scientific problems in the world [15], [14], [16]. Some authors cited above expressed the need of having additional elements to facilitate the self-directed learning. For instance, it has been proven that PBL facilitates the self-directed learning. Other additional elements can improve the facility for selfdirected learning. For instance, Hammond and Collins [17] recommended the construction of cooperative learning climate for self directed learning environments. Collaborative learning has been defined by several authors as: a situation within which two or more people learn or attempt to learn something together [18] [19] [20]; in collaborative learning, the students work together to maximize their own learning [21]; collaborative group learning occurs when individuals interact through shared inquiry to construct their understanding of each other and their social worlds [22]. Several tools for working collaboratively have been proposed. The C-CHENE CSCL system was designed to facilitate computer-mediated interaction between human learners. The specific task studied requires students to construct qualitative models for energy storage, transfer and transformation using a specially designed graphical interface [23], [24]. C-CHENE uses two different communication interfaces to examine reflective interactions. A dialogue box allows the exchange of typewriting
Collaborative Environments through Dialogues and PBL
727
texts between learners in the first interface. Meanwhile the second one promotes an interaction using interface buttons for specific speech acts. OSCAR is a framework based on the speech-acts theory that attempts to support students involved in collective activities and tutors in their perception of these activities by structuring textual communication through chats and forums. Woojin Paik et al [25] aim to develop intervention techniques to identify and remove obstacles of online learning groups. A secondary goal is to analyze the conversations of virtual group members using computational linguistics techniques. The final goal is to build an automatic system able to monitor the activities of the online learning group members to alert the instructors when the members encounter barriers. Other authors are concerned with the study of dialogue patterns, which can help to build systems to assist the interactions student-student or student-teacher in the tasks of collaboration. Ravi and Kim [26] revealed the need of determining patterns of student interactions in online discussions for better information management and assistance. Pilkington [27] pointed out that formal and computational models of dialogue patterns extracted from the interaction of student-tutor and student-student interaction could help to bridge the gap between empirical investigation of interaction and the design of Intelligent Educational Systems (IESs) that interact with students. In this work, the model of self-directed learning is supported by a collaborative computer tool called FreeStyler, developed by the Coollide group [28] [29]; ill structured problems; dialogues and draw actions whose purpose is to build a network of concepts that validate the knowledge acquired. The construction of a coherent network of concepts is made in several sessions. The network of concepts starts to be built as follows: given a concept represented by a circle node it will be transformed by a dynamic rectangle-node (transition) to get a new concept. The rest of the network is built following the same procedure. The exchanged messages of the dialogues contain collaborative actions to build the network. The network of concepts built in this work has certain similarity with the network of domain concepts built in Adaptive Hypermedia (AH) systems, where such network represents the user's knowledge of the subject [30] [31]. In AH systems, the concepts are related with each other thus forming a kind of semantic network which represents the structure of the subject domain. However, in the present work the links between concepts is given not only by static relations, but also by dynamic relations. For instance, a given concept is transformed into another one by applying actions, operations, mathematical methods, etc. As in Petri nets the transformations take place through special nodes representing transitions, as illustrated in Fig. 2. The self-directed learning process proposed in this work can be carried out at two different levels of difficulty. The two levels are differentiated each other by the degree of participation of the instructor. At the first level, the instructor has provided the students with a set of basic concepts of a topic to be learned during a short presentation without treating in detail such concepts. Then, the students aim to reinforce or reaffirm themselves the knowledge weakly acquired previously. The reinforcement or reaffirmation of knowledge is dynamically carried out as they build a coherent network of concepts. The instructor validates the coherence of such network. As a result of this level, the students have learned the way of structuring the concepts of a topic and have developed skills of analysis and synthesis and abstraction. At the second level, the students are provided with an ill structured problem that contains constructors which
728
F. Ramos-Quintana, J. Sámano-Galindo, and V.H. Zárate-Silva
guide the start of the self-learning process aiming to acquire new knowledge. As in the first level, the construction of a coherent network of concepts will validate that knowledge has been correctly acquired. The instructor in this case assumes the role of advisor, giving tips and suggestions, to build the network f concepts. He also validates the coherence of the network of concepts. Both levels use the computer tool to exchange messages, so they build dialogues, and draw geometric figures to build the network of concepts. In this work, messages are classified as speech acts which change the state of the universe through the perlocutive effects due to the implicit actions involved in them as in the work of Allen and Perrault [32]. The topics to be learned are the following: Case Based Reasoning (CBR) and Object Oriented Programming (OOP). It is described in this work the topic of OOP. CBR was at the level of reinforcement or reaffirmation of knowledge and OOP at the level of acquiring new knowledge. This paper is organized as follows: section 1 introduced the context of the work, the relevant related works and a general description of the model proposed in this paper; in section 2 the process of self-directed learning proposed in this work is shown. In addition, it is illustrated how the ISP is provided to start the construction of the network of concepts and some details about the construction of the network; the analysis of results is treated in section 3. Although, the main focused of this analysis is on the performance of the self-directed process, short comments will be given about the analysis of dialogues and its importance in this research work; conclusions and future works are exposed in section 4; finally, important reference related with this work are provided in last section.
2 The Process of Self-directed Learning Fig. 1 illustrates the process of self-directed learning proposed in this work. This scheme applies for the second level. As we can see, at the beginning of the process, the teacher provides the students with an ill structured problem (ISP). After carrying out an analysis (block I), the students make a self-identification of constructors and start the construction of the network. ISP (teacher)
(I) Analysis of the ISP (students)
(IV) Task of Analysis, Synthesis, Abstraction, Restructuring (students)
Constructors
Tips or suggestions
(II) Construction of the Network of Concepts (students) Network of concepts (III) Review of the network of concepts (teacher)
(V) Final Network of Concepts (end of sessions)
Fig. 1. The process of self-directed learning
Collaborative Environments through Dialogues and PBL
729
The output of block II is a network to be reviewed by the instructor in block III. He proposes tips or suggestions to update the network as output of block III. The students analyze, synthesize, make abstractions and reorganize the structure to update the network in block IV. The process is repeated until the end of the sessions or until the students get a coherent network of concepts. 2.1 Building the Network of Concepts 2.1.1 Statement of the ISP For the learning of OOP topic an ISP should be exposed to the students. For instance, a statement of a real example could be the following: “the distribution of products of an industry can be modeled using the paradigm of Object Oriented Programming (OOP)”. The constructors or openers extracted from the statement defined above are “distribution”, “model” and “OOP”. Most of the students choose to investigate the meaning of OOP. Teacher suggests investigating relevant concepts of OOP. Then, students build basic relations between two concepts, such as, the concept of “object” to be related with the concept of “class”. The students found several important concepts involved: object, classes, subclasses, simple and complex inheritance, methods, polymorphism and generalization, among the most important. During the collaborative sessions, the students built a network by linking concepts until a complete network is achieved (see Fig. 1, for the process of self-directed learning). Links represent desirable correct relations between concepts. The network is built dynamically as the messages are being exchanged, because messages contain actions. A guide or mentor assists the students in order to review the coherence of the network. The students could consult the mentor one time per session. Each session took 90 min. The network was built during four sessions, which took one month. 2.1.2 Construction of the Network of Concepts Fig. 2 shows the concept of “Objects” to which it has been applied the action of “Grouping” to obtain the concept of “Class”. The corresponding dialogue to obtain the nodes related with Fig. 2 is shown below. The dialogues have been translated from Spanish to English in order to facilitate their reading in this paper. The predicates marked in italic represent drawing actions to build the network. Express(A,J, “Hello”) < Express (J,A, “What´s up”?) < Question (J,A, “are you ready?”) < Answer(A,J, “yes”) < Proposition (A,J, Now, “we have to begin with class and object, ok?”) < Accept(J,A, “Yes”) < DrawNode(A,Objects) < DrawNode (A,Class) < DrawTransition(A, Grouping) Objects
Groupping
Class
Fig. 2. The construction of the Class concept from the concept of Objects
The process to build the network is based on the one described in Fig. 1. The rest of the dialogue and the process of the construction of the whole network is not showed because lack of space. Fig. 3 shows an example of the whole network of concept obtained by one of the pair of students. The environment to build the network
730
F. Ramos-Quintana, J. Sámano-Galindo, and V.H. Zárate-Silva
Objects
Have
Groupping
Encapsulate
Class
To generate
Specification
Inherit
Super Class Methods
Subclass
Over-specification
SubSub Class
Attributes
Over-generalization
SuperSuper Class
To Instance
Instance
Fig. 3. An example of a whole network of concepts
includes a window to exchange message and a menu of geometric figures to draw the network.
3 Analysis of Results 24 pairs of students participated in “Case Based Reasoning” and 20 pairs in OOP. The task of reinforcing knowledge previously learned was focused on CBR. Meanwhile, the task of acquiring new knowledge was focused on OOP, which took four sessions during one month. Although, the main interest was to build a coherent network of concepts, we were also interested in measuring the development of skills to analyze, synthesize and build structures. These variables have been measured through the process of building the network of concepts based on a bottom-up approach. That is, at the bottom level the students should list the main concepts of the topic under study; then, at intermediate level, they should establish the main relationships between pairs of concepts, for instance, objects are grouping to form classes of objects; and finally at the top level they should assembly the whole network of concepts. Then, it was measured the progress of the students to build correctly the network of concepts. For the analysis aspects, we were concerned about the capacity of students to establish the single and relational concepts to be used in the construction of the network. For the synthesis aspects, the students should build a synthesized but expressive network. For the construction of a structured network, it should reveal the construction of a coherent network, where the whole network should have a correct semantic meaning. We will show only the analysis of results related with OOP technology. In a good analysis, students should be able of extracting the following single concepts and establishing the correct relations between them. At the bottom level of abstraction, the main concepts of OOP: Object, Attributes, Class, Subclass, Subsubclass, Instance, Instantiation, Grouping, Generalization,
Collaborative Environments through Dialogues and PBL
731
Over-generalization, Specification, Over-specification, Inheritance, Characterization, Metaclass, Superclass. At intermediate level of abstraction, the main relationships between concepts: Object-Grouping-Class; Class-Specification-Subclass; Subclass-Generalization-Class; Class-Overgeneralization-Superclass; Attributes-Characterization-Object; SubclassOverspecification-Subsubsubclass; Subclass-instantiation-Instance; Class-Inheritancesubclass. At the top level the network of concepts should be as that one shown in Fig. 3. In spite that the experience for reinforcing or reaffirming the knowledge of CBR was important to develop skills of analysis, synthesis and build structures, an average of 70% of the whole network of concepts was reached by the couples in OOP after the first two sessions. This is due to several reasons: the couples had different background knowledge in computer subjects; the autonomy attitudes were not developed at the same degree; the students had not the same level of capacities of analysis, synthesis and construction of structures; finally the mentor-guide assisted the students only two times as strategy to encourage the autonomy work of the students. The guide gave tips or suggestions but neither corrections nor solutions. At the end of the four sessions we have obtained the following results: 11 couples got a network 100% correct; 4 couples got one or two no relevant errors, which can be judged as 90 % correct; 3 couples got three not relevant errors and 2 couples got four errors, two of them very important from the point of view of the coherence of the network. As an example, an important error is when sub-classes were not obtained from classes. The same exercise of building a network of concepts was applied to 24 pair of students having a previous learning in OOP in a traditional fashion. The results shown that only 10 pairs of the students built a network 100 % correct, requiring an important assistance of the guide. The rest of the students got an average of more than four mistakes. For instance, they were not able to extract the most important concepts. Consequently, the construction of relations was incomplete. This kind of errors resulted in an incoherent network of concepts. The skills of analysis and synthesis and construction of structures were weak in this group. Based on the results obtained, we consider that the environment composed of the computer collaborative tool through dialogues and PBL, in particular ISP’s, helped to encourage the self-directed learning. This kind of environments induces the development of skills such as analysis, synthesis and construction of structures. An advantage of the FreeStyler tool is that it allows the students to communicate in a natural way. However, it is advisable to determine a set of rules to carry out the communication, otherwise, because there are not restrictions about the way of exchanging messages, students can loose the control without taking care of the main purpose of the conversation. The computer tool could have openers to build the network in order to help students to start the construction of it. These openers could be a list of single and relational concepts relevant to the topic under study. This work is focused on self-directed learning process and the environment adequate for this purpose. However, we consider the importance of pointing out the work made to study the behavior of students through the dialogues or sequences of exchanged messages between students. The goal in this case was to discover patterns of dialogues capable of revealing us certain behavior of students as messages were being
732
F. Ramos-Quintana, J. Sámano-Galindo, and V.H. Zárate-Silva
exchanged. Thus, if dialogue patterns exist then they could serve as guide to monitor and advice the students in case of deviations or obstacles as the network of concepts is being built. We have discovered patterns of dialogues related with phases of salutations satisfying the protocol of saying hello, at the beginning of the dialogue, and goodbye, at the end of the dialogue. At an intermediate phase the students start and finish, practically, the construction of the network. An overlapping between neighbor phases was characterized by an action of drawing indicating the start and the end of the network construction, respectively. Some of the most important factors of distraction can occur in the initial and final phase, just where the students establish dialogues about the protocols of saying hello or good bye. The kind of distractions could be related with cultural factors. Other patterns of sub-dialogues emphasized the illocutive force of a speech act within the sequence of the dialogue. The specialized sub-dialogues can be: ESD (Expressive Sub-Dialogue); QSD (Question Sub-Dialogue); PSD (Proposition SubDialogue); RSD (Request sub-dialogue).
4 Conclusions To encourage the self-directed learning is needed a propitious environment. The environment proposed in this work was composed basically of a computer collaborative tool through which students exchanged messages to build a network of concepts. In addition, PBL helps, through Ill Structured Problems (ISP), to challenge and detonate the interest for investigating concepts around the topic being studied. The idea of increasing the degree of difficulty to become self-learner starting with reaffirmation or reinforcing knowledge to the acquisition of new knowledge helps to develop and/or improve skills such as analysis, synthesis and construction of structures. These skills are necessary for improving the results in self-directed learning. We could verify that students can improve their skills to analyze, synthesize and build structures through the construction of a network of concepts. In addition, the proposed environment encourages students to improve attitudes for autonomy and team work. We have confirmed that the efficiency to build the network of concepts made by students that learned by the self-directed method was better than the efficiency got by students that learned under traditional methods. A tool such as the one used in this work is adequate to work by pairs of students, more than two students could become the use of the tool very complicated. The students have used their own way of speaking. The culture and language are factors that should count certainly in the number of messages exchanged to achieve the task of building the network of concepts. The risk of distractions can occur if students are not concentrated in the objectives, which can increase the number of useless messages exchanged. Thus, one of the purposes of discovering dialogue patterns is to build an assistance computer system that can aid the students when they find obstacles or to end deadlocks. We consider that the patterns are in a certain way generalized because they were independent on the pairs of students. However, we need more than two or three domain of studies, other than CBR or OOP, and the study under other cultural environments and languages to say that the patterns are generalized.
Collaborative Environments through Dialogues and PBL
733
Reference 1. Gibbons, M.: The Self-Directed Learning Handbook. Wiley, Chichester (2002) 2. Hiemstra, R.: Self-directed learning. In: T.H.T.N.P. (ed.) The International Encyclopedia of Education, 2nd edn., Pergamon Press, Oxford (1994) 3. Garrison, D.R.: Critical Thinking and Self-Directed Learning in Adult Education: An Analysis of Responsibility and Control Issues. Adult Education Quarterly 42(3), 136–148 (1992) 4. Ryan, G.: Student perceptions about self-directed learning in a professional course implementing problem-based learning. Studies in Higher Education 18(1), 53–63 (1993) 5. Ozuah, P.O., Curtis, J., Stein, R.E.K.: Impact of Problem-Based Learning on Residents’ Self-directed Learning. Arch. Pediatr. Adolesc. Med. 155, 669–672 (2001) 6. Williams, B.: The Theoretical Links Between Problem-based Learning and Self-directed Learning for Continuing Professional Nursing Education. In: Teaching in Higher Education, pp. 85–98. Routledge (2001) 7. Jones, D.: What is PBL: Going Beyond Content, San Diego State University (1996) 8. Ramos, F., Espinosa, E.: A model based on Petri Nets for guiding the process of learning concepts needed to transform ill-structured problems into well-structured ones. In: PBL 2004 International Conference, Cancun (2004) 9. Ramos, F., Espinosa, E.: A self-learning environment based on the PBL approach: an application to the learning process in the field of robotics and manufacturing systems. The International Journal of Engineering Education 19(5) (2003) 10. Hong, N.S.: The relationship between well-structured and ill-structured problem solving in multimedia simulation, Pennsylvania State University (1998) 11. Jonassen, D.H.: Instructional Design Models for Well-Structured an Ill-Structured Problem-Solvin Learning Outcomes. ETR&D 45(1), 65–94 (1997) 12. Reitman, W.R.: Cognition and thought, p. 312. Wiley, New York (1965) 13. Voss, J.F.: Problem solving and reasoning in ill-structured domains. In: C.A. (ed.) Analyzing everyday explanation: A casebook of methods, pp. 74–93. SAGE Publications, London (1988) 14. Voss, J.F., Means, M.L.: Toward a model of creativity based upon problem solving in the social sciences. In: Glover, R.R.R.J.A., Reynolds, C.R. (eds.) Handbook of creativity, pp. 399–410. Plenum, New York (1989) 15. Sinnott, J.D.: A model for solution of ill-structured problems: Implications for everyday and abstract problem solving. In: J.D.S. (ed.) Everyday problem solving: Theory and applications, pp. 72–99. Praeger, New York (1989) 16. Voss, J.F., Lawrence, J.A., Engle, R.A.: From representation to decision: An analysis of problem solving in international relations. In: Complex problem solving, Lawrence Erlbaum, Hilldale (1991) 17. Hammond, M., Collins, R.: Self-Directed Learning: Critical Practice. Nichols/GP Publishing, 11 Harts Lane, East Brunswick, NJ 08816. 250 (1991) 18. Dillenburg, P., et al. (eds.): The evolution of Research on Collaborative Learning. Learning in Humans and Machines. Spada & Reimann (1996) 19. Dillenburg, P., Neville Bennet, E.D., Vosniaou, S., Mandl, H.: Collaborative Learning, Cognitive and Computacional Approaches. Advances in Learning and Instruction Series. Van Someren, Reimann, Boshuizen & De Jong Elsevier Science Ltd, Oxford (1999) 20. Collazos, C.A., et al.: Evaluating Collaborative Learning Process, in Department of Computer Science, Universidad de Chile, Santiago, Chile (2001)
734
F. Ramos-Quintana, J. Sámano-Galindo, and V.H. Zárate-Silva
21. Johnson, D.W., Johnson, F.P.: Joining Together, Group Theory And Group Skills. Allyn And Bacon (2000) 22. Ziegler, M.F., Paulus, T.M., Woodside, M.: Making Meaning Through Dialogue in a Blended Environment. Journal of Transformative Education (4), 302 (2006) 23. Lund, K., Baker, M.J., Baron, M.: Modelling Dialogue and Beliefs as a Basis for Generating Guidance in a CSCL Environment. In: Lesgold, A., Frasson, C., Gauthier, G. (eds.) ITS 1996. LNCS, vol. 1086, pp. 206–214. Springer, Heidelberg (1996) 24. Baker, M.J., Lund, K.: Promoting reflective interactions in a computer supported collaborative learning environment. Journal of Computer Assisted Learning (13), 175–193 (1997) 25. Paik, W., Lee, J.Y., McMahan, E.: Facilitating collaborative learning in virtual (and sometimes mobile) environment. In: Bussler, C.J., Hong, S.-k., Jun, W., Kaschek, R., Kinshuk, Krishnaswamy, S., Loke, S.W., Oberle, D., Richards, D., Sharma, A., Sure, Y., Thalheim, B. (eds.) WISE 2004 Workshops. LNCS, vol. 3307, pp. 161–166. Springer, Heidelberg (2004) 26. Ravi, S., Kim, J.: Profiling Student Interactions in Threaded Discussions with Speech Act Classifiers. In: Proceedings of AI in Education Conference (AIED 2007) (2007) 27. Pilkington, R.: Analyzing Educational Dialogue Interaction: Towards Models that Support Learning. International Journal of Artificial Intelligence in Education (12), 1–7 (2001) 28. Hoppe, U.: COLLIDE. Faculty of Engineering, University of Duisburg-Essen (2006), http://www.collide.info/background 29. Muehlenbrock, M., Tewissen, F., Hoppe, U.: A Framework System for Intelligent Support in Open Distributed Learning Environments. In: Proceedings of 8th World Conference on Artificial Intelligence in Education (AIED 1997), Kobe, Japan, August 20-22 (1997) 30. Brusilovsky, P.: Methods and techniques of adaptive hypermedia. User Modeling and User-Adapted Interaction Journal 6, 87–129 (1996) 31. Brusilovsky, P.: Adaptive Hypermedia. In: UMUAI: User Modeling and User-Adapted Interaction, Netherlands. pp. 87–110 (2004) 32. Allen, J.F., Perrault, C.R.: Plans, inference, and indirect speech acts. In: Annual Meeting of the ACL Proceedings of the 17th annual meeting on Association for computational Linguistics, Morristown, NJ, Association for Computational Linguistics, La Jolla, California (1979)
The Simulation Course: An Innovative Way of Teaching Computational Science in Aeronautics Ricard Gonz´ alez-Cinca1 , Eduard Santamaria2, and J. Luis A. Yebra3 1
Department of Applied Physics Department of Computer Architecture 3 Department of Applied Mathematics IV Castelldefels School of Technology (EPSC) Technical University of Catalonia (UPC) Av. del Canal Ol´ımpic, 15 - 08860 Castelldefels, Spain [email protected] 2
Abstract. This article describes an innovative methodology for teaching an undergraduate course on Computational Science, with a particular emphasis in Computational Fluid Dynamics (CFD), and the experiences derived from its implementation. The main activities taking place during this course are: development by students of a training project on a topic in materials science, development of a larger CFD project, and an introduction to a CFD commercial package. Projects are carried out by groups of students and are assigned from a set of different available possibilities. Project development consists in the implementation in code of the corresponding mathematical models and a graphical interface which permits the visualization of the results derived from the numerical resolution of the models. The main innovative aspects of the methodology are the use of Project Based Learning combined with the participation of lecturers from different areas of expertise. Other innovative issues include the opportunity for students to practice skills such as report writing, doing oral presentations, the use of English (a foreign language for them) and the use of Linux as the development environment.
1
Introduction
The Simulation course belongs to the bachelor’s degree in Aeronautical Engineering (Air Navigation specialization) offered at the Technical School of Castelldefels (EPSC) of the Technical University of Catalonia (UPC). The bachelor’s degree in Aeronautical Engineering started in 2002 and Simulation is being run since 2004. It is an elective course which students can enrol during the third year of their studies. The course complements previous training which mainly focuses on Air Navigation topics. Simulation has always been developed with the idea of applying an innovative methodology for teaching Computational Science, with a particular emphasis in CFD. The broad context in which Simulation is being developed requires an introduction in order to better understand its innovative aspects. Spanish higher M. Bubak et al. (Eds.): ICCS 2008, Part II, LNCS 5102, pp. 735–744, 2008. c Springer-Verlag Berlin Heidelberg 2008
736
R. Gonz´ alez-Cinca, E. Santamaria, and J.L.A. Yebra
education system is currently in an adaptation process to the new European framework. Technical universities are composed by schools or faculties and departments. In the still active system schools have the role of offering degrees which last either three years (’technical engineering’ or bachelor degree) or five years (’higher engineering’ degree or bachelor + master degree). The unifying characteristics of the members of a department is their area of expertise, that is the scientific background and research interests (e.g. physics or mathematics). Courses in the degrees curricula are assigned to the department which has the closest area of expertise to the course. Although several courses have characteristics shared by different areas, in Spanish universities they use to be assigned to just one department; in other words, it is very unlikely that lecturers from different departments participate in the same course. The two main features of the innovative methodology presented in this paper are the use of Project Based Learning [1,2,3] and the participation of lecturers from three main areas: physics, mathematics and computation. In Section 2 objectives of the course in the three areas as well as skills expected to be acquired by students are presented. Section 3 is devoted to the description of the methodology used, including an overview of the projects and information about the course organisation. The implemented system for evaluation of students progress is presented in Section 4. Finally, remarks on the experiences obtained during the last three years are presented.
2
Objectives
The main scientific objective of the Simulation course is an introduction to the application of numerical analysis to basic materials science and fluid dynamics problems in aerospace engineering. In particular, it is expected that after completing the course, students have: - A theoretical knowledge of the models describing some basic aerospace problems. - A conceptual understanding of numerical methods commonly used in the analysis of aerospace systems. - A working knowledge of these numerical methods and experience in implementing them. - A physical interpretation of the results derived from numerical simulations. - A practical knowledge of visualization techniques. - An intoductory knowledge of CFD commercial packages. In order to accomplish these objectives in a comprehensive way, fundamentals of theoretical models and several numerical and programming techniques for computation and visualization are presented by specialists in each area. The course counts with the participation of lecturers coming from the departments of Applied Physics, Applied Mathematics IV and Computer Architecture of the Technical University of Catalonia (UPC), and are specialized in the physical description, the numerical analysis and the programming and visualization aspects, respectively.
The Simulation Course: An Innovative Way of Teaching
737
Besides the objectives related to scientific and technical knowledge, the course also aims at developing other skills like group work, project development, scientific documents writing, use of English and performing oral presentations. The specific objectives of each area as well as an explanation on the other skills are presented in the following subsections. 2.1
Physical Description
When students enrol Simulation, they have already followed courses on Fundamental Physics, Thermodynamics, Materials Science and Aerodynamics. However, no computational tools to solve problems related with these topics have been used before. The main goal from the physical point of view is to show how some physical phenomena related to aerospace applications in fluid dynamics and materials science can be addressed in an reachable way by means of simplified models and adequated numerical techniques. Besides this objective, it is also an important issue to promote in students a critical point of view in any numerical results they can obtain. Projects development throughout the course share some common objetives such as the understanding of the involved physics and the models used to simulate them as well as the physical interpretation of the obtained results. 2.2
Numerical Methods
An introduction to the basics of numerical analysis and partial differential equations is the main objective in the area of numerical methods. The considered topics are: - Numerical integration (the trapezoidal rule and Simpsons method, including error estimation in both cases). - Discrete approximation of derivatives. - Numerical methods for ordinary differential equations (from Euler’s to RungeKutta method and introducing predictor-corrector methods). - Partial Differential Equations (only second order and mainly parabolic and hyperbolic equations). Aimed at acquiring a working knowledge in numerical methods, a brief introduction strongly facilitates the study of the discretization procedures and the numerical treatment of those partial differential equations that govern the main problems in fluid dynamics which the students will face in the course. 2.3
Programming and Visualization
From the programming perspective, the main goal of the course is to make students acquire a basic knowledge and understanding of Object Oriented Programming (OOP) in C++ and the necessary skills to develop applications with a graphical user interface (GUI). It is noteworthy that previous programming
738
R. Gonz´ alez-Cinca, E. Santamaria, and J.L.A. Yebra
experience of students is limited to an introductory course to C programming followed by another course where OOP is introduced but only addressing encapsulation. After these courses a full year passes before they can enrol Simulation. The following list summarizes the main objectives of the course in this area: - Introduction to GNU/Linux: during the course students will familiarize with this system at a user level. - Object Oriented Programming in C++: students will have to brush up on their C++ and learn new concepts as they work on their projects. - Real numbers in computing: due to the numerical nature of the algorithms that will be implemented having notions on this topic will help understing some issues that may arise. - Application programming and debugging with KDevelop: KDevelop [4] is an Integrated Development Environment that provides an editor, compilers, a debugger, etc. thus sharing features with most IDEs. - Application progamming using Qt GUI toolkit: the Qt [5] libraries provide a quite comprehensive set of classes for GUI programming. Besides it is multiplatform, well documented, integrates well with KDevelop and provides an easy to use interface designer. 2.4
Skills
There are a number of important skills whose development also forms part of the course objectives: - Group work: Most of the course work is done in group. - Self learning: There are no in-depth explanations to the class, rather they are given introductory notions and pointers to different topics that they will need to work on. - Report writing: One of the items that must be delivered after the project completition is a report as described in Section 3. - Performing oral presentations: Each group must give a presentation to the class explaining the main aspects of their project and demoing the developed application. - Use of English: Unfortunately, the use of English is not integrated into the studies to a point in which students are able to use it fluently. We try to alleviate this situation by providing all course materials in English. At least, students should be comfortable reading technical documentation in English.
3
Methodology
In this Section the learning methodology used in the Simulation course is presented. It starts with a very brief introduction to Project Based Learning, following an overview of the materials science (training project) and CFD projects that students will have to develop. Finally, we describe how the course is organized around the project development process.
The Simulation Course: An Innovative Way of Teaching
3.1
739
Project Based Learning
The basic idea of Project Based Learning (PBL) [1,2,3] is to organize the course (or most of it) around a project which the students must work on in groups. During the development process they need to learn the necessary topics in order to carry it out. PBL provides a model based on the student learning process rather than focusing on the lecturer teaching activities. It is a quite extended methodology, specially in engineering studies, where it fits very well. Nevertheless, it is not easy to implement because it confronts both lecturers and students with a number of challenges. 3.2
Projects Overview
Three consecutive projects are developed by students throughout the course. A first short training project on crystal growth, a largest CFD project, and a final project using a commercial tool. The first two projects, in which students implement their own codes, are explained in this Section. The first project developed during the course is a training one developed by all the groups. It consists in the numerical simulation of the process of a crystal growing from an undercooled melt. A simple version of the phase-field model introduced in [6] is used to reproduce the physical phenomena. This project allows students to familiarise with finite difference techniques, implicit schemes, mesh related problems and to start implementing visualisation techniques. Different initial and boundary conditions and system sizes are considered. From the physical point of view, students acquire an idea of the role played by undercooling and anistropies on the time evolution of the system. The projects in the fluid dynamics field are inspired in [7] and are the following: 1) Subsonic-supersonic isentropic nozzle flow. 2) Incompressible Couette flow. 3) Prandtl-Meyer expansion wave. Each group carries out only one of the CFD projects. Students directly acquire the knowledge associated with the project that they develop, while knowledge on the other projects is acquired through the lecturer presentations and, at the end of the course, through the oral presentations of the other groups. In project 1), the flow of a gas through a convergent-divergent nozzle is simulated. In order to simplify the model to use, the study is carried out for quasi-onedimensional flows. A set of dimensionless finite-difference equations (continuity, momentum and energy) are solved by means of MacCormack’s technique for an specific nozzle shape and initial conditions. The goal of this project is to obtain the steady-state solution. Figure 1 (left) shows an example of the graphical interface developed for this project. In project 2), incompressible Couette flow is studied as an example of a simple viscous flow that retains much of the same physical characteristics of a more complicated boundary-layer flow. The governing equations for this problem are
740
R. Gonz´ alez-Cinca, E. Santamaria, and J.L.A. Yebra
Fig. 1. Project examples. Left: subsonic-supersonic nozzle, right: Prandtl-Meyer expansion wave.
parabolic partial differential equations, while in projects 1) and 3) are hyperbolic differential equations. The employed numerical technique for the solution of the Couette flow is the Crank-Nicholson implicit method. In this case, the steady-state solution can be easily validated by the corresponding analytical solution of the problem. The observation of the evolution of the velocity profiles in an unsteady flow is an example of the added value provided by the numerical simulation. A two-dimensional, inviscid, supersonic flow moving over a surface is studied in project 3). This problem shows an special interest when the surface contains a sharp corner. In this situation, the supersonic flow is expanded around the sharp expansion corner in such a way that an expansion wave (the Prandtl-Meyer expansion wave) made up of an infinite number of infinitely weak Mach waves, fans out from the corner. The corresponding Euler equations for the considered flow are solved by means of the MacCormack’s predictor-corrector explicit finite-difference method applied to downstream space marching. An additional difficulty of this project lies in the necessity of carrying out a grid transformation to change from the physical plane to a rectangular computational plane. An example of the graphical interface developed for this project is shown in Figure 1 (right). In general, the projects are not specially complex from a class design point of view. Most groups end up with just two to three classes involved in the simulation computation. It is in the development of the graphical user interface where students will have to deal with a hierarchically structured collection of C++ classes that provide the different graphical components. The interdisciplinary nature of the projects makes its development a richer experience and also more real and encouraging. Besides, its open-endness allows the more motivated and capable groups to add features beyond the required minimums. 3.3
Course Organisation
Simulation requires about 250 hours of workload for the students, including classroom sessions. It has a duration of 15 weeks with 7 hours of classroom
The Simulation Course: An Innovative Way of Teaching
741
sessions per week. There are 25 available places which, from its beginning, have always been covered by demand. As seen on table 1, the Simulation course is organized in three blocks. The first block corresponds to the training project, which is delivered during the fourth week. Once the training project has been done, a more challenging and bigger fluid dynamics project is presented. Students will be working on this project for approximately seven weeks. Finally, the remaining four weeks of the course are devoted to the introduction and use of a widely used commercial flow modelling software. All lecturers of the course participate in both the trainig and the main project. The remainder of the course is done by only one of the lecturers. Table 1. Course organisation Block
Duration
Deliverables
Lecturers
Training Project Main Project
4 weeks 7 weeks
Physics, Math, Comp Physics, Math, Comp
CFD Package
4 weeks
Working application Application, writen report and oral presentation Tutorials, writen report and working application
Physics
At the beginning of the course students are organized in groups of 3 to 4 members. During the first weeks the main topics that will be covered during the course are introduced. The training project is proposed at the same time. This project aims at making students get familiar with the basic concepts and the tools that will be used during the course. It also forces students to get rapidly started and helps in establishing group work dynamics. During the second block of the course students carry out the development of the main project. First all projects are presented, then the course coordinator negotiates the distribution of the projects among the different groups. Since the difficulty level may differ from project to project, the results obtained in the training project will be helpful in order to decide on this distribution. From that point on, students work on their own. During the development process students will face different physics, mathematical and programming problems that they will need to work on in order to make progress. Thus the study of the different topics of the course is prompted by the learning needs that stem from the project development. Two books of J.D. Anderson [7,8] are the basic reference materials of the course, providing guidance and sample results that must be matched by the simulation programs outcome. This phase ends with the delivery of three different items: 1. The application that simulates the fluid dynamics problem. 2. A written document containing: (a) an abstract both in English and Spanish or Catalan,
742
R. Gonz´ alez-Cinca, E. Santamaria, and J.L.A. Yebra
(b) a description of the physical problem, (c) its mathematical resolution, (d) and the main aspects of the program implementation with a minimal user manual. 3. A 20 minutes oral presentation, summarizing the information contained in the document, with a demo of the application. During the final weeks of the year, students are introduced to Gambit, the geometry and mesh generation tool of the Fluent CFD commercial package [9]. This block of the course is done by the lecturer from the physics department. At this point students already have a basic understanding of numerical simulation principles which facilitates the use of commercial tools. After some introductory lectures, students carry out six tutorials provided by Fluent in order to practise the basic points of geometry creation and mesh generation and refinement. The final part of this block is devoted to the development of a project using Gambit. Under the lecturer advise, students choose an aerospace system (e.g. wings, balloons) and generate its geometry and create and study different meshes. Students have the possibility to complete this work in the Computational Fluid Dynamics course offered in the following semester. In this other course, the Fluent solver is introduced and students can use it to study the behavior of fluids flowing in the geometries built in the project done in Simulation. An important point of the methodology is the role of the lecturers, that gradually changes as the course advances. During the first weeks the lecturing activity is rather based on talks about the different topics that must be introduced. Once students start working on their projects their activity will consist in solving questions, supervise students progress and, from time to time, give small clarifying lectures if deemed necessary. During scheduled times at the classroom students will be able to perform their group meetings, work on their projects and ask whatever questions they have. Finally, in order to facilitate students work, the Castelldefels School of Technology provides every group of the Simulation course with a laptop for their use during the whole semester. The laptop comes with all the necessary software pre-installed. Students can also take advantadge of the wifi network available in most areas of the campus. Additional computers are available at the room where sessions take place. This is a key point which facilitates meetings and work in group.
4
Assesment
There is a separate evaluation of the activities done by students throughout the different blocks of the course. The whole part of the evaluation process is done by the evaluation comission, composed by the three lecturers in charge of this course. The first part of this evaluation comes from the document of the training project delivered by students. Grades are determined by the quality of the project and represent 10% of the final grading.
The Simulation Course: An Innovative Way of Teaching
743
At the seventh week of the year, students take the mid-term exam which lasts 90 minutes. General or specific questions about the introduced concepts on the physics, numerical or visualisation part are posed. Grades obtained in this exam represent 20% of the total grading of the course. The CFD project is evaluated from the document presented as well as from the public presentation done by groups. In order to pass this evaluation, students have to present some predetermined results which basically serve to validate their simulations. Students are encouraged to perform additional studies in their project in order to get higher marks. This part represents 50% of the final grading. Work performed in the last block of the course is evaluated through the delivery of a short document on the project of geometry construction and meshing with Gambit and represnets 10% of the total grading of the course. Finally, a 10% of the grading comes from the subjective evaluation done by the commission on the students participation in each activity.
5
Concluding Remarks
In this article the implementation of the Simulation course offered at the Technical School of Castelldefels (EPSC) has been discussed. The two main innovative aspects of the course are the use of PBL as the learning methodology and the participation of lecturers from different areas of expertise. The course is being run since 2004 and the experience gathered during its past three editions lets us draw some conclusions. Due to the interdisciplinary nature of the course the participation of lecturers from differenet areas of expertise fits quite naturally. This is uncommon in the Spanish university context where each course is usually assigned to a single department. We believe that this is good for both lecturers and students. Lecturers become more familiar with the activities carried out by their colleagues and students are provided with a broader view of the different course topics. This aspect is also positively valued by students. The main challenge that must be addressed is an adequate coordination of lecturers, so that students face a steady and manageable workload. Also care needs to be taken in order to provide students with a common view regarding all aspects of the course. The PBL approach clearly makes students more engaged with the course. Being confronted with a challenging and realistic project is encouraging, they percieve their learning as a direct consequence of their personal effort and, at the end of the course, they feel more satisfied. As a consequence of working in a “real” project, students also think that acquired knowledge may be useful in their academical or professional future. In respect to the use of Linux and other free and open-source software there are some complains mostly regarding the availability of Linux at home (not all students are willing to install a different OS on their personal computers). Besides having a laptop available at all times, this issue has been greatly alleviated by virtualization tools such as VirtualBox [10] and the possibility of dual booting.
744
R. Gonz´ alez-Cinca, E. Santamaria, and J.L.A. Yebra
The availability of finalized projects from previous editions and students doing little work within their group could be matters of concern. The continuous supervision of groups work at the classroom makes both issues unlikely to pass unpercieved and these situations have not been detected yet. A concern that we are studying how to address is an excessive degree of specialization within the groups. As an example, programming is not equally done among the group members. This can be seen during the classroom sessions and is confirmed by the exam results. Also, sometimes we feel that differences among students of a given group may not be properly reflexed in the final marks. To conclude, we are satisfied with the quality of most delivered projects and with the course results in general. It is also worth noting that the studentcentered active learning approach applied in the course suits perfectly to the methodologies shift that should accompany the adaptation process to the new European framework.
References 1. Markham, T.: Project Based Learning, a guide to Standard-focused project based learning for middle and high school teachers. Buck Institute for Education (2003) 2. Crawford, A., Tennant, J.: A Guide to Learning Engineering Through Projects. University of Nottingham (2003) 3. Del Canto, P., Gallego, I., Hidalgo, R., L´ opez, J., L´ opez, J.M., Mora, J., Rodr´ıguez, E., Santamaria, E., Valero, M.: Aprender a Programar Ordenadores mediante una on EdMetodolog´ıa Basada en Proyectos. 18o Congreso Universitario de Innovaci´ ucativa en las Ense˜ nanzas T´ecnicas (2007) 4. KDevelop Integrated Development Environment, http://www.kdevelop.org/ 5. Qt GUI Toolkit, http://trolltech.com/products/qt 6. Wang, S.-L., Sekerka, R.F., Wheeler, A.A., Murray, B.T., Coriell, S.R., Braun, R.J., McFadden, G.B.: Physica D, 69 (1993) 7. Anderson Jr., J.D.: Computational Fluid Dynamics. The basics with applications. McGraw-Hill, New York (1995) 8. Anderson Jr., J.D.: Fundamentals of Aerodynamics. McGraw-Hill, New York (2001) 9. http://www.fluent.com/ 10. VirtualBox, http://www.virtualbox.org/
Author Index
Abad, F. II-106 Abarca, R. III-471 Abbate, Giannandrea II-251 Abbod, Maysam III-16, III-634 Abdelouahab, Zair I-365 Abdullah, M. I-246 Abe, Takayuki II-35 Abell´ an, Jos´e L. I-456 Abramson, David I-66 Acacio, Manuel E. I-456 Adamczyk, Jakub I-355 Ahmad, Muhammad Bilal I-1013 Ai, Jianwen II-603 Akdim, Brahim II-353 Al-Kanhal, Tawfeeq III-634 Alda, Witold I-749, II-46 Alexandrov, Vassil III-379, III-429, III-438 Alles, Michael L. II-281 Alvaro, Wesley I-935 Anthes, Christoph III-379 Anthonissen, M.J.H. I-651 Anya, Obinna II-622, III-419 Arnal, A. II-96 Arod´z, Tomasz II-527 Assel, Matthias III-90 Assous, Franck II-331 Atanassov, Emanouil I-203 Aydt, Heiko III-26 Baczkowski, Krystian III-100 Bae, Seung-Hee I-407 Baeza, C. III-471 Bajka, Michael II-187 Balandin, Alexander A. II-242 Balfe, Shane III-510 Balint-Kurti, Gabriel G. II-387 Bali´s, Bartosz III-80, III-358 Banerjee, Sambaran II-207 Barabasz, Barbara III-682 Barbucha, Dariusz III-624 Bargiel, Monika II-126 Barty´ nski, Tomasz III-243 Barv´ık, Ivan I-661
Barzdziukas, Valerijus I-770 Battiato, Sebastiano II-76 Beezley, Jonathan D. III-46 Belkus, Houria II-207 Belloum, Adam III-459, III-481 Bengochea Mart´ınez, L. III-349 Benoit, Anne I-215 Bergdorf, Michael II-167 Bhowmick, Sanjukta I-873 Biecek, Przemyslaw III-100 Black, S.M. II-396 Blais, J.A. Rod II-638 Bl¨ ugel, Stefan I-6 Bode, Arndt III-201 Bonner, C.E. II-396 Boryczko, Krzysztof I-600, I-630 Bo˙zejko, Wojciech I-264 Brezany, Peter I-76 Brito, Rui M.M. III-70 Broeckhove, Jan I-226 Bubak, Marian I-56, I-254, II-217, III-80, III-90, III-243, III-358, III-446, III-459, III-481 Buchholz, M. I-45 Buchholz, Peter III-223 Buckingham, Lawrence III-491 Bulat, Jaroslaw III-178 Bungartz, Hans-Joachim I-45, III-213 Byler, Kendall G. II-360 Bylina, Beata I-983 Bylina, Jaroslaw I-983 Byrski, Aleksander III-584, III-654 Cai, Wentong III-26 Caiazzo, Alfonso II-291 ´ Calvo, Angel-Luis II-659 Camahort, E. II-106 Campos, Fernando Otaviano III-120 Campos, Ricardo Silva III-120 Cannataro, Mario III-67, III-148 Cao, Rongzeng I-853 ˇ Capkoviˇ c, Frantiˇsek III-545 Cˆ arstea, Alexandru I-126 Carvalho, Marco III-584
746
Author Index
Cebrat, Stanislaw III-100 Cerepnalkoski, Darko III-463 Cernohorsky, Jindrich I-489 Cetnarowicz, Krzysztof III-533, III-594 Chaarawi, Mohamad I-297 Chakraborty, Soham III-46 Chalmers, Matthew III-158 Chang, Jaewoo I-731 Chaves, R. I-741 Chen, Chien-Hsing I-913 Chen, Chuan-Liang I-995 Chen, H. III-731 Chen, Jong-Chen I-813 Chen, Mark I-590 Chen, Tzu-Yi I-955 Chen, Zhengxin II-426, II-450 Chen, Zhenxing I-7 Chien, Shou-Wei I-813 Childs, Hank I-32 Chlebiej, Michal II-25 Choi´ nski, Dariusz II-261, III-381 Chojnacki, Rafal I-355 Chopard, Bastien II-227, II-291 Chover, Miguel II-5, II-86, II-136 Chrysanthakopoulos, George I-407 Pawel I-903 Chrzaszcz, Ciarlet Jr., Patrick II-331 Cicho´ n, Czeslaw I-1022 Renata III-594 Cieciwa, Ciepiela, Eryk III-740 Clarno, Kevin T. III-291 Cobo, Angel II-116 Coen, Janice L. III-46 Cofi˜ no, A.S. III-471 Cooper, Ian I-184 Cope, Jason II-646 Corcuera, Pedro II-715 Cort´es, Ana II-659, III-36 Cox, Simon J. III-339 Cuenca, Javier I-236 Cui, Jiangjun III-110 Cur´e, Olivier III-520 Czarnowski, Ireneusz III-624 Dagdeviren, Orhan I-509, I-519 Dai, Peng I-759 Danelutto, M. I-146 Darema, Frederica III-5 Davoli, Renzo I-287 de Oliveira, Bernardo Lino III-168
de Supinski, Bronis R. III-253 del Vado V´ırseda, Rafael I-800 Deng, Xiaotie II-407 Denham, M´ onica III-36 Denkowski, Marcin II-25 Depoorter, Wim I-226 Deshpande, Karishma III-16 Detka, Marcin I-1022 Deymier, Pierre II-301 Di Blasi, Gianpiero II-76 Ding, Wei I-853 Dobrowolski, Grzegorz III-555 Doherty, Thomas I-96 Dongarra, Jack I-935 Dostert, Paul III-54 Douglas, Craig C. III-3, III-46, III-54 Dre˙zewski, Rafal III-664, III-740 Du, L.H. III-731 Dubielewicz, Iwona II-687 Dubin, Uri I-274 Dubitzky, Werner I-106, I-274, III-70 Duda, Krzysztof III-178 Dunk, Andrew III-429 Duplaga, Mariusz I-476, III-178 Dutka, L ukasz III-409 Dymek, Dariusz I-386 Dzemyda, Gintautas I-770 Dziurzanski, Piotr I-427 Dzwinel, Witold II-177 Eckhardt, Wolfgang III-213 Efendiev, Yalchin III-54 El Zein, Ahmed I-466 Elsayed, Ibrahim I-76 Elts, E. I-45 Enticott, Colin I-66 Erciyes, Kayhan I-509, I-519 Ewing, Richard E. III-54 Fabja´ nski, Krzysztof I-499 Falcone, Jean Luc II-291 Falcou, Joel I-154 Falda, Grzegorz III-301 Fang, Y.D. III-731 Fangohr, Hans III-339 Fedoseyev, Alexander I. II-242, II-281 Fern´ andez, Juan I-456, III-471 Fern´ andez-Quiruelas, V. III-471 Fey, Dietmar I-174 Finger, N. I-945
Author Index Fox, Geoffrey C. I-407 Fragos, Tassos II-207 Frantziskonis, George II-301 Fraser, David L. I-417 Fregeau, John II-207 Freitag, Felix II-669 Fuji, Michiko II-207 Fukazawa, Kenji II-35 Funika, Wlodzimierz III-233, III-446 F¨ urlinger, Karl III-261 Gabriel, Edgar I-297 Gaburov, Evghenii II-207 Gagliardi, Fabrizio I-18 Gaiday, Alexander V. II-360 Gallery, Eimear III-510 Gallo, Giovanni II-76 Gallud, Jose A. III-389 G´ alvez, Akemi II-116, II-715 Gan, Boon Ping III-26 Gansterer, W.N. I-945 Gao, Guangxia II-476 Gao, Zhen-Guo I-559 Garc´ıa de Lomana, Adri´ an L´ opez I-610 Garc´ıa-Torre, F. III-471 Gardenghi, Ludovico I-287 Garic, Slavisa I-66 Gatial, Emil I-116, I-194 Gava, Fr´ed´eric I-375 Gavaghan, David I-66, I-571 Gavrilenko, A.V. II-396 Gavrilenko, V.I. II-396 Gehrke, Jan D. III-692 Gepner, Pawel I-42, I-417 Giannoutakis, Konstantinos M. I-925 Gil-Costa, Veronica I-327 Gim´enez, Domingo I-236, II-659 Gjermundrød, Harald III-399 Glebbeek, Evert II-207 Glowaty, Grzegorz I-883 Glut, Barbara I-641 Godowski, Piotr III-233 Goldweber, Michael I-287 ` G´ omez-Garrido, Alex I-610 ´ G´ omez-Nieto, Miguel Angel II-369 G´ omez-R´ıo, M. I-741 Gong, Yun-Chao I-995 Gonz´ alez-Cinca, Ricard II-735 G´ orriz, J.M. I-741 Goscinski, Andrzej I-164
747
Grabska, Ewa III-604 Grau, Vicente I-571 Gravvanis, George A. I-925 Groen, Derek I-86, II-207 Gruji´c, Jelena II-576 Guarnera, Giuseppe Claudio II-76 Gubala, Tomasz I-56 Gumbau, Jesus II-136 Guo, Jianping II-630 Gutierrez, Eladio I-700 Guti´errez de Mesa, J.A. III-349 Guti´errez, J.M. III-471 Guzzi, Pietro Hiram III-148 Habala, Ondrej I-116, I-194 Habela, Piotr III-301, III-311 Haffegee, Adrian III-438 Hamada, Mohamed II-678 Han, Jianguo I-76 Har¸ez˙ lak, Daniel III-446 Harfst, Stefan II-207 Harvey, Jeremy N. II-387 Hasan, Adil III-321 He, Kaijian II-494 He, Y.L. III-731 Hegewald, Jan II-227 Heggie, Douglas II-207 Hern´ andez Encinas, L. II-706 Heˇrman, Pavel I-661 Herruzo, E. I-863 Hidalgo, J.L. II-106 Higashi, Masatake II-15, II-66 Hluch´ y, Ladislav I-116, I-194, III-331 Hnatkowska, Bogumila II-687 Hochreiter, Ronald II-408 Hoekstra, Alfons G. II-165, II-227, II-291 Hogan, James M. III-491 Horak, Bohumil III-564 Hovland, Paul D. I-873 Hsu, Chung-Chian I-913 Hsu, Jinchyr I-813 Hu, Yincui II-630 Huang, Fang II-605 Huang, Lican III-501 Huang, Rentian I-823 Huang, Yan I-184 H¨ ulsemann, Frank III-203 Hunt, Ela III-158
748
Author Index
Hussain, Saber II-353 Hut, Piet II-207 Ibrahim, H. I-246 Iglesias, Andr´es II-3, II-116, II-715 Izumi, Hajime II-35 Izzard, Rob II-207 Jablonski, Stefan III-520 Jafari, Fahimeh I-436 Jakimovski, Boro III-463 Jakubowska, Joanna III-158 Jamieson, Ronan III-429, III-438 Jankowski, Robert I-710, II-614 I-355 Jarzab, Marcin J¸edrzejowicz, Piotr III-624 Johnson, Neil F. I-33 Johnston, Steven III-339 Jun, Qin III-674 Jurczuk, Krzysztof I-679 Jurczyk, Pawel I-136 Jurczyk, Tomasz I-641 Jurek, Janusz III-712 Juri´c, Mario II-207 Justham, Stephen II-207 Kaandorp, Jaap A. III-110 Kaczmarek, Pawel L. I-317 Kaczmarski, Krzysztof III-301, III-311 Kaminski, Wieslaw A. I-620 Kanada, Yasumasa I-446 Kaneko, Masataka II-35 Karl, Wolfgang III-268 Kasperkiewicz, Janusz III-702 Kasprzak, Andrzej I-549 Kasztelnik, Marek I-56 Khalili, K. II-146 Khan, Fakhri Alam I-76 Khattri, Sanjay Kumar I-975, I-1042 Khonsari, Ahmad I-436, I-539 Kim, Youngjin I-731 Kiraga, Joanna III-100 Kirou, Andy I-33 Kisiel-Dorohinicki, Marek III-654 Kitahara, Kiyoshi II-35 Kitowski, Jacek I-903, III-409 Kleijn, Chris R. II-251 Kneip, Georges III-268 Knuepfer, Andreas III-201 Kobayashi, Masakazu II-15
Kocot, Joanna III-740 Kohl, Peter I-571 Kolingerova, Ivana II-86 Kononowicz, Andrzej A. III-188 Konovalov, Alexander I-126 Kope´c, Mariusz I-600 Kornmayer, Harald III-399 Kosch, Harald I-215 Kotsalis, Evangelos M. II-234 Kotulski, Leszek I-386, III-644 Koumoutsakos, Petros II-167, II-234 Kowalewski, Bartosz III-358 Kowalik, Michal F. I-417 Koziorek, Jiri III-564 Krafczyk, Manfred II-227 Kranzlm¨ uller, Dieter III-201, III-253, III-379 Kravtsov, Valentin I-274 Krejcar, Ondrej I-489 Kr¸etowski, Marek I-679 Kriksciuniene, Dalia II-504 Krishamoorthy, Sriram I-20 Krishnan, Manoj I-20 Kroeker, Juergen I-581 Kr´ ol, Dariusz III-446 Kruk, Tomasz I-499 Kryza, Bartosz III-409 Krzhizhanovskaya, Valeria V. II-165 Kuefler, Erik I-955 Kulakowski, Krzysztof II-545 Kumar, Praveen II-387 Kundeti, Vamsi I-893 Kurdziel, Marcin I-630 Kuroda, Hisayasu I-446 Kurzak, Jakub I-935 Kuta, Marcin I-903 Laclav´ık, Michal III-331 Lai, Kin Keung II-494 Lang, E. I-741 Lassl, A. I-741 Lech, Piotr I-790 Ledoux, Veerle I-1032 Lee, Vernon I-590 Lendermann, Peter III-26 Levandovskiy, Igor A. II-360 Levnaji´c, Zoran II-584 Li, Deng III-54 Li, Feng I-853 Li, Guoqing II-605
Author Index Li, Hongquan II-416 Li, Man-Tian I-559 Li, Xiang I-559 Li, Xingsen II-436 Liu, Dingsheng II-603, II-605 Liu, Rong I-7, II-426, II-450 Liu, Ting I-76 Lloyd, Bryn A. II-187 Lluch, A. II-96 Lobosco, Marcelo III-168 Lodder, Robert A. III-54 Lombardi, James II-207 Long, Wen II-486 Lorenz, Daniel III-223 Low, Malcolm Yoke Hean III-26 Lozano, Maria III-389 Lu, Tingjie II-466 Luo, Q. II-657 Luo, Ying II-630 Luque, Emilio III-36
Melnik, R.V.N. II-197 Mendon¸ca Costa, Caroline III-120 Mertes, Jacqueline Gomes II-153 Messig, Michael I-164 Metzger, Mieczyslaw II-261, III-381 Mikolajczak, Pawel II-25 Milde, Florian II-167 Millar, Campbell I-96 Miranda Teixeira, Gustavo III-168 Misev, Anastas I-203 Mishra, Sudib K. II-301 Mitra, Abhijit II-379 Mitrovi´c, Marija II-551 Monterde, J. II-96 Moore, Shirley III-261 Moraveji, Reza I-529, I-539 Morimoto, Shoichi II-514 Morris, Alan III-276 Muntean, I.L. I-45 Muralidharan, Krishna II-301
Macariu, Georgiana I-126 Maciejewski, Henryk III-140 Mackiewicz, Dorota III-100 Mackiewicz, Pawel III-100 Madey, Greg III-6 Mahapatra, D. Roy II-197 Maischak, Matthias II-321 Maka, Tomasz I-427 Makino, Jun I-86 Makowiecki, Wojciech I-749 Malarz, Krzysztof II-559 Malawski, Maciej I-56, III-243 Maleti´c, Slobodan II-568 Malony, Allen III-276 Mandel, Jan III-46 Mandel, Johannes J. I-106 Mantiuk, Radoslaw I-780 Margalef, Tom` as III-36 Marin, Mauricio I-327 Markelov, Gennady I-581 Markowski, Marcin I-549 Marks, Maria III-702 Marqu`es, Joan Manuel II-669 Marranghello, Norian II-153 Martin, David I-96 Mazurkiewicz, Jacek I-671 McCreath, Eric I-466 McMillan, Steve I-86, II-207 Mehl, Miriam III-213
Nagai, Takahiro I-446 Nagar, Atulya I-823, II-622, III-419 Nagy, James G. I-721 Nahuz, Sadick Jorge I-365 Natkaniec, Joanna II-545 Navarro, Leandro II-669 Negoita, Alina I-833 Nielsen, Henrik Frystyk I-407 Nieplocha, Jarek I-20 Noble, Denis I-66 Noble, Penelope I-66 Noco´ n, Witold II-261, III-381 Nogawa, Takeshi II-15 Nowakowski, Piotr III-90 ´ Nuall´ ain, Breannd´ an O II-207 Okarma, Krzysztof I-790 Ong, Boon Som I-590 Orlowska, Maria E. I-3 Ostermann, Elke II-321 Ostropytskyy, Vitaliy III-70 Othman, M. I-246 Oya, Tetsuo II-66 Ozsoyeller, Deniz I-519 Pacher, C. I-945 Pachter, Ruth II-353 Paj¸ak, Dawid I-780 Palmer, Bruce I-20
749
750
Author Index
Pannala, Sreekanth II-301 Park, Jongan I-1013 Park, Seungjin I-1013 Parus, Jindra II-86 Paszy´ nska, Anna III-604 Paszy´ nski, Maciej I-965, III-533, III-604 Pawlus, Dorota I-689 Peachey, Tom I-66 P¸egiel, Piotr III-233 Pelczar, Michal III-80 Penichet, Victor M.R. III-389 Pereira, Aledir Silveira II-153 Petcu, Dana I-126 Pflug, Georg Ch. II-408 Pita, Isabel I-800 Plagne, Laurent III-203 Plank, Gernot I-571 Plata, O. I-863 Plotkowiak, Michal I-571 Pokrywka, Rafal I-396 Poore, Jesse H. III-291 Portegies Zwart, Simon I-86, II-207 Pozuelo, Carmela II-659 Preissl, Robert III-253 Prusiewicz, Agnieszka III-614 Puglisi, Giovanni II-76 Puig-Pey, Jaime II-116 Puntonet, C.G. I-741 Qiu, Xiaohong I-407 Queiruga, D. II-706 Queiruga Dios, A. II-706 Quinlan, Daniel J. III-253 Radziszewski, Michal II-46 Rajasekaran, Sanguthevar I-893 Rajkovi´c, Milan II-568 Ramalho Pommeranzembaum, Igor III-168 Raman, Ashok II-242, II-281 Ramasami, Ponnadurai II-343, II-344 Ram´ırez, J. I-741 Ramos, Francisco II-5, II-86 Ramos-Quintana, Fernando II-725 Randrianarivony, Maharavo II-56 Rasheed, Waqas I-1013 Ratajczak-Ropel, Ewa III-624 Rebollo, Cristina II-136 Rehman, M. Abdul III-520 Rehn-Sonigo, Veronika I-215
Remolar, Inmaculada II-136 Rendell, Alistair I-466 Riaz, Muhammad I-1013 Riche, Olivier III-70 Ripolles, Oscar II-5 Robert, Yves I-215 Rodr´ıguez, A. I-741 Rodr´ıguez, Daniel III-289, III-368 Rodriguez, Blanca I-571 Roe, Paul III-491 Rojek, Gabriel III-594 Romberg, Mathilde III-67 Romero, A. I-741 Romero, Sergio I-700 Roux, Fran¸cois-Xavier II-311 Ruan, Yijun III-130 Ruiz, Irene Luque II-369 Ruiz, Roberto III-289 Ruszczycki, Bla˙zej I-33 Rycerz, Katarzyna II-217 Sadayappan, P. I-20 Safaei, Farshad I-539 Saiz, Ana Isabel I-800 Sakalauskas, Virgilijus II-504 S´ amano-Galindo, Josefina II-725 San Mart´ın, R.M. III-471 Sano, Yoichi II-15 Santamaria, Eduard II-735 Sarbazi-Azad, Hamid I-529 Sarmanho, Felipe S. I-337 Schabauer, Hannes I-945, II-408 Schaefer, Robert I-965, III-533, III-682 Sch¨ afer, Andreas I-174 Schneider, J¨ urgen E. I-571 Schoenharl, Timothy W. III-6 Schroeder, Wayne III-321 Schulz, Martin III-253 Schuster, Assaf I-274 Segura, Clara I-800 Sekiguchi, Masayoshi II-35 ˇ Seleng, Martin III-331 Seo, Shinji II-66 Sepielak, Jan III-664 Serot, Jocelyn I-154 Shang, Wei II-416 Shao, Qinghui II-242 Sharda, Anurag I-833 Sharma, Purshotam II-379 Sharma, Sitansh II-387
Author Index Shende, Sameer III-276 Sher, Anna I-66 Shi, Yong I-7, II-407, II-426, II-436, II-450, II-459, II-476 Shiflet, Angela B. II-697 Shiflet, George W. II-697 Shirayama, Susumu II-535 Shubina, Tatyana E. II-360 Sicilia, Miguel-Angel III-368 Silva, Cˆ andida G. III-70 Sim˜ ao, Adenilso S. I-337 ˇ Simo, Branislav I-116, I-194 Simunovic, Srdjan II-301 Singh, Harjinder II-379, II-387 Sinha, N. II-197 Sinnott, Richard O. I-96 Siwik, Leszek III-664, III-740 Skabar, Andrew II-441 Sloot, Peter M.A. II-217 Slota, Damian I-1005 Smola, Alex I-466 Smolka, Maciej III-535 ´ zy´ Snie˙ nski, Bartlomiej III-533, III-722 Sobczynski, Maciej III-100 Socha, Miroslaw III-178 Soler, Pablo I-800 Sosonkina, Masha I-833 Souza, Paulo S.L. I-337 Souza, Simone R.S. I-337 Spear, Wyatt III-276 Sportouch, David I-610 Srovnal, Vilem III-564 Stahl, Frederic III-70 Stencel, Krzysztof III-301, III-311 Stephan, Ernst P. II-321 Stewart, Gordon I-96 St¨ umpert, Mathias III-399 Subieta, Kazimierz III-301, III-311 Subramaniam, S. I-246 Sumitomo, Jiro III-491 Sun, B. III-731 Sun, Li-Ning I-559 Sun, Luo I-759 Sunderam, Vaidy I-136 Sundnes, Joakim III-67 Sung, Wing-Kin III-130 ˇ Suvakov, Milovan II-593 Suzuki, Yasuyuki II-15 Swain, Martin I-106, I-274, III-70 Swain, W. Thomas III-291
751
´ Swierczy´ nski, Tomasz III-409 Sykes, A.C. II-396 Szczerba, Dominik II-187 Sz´ekely, G´ abor II-187 Szepieniec, Tomasz I-254 Szydlo, Tomasz I-307 Szyma´ nski, Kamil III-555 Tadi´c, Bosiljka II-525, II-551 Tadokoro, Yuuki II-35 Tajiri, Shinsuke II-271 Takato, Setsuo II-35 Talebi, Mohammad S. I-436 Talik, Marek III-409 Tanaka, Hisao II-271 Tao, Jie III-201, III-268 Tao, Linmi I-759 Tavakkol, Arash I-529 Tawfik, Hissam I-823, II-622, III-419 Tay, Joc Cing I-590 Teixeira, Mario Meireles I-365 ten Thije Boonkkamp, J.H.M. I-651 Tesoriero, Ricardo III-389 Teuben, Peter II-207 Theis, F. I-741 Thijsse, Barend J. II-251 Tian, Chunhua I-853 Tian, Ying-Jie I-995, II-436 Tirado-Ramos, A. II-657 T¨ olke, Jonas II-227 Tomlinson, Allan III-510 Towsey, Michael III-491 Treigys, Povilas I-770 Trenas, Maria A. I-700 Trojanowski, Krzysztof I-843 Tsutahara, Michihisa II-271 Tufo, Henry M. II-646 Turcza, Pawel I-476, III-178 Turek, Wojciech III-574 Turner, Stephen John III-26 Turowski, Marek II-242, II-281 Uchida, Makoto II-535 Uebing, Christian III-223 Um, Jungho I-731 Uribe, Roberto I-327 Valiev, Marat I-20 van Bever, Joris II-207 Van Daele, Marnix I-1032
752
Author Index
Vanden Berghe, Guido I-1032 Vanmechelen, Kurt I-226 Vary, James P. I-833 Vasiljevi´c, Danijela II-568 Vega, Vinsensius B. III-130 Velinov, Goran III-463 Veltri, Pierangelo III-148 Vicent, M.J. II-106 Vilkomir, Sergiy A. III-291 Vill` a-Freixa, Jordi I-610 Villasante, Jes´ us I-5 Vodacek, Anthony III-46 Volkert, Jens III-201, III-379 Volz, Bernhard III-520
Wojcik, Grzegorz M. I-620 Wojtusiak, Janusz III-692 Wolniewicz, Pawel III-399 Wr´ oblewski, Pawel I-600 Wrzeszcz, Michal I-903 Wu, Chaolin II-630 Wu, Jun II-466
Wach, Jakub III-80 Walkowiak, Tomasz I-671 Walkowiak, Wolfgang III-223 Walser, Markus I-33 Wan, Mike III-321 Wan, Wei II-603 Wang, Huiwen II-486 Wang, Jiangqing III-674 Wang, Jianqin II-630 Wang, Shouyang II-407, II-416 Wang, Yanguang II-630 Wang, Zhen III-46 Watt, John I-96 Wcislo, Rafal II-177 Weber dos Santos, Rodrigo III-67, III-120, III-168 Wei, Wenhong I-347 Weinzierl, Tobias III-213 Weise, Andrea III-321 Weller, Robert A. II-281 Wendykier, Piotr I-721 Wibisono, Adianto III-481 Widmer, Gerhard III-379 Wierzbowska, Izabela III-624 Wism¨ uller, Roland III-201, III-223 Wi´sniewski, Cezary III-409 Wi´sniowski, Zdzislaw III-188 Wodecki, Mieczyslaw I-264 W¨ ohrer, Alexander I-76
Yaghmaee, Mohammad H. I-436 Yamashita, Satoshi II-35 Yan, Nian I-7, II-426, II-450 Yan, Yunxuan II-605 Yang, Xuecheng II-466 Yaron, Ofer II-207 Yau, Po-Wah III-510 Yebra, J. Luis A. II-735 Yeow, J.T.W. II-197 Yoshida, Hitoshi I-446 Yuan, Huapeng I-407
Xiao, Wenjun I-347 Xie, Chi II-494 Xiong, Li I-136 Xu, Guangyou I-759 Xue, Yong II-603, II-630
Zapata, Emilio L. I-700, I-863 Zapletal, David I-661 Z´ arate-Silva, V´ıctor H. II-725 Zemp, Marcel II-207 Zeng, Yi II-605 Zhang, Peng II-436, II-476 Zhang, Xiaohang II-466 Zhang, Ying II-459 Zhang, Zhiwang II-436, II-476 Zhao, Zhiming III-459, III-481 Zheng, Bo-jin III-533, III-674 Zhou, Zongfang II-459 Zieli´ nski, Krzysztof I-307, I-355 Zieli´ nski, Tomasz I-476, III-178 Zolfaghari, H. II-146 Zoppi, G. I-146 Zuzek, Mikolaj III-409