Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Alfred Kobsa University of California, Irvine, CA, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen University of Dortmund, Germany Madhu Sudan Massachusetts Institute of Technology, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max-Planck Institute of Computer Science, Saarbruecken, Germany
5593
Osvaldo Gervasi David Taniar Beniamino Murgante Antonio Laganà Youngsong Mun Marina L. Gavrilova (Eds.)
Computational Science and Its Applications – ICCSA 2009 International Conference Seoul, Korea, June 29-July 2, 2009 Proceedings, Part II
13
Volume Editors Osvaldo Gervasi University of Perugia, 06123 Perugia, Italy E-mail:
[email protected] David Taniar Monash University, Clayton, Victoria 3800, Australia E-mail:
[email protected] Beniamino Murgante L.I.S.U.T. - D.A.P.I.T., University of Basilicata, 85100 Potenza, Italy E-mail:
[email protected] Antonio Laganà University of Perugia, 06123 Perugia, Italy E-mail:
[email protected] Youngsong Mun SoongSil University, Seoul, 156-743, Korea E-mail:
[email protected] Marina L. Gavrilova University of Calgary, Calgary, AB, T2N 1N4, Canada E-mail:
[email protected]
Library of Congress Control Number: Applied for CR Subject Classification (1998): J.2, I.3.5, K.6, G.2.3, I.1.4, H.5, I.3 LNCS Sublibrary: SL 1 – Theoretical Computer Science and General Issues ISSN ISBN-10 ISBN-13
0302-9743 3-642-02456-4 Springer Berlin Heidelberg New York 978-3-642-02456-6 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. springer.com © Springer-Verlag Berlin Heidelberg 2009 Printed in Germany Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper SPIN: 12704828 06/3180 543210
Preface
These multiple volumes (LNCS volumes 5592 and 5593) consist of the peerreviewed papers from the 2009 International Conference on Computational Science and Its Applications (ICCSA 2009) held in Seoul (South Korea) from June 29 to July 2, 2009. ICCSA 2009 was a successful event in the International Conferences on Computational Science and Its Applications (ICCSA) series, previously held in Perugia, Italy (2008), Kuala Lumpur, Malaysia (2007), Glasgow, UK (2006), Singapore (2005), Assisi, Italy (2004), Montreal, Canada (2003), and (as ICCS) Amsterdam, The Netherlands (2002) and San Francisco, USA (2001). Computational science is a main pillar of most of the present research and industrial and commercial activities, and it plays a unique role in exploiting ICT innovative technologies. The ICCSA conference series have been providing a venue to researchers and industry practitioners to discuss new ideas, to share complex problems and their solutions, and to shape new trends in computational science. For the first time this year, ICCSA organized a number of special issues in international journals to publish selected extended best papers from ICCSA 2009, including Journal of Supercomputing (Springer); Concurrency and Computation: Practice and Experience (Wiley); Transactions on Computational Science (Springer); Mobile Information Systems (IOS Press); Journal of Algorithms in Cognition, Informatics and Logic (Elsevier); International Journal of Virtual Reality; and Journal of Universal Computer Science (JUCS). Apart from the general track, ICCSA 2009 also included 20 special sessions and workshops, in various areas of computational sciences, ranging from computational science technologies, to specific areas of computational sciences, such as computer graphics and virtual reality. We would like to show our appreciation to the Workshop and Special Session Chairs and Co-chairs. The success of the ICCSA conference series, in general, and ICCSA 2009 in particular, is due to the support of many people: authors, presenters, participants, keynote speakers, Session Chairs, Organizing Committee members, student volunteers, Program Committee members, Steering Committee members, and many people in other various roles. We would like to thank them all. We would also like to thank Springer for their continuous support in publishing ICCSA conference proceedings. June 2009
Osvaldo Gervasi David Taniar On behalf of all editors: Osvaldo Gervasi, David Taniar, Beniamino Murgante Youngsong Mun, Antonio Lagan` a, Marina Gavrilova
Organization
ICCSA 2009 was organized by the University of Perugia (Italy), Monash University (Australia), University of Calgary (Canada), La Trobe University (Australia), Soongsil University (Korea), and Kyung Hee University (Korea)
Conference Chairs Osvaldo Gervasi David Taniar
University of Perugia, Perugia, Italy, Program Chair Monash University, Australia, Scientific Chair
Steering Committee Marina L. Gavrilova Osvaldo Gervasi Eui-Nam Huh Andres Iglesias Vipin Kumar Antonio Lagan` a Youngsong Mun C.J. Kenneth Tan David Taniar
University of Calgary, Canada, Steering Committee Chair University of Perugia, Perugia, Italy Kyung Hee University, Korea University of Cantabria, Spain Army High Performance Computing Center and University of Minnesota, USA University of Perugia, Italy Soongsil University, Korea OptimaNumerics, UK Monash University, Australia
Workshop Organizers Advances in Web-Based Learning (AWBL 2009) Mustafa Murat Inceoglu, Ege University (Turkey)
Business Intelligence (BUSIN 2009) David Taniar, Monash University (Australia) Eric Pardede, La Trobe University (Australia) Wenny Rahayu, La Trobe University (Australia)
Computer Algebra Systems and Applications (CASA 2009) Andr`es Iglesias, University of Cantabria (Spain) Akemi Galvez, University of Cantabria (Spain)
Computational Geometry and Applications (CGA 2009) Marina L. Gavrilova, University of Calgary (Canada)
VIII
Organization
Computer Graphics and Virtual Reality (CGVR 2009) Osvaldo Gervasi, University of Perugia (Italy) Andr`es Iglesias, University of Cantabria (Spain)
Computational GeoInformatics (COMPGEO 2009) Hugo Ledoux, Delft University of Technology (The Netherlands) Nirvana Meratnia, University of Twente (The Netherlands) Arta Dilo, Delft University of Technology (The Netherlands)
Data Storage Devices and Systems (DSDS 2009) Yeonseung Ryu, Myongji University (Korea)
Geographical Analysis, Urban Modeling, Spatial Statistics (GEOG-AN-MOD 2009) Stefania Bertazzon, University of Calgary (Canada) Giuseppe Borruso, University of Trieste (Italy) Beniamino Murgante, University of Basilicata (Italy)
High-Performance Computing and Information Visualization (HPCIV 2009) Frank Devai London South Bank University (UK) David Protheroe, London South Bank University (UK)
Information Systems and Information Technologies (ISIT 2009) Youngsong Mun, Soongsil University (Korea)
International Workshop on Collective Evolutionary Systems (IWCES 2009) Alfredo Milani, University of Perugia (Italy) Clement Leung, Hong Kong Baptist University (Hong Kong)
Mobile Communications (MC 2009) Hyunseung Choo, Sungkyunkwan University (Korea)
Molecular Simulations Structures and Processes (MOSSAP 2009) Antonio Lagan` a, University of Perugia (Italy) Filippo De Angelis (ISTM, National Research Council, Perugia) (Italy)
PULSES V- Logical, Scientific and Computational Aspects of Pulse Phenomena in Transitions (PULSES 2009) Carlo Cattani, University of Salerno (Italy) Cristian Toma, University of Bucarest (Romania)
Organization
IX
Luis Manuel Sanchez Ruiz, Universidad Politecnica de Valencia (Spain) Myo-Taeg Lim, Korea University (Korea)
Software Engineering Processes and Applications (SEPA 2009) Sanjay Misra, Atilim University (Turkey)
Sensor Network and Its Applications (SNA 2009) Eui-Nam Huh, Kyung Hee University (Korea)
Security and Privacy in Pervasive Computing Environments (SPPC 2009) Byoung-Soo Koh, DigiCAPS Co. Ltd (Korea)
Wireless and Ad-hoc Networking (WAD 2009) Jongchan Lee, Kunsan National University (Korea), Sangjoon Park, Kunsan National University (Korea)
Workshop on Internet Communication Security (WICS 2009) Jos`e Maria Sierra Camara, University of Madrid (Spain)
Workshop on Internet Computing in Science and Engineering (WICSE 2009) Qi Luoi, Wuhan Institute of Technology (China) Qihai Zhou, Southwestern University of Finance and Economics (China) Chin-Chen Chang, National Chung Hsing University (Taiwan)
Program Committee Jemal Abawajy Kenny Adamson J. A. Rod Blais Sunil Bhaskaran John Brooke Martin Buecker Yves Caniou Young Sik Choi Hyunseung Choo Min Young Chung Jose C Cunha Alfredo Cuzzocrea Tom Dhaene Beniamino Di Martino Marina L. Gavrilova Osvaldo Gervasi
Deakin University, Australia EZ-DSP, UK University of Calgary, Canada City University of New York, USA University of Manchester, UK Aachen University, Germany INRIA, France University of Missouri, USA Sungkyunkwan University, Korea Sungkyunkwan University, Korea New University of Lisbon, Portugal University of Calabria, Italy University of Antwerp , Belgium Second University of Naples, Italy University of Calgary, Canada University of Perugia, Italy
X
Organization
James Glimm Andrzej Goscinski Jin Hai Shen Hong Eui-Nam John Huh Terence Hung Gih Guang Andres Iglesias Peter K Jimack Yoon Hee Kim Dieter Kranzlmueller Antonio Lagan` a Sang Yoon Lee Tae-Jin Lee Bogdan Lesyng Laurence Liew Michael Mascagni Edward Moreno Youngsong Mun Beniamino Murgante Jiri Nedoma Marcin Paprzycki Eric Pardede Ron Perrott Alias Abdul Rahman Richard Ramaroson Jerzy Respondek Alexey S. Rodionov Paul Roe Dale Shires Jose Sierra-Camara Siti Shamsuddin Mariyam Alexei Sourin Olga Sourina Kokichi Sugihara C. J. Kenneth Tan David Taniar
SUNY Stony Brook, USA Deakin University, Australia Huazhong University of Science and Technology, China Japan Advanced Institute of Science and Technology, Japan Seoul Woman’s University, Korea Institute of High Performance Computing, Singapore University de Cantabria, Spain University of Leeds, UK Syracuse University, USA Ludwig-Maximilians-Universit¨ at M¨ unchen, Germany University of Perugia, Italy Georgia Institute of Technology, USA Sungkyunkwan University, Korea ICM Warszawa, Poland Scalable Systems Pte, Singapore Florida State University, USA Euripides Foundation of Marilia, Brazil Soongsil University, Korea Universit` a di Perugia, Italy Academy of Sciences of the Czech Republic, Czech Republic Oklahoma State University, USA La Trobe University, Australia The Queen’s University of Belfast, UK Universiti Teknologi Malaysia, Malaysia ONERA, France Silesian University of Technology, Gliwice, Poland Russian Academy of Sciences, Russia Queensland University of Technology, Australia US Army Research Laboratory, USA University Carlos III of Madrid, Spain Universiti Technologi Malaysia, Malaysia Nanyang Technological University, Singapore Nanyang Technological University, Singapore University of Tokyo, Japan OptimaNumerics, UK and The Queen’s University of Belfast, UK Monash University, Australia
Organization
Ruppa K. Thulasiram Putchong Uthayopas Mario Valle Marco Vanneschi Piero Giorgio Verdini Adriana Vlad Koichi Wada Krzysztof Walkowiak Jerzy Wasniewski Ping Wu Mudasser F. Wyne Roman Wyrzykowski George Yee Myung Sik Yoo Seung Moo Yoo Albert Zomaya
XI
University of Manitoba, Canada Kasetsart University, Thailand Swiss National Supercomputing Centre, Switzerland University of Pisa, Italy University of Pisa and Istituto Nazionale di Fisica Nucleare, Italy “Politehnica” University of Bucharest, Romania University of Tsukuba, Japan Wroclaw University of Technology, Poland Technical University of Denmark, Denmark Institute of High Performance Computing, Singapore National University, San Diego, USA Technical University of Czestochowa, Poland National Research Council and Carleton University, Canada SUNY, USA Soongsil University, Korea University of Sydney, Australia
Sponsoring Organizations ICCSA 2009 would not have been possible without the tremendous support of many organizations and institutions, for which all organizers and participants of ICCSA 2009 express their sincere gratitude: University of Perugia, Italy Monash University, Australia University of Calgary, Canada La Trobe University, Australia Soongsil University, Korea Kyung Hee University, Korea Innovative Computational Science Applications (ICSA) MASTER-UP, Italy SPARCS Laboratory, Universityu of Calgary, Canada OptimaNumerics, UK
Table of Contents – Part II
Workshop on Software Engineering Processes and Applications (SEPA 2009) Ontology-Based Requirements Conflicts Analysis in Activity Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chi-Lun Liu
1
Resource Allocation Optimization for GSD Projects . . . . . . . . . . . . . . . . . . Supraja Doma, Larry Gottschalk, Tetsutaro Uehara, and Jigang Liu
13
Verification of Use Case with Petri Nets in Requirement Analysis . . . . . . Jinqiang Zhao and Zhenhua Duan
29
Towards Guidelines for a Development Process for Component-Based Embedded Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rikard Land, Jan Carlson, Stig Larsson, and Ivica Crnkovi´c Effective Project Leadership in Computer Science and Engineering . . . . . Ferid Cafer and Sanjay Misra Weyuker’s Properties, Language Independency and Object Oriented Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sanjay Misra
43 59
70
Workshop on Molecular Simulations Structures and Processes (MOSSAP 2009) Lattice Constant Prediction of A2 BB’O6 Type Double Perovskites . . . . . Abdul Majid, Muhammad Farooq Ahmad, and Tae-Sun Choi A Grid Implementation of Direct Semiclassical Calculations of Rate Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alessandro Costantini, Noelia Faginas Lago, Antonio Lagan` a, and Ferm´ın Huarte-Larra˜ naga A Grid Implementation of Direct Quantum Calculations of Rate Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alessandro Costantini, Noelia Faginas Lago, Antonio Lagan` a, and Ferm´ın Huarte-Larra˜ naga A Grid Implementation of Chimere: Ozone Production in Central Italy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Antonio Lagan` a, Stefano Crocchianti, Alessandro Costantini, Monica Angelucci, and Marco Vecchiocattivi
82
93
104
115
XIV
Table of Contents – Part II
Workshop on Internet Communication Security (WICS 2009) MDA-Based Framework for Automatic Generation of Consistent Firewall ACLs with NAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sergio Pozo, A.J. Varela-Vaca, and Rafael M. Gasca Testing Topologies for the Evaluation of IPSEC Implementations . . . . . . Fernando S´ anchez-Chaparro, Jos´e M. Sierra, Oscar Delgado-Mohatar, and Amparo F´ uster-Sabater Evaluation of a Client Centric Payment Protocol Using Digital Signature Scheme with Message Recovery Using Self-Certified Public Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Miguel Viedma Astudillo, Jes´ us T´ellez Isaac, Diego Suarez Touceda, and H´ector Plaza L´ opez
130 145
155
Workshop on Security and Privacy in Pervasive Computing Environments (SPPC 2009) Security Vulnerabilities of a Remote User Authentication Scheme Using Smart Cards Suited for a Multi-Server Environment . . . . . . . . . . . . . . . . . . Youngsook Lee and Dongho Won Enhancing Security of a Group Key Exchange Protocol for Users with Individual Passwords . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Junghyun Nam, Sangchul Han, Minkyu Park, Juryon Paik, and Ung Mo Kim Smart Card Based AKE Protocol Using Biometric Information in Pervasive Computing Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wansuck Yi, Seungjoo Kim, and Dongho Won A Practical Approach to a Reliable Electronic Election . . . . . . . . . . . . . . . Kwangwoo Lee, Yunho Lee, Seungjoo Kim, and Dongho Won Security Weakness in a Provable Secure Authentication Protocol Given Forward Secure Session Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mijin Kim, Heasuk Jo, Seungjoo Kim, and Dongho Won
164
173
182 191
204
Workshop on Mobile Communications (MC 2009) Performance of STBC PPM-TH UWB Systems with Double Binary Turbo Code in Multi-user Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . Eun Cheol Kim and Jin Young Kim
212
Performance Evaluation of PN Code Acquisition with Delay Diversity Receiver for TH-UWB System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Eun Cheol Kim and Jin Young Kim
226
Table of Contents – Part II
Performance Analysis of Binary Negative-Exponential Backoff Algorithm in IEEE 802.11a WLAN under Erroneous Channel Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bum-Gon Choi, Sueng Jae Bae, Tae-Jin Lee, and Min Young Chung A Resource-Estimated Call Admission Control Algorithm in 3GPP LTE System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sueng Jae Bae, Jin Ju Lee, Bum-Gon Choi, Sungoh Kwon, and Min Young Chung Problems with Correct Traffic Differentiation in Line Topology IEEE 802.11 EDCA Networks in the Presence of Hidden and Exposed Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Katarzyna Kosek, Marek Natkaniec, and Luca Vollero
XV
237
250
261
Adaptive and Iterative GSC/MRC Switching Techniques Based on CRC Error Detection for AF Relaying System . . . . . . . . . . . . . . . . . . . . . . . Jong Sung Lee and Dong In Kim
276
WiBro Net.-Based Five Senses Multimedia Technology Using Mobile Mash-Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jung-Hyun Kim, Hyeong-Joon Kwon, and Kwang-Seok Hong
286
Overlay Ring Based Secure Group Communication Scheme for Mobile Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hyunsu Jang, Kwang Sun Ko, Young-woo Jung, and Young Ik Eom
302
Enhanced Multiple-Shift Scheme for Rapid Code Acquisition in Optical CDMA Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dahae Chong, Taeung Yoon, Youngyoon Lee, Chonghan Song, Myungsoo Lee, and Seokho Yoon AltBOC and CBOC Correlation Functions for GNSS Signal Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Youngpo Lee, Youngyoon Lee, Taeung Yoon, Chonghan Song, Sanghun Kim, and Seokho Yoon
314
325
Performance Enhancement of IEEE 802.11b WLANs Using Cooperative MAC Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jin-Seong Kim and Tae-Jin Lee
335
Authentication Scheme Based on Trust and Clustering Using Fuzzy Control in Wireless Ad-Hoc Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Seong-Soo Park, Jong-Hyouk Lee, and Tai-Myoung Chung
345
On Relocation of Hopping Sensors for Balanced Migration Distribution of Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Moonseong Kim and Matt W. Mutka
361
XVI
Table of Contents – Part II
Hybrid Hard/Soft Decode-and-Forward Relaying Protocol with Distributed Turbo Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Taekhoon Kim and Dong In Kim Design of an Efficient Multicast Scheme for Vehicular Telematics Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Junghoon Lee, In-Hye Shin, Hye-Jin Kim, Min-Jae Kang, and Sang Joon Kim ODDUGI: Ubiquitous Mobile Agent System . . . . . . . . . . . . . . . . . . . . . . . . . SungJin Choi, Hyunseung Choo, MaengSoon Baik, HongSoo Kim, and EunJoung Byun
372
383
393
Determination of the Optimal Hop Number for Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jin Wang and Young-Koo Lee
408
Localization in Sensor Networks with Fading Channels Based on Nonmetric Distance Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Viet-Duc Le, Young-Koo Lee, and Sungyoung Lee
419
A Performance Comparison of Swarm Intelligence Inspired Routing Algorithms for MANETs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jin Wang and Sungyoung Lee
432
Analysis of Moving Patterns of Moving Objects with the Proposed Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . In-Hye Shin, Gyung-Leen Park, Abhijit Saha, Ho-young Kwak, and Hanil Kim A User-Defined Index for Containment Queries in XML . . . . . . . . . . . . . . . Gap-Joo Na and Sang-Won Lee On Optimal Placement of the Monitoring Devices on Channels of Communication Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alexey Rodionov, Olga Sokolova, Anastasia Yurgenson, and Hyunseung Choo Low Latency Handover Scheme Based on Optical Buffering at LMA in Proxy MIPv6 Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Seungtak Oh and Hyunseung Choo A Capacity Aware Data Transport Protocol for Wireless Sensor Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Md. Obaidur Rahman, Muhammad Mostafa Monowar, and Choong Seon Hong
443
453
465
479
491
Table of Contents – Part II
Data Distribution of Road-Side Information Station in Vehicular Ad Hoc Networks (VANETs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abhijit Saha, Gyung-Leen Park, Khi-Jung Ahn, Chul Soo Kim, Bongkyu Lee, and Yoon-Jung Rhee VC-GTS: Virtual Cut-Through GTS Allocation Scheme for Voice Traffic in Multihop IEEE 802.15.4 Systems . . . . . . . . . . . . . . . . . . . . . . . . . . Junwoo Jung, Hoki Baek, and Jaesung Lim Towards Location-Based Real-Time Monitoring Systems in u-LBS . . . . . . MoonBae Song and Hyunseung Choo
XVII
503
513 525
General Track on Computational Methods, Algorithms and Applications A Fast Approximation Algorithm for the k Partition-Distance Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yen Hung Chen A PSO – Line Search Hybrid Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ximing Liang, Xiang Li, and M. Fikret Ercan Using Meaning of Coefficients of the Reliability Polynomial for Their Faster Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alexey Rodionov, Olga Rodionova, and Hyunseung Choo
537 547
557
A Novel Tree Graph Data Structure for Point Datasets . . . . . . . . . . . . . . . Saeed Behzadi, Ali A. Alesheikh, and Mohammad R. Malek
572
Software Dependability Analysis Methodology . . . . . . . . . . . . . . . . . . . . . . . Beoungil Cho, Hyunsang Youn, and Eunseok Lee
580
New Approach for the Pricing of Bond Option Using the Relation between the HJM Model and the BGM Model . . . . . . . . . . . . . . . . . . . . . . . Kisoeb Park, Seki Kim, and William T. Shaw
594
Measuring Anonymous Systems with the Probabilistic Applied Pi Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xiaojuan Cai and Yonggen Gu
605
YAO: A Software for Variational Data Assimilation Using Numerical Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Luigi Nardi, Charles Sorror, Fouad Badran, and Sylvie Thiria
621
General Track on High Performance Technical Computing and Networks CNP: A Protocol for Reducing Maintenance Cost of Structured P2P . . . Yu Zhang, Yuanda Cao, and Baodong Cheng
637
XVIII
Table of Contents – Part II
A Research on Instability of Small Flow in SCADA and an Optimizing Design for Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Youqiang Guo, Zijun Zhang, and Xuezhu Pei
653
On-Demand Chaotic Neural Network for Broadcast Scheduling Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kushan Ahmadian and Marina Gavrilova
664
General Track on Advanced and Emerging Applications Mining Spread Patterns of Spatio-temporal Co-occurrences over Zones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Feng Qian, Qinming He, and Jiangfeng He
677
Computerized Detection of Pulmonary Nodule Based on TwoDimensional PCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wook-Jin Choi, Abdul Majid, and Tae-Sun Choi
693
Computational Measurements of the Transient Time and of the Sampling Distance That Enables Statistical Independence in the Logistic Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adriana Vlad, Adrian Luca, and Madalin Frunzete Future Personal Health Records as a Foundation for Computational Health . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Robert Steele and Amanda Lo
703
719
SVM Based Decision Analysis and Its Granular-Based Solving . . . . . . . . . Tian Yang, Xinjie Lu, Zaifei Liao, Wei Liu, and Hongan Wang
734
Fast Object Tracking in Intelligent Surveillance System . . . . . . . . . . . . . . . Ki-Yeol Eom, Tae-Ki Ahn, Gyu-Jin Kim, Gyu-Jin Jang, and Moon-hyun Kim
749
A Reliable Skin Detection Using Dempster-Shafer Theory of Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mohammad Shoyaib, Mohammad Abdullah-Al-Wadud, and Oksam Chae Video Shot Boundary Detection Using Generalized Eigenvalue Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ali Amiri and Mahmood Fathy
764
780
Content Quality Assessment Related Frameworks for Social Media . . . . . Kevin Chai, Vidyasagar Potdar, and Tharam Dillon
791
Weighted Aspect Moment Invariant in Pattern Recognition . . . . . . . . . . . . Rela Puteri Pamungkas and Siti Mariyam Shamsuddin
806
Table of Contents – Part II
Multiple Object Types KNN Search Using Network Voronoi Diagram . . . Geng Zhao, Kefeng Xuan, David Taniar, Maytham Safar, Marina Gavrilova, and Bala Srinivasan
XIX
819
General Track on Information Systems and Information Technologies RRPS: A Ranked Real-Time Publish/Subscribe Using Adaptive QoS . . . Xinjie Lu, Xin Li, Tian Yang, Zaifei Liao, Wei Liu, and Hongan Wang
835
A Dynamic Packet Management in a Protocol Processor . . . . . . . . . . . . . . Yul Chu, Amit Uppal, and Jae Sok Son
851
On a Construction of Short Digests for Authenticating Ad Hoc Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Khoongming Khoo, Ford Long Wong, and Chu-Wee Lim
863
Learning and Predicting Key Web Navigation Patterns Using Bayesian Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Malik Tahir Hassan, Khurum Nazir Junejo, and Asim Karim
877
A Hybrid IP Forwarding Engine with High Performance and Low Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Junghwan Kim, Myeong-Cheol Ko, Hyun-Kyu Kang, and Jinsoo Kim
888
Learning Styles Diagnosis Based on Learner Behaviors in Web Based Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nil¨ ufer Atman, Mustafa Murat Inceo˘glu, and Burak Galip Aslan
900
State of the Art in Semantic Focused Crawlers . . . . . . . . . . . . . . . . . . . . . . . Hai Dong, Farookh Khadeer Hussain, and Elizabeth Chang Towards a Framework for Workflow Composition in Ontology Tailoring in Semantic Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Toshihiro Uchibayashi, Bernady O. Apduhan, Wenny J. Rahayu, David Taniar, and Norio Shiratori
910
925
Fusion Segmentation Algorithm for SAR Images Based on HMT in Contourlet Domain and D-S Theory of Evidence . . . . . . . . . . . . . . . . . . . . . Yan Wu, Ming Li, Haitao Zong, and Xin Wang
937
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
953
Table of Contents – Part I
Workshop on Geographical Analysis, Urban Modeling, Spatial statistics (GEO-AN-MOD 2009) Using Causality Relationships for a Progressive Management of Hazardous Phenomena with Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . Nafaa Jabeur and Hedi Haddad Design and Development of an Intelligent Extension for Mapping Landslide Susceptibility Using Artificial Neural Network . . . . . . . . . . . . . . Mohammad H. Vahidnia, Ali A. Alesheikh, Abbas Alimohammadi, and Farhad Hosseinali Integrating Fuzzy Logic and GIS Analysis to Assess Sediment Characterization within a Confined Harbour . . . . . . . . . . . . . . . . . . . . . . . . . Nicoletta Gazzea, Andrea Taramelli, Emiliana Valentini, and Maria Elena Piccione Integrated Geological, Geomorphological and Geostatistical Analysis to Study Macroseismic Effects of 1980 Irpinian Earthquake in Urban Areas (Southern Italy) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Maria Danese, Maurizio Lazzari, and Beniamino Murgante
1
17
33
50
Using GIS to Develop an Efficient Spatio-temporal Task Allocation Algorithm to Human Groups in an Entirely Dynamic Environment Case Study: Earthquake Rescue Teams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ali Reza Vafaeinezhad, Ali Asghar Alesheikh, Majid Hamrah, Reza Nourjou, and Rouzbeh Shad
66
A GIS-Based SW Prototype for Emergency Preparedness Management Applied to a Real Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Marco Scaioni, Mario Alba, Renato Rota, and Simona Caragliano
79
Spatial Analysis of Humidity and Temperature of Iran . . . . . . . . . . . . . . . . Amir Kavousi and Mohammad Reza Meshkani An Evaluation of the Performance of the CHIMERE Model over Spain Using Meteorology from MM5 and WRF Models . . . . . . . . . . . . . . . . . . . . . Marta G. Vivanco, Inmaculada Palomino, Fernando Mart´ın, Magdalena Palacios, Oriol Jorba, Pedro Jim´enez, Jos´e Mar´ıa Baldasano, and Oier Azula Parameter-Less GA Based Crop Parameter Assimilation with Satellite Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shamim Akhter, Keigo Sakamoto, Yann Chemin, and Kento Aida
94
107
118
XXII
Table of Contents – Part I
Compactness and Flow Minimization Requirements in Reforestation Initiatives: An Integer Programming (IP) Formulation . . . . . . . . . . . . . . . . Pablo Vanegas, Dirk Cattrysse, and Jos Van Orshoven
132
Application of the Urban Green Assessment Model for the Korean Newtowns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sangkeun Eom and Seungil Lee
148
Modeling Un-authorized Land Use Sprawl with Integrated Remote Sensing-GIS Technique and Cellular Automata . . . . . . . . . . . . . . . . . . . . . . Norzailawati Mohd. Noor and Mazlan Hashim
163
A Study on the Driving Forces of Urban Expansion Using Rough Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yong Ge and Feng Cao
176
Assessing the Impact of Individual Attitude towards Otherness on the Structure of Urban Residential Space: A Multi-actor Model . . . . . . . . . . . Andr´e Ourednik
189
Incorporation of Morphological Properties of Buildings’ Descriptors Computed from GIS and LIDAR Data on an Urban Multi-agent Vector Based Geo-simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cl´ audio Carneiro, Vitor Silva, and Fran¸cois Golay
205
A Spatial Structural and Statistical Approach to Building Classification of Residential Function for City-Scale Impact Assessment Studies . . . . . . Dimitrios P. Triantakonstantis and Stuart L. Barr
221
Spatial Distribution of Social Benefit Given by Urban Attractions: A Test of UrAD Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Luca D’Acci
237
Economic Evaluation and Statistical Methods for Detecting Hot Spots of Social and Housing Difficulties in Urban Policies . . . . . . . . . . . . . . . . . . . Silvestro Montrone, Massimo Bilancia, Paola Perchinunno, and Carmelo Maria Torre
253
An Urban Study of Crime and Health Using an Exploratory Spatial Data Analysis Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Su-Yin Tan and Robert Haining
269
Geospatial Crime Scene Investigation – From Hotspot Analysis to Interactive 3D Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Markus Wolff and Hartmut Asche
285
Agent-Based Distributed Component Services in Spatial Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Allan J. Brimicombe, Yang Li, Abdullah Al-Zakwani, and Chao Li
300
Table of Contents – Part I
A Cellular Automata-Ready GIS Infrastructure for Geosimulation and Territorial Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ivan Blecic, Andrea Borruso, Arnaldo Cecchini, Antonio D’Argenio, Fabio Montagnino, and Giuseppe A. Trunfio An Integrated Methodology for Medieval Landscape Reconstruction: The Case Study of Monte Serico . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Maria Danese, Marilisa Biscione, Rossella Coluzzi, Rosa Lasaponara, Beniamino Murgante, and Nicola Masini
XXIII
313
328
Neural Network Based Cellular Automata Model for Dynamic Spatial Modeling in GIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yogesh Mahajan and Parvatham Venkatachalam
341
A Model-Based Scan Statistics for Detecting Geographical Clustering of Disease . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Massimo Bilancia, Silvestro Montrone, and Paola Perchinunno
353
New Prospects in Territorial Resource Management: The Semantic Web GIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ernesto Marcheggiani, Michele Nucci, and Andrea Galli
369
Analysing the Role of Accessibility in Contemporary Urban Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Henning Sten Hansen
385
An Indoor Crowd Simulation Using a 2D-3D Hybrid Data Model . . . . . . Chulmin Jun and Hyeyoung Kim Measuring Effectiveness of Pedestrian Facilities Using a Pedestrian Simulation Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Seunjae Lee, Seunjun Lee, and Shinhae Lee Impact of the Norm on Optimal Locations . . . . . . . . . . . . . . . . . . . . . . . . . . Marc Ciligot-Travain and Didier Josselin Designing Road Maintenance Data Model Using Dynamic Segmentation Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mohammad Reza Jelokhani-Niaraki, Ali Asghar Alesheikh, Abbas Alimohammadi, and Abolghasem Sadeghi-Niaraki
397
413 426
442
GeoSOM Suite: A Tool for Spatial Clustering . . . . . . . . . . . . . . . . . . . . . . . . Roberto Henriques, Fernando Ba¸c˜ ao, and Victor Lobo
453
Handling Spatial-Correlated Attribute Values in a Rough Set . . . . . . . . . . Hexiang Bai and Yong Ge
467
Rough Qualitative Spatial Reasoning Based on Rough Topology . . . . . . . Anahid Bassiri, Mohammad R. Malek, and Ali A. Alesheikh
479
XXIV
Table of Contents – Part I
An Integrative Approach to Geospatial Data Fusion . . . . . . . . . . . . . . . . . . Silvija Stankut˙e and Hartmut Asche Designing Data Warehouses for Geographic OLAP Querying by Using MDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Octavio Glorio and Juan Trujillo Turning Point Clouds into 3d Models: The Aqueduct of Segovia . . . . . . . Juan Mancera-Taboada, Pablo Rodr´ıguez-Gonz´ alvez, and Diego Gonz´ alez-Aguilera Design and Implementation of a GSM Based Automatic Vehicle Location System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ali Mousavi, Mohammad A. Rajabi, and Mohammad Akbari Temporal Relationships between Rough Time Intervals . . . . . . . . . . . . . . . Anahid Bassiri, Mohammad R. Malek, Ali A. Alesheikh, and Pouria Amirian Inverse Transformation for Several Pseudo-cylindrical Map Projections Using Jacobian Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cengizhan Ipbuker
490
505 520
533 543
553
Workshop on Wireless and Ad Hoc Networking (WAD 2009) A Cluster-Based Mobility Model for Intelligent Nodes . . . . . . . . . . . . . . . . Morteza Romoozi, Hamideh Babaei, Mahmood Fathy, and Mojtaba Romoozi
565
A Scalability Analysis of TDMA-Based Ad Hoc MAC Protocols . . . . . . . Sang-Chul Kim and Chong-Woo Woo
580
Noun and Keyword Detection of Korean in Ubiquitous Environment . . . . Seong-Yoon Shin, Oh-Hyung Kang, Sang-Joon Park, Jong-Chan Lee, Seong-Bae Pyo, and Yang-Won Rhee
593
Phased Scene Change Detection in Ubiquitous Environments . . . . . . . . . . Seong-Yoon Shin, Ji-Hyun Lee, Sang-Joon Park, Jong-Chan Lee, Seong-Bae Pyo, and Yang-Won Rhee
604
A File Carving Algorithm for Digital Forensics . . . . . . . . . . . . . . . . . . . . . . Deok-Gyu Park, Sang-Joon Park, Jong-Chan Lee, Si-Young No, and Seong-Yoon Shin
615
Data Discovery and Related Factors of Documents on the Web and the Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hyun-Joo Moon, Sae-Hun Yeom, Jongmyung Choi, and Chae-Woo Yoo
627
Table of Contents – Part I
Thorough Analysis of IEEE 802.11 EDCA in Ring Topology Scenarios with Hidden and Exposed Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Katarzyna Kosek, Marek Natkaniec, and Luca Vollero
XXV
636
Issues for Applying Instant Messaging to Smart Home Systems . . . . . . . . Jongmyung Choi, Sangjoon Park, Hoon Ko, Hyun-Joo Moon, and Jongchan Lee
649
The Scheme of Handover Performance Improvement in WiBro . . . . . . . . . Wongil Park
662
Workshop on PULSES V- Logical, Scientific and Computational Aspects of Pulse Phenomena in Transitions (PULSES 2009) Study of the Correlation between the Concrete Wall Thickness Measurement Results by Ultrasonic Pulse Echo and the PTF Model for Assymetrical Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lucian Pusca
671
Detection Weak Period Signal Using Chaotic Oscillator . . . . . . . . . . . . . . . Qizhong Liu and Wanqing Song
685
Energy Metrics and Sustainability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mory Ghomshei and Francesco Villecco
693
The Ultrasonic Measurement for Object Vibration . . . . . . . . . . . . . . . . . . . Li-Ping Zhang and Qiu Yu
699
The Generalized Wavelets Based on Meyer Wavelet . . . . . . . . . . . . . . . . . . Xudong Teng and Xiao Yuan
708
Reconstructing Microwave Near-Field Image Based on the Discrepancy of Radial Distribution of Dielectric Constant . . . . . . . . . . . . . . . . . . . . . . . . Zhifu Tao, Qifeng Pan, Meng Yao, and Ming Li
717
Fractals Based on Harmonic Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Carlo Cattani
729
Modelling Hodgkin-Huxley Neurons Interaction . . . . . . . . . . . . . . . . . . . . . . Gianni Mattioli and Massimo Scalia
745
Workshop on High-Performance Computing and Information Visualization (HPCIV 2009) Fourth Order Approximation with Complexity Reduction Approach for the Solution of Time Domain Maxwell Equations in Free Space . . . . . . . . Mohammad Khatim Hasan, Mohamed Othman, Zulkifly Abbas, Jumat Sulaiman, and Fatimah Ahmad
752
XXVI
Table of Contents – Part I
Nine Point-EDGSOR Iterative Method for the Finite Element Solution of 2D Poisson Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jumat Sulaiman, Mohamed Othman, and Mohammad Khatim Hasan
764
An Online and Predictive Method for Grid Scheduling Based on Data Mining and Rough Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Asgarali Bouyer, Mohammadbagher Karimi, and Mnsour Jalali
775
A Survey of Cloud Platforms and Their Future . . . . . . . . . . . . . . . . . . . . . . Milad Pastaki Rad, Ali Sajedi Badashian, Gelare Meydanipour, Morteza Ashurzad Delcheh, Mahdi Alipour, and Hamidreza Afzali
788
Workshop on Sensor Network and its Applications (SNA 2009) An Efficient Clustering Protocol with Reduced Energy and Latency for Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Allirani and M. Suganthi
797
Dynamic Threshold Control-Based Adaptive Message Filtering in Mobile Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sung Ho Jang, Yong Beom Ma, and Jong Sik Lee
810
Design of a Reliable Traffic Control System on City Area Based on a Wireless Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Junghoon Lee, In-Hye Shin, and Cheol Min Kim
821
Design of a Cache Management Scheme for Gateways on the Vehicular Telematics Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Junghoon Lee, In-Hye Shin, Gyung-Leen Park, Ik-Chan Kim, and Yoon-Jung Rhee An Efficient Topology Control and Dynamic Interval Scheduling Scheme for 6LoWPAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sun-Min Hwang and Eui-Nam Huh Relationship Based Privacy Management for Ubiquitous Society . . . . . . . Yuan Tian, Biao Song, and Eui-Nam Huh
831
841 853
Real-Time Monitoring of Ubiquitous Wearable ECG Sensor Node for Healthcare Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Do-Un Jeong and Hsein-Ping Kew
868
Authentication Analysis Based on Certificate for Proxy Mobile IPv6 Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Seong-Soo Park, Jong-Hyouk Lee, and Tai-Myoung Chung
885
Table of Contents – Part I
XXVII
Workshop on Collective Evolutionary Systems (IWCES 2009) A Triangular Formation Strategy for Collective Behaviors of Robot Swarm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xiang Li, M. Fikret Ercan, and Yu Fai Fung
897
P2P Incentive Mechanism for File Sharing and Cooperation . . . . . . . . . . . Junghwa Shin, Taehoon Kim, and Sungwoo Tak
912
Modeling Agents’ Knowledge in Collective Evolutionary Systems . . . . . . . Rajdeep Niyogi and Alfredo Milani
924
Collective Evolutionary Indexing of Multimedia Objects . . . . . . . . . . . . . . C.H.C. Leung, W.S. Chan, and J. Liu
937
A Hybrid Approach for Selecting Optimal COTS Products . . . . . . . . . . . . Pankaj Gupta, Mukesh Kumar Mehlawat, Garima Mittal, and Shilpi Verma
949
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
963
Ontology-Based Requirements Conflicts Analysis in Activity Diagrams Chi-Lun Liu Department of Information and Electronic Commerce, Kainan University, Taoyuan County, Taiwan
[email protected]
Abstract. Requirements conflicts analysis is one of most crucial activities in successful software engineering projects. Activity diagrams are a useful standard for modeling process behaviors of systems. This paper utilizes ontological approach to analyze conflicts in the requirement specifications of activity diagrams. The proposed conflicts analysis method includes a modeling process and a set of conflicts detection rules. Several scenarios of electronic commerce are also provided for demonstrating the validity of the proposed rules. The benefits of the proposed method are threefold: (1) The method provides systematic steps for modeling requirements and ontologies. (2) This method also offers a set of questions for facilitating requirements elicitation about activity diagrams. (3) The proposed rules can systematically assist in requirements conflicts detection.
1 Introduction Listening user requirements is one of importance tasks in successful software systems development [9]. Empirical data reveals that investing efforts in detailed requirement analysis considerably reduce drawbacks in software developments [17]. Unified modeling language (UML) is a common standard for requirements modeling. Activity diagrams in the UML are used for modeling the dynamic aspects of information systems. Hence the activity diagrams analysis is a key requirements engineering step in the successful software development projects. Inconsistent conflicts in information are ubiquitous in the real world [19]. Conflicts analysis is one of fundamental activities in requirements analysis tasks [16]. Making contradictory design decisions is common in industry and it is hard to recognize these conflicts [5]. In the past decade, several studies have been undertaken for analyzing requirement conflicts. These studies provide systematic approaches for modeling requirements and analyzing conflicts among requirement models. In the mean time, utilizing ontolgies for knowledge management is emerging in the recent years [2] [13]. However, only few of the existing studies use ontologies as a semantic approach for requirements analysis. This paper proposes a method which aims to detect conflicts in order to improve the inconsistencies problems of requirements. The proposed method includes two main parts. The first part of this method is a conflicts analysis process which includes preparing necessary information for conflicts detection. The second part of this O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 1–12, 2009. © Springer-Verlag Berlin Heidelberg 2009
2
C.-L. Liu
method is a set of conflicts detection rules which can be used in the end of the proposed process. The advantages of the proposed method are threefold: On the one hand this method offers a systematic and step-by-step process for modeling ontologies and requirements in activity diagrams as well as analyzing conflicts. This method also offers a set of explicit questions for facilitating requirements elicitation works. Finally, requirement conflicts can be systematically detected by the proposed rules based on onotologies and requirement models. The remainder of this paper is structured as follows: Section 2 gives an overview of the state of art of requirements conflicts analysis. Section 3 presents the proposed process of requirement conflicts analysis based on ontologies. Section 4 illustrates several rules for conflicts detection and provides the scenarios for demonstrating how these rules work appropriately.
2 Current Requirements Conflicts Analysis Studies Several studies have been undertaken for requirements conflicts analysis. Hausmann, Heckle, and Taentzer [8] reveal that a conflict is occurred if an attributes or a relation is different between two valid versions of UML object diagram. Two different versions of software specification should be merged as a consistent one. The study of Kim, Park, Sugumaran, and Yang [11] partitions a requirement into four elements: subject, verb, object, and resource. Their study proposes a set of questions for checking conflicts in these partitioned requirements, such as checking resource limitation conflicts. And the conflicts should be resolved according to priorities. Gervasi and Zowghi [6] reason conflicts in natural language requirements. Their study focuses on logical contradiction which is a specific kind of conflicts. For example, “car drive on bus lane” conflicts with “car cannot drive on bus lane”. Here is a logical contradiction because there is an antonym relationship between drive on and cannot drive on. Using ontologies for detecting requirements conflicts is an emerging topic. Ontologies are a shared conceptualization [7] which includes concepts and relationships such as kind relationships and composition relationships [4]. For applying ontologies to requirements engineering, Kaiya and Saeki [10] store all relevant terms into ontologies and use these terms to constitute requirements. All words in the requirements should be established in the ontologies. According to the conflict relations between concepts in ontologies, the conflicts in requirement can be detected. For example, consider two concepts in the music player case: Time efficiency and dynamic loading function. There is a conflict relation between two concepts in ontologies. If a requirement wants to add dynamic loading function, time efficiency will be decreased unwillingly. Therefore the conflict is detected according to the ontologies. Besides, Robinson and Volkov [14] use ontologies to facilitate conflict resolution in meeting scheduling requirements. This study propose two strategies to resolve conflicts: object restructuring and condition restructuring. Object restructuring is to define one or more sub concepts to avoid a conflict. And condition restructuring means that conflicted requirements should be restricted within different situations.
-
Ontology-Based Requirements Conflicts Analysis in Activity Diagrams
3
Sapna and Mohanty [15] focus on inconsistency checking among UML models. Twelve rules are provided for detect inter-model inconsistencies, such as inconsistencies between sequence and class diagram. Their study does not emphasize the inconsistencies detection within a specific UML diagram. The above related works are summarized as table 1. Table 1. The summary of conflict analysis works
Hausmann, Heckle, and Taentzer [8] Kim, Park, Sugumaran, and Yang [11] Gervasi and Zowghi [6] Kaiya and Saeki [10] Robinson and Volkov [14] Sapna and Mohanty [15]
Metamodel UML Object Diagram
Conflict Analysis Approach Find differences in different versions.
Subject-Verb -Object-Resource
Partition requirements and check resource and activity conflicts.
Subject-Verb -Object X is a instance of Y, Subject-Verb-Object
Detect logical contradiction in natural language requirements. Detect requirements conflicts according to predefined conflict relations in ontologies. Use ontologies to facilitate conflict resolution Use a set of rules to detect inconsistencies among different kinds of UML diagrams
Meeting Scheduler UML
Based on the above existing studies, this paper focuses on analyzing conflicts in the UML activity diagrams. The activity diagram conflicts analysis is important because activity diagram language is an important metamodel for modeling prevailing web systems [12]. To systematically detect requirements conflicts, ontologies are important because ontologies can make explicit the meaning of requirement information. In addition, the above related works shows that there is a gap between ontologies analysis and activity diagrams analysis. The semantics (e.g. kind and composition relationship between concepts) in the activity diagrams still do not be appropriate considered in the current studies. To bridge this gap, this paper adopts the ontology-based approach for conflicts detection in activity diagrams.
3 Proposed Ontology-Based Requirements Conflicts Analysis Process This study proposes the ontology-based requirements conflicts analysis (ORCA) process. In the beginning of the ORCA process, software engineers organize important concepts and relations of domain knowledge into ontologies. For example, registered member is a kind of website visitor. Then software engineers can use these concepts in ontologies to model known requirements. New requirements are also modeled in the next step. Based on the above ontologies and requirement models, it is possible to detect inconsistencies in a semantic way.
4
C.-L. Liu
Modeling Prior Knowledge
Modeling New Requirements
Detecting Conflicts
Fig. 1. Requirements conflicts analysis process
The ORCA process includes three steps: modeling prior knowledge, modeling requirements, and detecting conflicts. The three steps are depicted in fig. 1 and described as follows. (1) Modeling prior knowledge: Analysis and interpretation works have to base on prior knowledge [3]. The prior knowledge includes domain knowledge and adopted existing requirements in requirements analysis works. The analysis work in this step is to modeling domain knowledge and adopted requirements according to existing situations, business needs, related regulations, and agreed assumptions in organizations. All important terms related to specific relevant domains are stored in ontologies. These terms in ontologies can be used to represent known described or normative requirements. For example, store credit card number is a kind of handle credit card information in the action state ontology. Checkout process is a part of shopping website in the use case ontology. Therefore the security requirement using terms in ontologies is: “Checkout process should not include store credit card number”. Based on the flowchart structure of the UML activity diagram (Booch et al., 1998), the following six questions are proposed for requirements elicitation: a. b. c. d. e. f.
Which action state is the starting point in a use case? Which action state is the ending point of the condition in a use case? Which action state should be executed in advance before the other action state in a use case? Which kind of action state should not be executed in a use case? The process length of which process segment cannot be increased by inserting new action state in a use case? Which action state must be included in a use case?
In sum, this step produces the ontologies and the existing requirement models. The ontologies can be the basis for modeling new requirements. And the existing requirement models can be used to determine whether new requirements are consistent with known user demands in the follow-up step. (2) Modeling new requirements: Analyzing new requirements based on prior knowledge is the major task in this step. For improving functionalities of software systems, new user requirements can be elicited by the above six questions which are proposed in the step 1. Terms in ontologies, which are established in step 1, can be used to articulate models of new requirements. If a new requirement includes a new term which go beyond the scope of the ontologies, the process should return to step 1 and this unexpected term should be added in the ontologies. All terms used to describe requirements should be established and controlled in ontologies. After modeling new requirements and
Ontology-Based Requirements Conflicts Analysis in Activity Diagrams
5
related domain knowledge in step1 and step 2, there are sufficient data for detecting inconsistencies in requirements. (3) Detect conflicts: The goal of step 3 is to detect requirements conflicts based on the data which is produced in step 1 and step 2. This paper proposes a set of rules for detection conflicts. For example, requirement X is “Checkout process should not include store credit card number”. Requirement Y is “Payment phase includes store credit card number“. Requirement X conflicts with requirement Y because there is an antonym relationship between should not store and includes and there is a composition relationship between payment phase and checkout process. In addition, several proposed rules are discussed in the next session.
4 Proposed Conflict Detection Rules Seven rules are proposed for detecting conflicts in activity diagrams. These rules are workable if necessary information including ontologies and requirement models in activity diagrams are provided. Several scenarios of an online retailing website are also proposed to explain how these rules effectively works. Rule 1: There is a shortcut conflict if action state I, K, and M are in use case R, action state N, and P are in use case T, action state K comes after I and before M in requirement X, there is an arrow O between N and P in requirement Y, there is an equality relationship between action state I and N and between M and P, and there is an equality, kind, or composition relationship between use case R and T. Shortcut conflict means a requisite action state is omitted in an activity diagram. Rule 1 can be used to detect the shortcut conflict. Fig. 2 illustrates rule 1. For example, consider a scenario includes two requirements. Requirement X is “Update personal information (action state K) comes before fill out shipping information (action state M) and comes after use credit card for payment (action state I)”. All action state K, M, and I are in checkout process (use case R). Requirement Y is “The next step of use credit card for payment (action state N) is fill out shipping information (action state P)”. Both action state N and P are in checkout process (use case T). Action state I is equal to action state N. Action state M is equal to action state P. And use case R is equal to use case T. According to rule 1, Update personal information (action state K) is omitted and the shortcut conflict is occurred between requirement X and Y. Rule 2: There is an initial states conflict if initial state I and action state M are in use case E, initial state J and action state N are in use case F, and action state M, which connects with initial state I through arrow K, is unequal to action state N, which connects with initial state J through arrow L. Initial states conflict means two different action states can be the start point of a use case alternatively. Rule 2 is to detect the conflicts about initial states in an activity diagram. This rule considers initial states, arrows, and action states. Rule 2 is depicted in fig. 3.
6
C.-L. Liu
Requirement Y
Action State I Equality
Action State N
Before J Requirement X
Arrow O Action State K Action State P Before L Equality Use Case T
In S
Action State M
Equality, Kind, or Composition Use Case R
In Q
Fig. 2. Shortcut conflict
Initial State I
Initial State J Arrow L
Arrow K Inequality Action State M
In Q
Action State N
Use Case E
Use Case F
In Q
Equality Requirement X
Requirement Y
Fig. 3. Initial states conflict
For example, consider two requirement models related to rule 2. Requirement X indicates displaying privacy protection policy (action state M) is the beginning of registration process (use case E). Requirement Y indicates filling out member information (action state N) is the beginning of registration process (use case F). According to rule 2, requirement X and requirement Y are conflicted because use case E is equal to
Ontology-Based Requirements Conflicts Analysis in Activity Diagrams
7
use case F, arrow K connects initial state I and action state M (displaying privacy protection policy), arrow L connects initial state J and action state N (filling out member information), and action state M is unequal to action state N. Rule 3: There is a final states conflict if action state I, which connects with final state M through arrow K, is unequal to action state J, which connects with final state N through arrow L, and action state I and J are in the same condition and the same use case. Final states conflict means that two action states can be the end of a condition of a use case alternatively. Rule 3 is used to detect the conflicts related to final states. This rule considers conditions, use cases, final states, arrows, and action states. Rule 3 is depicted in the following figure. Here is a scenario for demonstrating rule 3. Requirement X expresses displaying congratulation information should be done in the end of the successful registration condition of member registration. Requirement Y indicates displaying registration benefits should be done in the end of successful registration condition of member registration. According to rule 3, a conflict is occurred because displaying registration benefits (action state J) is unequal to displaying congratulation information (action state I) and action state I and J are alternative endings in the successful registration condition (condition Q and R) of member registration (use case U and V). Rule 4: There is a sequence conflict if action state I comes before action state M, action state J comes before action state N, there is an equality, kind, or composition relationship between action sate I and N and between action state M and J, action state I and M are in use case U, action state J and N are in use case V, and there is an equality, kind or composition relationship between use case U and V. Sequence conflict means that two action states are arranged in different order. Fig. 5 depicts why a sequence conflict occurs in rule 4. In fig. 5, there are two sequential action state in requirement X and Y. Here is an example of how rule 4 works. In this example, requirement X is “displaying privacy protection policy comes before filling out form in the use case of shopping website”. Requirement Y is “filling out customer satisfaction questionnaire comes before displaying privacy protection policy in the use case of customer service”. The action state ontology stores the following information “Filling out customer questionnaire is a kind of filling out form”. The use case ontology indicates that “customer service is a kind of “shopping website”. According to rule 4, a conflict is occurred because both action state I and N are equal to displaying privacy protection policy, action state J (filling out customer questionnaire) is a kind of action state M (filling out form), and use case V (customer service) is a kind of use case U (shopping website). Rule 5: There is an action state addition conflict if K and M of action state O are in requirement X, L and N of action state P are in requirement Y, there is an antonym relationship between K and L, and there is an equality, kind, or composition relationship between object M and N and between use case I and J.
8
C.-L. Liu
Action State I
Action State J Inequality
Arrow K
Arrow L
Final State M
Final State N
Condition Q
In O
Equality
Condition R
In S
In P
In T
Use Case U
Equality
Use Case V
Requirement X
Requirement Y
Fig. 4. Final states conflict
Action State I
Action State J
Equality, Kind, or Composition
Before K
Before L
Action State M
In O
Action State N
Use Case U
Use Case V
In P
Equality, Kind, or Composition Requirement X
Requirement Y
Fig. 5. Sequence conflict
Action state addition conflict means that an unnecessary action state is added in an activity diagram. Rule 5 is to detect action state addition conflict. Fig. 6 shows how rule 5 works.
Ontology-Based Requirements Conflicts Analysis in Activity Diagrams
9
Requirement X
Object M
Should not Verb K
In Q
Use Case I
Action State O Antonym
Equality, Kind, or Composition
Object N
Verb L
In R
Equality, Kind, or Composition
Use Case J
Action State P Requirement Y
Fig. 6. Action state addition conflict
Here is a small scenario for explaining this rule. Company α has a privacy protection policy as requirement X: “Our retailing website should not store customer’s credit card numbers to prevent credit card number stealing”. However, a marketing employee β does not know this privacy protection policy and propose requirement Y: “There should be an action state of store credit card number in checkout process”. The ontologies reveal that checkout process is a part of retailing website. According to rule 5, requirement X and Y are conflicted because there is a composition relationship between use case I (retailing website) and use case J (checkout process), there is a antonym relationship between K (should not store) and L (store), and both M and N are credit card number. Rule 6: An action state deletion conflict occurs if action state J is deleted in use case L in requirement X, use case M must have action state O, and there is an equality, kind, or composition relationship between use case L and M and action state J and O. Action state deletion conflict means that a requisite action state is deleted. Rule 6 is illustrated in fig. 7. For example, requirement Y is a company policy: “checkout process (use case M) must have displaying return policy (action state O)”. However, requirement X wants to delete displaying return policy (action state J) in checkout process (use case K). Therefore requirement X and Y are conflicted according to rule 6. Rule 7: There is a process length conflict if requirement X is to insert action state J between action state L and N, and requirements Y is to inhibit the process length growth of use case Q which includes action state L and N.
10
C.-L. Liu
Requirement X Delete I
Requirement Y
Use Case M
Action State J
Use Case L
In K
Equality, Kind, or Composition
Must Have N
Action State O
Fig. 7. Action state deletion conflict
Process length conflict means that the length of the process is increased unwillingly. Fig. 8 depicts the conflict situation in rule 7. For example, consider a shopping cart discard problem [18] in a checkout process. An online survey shows that a lot of online customers discard their shopping carts due to the long and boring checkout process. In order to decrease shopping cart discard rate, requirement Y indicates that the process length of checkout process (Use Case Q) should not be increased. However, a sales representation express a requirement X which indicates that he wants to insert cross sell (action state J) between update shopping cart list (action state L) and fill out shipping information (action state N) for selling additional products. In the UML activity diagrams, the action states of cross sell and fill out shipping information are included in the use case of checkout process. According to rule 6, a process length conflict is occurred in this scenario.
Requirement X
Action State J
Between K
Requirement Y
Insert I
Action State N
And M
Action State L
Include
Include Process Length Growth of P
Inhibit O
Fig. 8. Process length conflict
Use Case Q
Ontology-Based Requirements Conflicts Analysis in Activity Diagrams
11
5 Conclusion and Future Work In order to reinforce requirements analysis this paper proposes a requirements conflicts analysis method. The proposed method provides key concepts of conflicts analysis including a process, a set of elicitation questions, and rules. Several scenarios are provided to validate the proposed conflict detection rules. The method has two advantages and a main limitation. This method has the advantage of a feasible analysis process. This method offers a clear and systematic road map for necessary data collection in activity diagrams analysis works. The second advantage of this method is effectiveness. This method attempts to facilitate conflict identification works through the effective rules. These rules are validated by the scenarios of electronic commerce. The main limitation of this method is that the method can not be applied in developing complete innovative new systems in their initial stages. Eliciting complete and correct prior knowledge is difficult when the requirements of new systems are still vague. For automatic support, developing a requirements management tool based on the proposed method is undertaken. The requirements management tool will offer the function of automatic conflict detection. The effectiveness of the requirements management tool can be used in experiments and evaluated empirically in the future. Besides, proposing new conflict detection rules in other UML diagram, such as use cases and sequence diagram, is a valuable direction of further study.
References 1. Booch, G., Rumbaugh, J., Jacobson, I.: The Unified Modeling Language User Guide. Addison-Wesley, Boston (1998) 2. Cayzer, S.: Semantic Blogging and Decentralized Knowledge Management. Communications of the ACM 47(12), 47–52 (2004) 3. Chalmers, M.: Hermeneutics, Information and Representation. European Journal of Information Systems 13(3), 210–220 (2004) 4. Chandrasekaran, B., Josephson, J.R., Benjamins, V.R.: What are Ontologies, and Why do We Need Them? IEEE Intelligent Systems 14(1), 20–26 (1999) 5. Egyed, A.: Instant Consistency Checking for the UML. In: 28th International Conference on Software Engineering, pp. 381–390. IEEE Press, New York (2006) 6. Gervasi, V., Zowghi, D.: Reasoning about Inconsistencies in Natural Language Requirements. ACM Transactions on Software Engineering and Methodology 14(3), 277–330 (2005) 7. Gruninger, M., Lee, J.: Ontology: Applications and Design. Communications of the ACM 45(2), 39–65 (2002) 8. Hausmann, J.H., Heckel, R., Taentzer, G.: Detection of Conflicting Functional Requirements in a Use Case-Driven Approach. In: 24th International Conference on Software Engineering, pp. 105–115. IEEE Press, New York (2002) 9. He, J., King, W.R.: The Role of User Participation in Information Systems Development: Implications from a Meta-analysis. Journal of Management Information Systems 25(1), 301–331 (2008) 10. Kaiya, H., Saeki, M.: Ontology based Requirements Analysis: Lightweight Semantic Processing Approach. In: Fifth International Conference on Quality Software, pp. 223– 230. IEEE Press, New York (2005)
12
C.-L. Liu
11. Kim, M., Park, S., Sugumaran, V., Yang, H.: Managing Requirements Conflicts in Software Product Lines: A Goal and Scenario based Approach. Data & Knowledge Engineering 61(3), 417–432 (2007) 12. Koch, N., Escalona, M.J.: Metamodeling the Requirements of web systems. Lecture Notes in Business Information Processing 1(Part II), pp. 267–280 (2007) 13. Nomaguchi, Y., Fujita, K.: DRIFT: A Framework for Ontology-based Design Support Systems. In: Semantic Web and Web 2.0 in Architectural, Product, Engineering Design Workshop, Manfred Jeusfeld, Tilburg (2007) 14. Robinson, W.N., Volkov, S.: Requirement Conflict Restructuring. GSU CIS Working Paper 99-5, Georgia State University, Atlanta, GA (1999) 15. Sapna, P.G., Mohanty, H.: Ensuring Consistency in Relational Repository of UML Models. In: 10th International Conference on Information Technology, pp. 217–222. IEEE Press, New York (2007) 16. Sommerville, I.: Integrated Requirements Engineering: A Tutorial. IEEE Software 22(1), 16–23 (2005) 17. Sommerville, I., Ransom, J.: An Empirical Study of Industrial Requirements Engineering Process Assessment and Improvement. ACM Transactions on Software Engineering and Methodology 14(1), 85–117 (2005) 18. Strauss, J., El-Ansary, A., Frost, R.: E-Marketing, 4th edn. Prentice Hall, Upper Saddle River (2006) 19. Zhang, D.: Quantifying Knowledge Base Inconsistency Via Fixpoint Semantics. In: Gavrilova, M.L., Tan, C.J.K., Wang, Y., Yao, Y., Wang, G. (eds.) Transactions on Computational Science II. LNCS, vol. 5150, pp. 145–160. Springer, Heidelberg (2008)
Resource Allocation Optimization for GSD Projects Supraja Doma1, Larry Gottschalk1, Tetsutaro Uehara2, and Jigang Liu2 1
Metropolitan State University, 700 East Seventh Street, St. Paul, MN 55106, U.S.A. {Domasu01@go,Larry.Gottschalk@}metrostate.edu 2 Kyoto University, Kyoto, Japan {Uehara@media,
[email protected]}kyoto-u.ac.jp
Abstract. As globalization has become main phenomena in software development in the US since the year 2000, many software projects have been shipped out to other countries. Although the off-sourcing saves companies a significant amount of cost, Global Software Development (GSD) projects have created a significant challenge to the companies in terms of difference in geographical locations, time zones, and cultures. While 24-hour development model sounds to reduce the time and then the cost of software development, the expected outcome cannot be ensured if the tasks are not allocated to the proper resources in remote teams with considerations of the dependencies and constraints. In this paper, we propose an approach that can be used to reduce the overall time of GSD project development by allocating the tasks to the best possible resources based on an integrated analysis of the constraints and their impact on the overall product development. Keywords: Global Software Development, 24-hour software development, project scheduling, task allocation, resources’ optimization, uncertainties.
1 Introduction With the rapid development of global outsourcing, business is now able to access efficient labor across the boundaries for comparatively less cost. For instance, the number of employees at the IBM-India has gone from 1000 in 1992 to more than 50,000 in 2007 [13] and the saving on wages is extremely significant to the company. But what is meant by outsourcing? As defined in [14] outsourcing “is the management and day to day execution of business functions by a third party service provider i.e. the transfer of components or large segments of an organization’s internal IT infrastructure, staff, processes or applications to an external resource such as an application service provider located anywhere in the world.” Though outsourcing refers to many processes in the business, the scope of this paper is to restrict this to software industry where the project is split across various remotely located sites so as to have access to the skilled labor or to reduce the cost and time of production and/or to improve the efficiency of the developed product. This process is called Global Software Development (GSD). The difference between time zones among the virtual teams reduces the total time of development as O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 13–28, 2009. © Springer-Verlag Berlin Heidelberg 2009
14
S. Doma et al.
the work is distributed in such a way that all 24 hours of the day are effectively utilized, which is often called “follow the sun development.”[4]. Even though the phenomenon of job distribution among remote teams seems to be very efficient, practically there exist many gaps that need to be filled so as to attain a high quality software product within less time and decreased cost. Lack of communication & coordination, difference in processes, time zones, and geographical differences among remote teams that work on single project contribute to excessive re-work and delays which in turn increase the total budget of development in terms of time, cost and efficiency. This leads to failure of the basic concept of globalization. Many challenges that are related to time management, resource optimization, communication and co-ordination between the team members have to be carefully handled when software development is temporally and geographically distributed. Presently, a lot of research is in the process. Though many researchers have been working on these challenges of Global Software Development, very few among them focused on time and resource optimization. It is found that, to stay at the top position in the current market, business need to release the high quality products within short time intervals. Efficient time and resource utilization helps release a product within a short span with good performance provided there is good interaction among the teams. Most software companies are now distributing their projects globally to reduce time and cost of development by working with cheap labor on 24 hour model or to increase the performance of projects by having access to highly skilled and efficient labor across the world. The aim of this work is to develop an optimization algorithm to allocate tasks to the distributed sites using resources which are available at low cost in terms of wages, development efficiency and time taken to complete the task. This is an extension to the work done by Gourav Jain [10], who in his thesis, provided a technical solution to reduce the software project duration in distributed environment. Although developing an optimal method for allocating resources is critical, clear task descriptions, well defined deliveries, detailed planning, weekly checkpoints, team meetings, regular reviews of issues, and status reports are also necessary for completing a project successfully [2]. Apart from the proposed model, this paper also presents a brief analysis of various challenges faced by Global Software Development and a few methods suggested by previous researchers to overcome these issues. The result will be helpful to software industrialists to minimize the total development time using the best among the available resources. The model developed is more realistic as it considers dependencies, constraints and also the unexpected delays which may occur in the process of development. Apart from minimizing the time the model improves efficiency of the project development across remote areas.
2 Advantages and Risks in GSD In the traditional software development, all the developers, architects, team leads, mangers, etc who are involved in project development reside at a single geographical location. In modern or global software development the team is scattered across
Resource Allocation Optimization for GSD Projects
15
different time zones and continents which is very rewarding in terms of cost, efficiency, access to technical skill set, and time [13]. Apart from just time and skill set, Ita Richardson Valentine, Casey Dolores Zage and Wayne Zage [15] have formulated few advantages of cross-client development over the conventional approaches. They also pointed out that distributed software development helps to develop the product close to or in target markets which reduces overall time to market and results in cost savings to a project. For example if a web portal is to be developed by an American company - to meet the expectations of Chinese customers, it is always a better idea to develop the portal in Chinese language rather than English which helps the product to attract many customers; To meet such requirements, a local maintenance and development site helps to build and maintain an efficient portal which is more realistic rather than building it up somewhere else with assumptions. Another advantage is that if the sites are distributed in different time zones then development can be done continuously round the clock using 24 hour model. This reduces the development time and increases the rate of releasing the products which gives an edge over the competitors. Also in GSD, efficient labor can be available at cheaper rate. For example a developer with similar skill set and experience like the one available in United States can be available at much cheaper rate in Asia. This decreases the project development cost. Moreover, when people from different backgrounds or culture come together to work toward a common goal, the different levels of experience, technical knowledge and elementary understanding of a problem, can lean in favor of innovation [15]. Ebert and De Neve [3] declare that a mix of skill sets and life experiences within a team can result in improved coordination among GSD team members [15]. Since sites are distributed across the world, business can have access to the best among the people who are efficient and meet the skill requirements. Apart from the quality and labor, some companies opt for global development to save upon their taxes [13]. Although there exist many advantages, to achieve the complete idea behind the phenomenon of global software development, the software industries moving for globalization have to meet certain challenges such as differences between time zones, cultures, understanding of the requirements or specifications so that they are not influenced by local culture and should be as per the client requirement, coordination among the teams, organization of tasks, planning the estimates, etc. Details regarding these risks and challenges are mentioned in [6], [7], [8], [9], [12], [16], [17], [18], [19], and [20]. It is found that in a distributed environment, more time and resources are required to complete a task when compared to the time and resources that were utilized to complete a similar task by the team present at a single location. This is due to lack of communication. In a distributed team the developers may not be able to find the appropriate person to get the requirements clarified as there is no informal communication whereas it is much easier to find the expert in the concentrated team as all the members are located at the same location [18]. Apart from general cultural differences like language, it is found that disrespect to holidays and work culture spoils the offsite and onsite relationships. For example if local team is extending their working hours to meet a deadline and they expect the
16
S. Doma et al.
other team to extend their working hours. This may not be possible for the other team as it is a local holiday due to some culture related event. As mentioned in the [18] this will further help to decrease the trust between the two teams. There exist many reasons why distant teams could not have good coordination between them. The difference in the processes or working nature creates confusion and hence one team does not have information on the other teams’ activities. So, neither team can initiate a talk which later may lead to either rework or performance failure [8]. Teams could not communicate with team members of the other site frequently or as and when needed as is done with local team members due to difference in time zone, language, culture and also the geographical distance, also discussed in [8]. Moreover, it is found that sometimes the business has an inability to take necessary action and recover any damages by the carelessness of the outsourced company because the country’s law may not permit this [13]. There is always a threat to the intellectual property of the business as it shared with a third party. This even increases if the third party with which we share out sources its tasks. Apart from these, one major threat is to the data privacy. If the data which has to be used only for a particular company is used or leaked to other company then huge losses incur on the actual business. This is possible since outsourced company may deal with many clients and may willingly or unknowingly transfer data to its competitors. To reduce the risks discussed above few strategies to overcome the issues that arise in the process of project development have been proposed in [6], [7], [8], [9], [12], [16], [17], [18], [19], and [20]. Based on the study of these papers we would like to discuss the methods proposed by previous researchers to the above risks. Certain strategies like reducing the dependencies among tasks, maintaining the change history, using standard tools for version control and change management, introducing chat capabilities in the system exclusive to work can increase the communication or decrease the need for communication. This also keeps track of the activities of other team [6]. The models explained in [13] help us to solve some of the issues mentioned above. For example, onsite/offsite model helps to minimize the communication gap; offsite model helps to understand the requirements better; etc. The business has to decide on which model is appropriate to its transactions.
3 Analysis of the Related Work Scheduling tasks to the best possible resources is basically an “assignment problem,” which is one of the fundamental combinatorial optimization problems [1]. Gomes and Hsu proposed an Assignment Based Algorithm for resource allocation in [5]. As they described that the novelty of the algorithm is to consider multiple resources and multiple tasks at a time. Since the assumption of the algorithm is to have identical resources for various tasks, the algorithm does not fit to the problem we try to tackle. A tailored COCOMO II model is proposed in [11] to account for the additional cost drivers of global development. The proposed model “examines the significance of each of these factors as a contributor to the overall cost of a software development project.” [11] Although the time zone is considered in the proposed model, the main
Resource Allocation Optimization for GSD Projects
17
focus of the research is on calculating the cost rather than optimizing the allocation of resources. In [10] Gourav Jain has provided an algorithm, to reduce the time duration of product development based on a task scheduling algorithm – Critical Path Algorithm. According to him a graph is generated considering all operational, skill and resource constraints and then the critical path for this task graph is calculated. Also, a ready queue containing all the tasks that are ready to be executed and a resource set containing all available resources is maintained. The task graph, ready queue and resource set are the inputs for his algorithm. In his thesis, Gourav Jain focused on reducing the total time duration for project development. But the uncertainties like resource moving out of the project during the project run, delays in meeting the intermediate deadlines, etc have not been handled. Hence to ensure comparatively a more practical solution we introduced the concept of floats and PERT method. Introduction of PERT analysis makes the task tougher as the number of tasks increases. In the real world there exists more than one graph in the product development. The optimistic and pessimistic durations should be estimated very carefully. Here, a graph is a group of tasks that are bound by precedent rules. Final product development is again a group of such graphs i.e. there exist multiple critical paths prior to final critical path. So if one group of tasks are completed according to optimistic time estimation and other is completed according to pessimistic time estimation, it affects the final product release date as these two paths are a part of the final group of tasks that tend to release the product. Hence time estimations should neither be too optimistic nor be too pessimistic. Also, resource optimization was not the scope of his paper. In this paper, apart from handling the uncertainties we are trying to use the best resources available without affecting the estimated time or the product releases. Sometimes it happens that the minimum cost resource is available in the next time zone. In such cases we can assign that resource to the current task if the current task has a slack period which can be used to extend the start time without affecting the final deadline. Although his method is efficient to minimize the time using 24 hour development model in the distributed environment, it was more theoretical as it was constructed assuming (1) no delays in the project. Also, (2) while allocating tasks the resources were randomly picked which may lead to high cost of development. Both these problems are addressed in this paper so that the algorithm is enhanced.
4 Preparation for an Improved Algorithm In the software industry, it is very important to develop the product in less time and very efficiently to stay high in the current competitive world. These two factors play a critical role to reduce the cost and improve the performance of the product in software industry. When a brief analysis is made on network scheduling (like PERTPerformance Evaluation and Review Technique, CPM-Critical Path Method and GERT-Graphical Evaluation and Review Technique) & resource optimization algorithms and are compared with the distributed software development, we observe a lot
18
S. Doma et al.
of similarities. According to Hendrickson [7] “the network analysis can be used to decide the man, material and capital requirement and also aid in setting up the risk analysis of the project”. PERT was found to be a better networking scheduling algorithm compared to Gantt charts and CPM. Interdependencies between the activities are hard to show using Gantt charts and CPM is a deterministic method which does not consider the discrepancies’ during the implementation of the plan. PERT considers both interdependencies and discrepancies’ and hence proves to be a better option in software development. In this paper we would like to focus on providing a solution so as to assign the best available resource for the development of the task. Though both time and resource are equally important, we have to give priority to one over the other based on the product and its deadline. If we give priority to resource then it may take huge amount of time to develop a product and we may need to wait for the efficient resource until he/she is available. Hence, we would like to give priority to the time over the resource because if we wait for the best resource to become available, the idea behind the GSD (i.e. to ensure that work continues round the clock) may fail. The research problem is to allocate the tasks to the best resource available at the time of scheduling the tasks so as to minimize the total time taken for project completion considering the uncertainties that may arise during the project development. The project is divided into various tasks based on the internal and external dependencies. Internal dependency may be due to technical dependency. For example, while developing screens for filing an online application, it is necessary that screens should be developed in the order of execution as output of screen would be input to the next screen i.e. the task of developing screen 1 should be preceded by the task of developing screen 2. Sometimes it happens that execution of screen 1 in one way may lead screen 2 and in other way leads to screen 3. Execution of screen 2 or 3 will lead to screen 4. So in such type of application screen 2 and 3 can be developed simultaneously at two different sites working in same time zone but prior to screen 4. Also testing or evaluation of the task can be performed only after the development. This comes under external dependency. Hence prior to scheduling, all these precedence relationships should be carefully analyzed. Similarly, when assigning a task to a resource / site a brief analysis has to be done on the resources to estimate their cost in terms of availability, time taken to complete the task, efficiency and allocate that task to the optimal resource (Resource at lower cost). At times even a perfect time estimate may go wrong due to uncertainties like absences of resources, or a resource suddenly moves out of the team in the process of development or an intermediate task may take longer than expected during the project run. These uncertainties should be handled during the time estimation so that they do not obstruct the release of product. Applying PERT analysis on the tasks or trying to adjust the start times of a task based on floats so that the project duration does not get extended while scheduling the tasks may be the optimal solution to handle these uncertainties. The following activities are to be performed prior to using the algorithm: • Identify the requirements and divide them into various tasks. • Identify the precedence relationships and/or dependency constraints among the tasks. • Identify the skill set required and the associated resources and/or sites.
Resource Allocation Optimization for GSD Projects
19
• Estimate the cost of each resource in terms of availability, efficiency and task completion time based on their previous experiences with the team. The time taken by a team involving multiple resources from different cultures or regions will be far more than a team involving same culture and region. So, if 2 dependent tasks are allocated to 2 team members belonging to same culture or region, the time required by allocating the resources to complete the task will be less than otherwise. Availability cost may be low or high based on resource being available or not, i.e. the status of resource may be available, but in reality the resource might be on a leave or might suddenly resign the company and the onsite team that is allocating the tasks might not be informed about the resource. In such case, if a task is allocated to this resource the cost of allocation will definitely be more. History regarding how many times a task was delayed, cost of telecommunications, average number of issues raised per task, average number of days taken to close the issue, how easily conflicts can be resolved with the specific site/resource need to be considered while determining the cost of efficiency of the site or resource before assigning new tasks. Like above many more factors play a role while determining the cost of allocating a resource to a particular task. The graphical representation is defined as follows: Vertex or node Æ start or end time Edge or Connecting lines Æ Activity / Task ES ÆEarliest Start Time LS Æ Latest start Time EF Æ Earliest Finish Time LF Æ Latest Finish Time For example, Table 1 describes an example with estimated duration of a group of tasks and its corresponding graphical representation is shown in Figure 1. The critical path for the graph is referred by a bold colored line.
Fig. 1. Task graph with normal durations for Example 1
20
S. Doma et al. Table 1. Example 1 – Estimating Duration and Slack period Calculation
The main idea behind the development of algorithm is to minimize the total time taken to product development. This is achieved by using the CPU scheduling algorithm. The analysis on the current CPU scheduling algorithms says that Critical Path Method is the optimal algorithm. Scheduling procedure can be explained using graphs. Consider there are 19 tasks {A, B …S} to be performed. The development sites {S1, S2 …Sn} where n = no. of sites, are distributed over different time zones. S1 is the development site in time zone T1 and has a resource set R1 (R1 is the set of resources {r1, r2...rx} where x = no. of resources). The duration to perform task m is say n days and this is represented over the connected line as m, n. The precedence relationship, Activities on critical path and float for each activity are calculated for the above graph and are presented below: Float/Slack is the amount of delay that can be allowed while performing a task without having to reschedule or change the succeeding tasks. The critical path is the longest-duration path and the activities that lie on it cannot be delayed without delaying the project. A delay in the critical path delays the project. Hence the tasks that lie on critical path have zero float because of its impact on the entire project
Resource Allocation Optimization for GSD Projects
21
(i.e. ES=LS and EF=LF for all activities in the critical path). As given in [7], there exist three types of floats (Free float, Independent float and Total Float) and is defined as [7]: (i) Free float(FF) is the amount of delay allowed for a task so that the succeeding task does not get delayed. The free float, FF (i, j), associated with task (i,j) is: FF (i, j) = E (j) – E (i) – D (i, j); where E (j) Æ earliest start time of node (j); D(i,j) Æ Duration to finish task on edge( i,j) Example: Free Float of task A = E(1) – E(0) – 5 = 5 – 0 – 5 = 0; (ii) Independent float(IF) is the delay allowed for a task such that neither succeeding task gets delay not preceding task is forced to complete much earlier than the stipulated period. Independent float, IF(i,j), for activity (i,j) is calculated as: IF (i, j) = E (j) – L (i) – D (i, j); where E(j) Æ earliest start time of node(j); L(i) Æ Latest Start Time for node i; D(i,j) Æ Duration to finish task on edge( i,j) Example: Independent Float of task B = E(2) – L(1) – 2; = 7 – 5 – 2 = 0; (iii) Total float(TF) is the maximum delay allowed to an activity in such a way that the total project does not get delayed. The total float, TF(i,j), for any activity (i,j) is calculated as: TF (i, j) = L (j) – E (i) – D (i, j); where E(i) Æ earliest start time of node(i); L(j) Æ Latest Start Time for node j; D(i,j) Æ Duration to finish task on edge( i,j) Example: Independent Float of task B = E(2) – L(1) – 2; = 7 – 5 – 2 = 0; Though there are three types of floats, it all depends on the business as to which type of slack period is to be considered while scheduling the tasks. Here in this paper we considered the total float to overcome the uncertainties and allocate the tasks to the optimal resources without delaying the actual estimated deadline. • •
If slack is positive, it is the maximum time the activities on the path can be delayed. If slack is negative, it is the amount of time the tasks on the path must be speed up.
5 An Improved Algorithm Prior to implementing the Task Allocation Algorithm[Table2], the latest finish time, earliest start time and critical path of the tasks are calculated. The tasks that are ready to be executed at a time T are queued and allocated to the first site that are in the loop in the next time zone if the task meets the precedence requirements and the skill set of the task and resource matches. This can be explained by the following example – Say there are many sites available for a company distributed over different time zones.
22
S. Doma et al.
Assume the sites are scattered in t – 1, t, t + 1 time zone. Initially all available tasks are stacked up and scheduled to the site in t – 1 time zone. Say few tasks are completed by sites in t – 1 time zone, and then few other tasks whose precedents have been completed by site at t – 1 time zone are added to the availability queue. At the end of the day, these tasks in Q are assigned to site at t time zone if the existing resources meet the skill set required by them. This allocation may be perfect theoretically, but practically there exist few uncertainties viz. the site may be in the available status but they might be enjoying a local holiday. In such a situation we cannot allocate a task to it. Also, if a precedent task which is on the critical path is not completed as per the schedule, we may have to add an additional time so as to start the task. This gives rise to schedule slippage. So we need to recalculate the critical path and estimate a new deadline. This is not feasible practically. Hence we need to consider the uncertainties that occur during the project run and resource optimization when allocating the tasks prior to minimum time estimation. There are two methods to handle these uncertainties. 1.
2.
Calculate the floats and then start times can be adjusted in such a way so that the project does not get delayed. But this is not the best method suggested as if a preceding task of the task on critical path gets delayed; there lies no option but to delay the critical task. Since tasks on critical path have zero float, it is obvious that the schedule gets delayed. Another method is the implementation using PERT Analysis. This is an extension to the critical path method. In this method, apart from the normal duration time, a more optimistic and a more pessimistic duration are also estimated prior to the minimum time calculation using the CPM. The PERT method is to be applied for each and every task in the project so that estimations do not go wrong. This gives rise to three types of minimum time estimations for the project i.e. an optimistic deadline, a pessimistic deadline and a more likely deadline.
The task allocation is an extension to the algorithm mentioned in [10], which includes a queue that holds all tasks that are ready to be executed along with their start time, finish time, and slack. The new task allocation algorithm and resource optimization algorithm are presented in Tables 2 and 3 below: Table 2. Task Allocation Algorithm For each site in current time zone perform the following steps: Step 1: Allocate tasks in ready queue to resources using the resource optimization algorithm proposed under section below. Step 2: If a task is completed on time or prior to estimated time duration Add all its successors into the task queue without any change in their respective earliest start times. Else if it is delayed Adjust the start time of the succeeding tasks with respect to their corresponding slack time and then add them to the queue. Repeat steps 1 and 2 at the start of each time zone.
Resource Allocation Optimization for GSD Projects
23
Table 3. Resource Optimization Algorithm Step 1: Set the scheduled start time for each activity to the earliest start time. For each task i = 1, 2 … m available in queue initiate Allocate(i)= 0. Step 2: While i ≤ m; If Allocate(i) = 1; Increment i ; Clear the resource stack; Continue; Endif. Determine the cost of all resources who possess required skill sets to perform the task i in all remote sites. Rank all resources from the most important (lowest cost) to the least important (highest cost), and number the resources j = 1, 2, 3... n. While j ≤ n. Step 2b: if resource j is in current time zone Goto method Allocation(i, j). Step 2c: if resource j is in other time zone If waiting period ≤ slack(i) Goto method Allocation( i,j). Endif; Step 2d: If Allocate(i) = 0. Increment j. Else Exit. Endif. Endwhile. EndWhile. Allocation(i,j) If resource j is available Allocate the task i to resource j Allocate(i) = 1. Else if resource j is busy Get the slack, time duration of the task (k) i.e. currently assigned to resource j and compare to the slack of task i. If slack period of task i ≥ time duration of the task k Allocate the resource j to task i so the he completes task k without interruption and then starts task i. Allocate(i) = 1. Elseif slack period of task k ≥ time duration of task i Allocate the resource j to task i so that he completes task i interrupting task k and then goes for task k. Allocate(i) = 1. Endif; Else // resource is unavailable, check for next best resource) Do Nothing Endif;
6 Examples To demonstrate the correctness of the proposed new algorithms, a new example is given in Table 4. Its corresponding graphical representation in Table 4 and computations are shown in Figure 2.
24
S. Doma et al. Table 4. Example 2 – Estimating pessimistic and optimistic time durations
To further exam the algorithms, Example 1, introduced previously, is calculated accordingly and the corresponding results are given in Table 5 and Figure 3. Example 2 shows that even if there is a deadline slippage of a task (be it on critical path or whatever) at any point of time during the project run, there would not be a major difference in the actual release that was estimated. Here in the example the difference was shown to be just few hours. If PERT analysis was not applied and if there is slippage for task 7 in example 1 than whole estimation would have to be done with new times. This would rather take more time than estimated. But since we already considered a slippage and pessimistic time durations were estimated, a new critical path is determined and then the job is completed accordingly.
Resource Allocation Optimization for GSD Projects
a) Normal durations
b) Optimistic durations
c) Pessimistic durations
Fig. 2. Example 2 - Task graphs with different durations Table 5. Example 1 – Estimating pessimistic and optimistic time durations
25
26
S. Doma et al.
a) Normal durations
b) Optimistic durations
c) Pessimistic durations
Fig. 3. Example 1 - Task graphs with different durations Table 6. Resource Set
Time zone
Resources
Time period
T1
R1, R2, R3
0.00 – 08.00
T2
R4, R5
9.00 – 16.00
T3
R6, R7
17.00 – 24.00
For evaluating resource assignment, we considered the following resource sets and assumed a cost for each resource which in real time is estimated based on their efficiency, time and cost of working on that particular task. Here for this example, it is assumed that skill sets required for all tasks is same to simplify calculation; but it can be tested with different skill set requirement. Rank of resources = {R5, R6, R3, R4, R1, R2, R7} Cost is assumed to be 1, 2, 3, 4, 5, 6 and 7 respectively for the above order of resources shown in Table 6. If the resultant performances of resources using the above resource optimization algorithm are compared with random allocation it is found approximately that for example 2 there is 15% improvement and for example 1 there is 40% improvement.
7 Conclusion and Future Work As global distribution of software development is increasing day to day, there is a need to allocate each and every individual task to the best available and possible resource to increase the efficiency of the product and also to decrease the total development time so that the company stays high in this competitive world. Sites or resources cannot be
Resource Allocation Optimization for GSD Projects
27
picked randomly to assign to a job because there exists various other dependencies and constraints on the task to be developed. The model we proposed helps to allocate the job to the site where efficient resources are available without compromising over the time. The model proposed in this paper will help to improve the performance by allocating tasks in a project to appropriate optimal resource. The methods proposed to handle the uncertainties help the business to meet the project deadlines without any slippages because now the estimated deadline takes care of most of the problems that may be a cause for the slippage. A simulation experiment needs to be constructed to assess the efficiency of the proposed new approach. A tool that estimates the minimum time duration for a project with best possible resources is to be designed in the future. This tool can be used by the software industrialists to ease the task allocation in distributed environment as it handles all the real time issues in determining the estimation. Also, the work can be further expanded to meet some unforeseeable situations based on the findings. For example, if a task is already allocated to site B and various other precedent dependant tasks were either completed by or allocated to site A, then a model can be developed to call back the assigned job from site B and allocate it to site A so that the efficiency of the product is improved. This was not the scope of this paper but we think this can be achieved by using preemption techniques while scheduling. Moreover, a method to calculate the cost and then to assign a rank to the resources can be part of the future work.
References 1. Atallah, M.J.: Algorithms and Theory of Computation Handbook, 1st edn. CRC Press LLC, Boca Raton (1998) 2. Cusick, J., Prasad, A.: A Practical Management and Engineering Approach to Offshore Collaboration. IEEE Software, 20–29 (September/October 2006) 3. Ebert, C., De Neve, P.: Surviving Global Software Development. IEEE Software 18(2), 62–69 (2001) 4. Fryer, K., Gothe, M.: Global Software Development and Delivery: Trends and Challenges. IBM Research, January 15 (2008), http://www.ibm.com/developerworks/ rational/library/edge/08/jan08/fryer_gothe/index.html 5. Gomes, C.P., Hsu, J.: ABA: an Assignment Based Algorithm for Resource Allocation. ACM SIGART Bulletin 7(1), 2–8 (1996) 6. Gregor, E.M., Hsieh, Y., Kruchten, P.: Cultural patterns in software process mishaps: incidents in global projects. In: The proceedings of the 2005 Workshop on Human and Social Factors of Software Engineering, St. Louis, MO, USA, May 16 (2005) 7. Hendrickson, C.: Project Management for Construction: Fundamental Concepts for Owners, Engineers, Architects and Builders. World Wide Web Publication, Version 2.1 prepared summer (2003) 8. Herbsleb, J.D.: The Future of Socio-technical Coordination. In: The proceedings of the 29th International Conference on Software Engineering, Minneapolis, MN, USA, May 2026 (2007) 9. Herbsleb, J.D., et al.: An empirical Study of Global Software Development: Distance and Speed. In: The proceedings of the 23rd International Conference on Software Engineering, Toronto, Canada, May 12-19 (2001)
28
S. Doma et al.
10. Jain, G.: Reducing the Software Project Duration Using Global Software Development. Master Thesis, Indian Institute of Technology, Kanpur, India (April 2002) 11. Keil, P., Paulish, D.J., Sangwan, R.S.: Cost Estimation for Global Software Development. In: Proceedings of the 2006 International Workshop on Economics Driven Software Engineering Research, Shanghai, China, May 27 (2006) 12. Lanubile, F., Damian, D., Oppenheimer, H.L.: Global software development: technical, organizational, and social challenges. ACM SIGSOFT Software Engineering Notes, 28(6) (November 2003) 13. Nalli, P.K., Atluri, S.: Software Development in an Outsourcing Environment. Master Thesis, Umea University, Sweden, June 11 (2006) 14. Parvathanathan, K., et al.: Global Development and Delivery in Practice: Experiences of the IBM Rational India Lab, IBM International Technical Support Organization (May 2007), http://www.redbooks.ibm.com/redbooks/pdfs/sg247424.pdf 15. Richardson, I., et al.: Global Software Development – the Challenges, http://www.serc.net/report/tr278.pdf 16. Sengupta, B., Chandra, S., Sinha, V.: A Research Agenda for Distributed Software Development. In: The proceeding of the 28th International Conference on Software engineering, Shanghai, China, May 20-28 (2006) 17. Setamanit, S., Wakeland, W., Raffo, D.: Planning and improving global software development process using simulation. In: The Proceedings of the First International Workshop on Global Software Development for the Practitioner, Shanghai, China, May 23 (2006) 18. Treinen, J.J., Miller-Frost, S.L.: Following the sun: case studies in global software development. IBM Systems Journal 45(4) (2006) 19. Wiredu, G.O.: A framework for the analysis of coordination in global software development. In: The Proceedings of the First International Workshop on Global Software Development for the Practitioner, Shanghai, China, May 23 (2006) 20. Zuluaga, A., Sefair, J.A., Medaglia, A.L.: Model for the Selection and Scheduling of Interdependent Projects. In: The Proceedings of 2007 IEEE Systems and Information Engineering Design Symposium, Charlottesville, VA, USA, April 27 (2007)
Verification of Use Case with Petri Nets in Requirement Analysis Jinqiang Zhao1,2 and Zhenhua Duan1, 1
Institute of Computing Theory & Technology, Xidian University, Xi’an, 710071, P.R. China 2 State Key Laboratory of Software Engineering, Wuhan University, 430072, P.R. China
[email protected], zhenhua
[email protected]
Abstract. Requirement analysis plays a very important role in reliability, cost, and safety of a software system. The use case approach remains the dominant approach during requirement elicitation in industry. Unfortunately, the use case approach suffers from several shortcomings, such as lacking accuracy and being difficult to analyze and validate the dynamic behavior of use cases for concurrency, consistency, etc. This paper proposes an approach for overcoming limitations of the use case approach and applies the approach in Model Driven Development (MDD). Timed and Controlled Petri Nets are used as the formal description and verification mechanism for the acquired requirements. Use cases are used to elicit the requirements and to construct scenarios. After specifying the scenarios, each of them can be transformed into its correspondent Petri-net model. Through analyzing these Petri-net models, some flaws or errors of requirements can be detected. The proposed approach is demonstrated by an E-mail client system. Keywords: use case; Model Driven Development; Petri net; requirement analysis.
1
Introduction and Related Works
Software development usually consists of the following stages: requirement analysis, design, code and testing. Many research studies have shown the considerable influence of early requirement analysis on the reduction of the unnecessary costs, confusion and complexity in the later phases of software development [1]. Therefore, high quality of requirement analysis can most likely reduce some potential risk occurred in later phases of software development.
This research is supported by the NSFC Grant No. 60433010, NSFC Grant No. 60873018 jointly sponsored by Microsoft Asia Research Academy, Defence PreResearch Project of China No. 51315050105, SRFDP Grant 200807010012 and SKLSE20080713. Corresponding author.
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 29–42, 2009. c Springer-Verlag Berlin Heidelberg 2009
30
J. Zhao and Z. Duan
With its adoption into the UML, the use case diagram soon became widely adopted in industry, and it remains the dominant approach to requirements description and specification within the UML. The use case approach, based on the concept of scenarios, is commonly used as a requirements elicitation technique. Software requirements are stated as a collection of use cases. Each of them is written in the user’s perspective and described by a specific flow of events in the system [2]. The use case approach offers several practical advantages during software development. On one hand, software users can understand whether the software system satisfies their need at the very beginning in software development when they see use case diagrams. Their complaints about a system can be directly reported to requirements developers and some necessary changes can be made in the requirements model accordingly. Use case diagrams make it possible for users to evaluate the system behavior before code is written. On the other hand, use case diagram can be used as blueprints during the whole software development [3]. However, some defects of the use case approach also exist. First, use cases are often described in informal language. Although readability of informal language is very important, one still needs to have precision in specification, as there are risks of ambiguity in natural language. Second, it’s difficult to analyze and validate the dynamic behavior of use cases for concurrency, consistency, etc. To overcome the limitations of the use case approach, several researchers have tried to formalize the informal aspects of use cases. P. Hsia et al. [4] used a BNF-like grammar to formally describe use cases. The authors propose such a method and apply it to a simple PBX (private branch exchange) system. Their method has a formal mathematical base, generates precise scenarios, accommodates change, and keeps users involved in the process. The method is supported by developing a complex grammar. Andersson and Bergstrand [5] used Message Sequence Chart (MSCs). MSC provides several features aimed at enhancing the expressiveness of individual MSCs. Examples include constructs to specify the conditional, iterative, or concurrent execution of MSC sections. An MSC-based approach has advantages over the grammar-based approach in terms of scalability and understandability. Wuwei Shen et al. [6] presented HCL (High-Level Constraint Language) to formally and textually represent a requirement model, and any error or undesired result observed through the execution of HCL specification. Xiaoshan Li et al. [7] demonstrated a use case-driven, incremental and iterative requirement analysis supported by a simple formal semantic model. A use case is defined in terms of its pre and post conditions. The pre-conditions and post conditions of use cases and state constraints are written in the relational algebra. In this paper, we propose an approach based on Petri nets, and apply the approach in MDD (Model Driven Development). In MDA (Model Driven Architecture), there are 4 types of models [8]: CIM (Calculation Independent Model), PIM (Platform Independent Model), PSM (Platform Specific Model), and ISM (Implementation Specific Model). So in our approach, use case diagram, as in level of CIM, is used to elicit the user requirements. For each use case, use case
Verification of Use Case with Petri Nets
31
Fig. 1. An overview of the approach
card is used to describe the properties (use case name, primary actor, pre condition, post condition, etc.) and some event flow of the use case. Through analyzing the use case card, scenarios are constructed. Each of the scenarios can be transformed into its correspondent Timed and Controlled Petri Net (TCPN). After the analysis and verification of these Petri net models, if some flaws or errors exist, the process will return back to reconstruct the use case description and scenarios, otherwise, models in the level of PIM will be built through analyzing these models in requirement analysis. An overview of the proposed approach is shown in Fig. 1. This approach based on Petri nets is well-suited to overcome the limitations caused by the informal description of use case. Among many techniques, we choose Petri nets in our approach due to its mathematical simplicity, modeling generality, locality in both states and actions, and graphical representation, as well as well-developed qualitative and quantitative analysis techniques, and the existence of analysis tools[18]. Some researchers have tried to describe requirements using Petri nets, such as [2] [20], they do research on Modular Petri nets and Time Petri nets. However, they are different from the approach in this paper: – The Research in this paper is based on the MDD perspective, and the approach is applied in MDA. – The requirements we verified is the use case card, and event flow is written using the improved scenario language. – Timed and Controlled Petri Net is used to make Petri net model be controlled enough to describe the selective behavior and message interaction between objects. – The criteria for requirement quality assessment are defined and requirement model is verified by analyzing the Petri net models. However, any techniques are not completely perfect. Timed and Controlled Petri Nets proposed in this paper will face challenges when event flow of the use case
32
J. Zhao and Z. Duan
in industrial systems is quite complex. In that case, use case described using TCPN will be not only difficult to understand but also sensitive to changes. Fortunately, the ‘include’ relationship gives a good indication that some parts of event flows in use case can be extracted to compose a new use case which has ‘include’ relationship with initial use case. The rest of this paper is organized as follows. Section 2 describes software requirement and use case diagram; for each use case, it describes the method of constructing use case card and scenario. Section 3 focuses on the definition of the Timed and Controlled Petri Nets. Section 4 introduces the transformation from use case descriptions to Timed and Controlled Petri Net models. The verification of TCPN models is discussed in Section 5. Finally, Section 6 presents the conclusion and future work.
2
Requirement Elicitation
During software development, requirements analysis and design is the first phase, in which software developers design a requirement model after they talk to users of a software system. The requirement quality has an influential impact on the whole system [6]. In this Section, an E-mail client system is used to illustrate how to elicit requirements and build requirement model using the method proposed in this paper. The e-mail client is software that provides the user interface which connects a user to the e-mail server, and allows individuals to send mail using the SMTP [9] protocol and to check mail from server using the POP3 protocol. The use case diagram of the E-mail client system is shown in Fig. 2. In the use case diagram, each use case is described by use case card and scenarios are constructed. Here, we present a scenario language for describing scenarios [10]. The general advantages of using a scenario language mainly include: – It is easily to be understood and conveniently employed by software users; – Syntactic rules and event frame of each event enable the transformation from use case descriptions to TCPN models to be feasible;
Fig. 2. Use case diagram of mail client system
Verification of Use Case with Petri Nets
33
– Use case card is improved using this scenario language, and method of using syntactic rules can detect the errors (lack of events, extra events, and wrong sequence among events); The vagueness of the scenario written using this language can be reduced. In several papers such as [10] [11] [12], the similar scenario language has been already introduced, but some differences also exist. So, a brief introduction of this language will be given. A scenario can be regarded as a sequence of events. Events are behaviors employed by users or systems for accomplishing their goals. A set of syntactic rules will be followed while constructing scenarios. Rule 1: Each event is written using simple sentence based on Subject + Predicate + Object, and each has only one subject and one predicate. Example: (1) Client send ‘DATA’ and ‘QUIT’ to SMTP server; (false) Sentence (1) is replaced by some simple sentences: (1)Client send ‘DATA’ to SMTP server; (true) (2)Client send ‘QUIT’ to SMTP server; (true) Rule 2: Use active rather than passive voice. Example: (1) ‘DATA’ is sent to SMTP server; (false) (1)Client send ‘DATA’ to SMTP server; (true) Rule 3: Use the same word for same description in different sentences. Example: (1)Client send ‘DATA’ to SMTP server; (true) (2)E-Mail Client send ‘QUIT’ to SMTP server; (false) (2)Client send ‘QUIT’ to SMTP server; (true) ‘Client’ and ‘E-Mail Client’ are the description of the same thing. Rule 4: Use the name of an object to replace a pronoun of the object. Example: (1)Client send ‘DATA’ to it; (false) (1)Client send ‘DATA’ to SMTP server; (true) Rule 5: Each event has its own event frame. The Example is shown in Table 2. Rule 6: Each event ends with a semicolon. Rule 7: The relations among single events are: sequence, selection, iteration, and concurrency. Most events occur sequentially, so ordered events can be regarded as sequential events. For selective events, we use “IF-THEN-ELSE {Condition, F1, F2};”,If ‘Condition’ return true, go to F1, otherwise, go to F2. For iterative events, we use “DO-WHILE {Condition, F1, F2 ...};”, Do F1, F2 iteratively, until ‘Condition’ return false. For concurrency events, we use “And {F1, F2, F3 ...};”. We consider a use case card of sending mail using the SMTP protocol in E-mail client system. Table 1 shows the use case card that client sends a mail by SMTP protocol. In the use case card shown in Table 1,‘Use case name’ is
34
J. Zhao and Z. Duan Table 1. Use case card of ‘Send Mail’ Use case name Primary actors Base use cases Include use cases Extend use cases Pre-condition Post-condition
Send Mail Client null null null Client click on ‘send mail’ button; Client finished ‘send mail’; Event flow (1) Client establish TCP connection with the SMTP server via port 25; (2) Client send ‘HELO’ to SMTP server; (3) SMTP server receive request of connection; (4) DO-WHILE{ Condition: username is ok, (4.1) SMTP server send request of username, (4.2) Client send username to SMTP server }; (5) SMTP server send request of password; (6) Client send password to SMTP server; (7) IF-THEN-ELSE{ Condition: password is ok, (7.1) SMTP server send ‘OK’ to Client, (7.2) SMTP server send message of error }; (8) Client send ‘MAIL’ and sender’s address to SMTP server; (9) SMTP server send ‘OK’ to Client; (10) Client send ’RCPT’ and receiver address to SMTP server; (11) IF-THEN-ELSE { Condition: SMTP server identify receiver address, (11.1) SMTP server send ‘OK’ to Client, (11.2) SMTP server refuse the request }; (12) Client send ‘DATA’ and mail content to SMTP server; (13) Client send ‘QUIT’ to close connection;
the name of use case in use case diagram. ‘Primary actors’ means active objects such as human or system appearing in the scenario. ‘Base use cases’, ‘Include use cases’, and ‘Extend use cases’ mean use cases that have relation of ‘generalization’, ‘include’, and ‘extend’ with the use case described in the use case card. ‘Pre-condition’ specifies a condition that satisfies at the start of the scenario. ‘Post-condition’ specifies a condition that satisfies at the end of the scenario. ‘Pre-condition’, ‘Post-condition’, and ‘Event flow’ are written using the Table 2. Example of event frame Event Client send ‘HELO’ to SMTP server Object1 Client Action send ‘HELO’ Object2 SMTP server Message ‘HELO’ Time Constraint 0
Verification of Use Case with Petri Nets
35
scenario language presented before. Furthermore, ‘Event flow’ in use case card is analyzed to extract object and message between objects. If needed, time delay of each event will also be added. Example of event frame is shown in Table 2. ‘Object1’ and ‘Object2’ are objects extracted from the event sentence. ‘Object1’ and ‘Object2’ are elicited from subject and object of event flow respectively, ‘Message’ means message between ‘Object1’ and ‘Object2’. ‘Time Constraint’ is the time delay.
3
Timed and Controlled Petri Net (TCPN)
Petri nets have been used extensively and successfully in various applications such as protocol or performance analysis. The general characteristics of using Petri-nets mainly include: – Petri net allows the modeling of concurrency, synchronization and resource sharing behavior of a system. – Petri net has visual and easily understandable representation. – Petri net has well-defined semantics and a solid foundation for mathematics. – Petri net can offer variety of mature analysis techniques, such as reachability, deadlock, bounded, safety, invariant, etc. – Some software tools are available to assist Petri net modeling and analysis. – The integration of Petri nets with UML could provide a means for automating behavioral analysis and building platform independent model efficiently in Model Driven Development. In order to describe the time constraint and to force the firing of one or more transition according to some given condition or system state, we present a Timed and Controlled Petri Net (TCPN), which adds time constraint to Controlled Petri Net. Controlled Petri net is a class of Petri nets with external enabling conditions called control places which allow an external controller to influence the progression of tokens in the net. Controlled Petri net was first introduced by Krogh [14] and Ichikawa and Hiraishi [15]. Following the formalism adopted in defining ordinary Petri nets, we present the definition of TCPN as follows: Definition 1 (TCPN). A Timed and Controlled Petri Net (TCPN) is an eight-tuple TCPN={P, C, T, F, M0 ,λ,μ,γ} where – – – – – – – –
P = {p1 , p2 , ..., pn } is a finite set of state places or places; C = {c1 , c2 , ... , cn } is a finite set of control places; T = {t1 , t2 , ... , tk }is a finite set of transitions; F⊆(P×T)∪(C×T)∪(T×P) is the flow relation i.e. set of directed arcs; M0 : P→{0, 1, 2, 3, ...} is the initial marking; λ: P→ N is the token function of state places; μ: C→{0, 1} is the token function of control places; γ: T→ R+ is the time delay function of transition;
P ∩ C ∩ T= and P ∪ C ∪ T = i.e. state places, control places, and transitions are disjoint sets;
36
J. Zhao and Z. Duan
Definition 2. For a transition t∈T, the set of input control places is(c) t = {c|(c, t)∈F}; for a control place c∈C, the set of output transitions is c(t) = {t|(c, t)∈F}; similarly, for a transition t∈T, the set of input and output state places are (p) t = {p|(p, t)∈F} and t(p) = {p|(t, p)∈F}; for a state place p∈P, the set of input and output transitions are (t) p = {t|(t, p)∈F} and p(t) = {t|(p, t)∈F}. Definition 3. A transition t is state enabled if for all p∈ (p) t ,λ(p)≥1; a transition t is control enabled if for all c∈ (c) t,μ(c)=1; a transition t is enabled if and only if t is state enabled and control enabled.
4
Transforming Use Case Description into Timed and Controlled Petri Net (TCPN)
Use case description including use case card and scenarios written using scenario language has been introduced in Section 2. In this section, we discuss transformation of use case description into Timed and Controlled Petri Net. A Timed and Controlled Petri subnet is defined for each part of the use case card (single event, IF-THEN-ELSE, etc.), and through the interconnection of every subnets, the net which models the whole use case is obtained. Detailed transformation rules are described below. 4.1
Expressing Single Event
Through analyzing the event frame (Table 2), and taking no account of message interactive between objects, single event is transformed into subnet shown in Fig. 3(a)(b). If time constraint of the event exists, timed transition is used (Fig. 3(b)). The presence of a token inside P0 means that transition T0 is enabled and the action of single event is being executed. Finally, the presence of a token inside place P1 means that the action of this single event is ended. 4.2
Expressing Message between Objects
For some events of the event flow of use case card, message between objects exists, i.e. ‘Message’ in event frame of the single event is not null. The TCPN subnet used to describe message between objects is shown in Fig.3(c).
Fig. 3. Petri subnet with single event and message: (a) single event with immediate transition; (b) single event with timed transition; (c) subnet with message
Verification of Use Case with Petri Nets
37
Fig. 4. Petri subnet with ‘IF-THEN-ELSE’ ‘DO-WHILE’ ‘AND’
Token exists inside object1 means that transition action1 is enabled. After firing the transition action1, a token is inserted into place named message. Transition action2 is enabled only if place object2 and message has a token respectively. 4.3
Expressing ‘IF-THEN-ELSE’
The Timed and Controlled Petri subnet which models the ‘IF-THEN-ELSE’ in the event flow described using the scenario language is shown in Fig.4(a). The firing of transition T0 models the command of starting the check of condition. It inserts a token into place P1 and allows the beginning of checking. When the check ends, a token is automatically inserted into control place C1 or C2 (μ(C1) =1 orμ(C2) =1). It depends on the considered condition has been returned false or true respectively. According to the result of check, one between transition T1 and transition T2 is enabled, the token game of the subnet will then go on with a token inside place P2 (the check is ended and return true) or inside place P3 (the check is ended and return false). 4.4
Expressing ‘DO-WHILE’
The subnet which models the ‘DO-WHILE {Condition, F1}’ in the event flow is shown in Fig. 4(b). As soon as T0 is fired, execution of event flow F1 is started and a token is inserted into place P1. The firing of transition T1 models the command of starting check of the condition. It inserts a token into place P2. When the check ends, control place C1 or C2 will be inserted a token automatically, i.e.μ(C1) =1 orμ(C2) =1. The evolution of subnet is similar to what Fig.4(a) describes. The difference is the directed arc between transition T2 and place P0, when the check is ended and returns true. 4.5
Expressing ‘AND’
Fig. 4(c) shows the subnet which models ‘AND {F1, F2}’. The firing of transition T0 puts a token into both place p1 and place p2. In this way, event F1 and event F2 have to be executed concurrently. Transition T3 can only be fired if both
38
J. Zhao and Z. Duan
Fig. 5. TCPN model of ‘Send Mail’
places (P3 and P4) have tokens. The presentation of a token inside place P5 means the concurrent event is ended. 4.6
TCPN Model of ‘Send Mail’
According to the rules which precede this section, the TCPN model of ‘Send Mail’ (its use case card is described in Section 2) is obtained through the interconnection of some subnets. The order which TCPN is constructed is: First, transforming ‘Pre-condition’ of use case description into a subnet; second, transforming ‘Event flow’ into subnets sequentially; third, transforming ‘Post-condition’ into TCPN subnet. The TCPN model of ‘Send Mail’ is shown in Fig. 5.
5
Verification of Use Cases by Analyzing Petri Net Models
A major strength of Petri nets is their support for analysis of properties and problems associated with the systems. So in order to verify the faults of use cases, we will analyze some behavioral properties such as Reachability, Boundedness, Liveness, Reversibility, etc. First of all, it is very necessary to understand what the high quality requirement is. The following criteria for requirement quality assessment are used when verifying requirements by analyzing TCPN model.
Verification of Use Case with Petri Nets
39
1. Completeness: The acquired requirements are completeness if the requirement model established has included all the functions of the system, and all significant requirements or software responses are included or defined. 2. Consistency: Two or more requirements are not in conflict with one another. 3. Correctness: The requirements elicited should satisfy user’s demand, and the requirement itself should be correct. 4. Unambiguous: It is necessary to make sure that a requirement has only one meaning in particular context. Because our use case event flows are written following the rules described in Section2, the ambiguity of requirement descriptions can be reduced. Here we will not verify ambiguous information by analyzing Petri nets. While analyzing the TCPN model for use case, a Petri net tool called PIPE2 (Platform Independent Petri Net Editor) is used for Petri net comparison and generating reachability graph [16]. Verification of use case is described in detail below. 5.1
Verification of ‘Completeness’
The requirement is completeness, so Timed and Controlled Petri Net models generated by requirement descriptions should be completeness as well. Definition 4 (Completeness). The Timed and Controlled Petri Net is completeness, if – All places and transitions are specified by particular names; – No isolated subnet exists in the TCPN model of each use case; The approach for verifying ‘Completeness’ is: For a TCPN Q={P, C, T, F, M0 ,λ,μ,γ}, traverse all places and transitions from initial place: if ∃(p∈P c∈C
t∈T), p or c or t cannot be traversed, i.e. there exists an isolated subnet, display (“some message associate with the object is not identified”). If ∃p∈P, p does not have a particular name, display (“objects are not identified in event frame”). If ∃t∈T, t does not have a particular name, display (“‘Action’ in the event frame is not identified as a name”). 5.2
Verification of ‘Consistency’
The consistence of requirements is verified by analyzing some properties of TCPN. Definition 5 (Consistency). The TCPN models are consistence, if – The TCPN model itself is consistency, i.e. the TCPN is live; – TCPN models of related use cases are consistency, i.e. TCPN model of the use case U is consistent with that of U’s include use case; TCPN model of the use case U is consistent with that of U’s base use case.
40
J. Zhao and Z. Duan
The approach for verifying ‘Consistency’ is to simulate a Petri net model using PIPE tool [16]: For a TCPN Q={P, C, T, F, M0 ,λ,μ,γ}: If Q is dead (L0-live), display (‘some events never happen, deadlock exists’). For use case and ‘Include use cases’ in use case card: If name of ’Include use cases’ does not exist in transition names of TCPN model for a use case, display (“inconsistent between use case and ‘Include use cases’”). For use case and ‘Base use cases’ in use case card: By comparing between TCPN model of use case and that of its ‘Base use cases’, If number of transitions in TCPN model of ‘Base use cases’ is more than that of the use case, or no identical transitions exist, display (“inconsistent between use case and ‘Base use cases’”). 5.3
Verification of ‘Correctness’
The Correctness of TCPN models for describing use cases is defined below: Definition 6 (Correctness). The TCPN models are correctness, if – The reachability graph of TCPN model is correctness; – The TCPN models is boundness; – The time delay of the transition of TCPN models is valid. The method for verifying ‘Correctness’ is: For a TCPN Q={P, C, T, F, M0 ,λ,μ,γ}: If the correct trace is a→b→c→d, errors of omission result in a→b→d or b→c→d; errors of extra event result in a→b→c→c→d or a→b→c→x→d; errors of reversal event results in a→c→b→d. these errors are verified by analyzing the reachability graph of TCPN after giving the initial marking of state places and the token number of control places. Fig.6(a) shows the reachability graph of TCPN model shown in Fig.5 when λ(Client) =1, λ(SMTP server) =1, μ(C2) =1, and Fig.6(b) shows the reachability graph of the TCPN model when λ(Client) =2, λ(SMTP server) =2, μ(C2) =1. If Q is not bounded, display (‘maybe overflow exists’). If ∃t∈T, γ(t) denotes the average ionic radius of B and B' cations. Using this equation the crystal structure of the different compounds can be estimated. For Sr2CrWO6 compound, the numerical value of t is calculated to be 0.999. This indicates that it is cubic in structure. However, the value of t = 1.059 for Ba2CrWO6 is due to the large ionic radius of Ba+2, which causes to make structural phase transition to hexagonal. For Ca2CrWO6 compound, t = 0.945, indicates a heavy
Lattice Constant Prediction of A2BB'O6 Type Double Perovskites
85
distortion in the structure of Ca2CrWO6 and thus transformation into the monoclinic P21/n space group takes place [5]. 2.3 Model Construction To correlate the lattice constants of perovskites with six independent parameters, the prediction models can be expressed in functional form as:
Model = f ( r A , rB , rB ' , x B , x B ' , z B )
(3)
The lattice constants predictions of MLR model [9] are obtained by using the following relationship.
aMLR = 4.3966 + 1.1659rA + 1.0637rB + 1.7085rB ' + 0 . 0747 x B + 0 .0435 x B ' + 0 . 0499 z B
(4)
2.4 SVR Model Development The description of SVR regression model development are available in the chemometerics literature [13], [14], [15] and well documented in the statistical learning theory [16], [17]. Here, we will explain it largely with reference to implementation. SVR models map the nonlinear data samples into higher dimensional feature space, so that the data samples may be linearly separable. For N samples, we have a pair of dataset S = ( xi , yi ), ∀ i = 1,L , N . The input data is mapped into a nonlinear mapping Φ such that the input data may be linearly separable. Here, Φi ( x) ∀ i = 1, ..., l where l ≤ N represents the number of support vectors. The output SVR predicted model is estimated as: l
y = ∑ wi Φi ( x) + b
(5)
i =1
The above equation can be solved by using a kernel function as follow: l
f ( x, α , α * ) = ∑ (α i* − α i ) K ( x, xi ) + b
(6)
i =1
where K(x,xi) is the kernel function, selected according to the complexity in the input data. The numerical value of the kernel is computed by taking the dot product of nonlinear mapping from input space, x ∈ R n , to output space, y ∈ R . The coefficients
αi* and α i
are the Lagrange multipliers. In the training phase, the SVR model uses
training data to minimize the following:
R(w) =
1 l ∑ yi − f ( xi , w) ε + γ < w, w > l i =1
(7)
where, represents the dot product of two weight vectors and γ is a constant, ε is the insensitive parameter.
86
A. Majid, M. Farooq Ahmad, and T.-S. Choi
⎧⎪0, if y − f ( x, w) < ε yi − f ( xi , w) ε = ⎨ ⎪⎩ y − f ( x, w) − ε , otherwise
(8)
The Lagrange multipliers αi* and α i , are determined by maximizing the following functional with constraints l
l
i =1
i =1
w(α * ,α ) = −ε ∑ (αi* + αi ) + ∑ y(αi* − αi ) − l
∑ (α
* i
1 l (α i* − α i )(α *j − α j ) K ( xi , x j ) ∑ 2 i , j =1
− αi ) = 0,0 ≤ α i* ,α i ≤ C , i = 1,...,l
(9)
(10)
i =1
where C is a trade-off parameter, which determines the cost of constraint violation. Its value is empirically determined during training. At the end of training, support vectors that correspond to non-zero values of coefficients ( αi* − α ), are selected for SVR model construction. We have chosen the most commonly used Gaussian kernel function. The values of cost function C and error function ε, and the kernel width (σ) are optimized by using grid search [18]. To find the optimal values of these parameters (C=100, σ=1.7255 and ε=1.0x 10 −6 ), during parameter optimization, the relative percent error between the experimentally reported LC values and the predicted values by SVR model is minimized. 2.5 ANN Model Development ANN model is provided a set of training samples and desired output values to organize its connections weights. To compare and analyze the results with SVR model, back-propagation algorithm in Matlab [19] is employed for the training. The weights of this network are adjusted by using Levenberg–Marquardt algorithm. The activation function used in the hidden and output layers are selected as tansig and pure linear, respectively. For optimal network performance, four neurons are searched in the hidden layer of the ANN architecture. Initial weights and bias values are randomly adjusted. The values of six atomic parameters and the desired LC value of each sample data point are fed as the network input. Out of 147 compounds, randomly selected 98 compounds are used for training and the remaining 49 compounds are used for the ANN model validation.
3 Results and Discussions The performance of prediction models is estimated in terms of PAD for training, validation, and novel compounds. It is observed, from Table 1-2 and supplementary Table 2-3 (appendix), that the PAD of SVR and ANN prediction models are appreciably less than the MLR and SPuDS prediction models. These SVR and ANN models retain higher performance not only on training and validation data but on the novel data as well.
Lattice Constant Prediction of A2BB'O6 Type Double Perovskites
87
Table 1 highlight the PAD values of SVR, ANN and MLR prediction models for 48 novel compounds. This indicates a significant difference in the average PAD values of 0.2733 and 0.4777 for SVR and ANN models, respectively. SVR model provides 20 percent more accurate prediction for novel compounds as compared to ANN model. This is because ANN models usually give less generalization. They are prone to trapping in local minima during training. Another possible disadvantage of ANN models is relatively higher training time [20]. Therefore, SVR model is more generalize and efficient than ANN model. To explore the validity of prediction models, some other types of double perovskites are also included. In the Table 1, for a tetragonal structure of Sr2MnRuO6 compound (row 48), reasonably low PAD values of 0.8004, 0.6933, and 0.9681 are predicted by SVR, ANN and MLR models, respectively. Sr2MnRuO6 compound has I4/mcm symmetry with lattice parameters of a=5.45Å, b=5.45Å, c=7.933Å, t=0.987, and apseudocubic=7.933Å [21]. Sr2MnReO6 compound (row 47) is reported to be of monoclinic space group of P21/n in [22] with lattice parameters of a=5.668Å, b=5.645Å, c=7.9904Å, and apseudocubic=7.9904Å. It is reported to be cubic with LC=8.0012Å and t=0.9609 [23]. However, SVR based model provides an excellent prediction with PAD value of 0.1128 for this compound. Reasonably good PAD values of 0.3453 and 1.5876 by ANN and MLR based prediction models are found for the same compound, respectively. The t value of 0.945 for Ca2CrWO6 compound (row 43) indicates that it is highly distorted from ideal cubic symmetry. For this heavily distorted Ca2CrWO6 compound, SVR, ANN, and MLR based prediction models predict appreciably low PAD values of 1.3437, 1.3868 and 1.8369, respectively. The above results show not only the validity of prediction models, but also confirm the higher robustness of SVR model for other categories of perovskites. Table 1. Performance comparison of prediction models for the 48 novel compounds Sr. No
Compound Name Expt. LC 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Sr2AlSbO6 Ba2GdTaO6 Sr2FeMoO6 Sr2FeReO6 Sr2CrReO6 Ba2YRuO6 Ba2LuRuO6 Sr2 InSbO6 Sr2GaSbO6 Ba2ErMoO6 Sr2MnReO6 Sr2MnMoO6 Ba2TbIrO6 Ba2HoTaO6 Ba2YMoO6 Ba2YbMoO6 Ba2PrIrO6 Ba2HoRuO6
7.7662 8.47 7.9072 7.887 7.8152 8.339 8.272 8.094 7.88 8.4368 8.0012 8.01 8.3848 8.40748 8.39173 8.3378 8.4013 8.3419
SVR model Pred. LC PAD 7.7668 0.0077 8.4879 0.2112 7.9011 0.0767 7.8707 0.2067 7.8403 0.3216 8.3508 0.142 8.3015 0.357 8.0971 0.038 7.8668 0.1671 8.383 0.6382 8.0110 0.1228 8.0258 0.1974 8.4302 0.5415 8.4463 0.4615 8.3951 0.0398 8.3511 0.16 8.3694 0.3798 8.353 0.1334
ANN model Pred. LC PAD
MLR model Pred. LC PAD
7.7591 8.4937 7.8931 7.88 7.8404 8.3466 8.2894 8.0965 7.853 8.3429 7.9736 7.9946 8.5084 8.4311 8.3578 8.3097 8.4512 8.3482
7.7883 8.4902 7.9107 7.8482 7.8329 8.3506 8.3054 8.0575 7.8638 8.4136 7.873 8.0444 8.512 8.4538 8.4257 8.3924 8.4565 8.3509
0.0917 0.28 0.1782 0.0892 0.322 0.0911 0.2101 0.0311 0.3423 1.1126 0.3453 0.1919 1.4741 0.2809 0.4042 0.3364 0.5942 0.0761
0.2844 0.2385 0.0446 0.4924 0.226 0.1391 0.4035 0.4514 0.2061 0.2749 1.5876 0.4289 1.5167 0.5512 0.4053 0.6554 0.6576 0.1081
88
A. Majid, M. Farooq Ahmad, and T.-S. Choi 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
Ba2SmSbO6 Ba2HoSbO6 Ba2YSbO6 Ba2SmMoO6 Ba2EuMoO6 Ba2GdMoO6 Ba2DyMoO6 Ba2DySbO6 Ba2PrRuO6 Ba2NdRuO6 Ba2MnWO6 Sr2FeReO6 Ba2FeNbO6 Ba2FeWO6 Ba2CePtO6 Ba2PrPtO6 Ba2YbRuO6 Sr2MgIrO6 Ba2CaIrO6 Sr2NiMoO6 Ba2ScBiO6 Ba2YSnO5.5 Ba2DySnO5.5 Ba2HoSnO5.5 Ca2CrWO6 Ba2CrWO6 Sr2ScReO6 Sr2CoReO6 Sr2MgReO6 Sr2MnRuO6 Mean PAD
8.50908 8.4119 8.402 8.4762 8.459 8.4481 8.4062 8.4247 8.48416 8.4706 8.1985 7.90089 8.118 8.135 8.4088 8.3892 8.2753 7.8914 8.3567 7.9019 8.366 8.521 8.513 8.498 7.66 8.06 7.98686 7.95083 7.93283 7.9333
8.4696 8.3964 8.3943 8.4693 8.449 8.445 8.4114 8.4108 8.4637 8.4555 8.2048 7.9655 8.0967 8.1439 8.3861 8.3616 8.2966 7.9094 8.3686 7.8844 8.3859 8.511 8.5273 8.5129 7.7629 7.9971 7.997 7.9146 7.9026 7.8698
0.4635 0.1845 0.092 0.0813 0.1179 0.0365 0.0613 0.1654 0.2408 0.1784 0.0764 0.8173 0.2623 0.1089 0.2703 0.3292 0.2571 0.2277 0.143 0.2217 0.2376 0.1172 0.1679 0.1756 1.3437 0.78 0.1274 0.4557 0.3813 0.8004 0.2733
8.452 8.367 8.3653 8.4449 8.4283 8.4154 8.3761 8.3836 8.4862 8.4756 8.183 7.8884 8.0606 8.1292 8.4699 8.4445 8.298 7.8465 8.3783 7.8216 8.3644 8.4367 8.4536 8.4379 7.7662 7.9854 8.017 7.8596 7.8364 7.878
0.6708 0.534 0.4365 0.3694 0.3634 0.3872 0.3578 0.4879 0.0236 0.0596 0.1885 0.1575 0.7067 0.071 0.7269 0.6587 0.2741 0.5687 0.2586 1.0163 0.0197 0.9899 0.6981 0.7071 1.3868 0.9254 0.3775 1.1469 1.2155 0.6933 0.4777
8.4693 8.4042 8.4039 8.4912 8.4832 8.4677 8.4385 8.4166 8.4531 8.4449 8.2683 7.8985 8.1358 8.1942 8.482 8.460 8.3255 7.9128 8.432 7.8685 8.5006 8.5537 8.5665 8.554 7.8007 8.1155 7.9896 7.8496 7.8656 7.8565
0.4675 0.0917 0.0223 0.1766 0.2861 0.2315 0.3843 0.0957 0.3666 0.3039 0.8519 0.0304 0.2196 0.7283 0.8711 0.8442 0.6069 0.2718 0.9016 0.4221 1.6088 0.384 0.6283 0.6594 1.8369 0.6886 0.0348 1.2727 0.8471 0.9681 0.5370
It is observed from Table 2 that the average PAD value of ANN model is 0.4218 for validation data. This table shows the marginal improvement of SVR model against ANN model, i.e. the average PAD values of SVR and ANN models are 0.1768 and 0.1779 for training data, respectively. For validation data, these values are 0.4257 and 0.4218, respectively. In brief, the overall PAD statistics of the prediction models can be summarized as: avgPAD SVR < avgPAD ANN < avgPAD MLR < avgPAD SPuDS . The graphs from the Fig. 2 to Fig. 5 show the correlation between the experimental and predicted LC values of four prediction models for novel data. In Fig. 2, the equation of linear fit and R value of SVR model is computed to be Y=(0.99)T+(0.083) and 0.9904, respectively. Similarly, the equations of linear fit and R values of ANN and MLR models are shown in the Fig. 3 and Fig. 4 as: YANN=(0.96)T+(0.38), RANN=0.9808, YMLR=(0.92)T+(0.67), RMLR=0.9804, respectively. This shows that the equation of linear fit and R value of SVR based prediction model are more close to the ideal linear fit as compared to the ANN/MLR models. Therefore, SVR model can be used more effectively for accurate prediction of LCs. It is concluded that all prediction models except SPuDS program can be adopted for the LC prediction of double cubic perovskites. However, if we are interested in accurate prediction of LC values, then SPuDS program may not be too effective, though, it is a general purpose perovskites structure predicting program [10].
Lattice Constant Prediction of A2BB'O6 Type Double Perovskites
89
Table 2. Average PAD statistics of prediction models Input data Training data Validation data Novel data Overall Mean PAD
SVR model Mean PAD 0.1768 0.4257 0.2733 0.2919
ANN model Mean PAD 0.1779 0.4218 0.4777 0.3591
MLR model Mean PAD 0.4664 0.5928 0.5370 0.5321
SPuDS [10] Mean PAD 1.6281 1.6923 1.4115 1.5773
Fig. 2. Correlation between experimental and predicted LC values by SVR model for validation data
Fig. 3. Correlation between experimental and predicted LC values by ANN model for validation data
90
A. Majid, M. Farooq Ahmad, and T.-S. Choi
Fig. 4. Correlation between experimental and predicted LC values by MLR model for validation data
Fig. 5. Correlation between experimental and predicted LC values by SPuDS model for validation data
4 Conclusion In this work, four different types of prediction models are employed to correlate LCs parameters with six atomic parameters of constituent ions of double perovskites. Their prediction performance is analyzed for 48 novel perovskites obtain from the recent literature of material science. It is concluded that SVR and ANN based nonlinear prediction models are better than MLR and SPuDS model models. SVR model exhibit more penalization than that of ANN model, especially for novel perovskites. Therefore, SVR based learning approach can be employed as an alternative tool in computational material.
Lattice Constant Prediction of A2BB'O6 Type Double Perovskites
91
References 1. Baran, E.J.: Structural chemistry and physicochemical properties of perovskite-like materials. Catalysis Today 8, 133–151 (1990) 2. Wolfram, T., Ellialtioglu, S.: Electronic and Optical Properties of d-Band Perovskites. Cambridge University Press, Cambridge (2006) 3. Azad, A.K.: Synthesis, Structure, and Magnetic Properties of Double Perovskites of the type A2MnBO6 and A2FeBO6 (A=Ca, Sr, Ba; B= W, Mo, Cr). Ph.D. Thesis. Göte-borg University, Sweden (2004) 4. Bouville, M., Ahluwalia, R.: Effect of lattice-mismatch-induced strains on coupled diffusive and displacive phase transformations. Physical Review B - Condensed Matter and Materials Physics 75, 054110–054118 (2007) 5. Philipp, J.B., Majewski, P., Alff, L., Erb, A., Gross, R., Graf, T., Brandt, M.S., Simon, J., Walther, T., Mader, W., Topwal, D., Sarma, D.D.: Structural and doping effects in the half-metallic double perovskite A 2CrWO6 (A = Sr, Ba, and Ca). Physical Review B Condensed Matter and Materials Physics 68, 1444311–14443113 (2003) 6. Serrate, D., De Teresa, J.M., Ibarra, M.R.: Double perovskites with ferromagnetism above room temperature. Journal of Physics Condensed Matter 19, 023201–023287 (2007) 7. Faik, A., Gateshki, M., Igartua, J.M., Pizarro, J.L., Insausti, M., Kaindl, R., Grzechnik, A.: Crystal structures and cation ordering of Sr2AlSbO6 and Sr2CoSbO6. Journal of Solid State Chemistry 181, 1759–1766 (2008) 8. Bokov, A.A., Protsenko, N.P., Ye, Z.G.: Relationship between ionicity, ionic radii and order/disorder in complex perovskites. Journal of Physics and Chemistry of Solids 61, 1519– 1527 (2000) 9. Dimitrovska, S., Aleksovska, S., Kuzmanovski, I.: Prediction of the unit cell edge length of cubic A2BB 2006 perovskites by multiple linear regression and artificial neural networks. Central European Journal of Chemistry 3, 198–215 (2005) 10. Lufaso, M.W., Woodward, P.M.: Prediction of the crystal structures of perovskites using the software program SPuDS. Acta Crystallographica Section B: Structural Science 57, 725–738 (2001) 11. Shannon, R.D.: Revised effective ionic radii and Systematic Studies of Interatomic Distances in halides and chalcogenides. Acta Cryst. A 32, 751–767 (1976) 12. Environmental, Chemistry & Hazardous Materials Resources, http://environmentalchemistry.com/yogi/periodic/ 13. Xu, L., Wencong, L., Shengli, J., Yawei, L., Nianyi, C.: Support vector regression applied to materials optimization of sialon ceramics. Chemometrics and Intelligent Laboratory Systems 82, 8–14 (2006) 14. Li, J., Liu, H., Yao, X., Liu, M., Hu, Z., Fan, B.: Quantitative structure-activity relationship study of acyl ureas as inhibitors of human liver glycogen phosphorylase using least squares support vector machines. Chemometrics and Intelligent Laboratory Systems 87, 139–146 (2007) 15. Pan, Y., Jiang, J., Wang, R., Cao, H., Cui, Y.: Predicting the auto-ignition temperatures of organic compounds from molecular structure using support vector machine. Chemometr. Intell. Lab. Syst. 92, 169–178 (2008) 16. Smola, A., Schoelkopf, B.: A Tutorial on Support Vector Regression, vol. 14, pp. 199– 222. Springer, Netherlands (2004) 17. Gunn, S.: Support vector machines for classification and regression. ISIS Technical Report (1999)
92
A. Majid, M. Farooq Ahmad, and T.-S. Choi
18. Majid, A., Khan, A., Mirza, A.M.: Combination of Support Vector Machines Using Genetic Programming. International Journal of Hybrid Intelligent System 3, 1–17 (2006) 19. MATLAB7.0: Mathworks, http://www.mathworks.com 20. Javed, S.G., Khan, A., Majid, A., Mirza, A.M., Bashir, J.: Lattice constant prediction of orthorhombic ABO3 perovskites using support vector machines. Computational Materials Science 39, 627–634 (2007) 21. Woodward, P.M., Goldberger, J., Stoltzfus, M.W., Eng, H.W., Ricciardo, R.A., Santhosh, P.N., Karen, P., Moodenbaugh, A.R.: Electronic, magnetic, and structural properties of Sr2MnRuO 6 and LaSrMnRuO6 double perovskites. Journal of the American Ceramic Society 91, 1796–1806 (2008) 22. Kato, H., Okuda, T., Okimoto, Y., Tomioka, Y., Oikawa, K., Kamiyama, T., Tokura, Y.: Structural and electronic properties of the ordered double perovskites A2MReO6 (A = Sr,Ca; M = Mg,Sc,Cr,Mn,Fe,Co,Ni,Zn). Physical Review B - Condensed Matter and Materials Physics 69, 184412–184420 (2004) 23. Popov, G., Greenblatt, M., Croft, M.: Large effects of A-site average cation size on the properties of the double perovskites Ba2-xSrxMnReO6: A d5-d1 system. Physical Review B - Condensed Matter and Materials Physics 67, 244061–244069 (2003)
A Grid Implementation of Direct Semiclassical Calculations of Rate Coefficients Alessandro Costantini1 , Noelia Faginas Lago1 , Antonio Lagan` a1 , 2 and Ferm´ın Huarte-Larra˜ naga 1
2
Department of Chemistry, University of Perugia, Perugia, Italy Computer Simulation and Modeling (CoSMo) Lab, Parc Cient`ıfic de Barcelona and Institut de Qu`ımica Te´ orica de la Universidad de Barcelona (IQTCUB), Barcelona, Spain {alex,noelia}@impact.dyn.unipg.it,
[email protected],
[email protected]
Abstract. A detailed description of the grid implementation on the production computing grid of EGEE of a semiclassical code performing a calculation of atom-diatom reaction rate coefficients is given. An application to the N + N2 reaction for which a massive computational campaign has been performed is reported.
1
Introduction
Computational modeling of spacecraft reentry [1], secondary pollutants production in the atmosphere [2,3], combustion of fuels and biofuels [4], semi-permeable membranes of fuel cells [5] and permeability of micropores to ions [6] is often based on multiscale simulations [7]. The crucial step of the microscopic scale of these codes is the evaluation of thermal rates coefficients k(T ) of the elementary chemical reactions taking place in the corresponding complex processes. By definition [8] the thermal rate coefficient k(T ) links the macroscopic domain of concentrations to the microscopic domain of molecular processes and can be formulated as the thermal average of the cumulative reaction probability, N (E): ∞ 1 k(T ) = dEe−E/kB T N (E) (1) hQr (T ) −∞ where h is the Planck constant, kB the Boltzmann constant, E the total energy and Qr (T ) the reactant thermal partition function per unit volume at the temperature T . The cumulative reaction probability N (E), sometimes called also CRP, can be derived from the scattering matrix elements Si→i (E, J). Usually S matrix elements are calculated by integrating the corresponding Schr¨ odinger equation either in its time dependent [9] or time independent [10] formulation. However, there is no reason to undertake such a heavy numerical effort when the ultimate goal of the calculation is to average them to evaluate O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 93–103, 2009. c Springer-Verlag Berlin Heidelberg 2009
94
A. Costantini et al.
a rate coefficient. In this case, in fact, especially when the reaction occurs by overtaking a barrier and no long-living complex is formed, it is possible (and clearly more convenient) to calculate k(T ) in a direct way. Direct methods are based on a dynamics calculation confined to the region surrounding the saddle point. They have the advantage of not only reducing significantly the numerical effort with respect to a state to state treatment but also of providing an intuitive interpretation of reactivity using the language of the popular transition state theory [11]. Moreover, they have the advantage of allowing the study of reactions for systems of arbitrary complexity. In fact, although direct approaches have been developed also using purely quantum schemes [12,13,14] the fact that they can be applied to trajectory based approaches as is, indeed, the case of the semiclassical approaches discussed in refs [15,16] makes them applicable to heavier and larger chemical systems. In this paper we discuss in section 2 the semiclassical formalism and the articulation of the computational procedures developed to calculate rate coefficients for the generic A + BC reaction. In section 3 we discuss the computational features more suited to implement the code in a distributed way and the results of its actual implementation on the segment of the production EGEE grid [17] available to the COMPCHEM Virtual Organization (VO) [18] to study the reaction N+N2 . Conclusions are drawn in Section 4.
2
The Semiclassical Approach
As is well known the direct evaluation of the thermal rate coefficient k(T ) is based on the following equation ∞ 1 k(T ) = dtCf (t) (2) hQr (T ) 0 in which Cf (t) is the flux correlation function, that allows a direct calculation of the cumulative reaction probability [19, 20]. An exact formulation of Cf (t) in terms of the flux operator Fˆ and of the ˆ is given as, Hamiltonian operator H ˆ ˆ ˆ ˆ Cf (t) = tr Fˆ ei(Ht+iβ H/2) Fˆ e−i(Ht+iβ H/2) (3) where β = 1/(kB T ) (in atomic units). In the semiclassical (SC) version of Eq. 2, the flux correlation function [16, 21, 22] is factorized in terms of a “static” Cf (0) and a “dynamical” Rf (t) factor as follows: ∞ 1 k(T ) = Cf (0) dtRf (t) (4) hQr (T ) 0 with Rf (t) = Cf (t)/Cf (0). To calculate exactly the “static” factor, Cf (0), use is made of imaginary-time path integral techniques. Whereas to calculate the
A Grid Implementation of Direct Semiclassical Calculations
95
“dynamical” factor, Rf (t), an approximation based on the Semiclassical IVR (Initial Value Representation) and path integral techniques are adopted. The calculation of the static factor Cf (0) is carried out using the formula Cf (0) =
32 (Qrrpp − Qrprp ) (β)2
(5)
which has the advantage of using the converging difference between two traces (rather than calculating their divergent individual values). In eq 5 Qrrpp and Qrprp (where r stands for reactants and p stands for products) are the constrained partition functions defined as ˆ
ˆ
ˆ
ˆ
ˆ −β H/4
ˆ −β H/4
ˆ −β H/4
ˆ −β H/4
Qrrpp = tr[e−β H/4 hr (ˆ s)e−β H/4 hr (ˆ s)]e−β H/4 hp (ˆ s)e−β H/4 hp (ˆ s) Qrprp = tr[e
hr (ˆ s)e
hp (ˆ s)]e
hr (ˆ s)e
(6)
hp (ˆ s)
(7)
with hr (ˆ s) = 1 − h(ˆ s) and hp (ˆ s) = h(ˆ s) being the projection operators onto the reactant and product sides of a surface separating the reactant and the product sections of the reaction channel. The above partition functions can be evaluated using their discretized path integral expression [23]. The calculation of the dynamical factor Rf (t) is carried out using the following approximate formulation Rf (t) ∼
qt pt | Fˆ (β/2)qt pt Ct (q0 p0 )Ct∗ (q0 p0 )ei[St (q0 p0 )−St (q0 p0 )]/ ˆ q p | F (β/2) | q0 p0
0 0
W
(8) where · · · W is a Monte Carlo average over the weight function W defined as W (q0 p0 ; q0 p0 ) ∼| q0 p0 | Fˆ (β/2)q0 p0 | .
(9)
In eq. 8 the q0 p0 | Fˆ (β/2) | q0 p0 terms are the elements of coherent-state matrix (CSM) of the Boltzmannized (thermal) flux operator, St is the classical action, Ct is the square root of a determinant that involves the various monodromy matrices and (q0 p0 ) and (q0 p0 ) are the initial conditions of the real-time trajectories sampled using the standard Metropolis method [24]. The CSM elements of the Boltzmannized flux operator are evaluated using a discretized path integral and expressing the Boltzmannized flux operator as follows: Fˆ (β) =
ˆ ˆ ˆ 3β H 3β H βH i − βHˆ e 4 h(ˆ s)e− 4 e− 4 h(ˆ s)e− 4 β/2
(10)
The surface dividing reactants from products is located along the reaction coordinate s(x) defined as s(x) = max{sa (x), sb (x)}
(11)
96
A. Costantini et al.
where sa (x) and sb (x) are the reaction coordinates describing the individual rearrangement processes A + BC → AB + C and A + CB → AC + B sa (x) = rBC − rCA
(12)
sb (x) = rBC − rBA with rXY being the interatomic distance between the atoms X and Y . The location point s(x; λ) of the dividing surface is made depend on the parameter λ as follows s(x; λ) = λs1 (x) + (1 − λ) + s0 (x)
(0 ≤ λ ≤ 1)
(13)
with s1 (x) = s(x), s0 (x) = R∞ − | R | and R∞ being a large constant. This means that s(x; λ) interpolates between the original reaction coordinate s1 (x) and the asymptotic reactant reaction coordinate s0 (x) in a continuous manner. A key feature of the IVR treatment is the fact that the nonlinear two boundary value problem associated with the search of the root trajectories of the traditional SC method is replaced by a Monte Carlo average over the initial coordinates and momenta. Accordingly, the time evolution operator is formulated as a phase space average over the initial conditions of the classical trajectories as follows ˆ e−iHt/ = (2π)−F dp0 dq0 Ct (p0 q0 )eiSt (p0 q0 )/ | pt qt p0 q0 | (14) where, (pt qt ) are the momenta and coordinates at time t obtained from the given initial conditions. The states | pt qt and p0 q0 | are defined as position eigenstates (| q0 and | qt ) in the Van Vleck version of the IVR method or coherent states in the Herman-Kluk [25, 26, 27] one. The SC-IVR code [21, 22] implementing the above discussed semiclassical scheme is articulated as follows - Step 1 : calculation of the CSM elements of the Boltzmannized flux operator. The CSM elements are evaluated using the normal mode coordinates at the geometry of the saddle to reaction. The key parameters of the calculation (and the convergence determining factors) are the number of time slices P and the number of paths sampled. - Step 2 : generation of the initial states (q0 p0 ; q0 p0 ) of the pairs of considered trajectories obtained after by sampling the weight function. The result is written on an external file for later use in the calculation of Rf using Eq. 8. - Step 3 : evaluation of the Static factor Cf (0) of Eq. 5 in terms of the partition functions calculated using the DVR [28] method at the reference temperature Tref on both the mobile dividing surface set at s(x; λ) and the reactant asymptotic surface at s0 (x). The ratio between the partition function computed in the interaction region at s1 (x) (that corresponds to s(λ = 1; β = βref )) and in the asymptotic reactant region s0 (x) (that corresponds to s(λ = 0; β = βref )) is evaluated using the extended ensemble
A Grid Implementation of Direct Semiclassical Calculations
97
Fig. 1. The distribution scheme of the evaluation of the initial states
method [29]. Then the dependence of the partition function on the location of the dividing surface and its temperature dependence is evaluated to calculate the static factor. - Step 4 : calculation of the normalized flux-flux correlation function Rf of Eq. 8 by performing the integration of a suitable set of trajectories. The initial states of the Monte Carlo sampling are taken from the output generated in step 2 . In this step the actual evaluation of the dynamical factor is obtained by sampling the path (x0 , . . . , xP ) variables of Eq. 8 using a normal-mode technique [23, 30]. By inserting the values of Cf (0) and Rf (t) in eq. 4 the value of the rate coefficient k(T ) is determined (see Fig. 2).
3
Porting the Code onto the Grid and Results
According to the computational scheme illustrated above, the distributable steps of the SC-IVR calculations are Step 2 (the random generation of the initial States) and Step 4 (the integration of the batches of trajectories needed to calculate Rf ). The schemes of the distributed flow of the program in Step 2 and Step 4 are given respectively in Fig. 1 and Fig. 2 where the independent calculations are evidenced using the terminology of section 2. The porting of the SC-IVR code onto the Grid environment was performed by making use of the User Interface (UI) machine available in COMPCHEM.
98
A. Costantini et al.
Fig. 2. The distribution scheme of the evaluation of Rf (t)
From the UI the user is able to compile and test the code, submit it on the grid environment for execution, control the status of the submitted work and, finally, retrieve the results of the performed calculations. The porting procedure was articulated in several steps. In the first step the code was compiled using the Intel Fortran Compiler with te support of the Message Passing Interface (MPI) libraries and in particular of the shared memory version of the libraries in order to maximize the performance of the multi processor Working Nodes (WNs) available on the segment of the production EGEE grid available to the COMPCHEM VO. In order to reduce the dependency of the calculation from the dynamic libraries, the code was compiled in a totally static fashion. As a result, its compiled version is strictly architecture dependent. This was made possible by the homogeneity of the software and the adoption of the gLite middleware [31] based on the Globus Toolkit [32]. It is important to emphasize here that use has been made of reserve smp nodes, a set of scripts, developed at Democritos by R. Di Meo [33]. These scripts enable the use of MPI shared memory intranode devices managing the submission of the concurrent jobs to the grid environment, the selection of the Computing Elements on which the jobs will be executed and the choice of the number of processors engaged for the concurrent run. They also manage the resubmission of the job after a given amount of waiting time assumed to correspond to a job execution failure. In the second step the files necessary for the execution are uploaded to the Grid environment on one of the Storage Elements supporting the VO (in our case use was made of the se.grid.unipg.it SE) before submitting the job. The relevant files are: input.tar.gz which is a compressed archive containing the input files; mpi-exec which is the MPI executable file of the SC-IVR program statically compiled in the UI machine. Use is made of the relocatable mpich shmem which is a script to exploit the MPI shared memory intranode device.
A Grid Implementation of Direct Semiclassical Calculations
99
As already mentioned the distributed grid runs of the SC-IVR code were performed for the N + N2 elementary reaction using the L3 potential energy surface of ref. [34]. Results of preliminary calculations are shown in Fig. 3 where for comparison the experimental data and related error bars are also given. As to the computational implications of performing the calculations on the Grid, one has to bear in mind that trajectory calculations and related initial states implementation are traditionally considered intrinsically suited for an efficient distributed execution. This has shown to be true also in our case. In fact, the elapsed time of a run performed to obtain the SC-IVR rate coefficient values at T =1000 K on a Compaq AlphaServer ES40 cluster of the Barcelona CESCA supercomputing facility is 70 hours while the one measured on the Intel Server Platform SR1435 VP2 of the cex.grid.unipg.it (belonging to the already mentioned segment of the EGEE grid available to COMPCHEM) is slightly larger than 72 hours. These were obtained by summing the fraction of the elapsed time associated with the generation of the initial states (step 2 of Section 3) and the fraction associated with the trajectory integration (step 4 of the same section) for one CPU both in the case of the Alpha Server and in the case of the cex.grid.unipg.it platform. A first important indication provided by these results is that the calculation of the initial states carried out in step 2 is definitely more time consuming (about one order of magnitude larger) than the integration of the set of trajectories performed in step 4 (the truly dynamics one) when running the code on a single processor. The elapsed times measured on the two processors turn out to be basically equivalent despite the fact that they bear different hardware characteristcs (EV68, 0.8GHz, 4GB RAM/node, cache L1 64KB/L2 8MB the former, Pentium Xeon 3.06GHz 2GB RAM/node cache L1 8KB/L2 512KB the latter). The equivalence, however, is more apparent than real since no queueing or brokerage times are included in the comparison. Yet for the sake of completeness the queueing time on the AlphaServer is in general of the order of one or two days while brokerage time on the Grid is only usually of the order of one or two hours. This is due to the different weights given to the parameters adopted for resources allocation by the two platforms (especially the estimated length of the run, the number of nodes engaged and the size of memory occupied). Even more instructive is the fact that on the Grid one can fully exploit the large variety of CPUs offered by the extremely heterogeneus environment. In our tests the grid node was made of two single core nodes and led to a measured elapsed time about 20% smaller per processor than that of the single processor. This evidences a case of superscalarity due to the fact that the CPUs belong to the same Working Node and share the same memory. In order to check this, other runs in which the CPUs belong to different Working Nodes were performed and no superscalar effect was detected. These two pieces of evidence tell us clearly that the rough “first come first served” policy of the grid frees the users from the constraint that several computer centers set for their own purposes “optimizing” the management of the resources. On the contrary no artificial obstacle-race is
100
A. Costantini et al.
Fig. 3. SC-IVR estimates of log k(T ) plotted as a function of 1/T . For comparison experimental data and related error bars are also shown.
imposed on the Grid to get the program run. The only price to pay for the purpose of getting a program run whenever and wherever is possible is a severe loss of hardware homogeneity as the number of available computing elements increases. However, this is largely compensated by the homogeneization at the level of software provided by the well known gLite middleware [31] based on the Globus Toolkit [32]. Moreover, the fact that the Grid is inhomogeneus does not play completely on the negative side. The heterogeneity of the grid platform may offer the opportunity of using machines which (sometimes surprisingly) perform much better than expected (even if sometimes requiring a little bit of extra work). Moreover, users of the grid do not meet the risk of getting a “your allocated time has expired” message after having spent the whole grant (often earned with a lot of bureaucratic work) to adapt the code to a specific machine or before, in any case, that the planned investigation has been completed.
4
Conclusions
The paper reports the work performed to implement on the grid a code based on the Semiclassical SC-IVR approach. The computational scheme of the SC-IVR code has been found to be suited for distributions in two important parts of the calculation. Namely: the generation of the initial conditions and the integration of the trajectories. These two parts of the code are naturally concurrent and were found to be efficiently distributable. Particular emphasis is given to this aspect by the possibility of exploiting the superscalar effects in multiprocessors architectures.
A Grid Implementation of Direct Semiclassical Calculations
101
The results obtained for step 2 confirm, indeed, that the more matrix-operation based is the calculation, the higher is the chance of efficiently exploiting the shared memory features of the multi processor WNs available on the Grid platform. This indicates also that Grid platforms offer advantages (sometimes even larger than expected) when performing heavy computational campaigns and that a proper use of the Grid could be beneficial for quantum calculations which, as is well known, are heavily matrix-operation based. The other important conclusion of our study is that the impact of the innovative features of the presently available production computing grids prompts the search for new (or the rivitalization for other ones) theoretical and computational methodologies. In the specific case of the work reported here the implementation of the semiclassical direct calculation of thermal rate coefficients on the Grid opens the perspective of producing on the fly the needed kinetics information in complex realistic simulations.
Aknowledgements The authors acknowledge financial support from the European project EGEE III and Spanish MEC (Project CTQ2005-03721) and DURSI (Project 2005 PEIR 0051/69). CESGA and COMPCHEM VO are thanked for the allocation of computing time. This work has been carried out as part of the activities of the working group QDYN of the COST CMST European Cooperative Project CHEMGRID (Action D37). F.H-L. thanks the Spanish Ministerio de Educaci` on y Ciencia for a “Ram`on y Cajal” fellowship. Thanks are also due to the project “Fundamental Issues on the Aerothermodynamics of Planetary Atmosphere Re-entry” from the AO/1-5593/08/NL/HE European Space Agency.
References 1. Armenise, I., Capitelli, M., Celiberto, R., Colonna, G., Gorse, C., Lagan`a, A.: The effect of N+N2 Collisions on the Non-Equilibrium Vibrational Distributions of Nitrogen under Reentry Conditions. Chem. Phys. Letters 227, 157–163 (1994) 2. Angelucci, M., Costantini, A., Crocchianti, S., Lagan` a, A., Vecchiocattivi, M.: Uno studio sull’ Ozono. Micron, rivista di informazione ARPA Umbria 9, 34–39 (2008) 3. Costantini, A.: Grid Enabled Distributed Computing: from Molecular Dynamics to Multiscale Simulations. PhD Thesis, University of Perugia, Perugia (I) (2009) 4. Carvalho, M.: Clean Combustion Technologies. CRC Press, Boca Raton (1999) 5. Porrini, M.: A Molecular Dynamics Study of Lamellar Membranes Microsolvated Benzene for a Grid Approach. PhD Thesis, University of Perugia, Perugia (I) (2006) 6. Arteconi, L.: Molecular Dynamics Modeling of Micropores of cellular membranes. PhD Thesis, University of Perugia, Perugia (I) (2008) 7. Bruno, D., Capitelli, M., Longo, S., Minelli, P.: Direct Simulation Monte Carlo Modeling of Non Equilibrium Reacting Flows. Issues for the Inclusion into a ab initio Molecular Processes Simulator. In: Lagan´ a, A., Gavrilova, M.L., Kumar, V., Mun, Y., Tan, C.J.K., Gervasi, O. (eds.) ICCSA 2004. LNCS, vol. 3044, pp. 383– 391. Springer, Heidelberg (2004)
102
A. Costantini et al.
8. Bowman, J.M.: Approximate Time Independent Methods for Polyatomic Reactions. Lecture Notes in Chemistry 75, 101–114 (2000) 9. Skouteris, D., Pacifici, L., Lagan` a, A.: A Time Dependent Study of the Nitrogen. Atom Nitrogen Molecule Reaction. Mol. Phys. 102, 2237–2248 (2004) 10. Skouteris, D., Castillo, J.F., Manolopulos, D.E.: ABC: a quantum reactive scattering program. Comp. Phys. Comm. 133, 128–135 (2000) 11. Johnston, H.S.: Gas phase reaction rate theory. The Ronald Press Company, New York (1966) 12. Manthe, U., Seideman, T., Miller, W.H.: Full Dimensional Quantum Mechanical Calculation of the Rate Constant for the H2 + OH → H2 O + H Reaction. J. Chem. Phys. 99, 10078–10081 (1993) 13. Wang, H., Thompson, W.T., Miller, W.H.: “Direct” Calculation of Thermal Rate Constants for the F + H2 → HF + H Reaction. J. Phys. Chem. A 102, 9372–9379 (1998) 14. Viel, A., Leforestier, C., Miller, W.H.: Quantum Mechanical Calculation of the Rate Constant for the Reaction H + O2 → OH + O. J. Chem. Phys. 108, 3489– 3497 (1998) 15. Miller, W.H.: “Direct” and “Correct” Calculation of Microcanonical and Canonical Rate Constants for Chemical Reactions. J. Phys. Chem. A 102, 793–806 (1998) 16. Yamamoto, T., Miller, W.H.: Semiclassical Calculation of Thermal Rate Constants in Full Cartesian Space: The Benchmark Reaction for D + H2 → DH + H. J. Chem. Phys. 118, 2135–2152 (2003) 17. Enabling Grids for E-Science in Europe (EGEE), project funded by the European Union, http://compchem.unipg.it/ 18. Lagan` a, A., Riganelli, A., Gervasi, O.: On the Structuring of the Computational Chemistry Virtual Organization COMPCHEM. In: Gavrilova, M.L., Gervasi, O., Kumar, V., Tan, C.J.K., Taniar, D., Lagan´ a, A., Mun, Y., Choo, H. (eds.) ICCSA 2006. LNCS, vol. 3980, pp. 665–674. Springer, Heidelberg (2006), http://www.eu-egee.org/compchem 19. Miller, W.H.: Quantum Mechanical Transition State Theory and a New Semiclassical Model for Reaction Rate Constants. J. Chem. Phys. 61, 1823–1834 (1974) 20. Yamamoto, T.: Quantum statistical mechanical theory of the rate of exchange chemical reactions in the gas phase. J. Chem. Phys. 33, 281–289 (1960) 21. Miller, W.H.: The Classical S-Matrix: Numerical Application to Inelastic Collisions. J. Chem. Phys. 53, 3578–3587 (1970) 22. Heller, E.: J. Chem. Phys. 94, 2723–2729 (1991) 23. Ceperley, D.M.: Path integrals in the theory of condensed helium. Rev. Mod. Phys. 67, 279–355 (1995) 24. Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E.: Equations of State calculations by fast computing machines. J. Chem. Phys. 21, 1087– 1092 (1953) 25. Herman, M.F., Kluk, E.: A semiclassical justification for the use of non-spreading wavepackets in dynamic calculations. Chem. Phys. 91, 27–34 (1984) 26. Kay, K.G.: Integral Expressions for the Semiclassical Time-Dependent Propagator. J. Chem. Phys. 100, 4377–4392 (1994) 27. Kay, K.G.: Numerical Study of Semiclassical Initial-Value Methods for Dynamics. J. Chem. Phys. 100, 4432–4445 (1994) 28. Harris, D., Engerholm, G., Gwinn, W.: J. Chem. Phys. 43, 1515–1517 (1965) 29. Lyubartsev, A.P., Martsinovski, A.A., Shevkunov, S.V., Vorontsov-Velyaminov, P.N.: Method of Expanded Ensembles. J. Chem. Phys. 96, 1776–1783 (1992)
A Grid Implementation of Direct Semiclassical Calculations
103
30. Yamamoto, T., Wang, H., Miller, W.H.: Combining Semiclassical Time Evolution and Quantum Boltzmann Operator to Evaluate Reactive Flux Correlation Function for Thermal Rate Constants of Complex Systems. J. Chem. Phys. 116, 7335–7349 (2002) 31. gLite website, http://glite.web.cern.ch/glite 32. The Globus Project, http://www.globus.org 33. DEMOCRITOS is the National Simulation Center of the Italian Istituto Nazionale per la Fisica della Materia (INFM), hosted by Scuola Internazionale Superiore di Studi Avanzati (SISSA), Trieste, http://www.democritos.it/ 34. Garcia, E., Lagan` a, A.: Effect of Varying the Transition State Geometry on N + N2 Vibrational Deexcitation Rate Coefficients. J. Phys. Chem. A 101, 4734–4740 (1997)
A Grid Implementation of Direct Quantum Calculations of Rate Coefficients Alessandro Costantini1 , Noelia Faginas Lago1 , Antonio Lagan` a1 , 2 and Ferm´ın Huarte-Larra˜ naga 1
2
Department of Chemistry, University of Perugia, Perugia, Italy Computer Simulation and Modeling (CoSMo) Lab, Parc Cient`ıfic de Barcelona and Institut de Qu`ımica Te´ orica de la Universidad de Barcelona (IQTCUB), Barcelona, Spain {alex,noelia}@impact.dyn.unipg.it,
[email protected],
[email protected]
Abstract. A detailed description of the grid implementation on the production computing grid of EGEE of FLUSS and MCTDH quantum codes performing a calculation of atom-diatom reaction rate coefficients is given. An application to the N + N2 reaction for which a massive computational campaign has been performed is reported.
1
Introduction
Quantum calculations of nuclear dynamics are difficult to implement on distributed computer platforms even for simple systems. This is due to the fact that related computational procedures usually involve tightly coupled operations of large matrices because the traditional way of tackling this problem which consists in integrating the stationary (time independent) Schr¨ odinger equation for the nuclei at a fixed value of total energy. This implies the need for integrating the equation ˆ HΦ({X}) = EΦ({X})
(1)
where Φ is the time-independent three-atom wavefunction (depending on nuclear ˆ is the related (usually electronically adiabatic) coordinates {X} only) and H Hamiltonian of the nuclei [1]. The most popular way of performing the integration of the stationary Schr¨ odinger equation is to define a proper coordinate smoothly connecting reactants to products (called reaction coordinate), calculate eigenvalues and eigenfunctions of the Hamiltonian of the bound state problem in the remaining coordinates on a fairly dense grid of points along the reaction coordinate, expand Φ at each point of the grid on the related fixed reaction coordinate eigenfunctions, average over the related coordinates and propagate the solution from reactants to products by integrating the resulting coupled differential scattering equations in the reaction coordinate. From the value of Φ at the asymptotes the detailed information on the efficiency of the reactive process is recovered under the form of the scattering S matrix. A first step towards a O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 104–114, 2009. c Springer-Verlag Berlin Heidelberg 2009
A Grid Implementation of Direct Quantum Calculations
105
more distributable computational organization has been the formulation of the problem using time dependent approaches. These approaches integrate in time the time dependent Schr¨ odinger equation i
∂ ˆ Φ({X}, t) = HΦ({X}, t) ∂t
(2)
A popular way of integrating Eq. 2 is to build a wavepacket configured in a given initial state of the reactants and then propagate it in time until its shape is no longer perturbed when continuing the time propagation [2]. Then by analyzing the wavepacket in the product region detailed information on the efficiency of the reactive process can again be recovered under the form of the S matrix. A further step forward towards more efficient and distributable computational machineries can be made when quantities to be calculated are not as detailed as the S matrix and some kind of averaging can be built in the treatment from the very beginning. This is the case of the evaluation of thermal rate coefficients k(T ) that is needed when realistically modeling some complex applications like the atmospheric reentry of a spacecraft [3], the production of secondary pollutants in the atmosphere [4,5], the combustion of fuels and biofuels [6], the permeability of membranes to ions in fuel cells [7] and ion permeability through micropores [8]. By definition, [9] the thermal rate coefficient k(T ) can be formulated as the thermal average of the cumulative reaction probability, N (E): k(T ) =
1 hQr (T )
∞
−∞
dEe−E/kB T N (E)
(3)
where h is the Planck constant, kB the Boltzmann constant, E the total energy, T the temperature and Qr (T ) the reactant partition function per unit volume. The cumulative reaction probability N (E) of eq. 3, sometimes indicated also as CRP, is linked to the scattering matrix elements Si→i (E, J) by the relationship N (E) =
J
(2J + 1)
Pi→i (E, J)
(4)
i,i
in which the individual reaction probabilities Pi→i (E, J) are defined as Pi→i (E, J) =| Si→i (E, J) |2 with i and i labeling reactant and product states respectively and J being the total angular momentum quantum number. However, when reaction occurs by overtaking a barrier and no long-lived complex is formed, it is more convenient to calculate k(T ) in a direct way. In this paper in section 2 we illustrate the formalism of direct quantum calculations of k(T ) using a Multi Configurational Time Dependent Hartree (MCTDH) [10,11,12] and the related computational procedures. In section 3 we discuss the steps undertaken to implement the code in a distributed way on the segment of the production EGEE grid [13] available to the COMPCHEM Virtual Organization (VO) [14]. Results and some conclusions are drawn in Section 4.
106
2
A. Costantini et al.
The Direct Quantum Calculation of K(T )
The direct evaluation of k(T ) is based on the following formulation of thermal rate coefficients ∞ 1 k(T ) = dtCf (t). (5) hQr (T ) 0 In eq. 5 Cf (t) is the flux correlation function that can be calculated in terms of ˆ as follows the flux operator Fˆ and of the Hamiltonian operator H ˆ
ˆ
ˆ
ˆ
Cf (t) = tr[Fˆ ei(Ht+iβ H/2) Fˆ e−i(Ht+iβ H/2) ]
(6)
where β = 1/(kB T ) (in atomic units) by avoiding so far the evaluation of the detailed reactive properties of the asymptotic fragments [15, 16]. Similarly, the cumulative reaction probability N (E) can be formulated as: ˆ − E)Fˆ δ(H ˆ − E)). N (E) = 2π 2 tr(Fˆ δ(H
(7)
The exact quantum calculation of Cf (t) relies on the generalized formulation of the flux autocorrelation function [17, 18] that reads in atomic units: ˆ ˆ Cf (t, t ; T0 ) = tr(FˆT0 eiHt Fˆ e−iHt )
(8)
ˆ ˆ with FˆT0 = e(−Hβ0 /2) Fˆ e(−Hβ0 /2) being the thermal flux operator [19] at the reference temperature T0 and β0 = 1/(kB T0 ). The calculation should include the terms relative to all the contributing J values. To this end the thermal flux operator is projected on a random combination of Wigner functions [20]: ˆ ˆ (j) (j) FˆTj0 = e(−Hβ0 /2) Fˆ | ψrot ψrot | e(−Hβ0 /2) .
(9)
A random sample of the generalized flux correlation function at the reference temperature T0 is generated using the expression: ˆ (j) (j) ˆ Cf (t, t ; T0 ) = tr(FˆT0 eiHt Fˆ e−iHt )
(10)
(j) and performing a spectral decomposition of FˆT0 to obtain Eq. 11:
Cf (t, t ; T0 ) = (j)
(j)
k
ˆ ˆ (j) (j) (j) fk,T0 fk,T0 | eiHt Fˆ e−iHt | fk,T0
(11)
(j)
where fk,T0 and | fk,T0 are respectively the eigenvalues and the eigenvectors of (j) FˆT0 and can be viewed as eigenstates of the reaction intermediate. Accordingly the CRP corresponding to a given j rotational random sample, N (j) (E), can be formulated as: N
(j)
eEβ0 /2 (E) = 2
∞
−∞
dt
∞
−∞
dt e−iE(t−t ) Cf (t, t ; T0 ) (j)
(12)
A Grid Implementation of Direct Quantum Calculations
107
(j)
since the set of rotational functions | ψrot fulfill a completeness principle relationship [20]. The exact cumulative reaction probability can be recovered from a statistical average of the N (j) (E) values: M 1 (j) N (E) = lim M (E). M→∞ M j
(13)
1
By combining the iterative diagonalization on ref. [21] and the statistical sampling scheme of ref. [20], efficient accurate allJ thermal rate coefficient calculations [22, 23, 24] even for polyatomic reactions with floppy transition states [25] can be performed. A key element of the successful application of this methodology is the fact that the wavefunctions Ψ are expanded into a direct product of time-dependent (i) single particles functions φli (xi , t) (SPF) as follows ψ(x1 , . . . , xf , t) =
n1 l1 =1
···
nf lf =1
(1)
(f )
Al1 ...lf (t)φl1 (x1 , t) . . . φlf (xf , t)
(14)
when a Multi-Configurational Time-Dependent n Hartree (MCTDH) scheme is adopted in eq 14. In eq. 14 the summations lii=1 run over all single particle functions of the i-th degree of freedom. The quantum mechanical code is articulated in a sequence of consecutive steps as follows - step 1 : The thermal flux operator is diagonalized at the reference temperature T0 to generate a Krylov basis for which flux eigenstates and the corresponding eigenvalues are calculated. In this step, the code called GridFLUSS made of a sequential task, performs a modified Lanczos iterative diagonalization starting from a trial initial wavefunction. As detailed before, ˆ ˆ the calculation of FˆT = e−β H/2 Fˆ e−β H/2 and of two imaginary time propˆ = i(iβ)H) ˆ are performed at each iteration step. Moreover, agations (−β H the wavefunction is expressed as an SPF product following the MCTDH approach. This method is particularly convenient for approaches applying exponential operators in multidimensional systems. - Step 2 : The Krylov subspace basis functions generated in Grid-FLUSS are propagated in time. This is an important feature for grid distribution since the propagation of each basis functions is an independent event and related results can be unambiguosly labelled after their index and seed of the random numbers generator. - Step 3 : Once time propagation of all Krylov basis functions has been completed successfully, a unitary transformation is performed to diagonalize the thermal flux matrix representation and calculate the cumulative reaction probability. In the case of a single J (usually J = 0) run, this scheme is executed only once. However, since the calculation of the rate coefficient requires the inclusion of the fixed J contributions for a large number of total angular
108
A. Costantini et al.
momentum values, J is statistically sampled and the calculations are repeated for randomly generated J values until convergence is reached.
3
Porting the Quantum Codes onto the Grid
The porting of the quantum codes onto the Grid environment was performed by making use of the User Interface (UI) machine available in COMPCHEM. From the UI the user is able to compile and test the code, submit it to the grid for execution, control the status of the submitted work and, finally, retrieve the results of the performed calculations. To this end a standalone library-independent F77 version of the MCTDH code (called Grid-MCTDH ) for multi-dimensional quantum dynamics simulations was deployed and implemented for sequential validation on a CPU of the Grid. After testing the reliability of the code, two possible strategies were considered and tested in order to implement the MCTDH code onto the Grid environment: 1. Submit an executable MCTDH file to the computing elements (CEs) of the grid, along with the input files and a script file imparting the necessary instructions. This option has the advantage of requiring only simple scripts for the code execution. On the other hand, the executable, which needs to be sent, is a binary file and this has two main disadvantages: it is larger in size than a plain ASCII file and it is also architecture dependent. 2. Submit the source code to the CEs. This implies that the transferred files during the job submission are much smaller and architecture independent. However, the success and the outcome of the job strictly depends on the machine environment (e.g. a proper FORTRAN compiler needs to be available on the CE). An illustration of the general scheduling strategy is sketched in Figure 1.
Fig. 1. Illustrative scheme of a script for the compilation of the Grid-MCTDH source code into a CE
A Grid Implementation of Direct Quantum Calculations
109
Fig. 2. Typical distributed execution flow of the eigenstates independent propagation (fi boxes) after the Grid-FLUSS flux diagonalization (fi boxes)
As already mentioned both directions were followed and the work has progressed by developing the necessary scripts and JDL files for the job submission and execution. Both strategies have been tested satisfactorily on the segment of the EGEE production grid available to COMPCHEM. The tests showed that the second strategy is more flexible and performs better in task submission. In step 1 Grid-FLUSS , the code carrying out the diagonalisation of the thermal flux operator, has been implemented in a sequential way. A problem which had to be solved for the distributed implementation of the FLUSS program, was the management of the file containing the thermal flux eigenfunctions. This can be large when dealing with polyatomic systems and due to its binary format it cannot be compressed. To bypass the present 80MB size limit for the output retrieval, we have devised a strategy in which, once FLUSS has finished successfully, the output ASCII files are compressed and retrieved by the user to the UI, while the large binary file is transferred to a storage element (SE) of the Grid. This transfer is presently performed by means of a lcg-cp command which does not impose limitations on the size of the file. In step 2 the distributed time-propagation (see the parallel fi → fi components of Fig. 2) of the eigenstates (eigenfunctions) resulting from the flux diagonalisation of Grid-FLUSS (see the fi boxes of Fig. 3) is performed by the Grid-MCTDH code. For this purpose a specific code, called SPLIT , has been developed in order to generate a separate file for each flux eigenstate out
110
A. Costantini et al.
Fig. 3. The distribution scheme of multiple (M ) J runs of the code
of the common one. These files are stored in a SE and the eigenfunctions are efficiently propagated in independent calculations, exploiting different CEs of the Grid. This scheme associated with a single value of the total angular quantum number J (usually J = 0) can be replicated (see Fig. 3) when calculations are performed for other J values in order to converge with the total angular momentum. In step 3 the gathering together of the outcomes of step 2 is performed after monitoring the succesful execution of all the Krylov basis functions and collecting related results. A key element of the distributed procedure, as already repeatedly pointed out before, are the scripts written to manage the database files of the GridFLUSS and Grid-MCTDH programs which contain the attributes for each of the following entries: JOBNAME: name of directory containing the input files. JOBTYPE: “1” for Grid-FLUSS and “2” for Grid-MCTDH . LAN: eigenfunction being propagated (only applies if JOBTYPE=2). ISTAT: Seed of the random number generator. JOBID: JobID provided by the Resource Broker (RB). JOBSTATUS: integer number defining the Job Status. CE: Computing Element on which the Job is executed. The PHP scripts developed for the various goals of the procedure are: create append.php: This PHP script creates a job database table or, if the table is already present, appends new jobs to it. The JOBNAME and
A Grid Implementation of Direct Quantum Calculations
111
JOBTYPE are read from the command line while the values of ISTAT and LAN are generated automatically. JOBID, JOBSTATUS, and CE are initially set to 0. The script is designed also for an automatic generation of the set of Grid-MCTDH once Grid-FLUSS has finished and the N eigenstates are ready to be propagated (see Figure 2). send.php: This PHP script reads all the entries in the job database table and submits to the WMS all jobs in the database for which JOBSTATUS=0 (pending to be sent). According to JOBTYPE (1 or 2) a Grid-MCTDH or Grid-FLUSS with the correct parameters (LAN & ISTAT) is launched. status.php: This PHP script reads the job database table and queries the WMS about the status of all jobs in the database. The JOBSTATUS, JOBID and CE tags of the database are updated according to the answer of the Workflow Management System (WMS). If the job is successfully finished it retrieves the output to the JOBNAME directory using the glite-wms middleware commands.
4
Results and Conclusions
As a case study for testing concurrent calculations, the N + N2 elementary reaction was considered. The theoretical study of this reaction is of extreme interest for plasma technologies and related aerothermodynamics applications [3]. The potential energy surface employed was the L3 PES [26]. The Arrhenius plot corresponding to the present calculations is performed using 30 flux eigenstates computed at 2000K. An important indication obtained from our work is, indeed, that the flux correlation method can play a role equivalent to that of trajectory calculations for the quantum realm when performed on the grid. In fact, it is made essentially of two bloks similar to those of the trajectory codes: initial conditions generation and propagation in time of independent events. However, as indicated by the preliminary calculations performed for the N+N2 reaction, while the ratio of times employed on a typical rackable cluster CPU available at a computational chemistry laboratory (Pentium Xeon 3.06GHz 2GB RAM/node L1/L2 cache 8KB/512KB) though not negligible (as in trajectory calculations) is still as small as 1%, memory requests are completely different. Yet, this may become a problem as the system considered gets large. In this case, however, the adoption of the iterative Lancos method might offer further options for distributed calculations of the flux eigenstates. Moreover, though for the case considered here memory size was still in the limits of the most popular clusters, moving to larger systems it may become a problem. In this case, in fact, the large size of the matrices may require their distribution over more CPUs that is not adequately supported on Grid platforms. Better support is obtained for intracluster distribution with the optimum solution being the use of multicore processors. Indeed we did exploit this in our calculations and we found that superscalar effects in a bi processor node lead to a speedup of 2.4.
112
A. Costantini et al.
Fig. 4. Arrhenius plot for the N + N2 → N2 + N reaction rate coefficient calculated using 30 states
This means that quantum studies of chemical reactivity and moreover the calculation of rate coefficients can become part of routine calculations rather than being considered a una tantum effort. This, however, is largely due to the exploitation of the innovative features of the presently available production Computing grids which make feasible extended computational campaigns provided that, as in our case, a well suited theoretical approach is chosen including in this respect also direct quantum calculations of thermal rate coefficients.
Aknowledgements The authors acknowledge financial support from the European project EGEE III and Spanish MEC (Project CTQ2005-03721) and DURSI (Project 2005 PEIR 0051/69). The VO is thanked for the allocation of computing time. This work has been carried out as part of the activities of the working group QDYN of the COST CMST European Cooperative Project CHEMGRID (Action D37). F.H-L. thanks the Spanish Ministerio de Educaci` on y Ciencia for a “Ram`on y Cajal” fellowship. Thanks are also due to the project “Fundamental Issues on the Aerothermodynamics of Planetary Atmosphere Re-entry” from the AO/1-5593/08/NL/HE European Space Agency EMITS.
A Grid Implementation of Direct Quantum Calculations
113
References 1. Skouteris, D., Pacifici, L., Lagan` a, A.: A Time Dependent Study of the Nitrogen. Atom Nitrogen Molecule Reaction. Mol. Phys. 102, 2237–2248 (2004) 2. Skouteris, D., Castillo, J.F., Manolopulos, D.E.: ABC: a quantum reactive scattering program. Comp. Phys. Comm. 133, 128–135 (2000) 3. Armenise, I., Capitelli, M., Celiberto, R., Colonna, G., Gorse, C., Lagan`a, A.: The effect of N+N2 Collisions on the Non-Equilibrium Vibrational Distributions of Nitrogen under Reentry Conditions. Chem. Phys. Letters 227, 157–163 (1994) 4. Angelucci, M., Costantini, A., Crocchianti, S., Lagan` a, A., Vecchiocattivi, M.: Uno studio sull’ Ozono. Micron, rivista di informazione ARPA Umbria 9, 34–39 (2008) 5. Costantini, A.: Grid Enabled Distributed Computing: from Molecular Dynamics to Multiscale Simulations. PhD Thesis, University of Perugia, Perugia (I) (2009) 6. Carvalho, M.: Clean Combustion Technologies. CRC Press, Boca Raton (1999) 7. Porrini, M.: A Molecular Dynamics Study of Lamellar Membranes Microsolvated Benzene for a Grid Approach. PhD Thesis, University of Perugia, Perugia (I) (2006) 8. Arteconi, L.: Molecular Dynamics Modeling of Micropores of cellular membranes. PhD Thesis, University of Perugia, Perugia (I) (2008) 9. Bowman, J.M.: Lecture Notes in Chemistry 75, 101–114 (2000) 10. Beck, M., J¨ akle, A., Worth, G., Meyer, H.D.: The multiconfiguration timedependent Hartree (MCTDH) method: a highly efficient algorithm for propagating wavepackets. Phys. Rep. 324, 1–5 (2000) 11. Meyer, H., Manthe, U., Cederbaum, L.: The multi-configurational time-dependent Hartree approach. Chem. Phys. Lett. 165, 73–78 (1990) 12. Manthe, U., Meyer, H.D., Cederbaum, L.S.: Wave-packet dynamics within the multiconfiguration Hartree framework: General aspects and application to NOCl. J. Chem. Phys. 97, 3199–3213 (1992) 13. Enabling Grids for E-Science in Europe (EGEE), project funded by the European Union, http://compchem.unipg.it/ 14. Lagan` a, A., Riganelli, A., Gervasi, O.: On the Structuring of the Computational Chemistry Virtual Organization COMPCHEM. In: Gavrilova, M.L., Gervasi, O., Kumar, V., Tan, C.J.K., Taniar, D., Lagan´ a, A., Mun, Y., Choo, H. (eds.) ICCSA 2006. LNCS, vol. 3980, pp. 665–674. Springer, Heidelberg (2006) 15. Miller, W.H.: Quantum Mechanical Transition State Theory and a New Semiclassical Model for Reaction Rate Constants. J. Chem. Phys. 61, 1823–1834 (1974) 16. Yamamoto, T.: Quantum statistical mechanical theory of the rate of exchange chemical reactions in the gas phase. J. Chem. Phys. 33, 281–289 (1960) 17. Miller, W.H.: “Direct” and “Correct” Calculation of Microcanonical and Canonical Rate Constants for Chemical Reactions J. Phys. Chem. A 102, 793–806 (1998) 18. Huarte-Larra˜ naga, F., Manthe, U.: Thermal Rate Constants for Polyatomic Reactions: First Principles Quantum Theory. Z. Phys. Chem. 221, 171–213 (2007) 19. Park, T.J., Light, J.C.: Quantum flux operators and thermal rate-constant: Collinear H+H2 . J. Chem. Phys. 88, 4897–4912 (1988) 20. Matzkies, F., Manthe, U.: Accurate reaction rate calculations including internal and rotational motion: A statistical multi-configurational time-dependent Hartree approach. J. Chem. Phys. 110, 88–96 (1999) 21. Manthe, U., Matzkies, F.: Iterative diagonalization within the multi-configurational time-dependent Hartree approach: calculation of vibrationally excited states and reaction rates. Chem. Phys. Letters 252, 71–76 (1996)
114
A. Costantini et al.
22. Matzkies, F., Manthe, U.: Combined iterative diagonalization and statistical sampling in accurate reaction rate calculations: Rotational effects in O + HCl → OH + Cl. J. Chem. Phys. 112, 130–136 (2000) 23. Manthe, U., Matzkies, F.: Rotational effects in the H2 + OH → H + H2 O reaction rate: Full-dimensional close-coupling results. J. Chem. Phys. 113, 5725–5731 (2000) 24. Huarte-Larra˜ naga, F., Manthe, U.: Quantum mechanical calculation of the OH + HCl → H2 O + Cl reaction rate: Full-dimensional accurate, centrifugal sudden, and J-shifting results. J. Chem. Phys. 118, 8261–8267 (2003) 25. Huarte-Larra˜ naga, F., Manthe, U.: Vibrational excitation in the transition state: The CH4 + H → CH3 + H2 reaction rate constant in an extended temperature interval. J. Chem. Phys. 116, 2863–2869 (2002) 26. Garcia, E., Lagan` a, A.: Effect of Varying the Transition State Geometry on N + N2 Vibrational Deexcitation Rate Coefficients. J. Phys. Chem. A 101, 4734–4740 (1997)
A Grid Implementation of Chimere: Ozone Production in Central Italy Antonio Lagan` a1 , Stefano Crocchianti1 , Alessandro Costantini1 , Monica Angelucci2 , and Marco Vecchiocattivi2 1
Dipartimento di Chimica, Universit` a di Perugia, 06123 Perugia, Italy 2 ARPA Umbria, Via Pievaiola - San Sisto - 06132 Perugia, Italy
Abstract. A multiscale tridimensional Chemistry and Transport Model (Chimere) has been implemented on two very different scalable cluster. Its input was generated by specific interfaces built by the authors, adaptating meteo, emissions and boundary conditions data provided by different agencies to the needs of the Model. A prototype grid implementation on a segment of the EGEE Grid has been performed. Preliminary results of an application to the Umbria Region ozone production are also presented. Keywords: Umbria, ozone, Chimere, Chemistry and Transport Model, EGEE.
1
Introduction
The Department of Chemistry of the University of Perugia in collaboration with the Department of Mathematics and Informatics has spent significant efforts in building a virtual organization of EGEE[1] called COMPCHEM[2]. The COMPCHEM main aim is the assemblage of a computing grid[3] community working on molecular and material science and technology. As a result of these efforts several quantum and classical molecular dynamics programs have been developed and/or ported on the production computing Grid segment of EGEE available to COMPCHEM. On top of this effort a common workflow has been designed in order to assemble a Grid empowered molecular simulator (GEMS)[4]. GEMS is designed to act as computational molecular engine of some complex simulations like those related to spacecraft reentry[5] and clean combustions design[6]. Another important application is the one considered in this paper for the production of secondary pollutants in the atmosphere[7]. This application deals with the modelling of atmospheric processes affecting the quality of air and is being carried out in collaboration with ARPA Umbria (the Agency for Environment of the Umbria region of Italy). In the Umbria region air quality control is regulated by the Legislative Decree No. 351 published on 04/Aug/1999 that derives from the European Commission Council Directive 96/62/EC on ambient air quality assessment and management. These directives assign to computational simulations the key role of determining the scenario in which environmental policies are carried out. As a matter of O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 115–129, 2009. c Springer-Verlag Berlin Heidelberg 2009
116
A. Lagan` a et al.
fact, the Umbria region has approved a plan for ”Air Quality Preservation” (Regional Council decision 466 of 09/Feb/2005) in which strategies and tools to be activated for ensuring a good air quality are regulated. Among them the regional inventory of emissions, the local network of monitoring stations and the use of predictive models to evaluate the impact of the adopted measures on the air quality. The main pollutants considered for the monitoring in the Umbria Region are: sulfur (SOx ) and nitrogen (NOx ) oxydes, all volatile organic compounds but methane (COVNM), carbon monoxydes (CO), particulate matter with diameter smaller than 10 µm (PM10 ). To this end the Umbria territory has been partitioned in 5 zones in which NOx , SOx , CO, COVNM and PM10 emissions are monitored for control. To act in the same way on ozone (O3 ) and satisfy the prescriptions of the current italian legislation (law No. 183 of 21/May/2004 as prescribed by the Legislative Decree No. 351/99) one needs to implement a simulation chain based on diffusion and physico-chemical transformation of the pollutants intervening in the O3 production/consumption cycles, since this species is mainly produced in atmosphere by photochemical reactions. To establish a modeling chain able to provide suitable answers to these needs the Department of Chemistry has initiated a collaboration with the Regional Agency for the Environment ARPA Umbria. To this end the modeling software Chimere[8] has been chosen and implemented onto the portion of the EGEE computing Grid devoted to the COMPCHEM Virtual Organization. Accordingly, the present paper is structured as follows: in section 2 the implementation of Chimere is described; in section 3 the distributed nature of the modelling chain is singled out; in section 4 the prototype grid implementation is discussed; in section 5 the Umbria case study is illustrated. Some conclusions are drawn in section 6.
2
The Implementation of Chimere
As already mentioned, the software package adopted as computational engine of the modellistic chain is Chimere. Chimere is believed to be one of the modellistic packages more suited to carry out research and algorithm development for the atmosphere. It is based on the chemistry and transport eulerian model for the modeling of the air quality developed by the French Pierre Simon Laplace and the Lisa Institutes of the CNRS and by the INERIS (France National Institute for Environmental Technology and Hazards). Chimere is designed to provide daily predictions of O3 , PM (PM10 and PM2.5 ) and other main atmospheric pollutants. It can provide also medium range period predictions on local scale with a 1-2 Km resolution. Chimere models a large set of physicochemical phenomena concerned with atmospheric pollutants including diffusion, transport, deposition as well as chemical and photochemical reactions. Chimere is also able to deal with aerosols and reactions occuring in heterogenous phases. The original version of Chimere was written in FORTRAN 77 and was later on partly converted into FORTRAN 90. The version adopted by us is the 200606A
A Grid Implementation of Chimere: Ozone Production in Central Italy
117
one that was also the most recent stable version distributed by the developers at the time we started its implementation. The 200606A version has a parallel structure based on an MPMD (Multiple Program Multiple Data) model that makes use of the MPI (Message Passing Interface) parallel library LAM/MPI and is developed and used as a production system by the PREV’AIR (French national air quality forecasting and monitoring system) on x86 bi-processors personal computers running Linux operating system[9]. In our implementation use has been made of a more performing and scalable cluster of Intel processors (namely: Pentium Xeon, 3.06 GHz, 2 GB RAM/node, cache L1 8 KB, L2 512 KB), Linux RedHat 8.0 operating system, Intel Ifort 3.1.036 compiler and a Gbit network. The mentioned cluster (named GRID) is inserted in the EGEE European production grid[1]. For cross comparison the code has been also implemented on a SUN HPC5.0 parallel environment installed on a SUN platform (RISC UltraSPARC-IIIi processors, SUN SOLARIS 9 operating system, FORTRAN SUN STUDIO 11 compiler). This has allowed the utilization of an advanced (compared with that of the grid cluster that is largely experimental) high level development environment to spot programming errors normally passing undetected on the Intel compiler (like the attempts to change the constants passed as argument to the functions) but still seriously affecting the validity of the results. Several customized improvements have been introduced in the execution setup programs (scripts) which are also tailored to avoid the recompilation of the whole code (more than 62000 lines) when only a single (or a few) routines were modified with respect to the previous compilation. The check of the program implemented by us has been carried out on all the significant elements of the code step by step. The check has been repeated when implementing the MPICH1 library.
3
The Distributed Articulation of Chimere
The input section of Chimere is articulated in separate subsections and makes use of files generated by specific preprocessors. We built several interfaces to assemble sets of data of different type relevant to the period of interest in a format suited for those preprocessors. The mentioned interfaces are: 1. meteo to netcdf (meteorologic data) 2. bemis to netcdf (biological emissions) 3. aemission.pl (anthropic emissions) Additional data necessary for the input section are those expressing the boundary conditions of the geographic area of interest. To handle these data an utility (called chmbin2cdf and available from the directory toolbox of Chimere) was also adapted by us and implemented. meteo to netcdf: this interface provides the input data for meteorology taken from external sources like air temperatures, wind speeds, etc. contained in the file METEO. In our case the meteo data of the year considered is the one
118
A. Lagan` a et al.
Table 1. Meteorological variables input to Chimere parameter temperature at 2 meter above ground light attenuation due to the cloud height of the Planetary Boundary Layer friction speed aerodynamics resistence length of Monin-Obukhov scaling convective speed superficial relative humidity overall rainfall wind speed at 10 m above ground saltation speed humidity at ground width of each layer ovest-west wind component (zonal) north-south wind component (southern) temperature specific humidity air density vertical dispersion water content of the cloud
meteo variable tem2 atte hght usta aerr obuk wsta sreh topc w10m w10s soim alti winz winm temp sphu airm kzzz clwc
generated by LAMI1 . They were pre-processed and provided by the ARPA of the Emilia-Romagna Region and had to be converted into a netCDF[10] format. The file contains the meteorological conditions estimated by the model Lokal Modell[11] developed by the consortium COSMO[12]. The model, which is not hydrostatic, allows the forecast of meteorological conditions for a limited area by solving the complete fluid dynamics equations. Each record of the file contains the date and the values of several three-dimensional quantities (see Table 1) for each of the 8 layers in which the atmosphere is partitioned. It contains also the values of several bi-dimensional quantities (for example the aerodynamic resistence) at every gridpoint of the surface of all layers. Most of the difficulties we met arose from the fact that data was produced using an older version (V200501H) of the Chimere preprocessor, whose ouput has a structure differing from that of the version we were using. Moreover, some of the variables used in later versions to deal with biogenic emissions were not included. The final dimension of the METEO file produced by the interface is about 7.2 GB. bemis to netcdf: this interface provides the input data of hourly biogenic emissions for the period to be covered by the simulations. These emissions 1
A consortium established by the Department of Civil Protection of Italy, the Italian Air Force, Arpa Piedmont and Emilia-Romagna.
A Grid Implementation of Chimere: Ozone Production in Central Italy
119
Table 2. Biogenic species input to Chimere component biogenic variable Isoprene C5H8 Terpene TERPENE Nitrogen monoxide NO Mineral powders of diameter larger than 10μ DUST big Mineral powders of diameter ranging from 10μ to 2.5μ DUST coa Mineral powders of diameter smaller than 2.5μ DUST fin Total powder of anydrous sea salt SALT coa Sodium fraction of sea salt NA coa Chlorine fraction of sea salt HCL coa Sulphur fraction of sea salt H2SO4 coa Sea aqueous fraction of aerosol WATER coa
depend on the type of land use (in particular on the vegetation) and on the meteorological data needed to simulate the diffusion process. Related emission files were also produced by ARPA Emilia-Romagna using an in-house version of the preprocessor derived from the sequential V200501H version of the program. Data was stored in a FORTRAN unformatted fashion rather than in the netCDF format. Because of this it was necessary to write an interface for the conversion between the two formats in analogy with what has been done for meteorological data. Emission data obtained from ARPA Emilia-Romagna contains the date and the concentrations of the 11 biogenic species (see Table 2) in all points of the grid in which the Central Italy domain has been partitioned. To validate our interface, emissions produced by the V200501H version when using benchmark data were converted in the BEMISSIONS.nc file of format netCDF and compared with the analogous file produced by version V200606A. The comparison was found to be satisfactory in spite of the fact that results of the two runs were not directly comparable because in some cases they treat differently the various species2 . The size of the BIOEMISSIONS.nc file is about 1 GB. aemission.pl: this interface provides the input data of anthropogenic emissions in the netCDF format. The generation of the data is performed using the interface prepemis std that needs 12 ASCII files containing hourly emissions of three typical days of the week (working day, holiday and pre-holiday) for each month. The goal of the aemission.pl interface (written by us) is that of creating the 12 hourly files by selecting, for the Central Italy domain, the annual anthropogenic emissions out of the database containing the spatial disaggregation of the National Inventory of the year 2003 for each of the 11 macrosectors. Hourly emissions are recovered by making use of the monthly, daily and hourly disaggregation profiles. Subsequently emissions 2
In particular, some algorithms which generate the mentioned emissions differ in the two versions.
120
A. Lagan` a et al. Table 3. Anthropogenic species input to Chimere component fine primary powders (diameter > 10μm) fine primary powders (2.5μm < diameter < 10μm) fine primary powders (diameter < 2.5μm) Nitrogen monoxide Nitrogen dioxide Nitrous acid Carbon monoxide Sulphurous anhydride Ammonia Metahane Ethane n-Butane Ethylene Propene α-pinene Isopropene O-xilene Formic aldehyde Acetic aldehyde Metil-etil-khetone
anthropogenic variable PPM big PPM coa PPM fin NO NO2 HONO CO SO2 NH3 CH4 C2H6 NC4H10 C2H4 C3H6 APINEN C5H8 OXYL HCHO CH3CHO CH3COE
are partitioned into those of the various compounds (PM, NOx, VOC) using the speciation profiles provided by ISPRA (National Institute for the Protection and Environmental Research, formerly APAT). The chemical species whose emission is given in the 12 files are listed in Table 3. It is worth pointing out here that the missing profiles of the methane (CH4 ) were rendered by setting equal to zero related anthropogenic emission over all the domain. The size of the 12 files, produced by an 8 hour run on our machines amounts to about 100 MB each. chmbin2cdf: An additional set of data to be provided as input to Chimere is the one containing initial and boundary conditions of the domain. This can be created either by carrying out a simulation on the global scale or by running Chimere for the same period on a larger domain (though with a coarser grid). It was easier for us to follow the second approach since the output of a run of Chimere on a larger domain using grids of width ∼ 50x50 Km2 . and the MELCHIOR2 chemical mechanism (the same one we were intending to use) was provided by the PREV’AIR institute. In their run[13]: – anthropogenic emissions were taken from the EMEP[14] inventory – biogenic emissions were already available because computed on the ground of local vegetation and soil usage – boundary conditions of gases and aerosols were calculated using MOZART2[15] and GOCART[16] global models respectively.
A Grid Implementation of Chimere: Ozone Production in Central Italy
121
An additional effort had to be paid to adapt the chmbin2cdf interfaces, part of the Chimere package, to the PREV’AIR data since they were produced using 3 different, older versions of the code (V200310F, V200402D, V200410A). During this work it was noticed that a variable related to the air density used by the program was not included in the data available. After a consultation with the experts of Arpa Emilia-Romagna we sat it equal to a standard value. The size of the global boundary conditions file in the netCDF format is of about 1.8 GB.
4
The Prototype Grid Implementation
As already mentioned Chimere was implemented by us on two different concurrent platforms. The easiest way of exploiting concurrency in Chimere is to let the program run in parallel on several clusters. The V200606A version of the code is in fact parallel and parallel runs on an increasing number of processor have been performed. A typical run on our 8 processor Intel cluster needs 124 hours to handle a simulation covering 120 days and collecting 30 GB of result data. A distribution of Chimere over the grid is the most effective way of exploiting concurrency and gaining in computing throughput. As a matter of fact the input section of the program is already designed in a way that collects data of distributed provenance. However, in order to effectively exploit concurrency and achieve significant computing throughput a proper distribution model needs to be adopted. More in detail, an iterative structure of independent cycles to be executed a large number of times has to be singled out. For this purpose we have exploited the fact that the dependence of a given day simulation does only partially (for a few hours) depend on the initial concentrations (the values that the variables take in the previous day) since they rapidly converge to the true solution irrespectively of the starting values. Accordingly we have restructured Chimere so as to run concurrently simulations for subsets of days the first of which replicates the last day calculations of another subset. Then results produced by the various subsets are glued together by discarding the first day of each subset. Such a restructuring has allowed us to implement a prototype version of Chimere on the segment of the EGEE production Grid available to COMPCHEM. The porting was performed by making use of the User Interface (UI) machine available in COMPCHEM. From the UI the user is able to compile and test the code, submit it on the grid environment for execution, control the status of the submitted work and, finally, retrieve the results of the performed calculations. The procedure is articulated as follows: first, the code is compiled using the Intel Fortran Compiler, with the support of the Message Passing Interface (MPI) libraries in order to maximize the performance of the multi processor Working Nodes (WNs) present on the segment of the production EGEE grid available to the COMPCHEM VO, and using netCDF libraries. Second, the files necessary
122
A. Lagan` a et al.
for the execution are uploaded to the Grid environment on one of the Storage Elements (a remote machine for the date storing that supports, via the gridftp protocol, the data transfer between the machines interconnected into the Grid) which supports the VO (in particular se.grid.unipg.it, SE) before submitting the job. Third, the script is launched for execution. To this end the following bash script is used: 01 02 03 04 05 06 07 08 09 10 11
lcg-cp --vo compchem lfn:/grid/compchem/meteo \\ file:meteo lcg-cp --vo compchem lfn:/grid/compchem/emi \\ file:emi lcg-cp --vo compchem lfn:/grid/compchem/bound \\ file:bound ./chimere > CHIMERE-prod.log 2>&1 tar -cvzf data.tar.gz *.nc *.log lcg-cr -d se.grid.unipg.it -l lfn:/grid/compchem/data.tar.gz \\ file:data.tar.gz exit
In lines 01, 03 and 05 of the script the input files needed for the execution are retrieved from the SE and downloaded to the Computing Element (CE) where the calculation is performed; in line 07 the parallel version of Chimere is executed; in lines 08 and 09 the files produced by the execution are collected in a single tar file and uploaded to the SE. Fourth, at the end of the simulation, the tar files created by the calculation is directly retrieved from the SE using the lcg-cp command and transferred into the UI machine.
5
The Umbria Case Study and Results
Data used for biogenic and anthropogenic emissions are illustrated in figures 1, 2 e 3 where regional boundaries are also drawn and the Umbria region is placed at the center of each panel. The figures show respectively the temperature at 2 m above ground, biogenic and anthropogenic emission of NO for some hours of a day of July 2004 (please notice that the maximum is different for the last two graphs). As apparent from the figures the temporal and spatial trends are in line with what is expected from the orography, the kind of urbanization and the traffic routes of the place examined. The computing domain made of 8000 near-square cells of 5 Km side having a total extension of 500x400 Km2 is sketched in fig. 4. The actual run has been carried out to simulate the situation of the gas and aerosol atmospheric phases of the summer of the year 2004 (from May 1 to August 31), since in that period the ozone concentration reached a peak. To this end the chemistry mechanism MELCHIOR2, involving 44 species and 220 reactive processes, was adopted together with an aerosol sectional simulation model taking into account 7 species (primariy particle materials, nitrate, sulfate, ammonium, biogenic and anthropogenic secondary organic aerosol, water).
A Grid Implementation of Chimere: Ozone Production in Central Italy
Thu Jul 1 2004 05:00
Thu Jul 1 2004 08:00
Thu Jul 1 2004 11:00
Thu Jul 1 2004 14:00
Thu Jul 1 2004 17:00
Thu Jul 1 2004 20:00
123
Fig. 1. Air temperature at two meters from ground for various hours of July 1, 2004. Black stands for the lowest temperature (6◦ C) and white for the highest temperature shown (42◦ C). Regional boundaries are drawn in white and the Umbria region is placed at the center of each panel.
Thu Jul 1 2004 05:00
Thu Jul 1 2004 14:00
Thu Jul 1 2004 08:00
Thu Jul 1 2004 11:00
Thu Jul 1 2004 17:00
Thu Jul 1 2004 20:00
Fig. 2. Biogenic emissions of NO at various hours of July 1, 2004. Black stands for zero emission and white for the highest emission shown (6 × 1011 molecule/(cm2 × s)).
124
A. Lagan` a et al.
Thu Jul 1 2004 05:00
Thu Jul 1 2004 14:00
Thu Jul 1 2004 08:00
Thu Jul 1 2004 17:00
Thu Jul 1 2004 11:00
Thu Jul 1 2004 20:00
Fig. 3. Anthropogenic emissions a of NO at various hours of July 1, 2004. Black stands for zero emission and white for the highest emission shown (3×1012 molecule/(cm2 ×s)).
Fig. 4. Computing domain of the simulation (finer central grid) and boundary conditions grid domain (coarser grid domain), both centered on the Umbria region. The extension of the computing domain (given in coordinates of latitude and longitude of the reference system WGS84) ranges from 35,10.5 degrees (SW corner) to 57.5,22.5 degrees (NE corner).
A Grid Implementation of Chimere: Ozone Production in Central Italy
Thu Jul 1 2004 05:00
Thu Jul 1 2004 08:00
Thu Jul 1 2004 11:00
Thu Jul 1 2004 14:00
Thu Jul 1 2004 17:00
Thu Jul 1 2004 20:00
125
Fig. 5. O3 concentrations determined at various hours of July 1, 2004. Black stands for the minimum concentration (0.01 ppb) and white for the highest concentration (108 ppb).
O3 (ppb)
80 60 40 20 0
0
5
10
15
20
0
24
48
72
96
120
144
90 15:00 of every day
O3 (ppb)
80 70 60 50 40 30
0
168
336 504 Time (hours since Thu Jul 1 2004 00:00)
672
Fig. 6. O3 concentration (in ppb) in one of the cells of the calculation (lat=43.104 degrees, long=12.35 degrees) for the Perugia town. In the upper left panel the daily behaviour of July 1, 2004; in the upper right panel the weekly behaviour in the week July from 1 to 7, 2004; in the lower panel the monthly (at 15:00 of every day) behaviour for July 2004.
126
A. Lagan` a et al.
160
model station (Cortonese)
3
O3 (μg/m )
120
80
40
0
0
24
48
72
96
120
144
168
Time (hours since Thu Jul 1 2004 00:00)
Fig. 7. Computed (solid line with circles) and measured (dashed line with triangles) at a monitoring station of urban type (Perugia, via Cortonese) O3 concentrations (in μg/m3 ) during the week from 1 to 7 July 2004
Calculated concentrations of O3 for some daily hours are plotted in fig. 5. They single out that concentrations oscillate during the day and during the week. This is illustrated in detail in fig. 6 in which the daily, weekly and monthly variations of the concentrations of interest are plotted as a function of the elapsed hours (monthly data refers to 3 pm of the day) and show a daily maximum, the weekly periodicity of O3 , and a behaviour probably to be associated with the meteorological conditions. In fig. 7 the O3 concentration measured at the monitoring station of type URB3 located in a central place characterized by a low density of traffic (Perugia, Via Cortonese) and those estimated by the model are plotted. As can be easily seen from the figure, measured and estimated data are in good aggreement on structures and magnitudes of the oscillations. It is however important at this point to emphasize that measured concentrations depend on the particular position (latitude, longitude and height from the ground) of the instrumentation used. In the case illustrated here the monitoring station is placed inside a park and well represents the urban ground since ozone is not produced in the immediate vicinity to an appreciable extent. It is, therefore, reasonable to expect that these sets of data are in better agreement with values estimated than with values measured in monitoring stations placed near heavily trafficked crossroads. 3
Urban/Suburban Background Station[17]: a station measuring the average concentration of pollutant in urban areas due to urban emission and migration from periphery.
A Grid Implementation of Chimere: Ozone Production in Central Italy
127
It has also to be added that the cell considered has an area of 25 Km2 and is 43 m high. This means that inside that cell (that comprehends not only half the urban center but also a wide portion of pephifery with low population density) whichever data has to be considered as an average.
6
Conclusions
In this paper we discuss the problem of implementing a chemical and transport model to describe the complex mechanisms of formation, transport and elimination of atmospheric pollutants. The purpose of our work has been twofold. The first goal has been that of carrying out a detailed analysis of the chosen model (in our case the Chimere code was chosen because of its reputation for being particularly biased towards the physico-chemical aspects of the problem) in order to understand in depth its mechanisms and articulation. This goal has been succesfully accomplished and has led to the singling out of the key components of the computational procedure allowing the fixing of some drawbacks of the version of the code used for our implementation. The second goal of our work has been that of carrying out a critical implementation of Chimere targeted to the development of loosely coupled distributed structures suited to run on the computing grid. This goal has also been accomplished succesfully. Our efforts have, in fact, led to the assemblage of a prototype distributed version of the code that has been effectively implemented on the EGEE production computing grid and has been used for modeling the production of secondary pollutants in the Umbria region during the summer (which means a particular attention to the ozone concentration) by exploiting the information obtainable from the appropriate distributed sources. This implementation paves the way for further advances: – optimization of the distributed version of Chimere for a high throughput utilization in routine usage; – improvement of the chemical mechanisms with the linking of Chimere with GEMS the grid empowered simulator of molecular processes allowing to use ab initio estimates of the efficiency of the intervening chemical processes; – integration of Chimere with output statistical analysis and further graphical tools supporting a better understanding and a multimedia dissemination of the results; – comparison with measurements of the monitoring stations and the performances of other models in order to evaluate the accuracy of the model and figure out possible improvements.
Acknowledgments Thanks are due to the french Institutes Pierre-Simon Laplace (C.N.R.S.), INERIS and LISA (C.N.R.S.) for distributing the version of Chimere used for our
128
A. Lagan` a et al.
investigation; to the Italian ARPA Emilia Romagna and Lombardia for sharing meteorologial and biogenic emission data. Thanks are also due to COST CMST (action D37), EGEE III, the Fondazione Cassa di Risparmio of Perugia and Arpa Umbria for financial support.
References 1. EGEE: Enabling Grids for E-Science in Europe, http://www.eu-egee.org 2. Lagan` a, A., Riganelli, A., Gervasi, O.: On the structuring of the computational chemistry virtual organization COMPCHEM. In: Gavrilova, M.L., Gervasi, O., Kumar, V., Tan, C.J.K., Taniar, D., Lagan´ a, A., Mun, Y., Choo, H. (eds.) ICCSA 2006. LNCS, vol. 3980, pp. 665–674. Springer, Heidelberg (2006) 3. Kesselman, C., Foster, I.: The Grid: Blueprint for a Future Computing Infrastructure. Morgan Kaufmann Publisher, USA (1999) 4. Gervasi, O., Lagan` a, A., Lobbiani, M.: Towards a grid based portal for an a priori molecular simulator of chemical reactivity. In: Sloot, P.M.A., Tan, C.J.K., Dongarra, J., Hoekstra, A.G. (eds.) ICCS-ComputSci 2002. LNCS, vol. 2331, pp. 956– 965. Springer, Heidelberg (2002) 5. ESA project Fundamental issues on the Aerothermodynamics of Planetary Atmospheric reentry AO/1-5593/08/NL/HE 6. Carvalho, M.G., Nogueira, M.: Model-based study for Oxy-fuel furnaces for lowNOx melting process. In: Carvalho, M.G., Lockwood, F.C., Fiveland, W.A., Papadopoulos, C. (eds.) Clean Combustion Technologies Pt. B, pp. 941–960. CRC Press, Boca Raton (1999) 7. ARPA Umbria and the Department of Chemistry of the University of Perugia Memorandum of Understanding n. 494 (2008) 8. The Chimiere chemistry-transport model. A multi-scale model for air quality forec Institut Pierre-Simon Laplace, INERIS, LISA, C.N.R.S. casting and simulation. (2004), http://www.lmd.polytechnique.fr/chimere. 9. Honor´e, C., Rou¨ıl, L., Vautard, R., Beekmann, M., Bessagnet, B., Dufour, A., Elichegaray, C., Flaud, J.M., Malherbe, L., Meleux, F., Menut, L., Martin, D., Peuch, A., Peuch, V.H., Poisson, N.: Predictability of European air quality: Assessment of 3 years of operational forecasts and analyses by the PREV’AIR system. J. Geophys. Res. 113, D04301 (2008) 10. Network Common Data Form: a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data, http://www.unidata.ucar.edu/packages/netcdf 11. Doms, G., Sch¨ attler, U.: A description of the nonhydrostatic regional model LM. Part I: Dynamics and Numerics (2002), http://www.cosmo-model.org/content/model/documentation/core/ cosmoDyncsNumcs.pdf 12. COSMO: Consortium for Small-scale Modelling, http://www.cosmo-model.org 13. Input Data to the PREV’AIR System, http://www.prevair.org/en/donneesentrees.php 14. EMEP: a scientifically based and policy driven program under the Convention on Long-range Transboundary Air Pollution for international co-operation to solve transboundary air pollution problems, http://www.emep.int
A Grid Implementation of Chimere: Ozone Production in Central Italy
129
15. Horowitz, L.W., Walters, S., Mauzeralles, D.L., Emmonds, L.K., Rash, P.J., Granier, C., Tie, X., Lamarque, J.F., Schultz, M.G., Tyndall, G.S., Orlando, J.J., Brasseur, G.P.: A global simulation of tropospheric ozone and related tracers: Description and evaluation of MOZART. Version 2. J. Geophys. Res. 108(D24), 4784 (2003) 16. Ginoux, P., Chin, M., Tegen, I., Prospero, J.M., Holben, B., Dubovik, O., Lin, S.J.: Sources and distributions of dust aerosols simulated with the GOCART model. J. Geophys. Res. 106, 20255–20273 (2001) 17. Larssen, S., Sluyter, R., Helmis, C.: Criteria for EUROAIRNET. The EEA Air Quality Monitoring and Information Network. European Environment Agency, Copenhagen (1999), http://reports.eea.europa.eu/TEC12/en/tech12.pdf
MDA-Based Framework for Automatic Generation of Consistent Firewall ACLs with NAT Sergio Pozo, A.J. Varela-Vaca, and Rafael M. Gasca QUIVIR Research Group, Department of Computer Languages and Systems Computer Engineering College, University of Seville Avda. Reina Mercedes S/N, 41012 Sevilla, Spain {sergiopozo, ajvarela, gasca}@us.es http://www.lsi.us.es/~quivir
Abstract. The design and management of firewall ACLs is a very hard and error-prone task. Part of this complexity comes from the fact that each firewall platform has its own low-level language with a different functionality, syntax, and development environment. Although several high-level languages have been proposed to model firewall access control policies, none of them has been widely adopted by the industry due to a combination of factors: high complexity, no support of important features of firewalls, no common development process, etc. In this paper, a development process for Firewall ACLs based on the Model Driven Architecture (MDA) framework is proposed. The framework supports the market leaders firewall platforms and is user-extensible. The most important access control policy languages are reviewed, with special focus on the development of firewall ACLs. Based on this analysis a new DSL language for firewall ACLs, AFPL2, covering most features other languages do not cover, is proposed. The language is then used as the platform independent metamodel, the first part of the MDA-based framework. Keywords: firewall, acl, ruleset, framework, language.
1 Introduction A firewall is a network element that controls the traversal of packets across different network segments. It is a mechanism to enforce an Access Control Policy, represented as an Access Control List (ACL), or a rule set. Firewalls use obligation policies, (also known as Event Condition Action Rules (ECA) that must perform certain actions when certain events occur. By contrast, authorisation policies permit or deny actions based upon the action, the source of the action and the target of the action. Thus, a layer 3 Firewall ACL is in general a list of linearly ordered (total order) condition/action rules. Let ACLf be a firewall ACL consisting of f+1 rules,
ACL f
=
{R ,...R } . Consider as a rule 0
f
R j ∈ ACL f =< H , Action >, H ⊆ Z , 0 ≤ j ≤ f ,
Z = protocol × srcIP × srcPrt × dstIP × dstPrt , where Action
=
{allow, deny} is its action.
A selector of a firewall rule Rj is defined as R j [ k ], k ∈ H , 0 ≤ j ≤ f . A rule Rj matches a packet p when the values of each field of the header of a packet O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 130–144, 2009. © Springer-Verlag Berlin Heidelberg 2009
MDA-Based Framework for Automatic Generation of Consistent Firewall ACLs
p[ k ], k ∈ H
are
subsets
or
equal
to
the
values
of
the
rule
131
selector
R j [ k ], k ∈ H , 0 ≤ j ≤ f (i.e. p[ k ] ⊆ R j [ k ], ∀k ∈ H ). Firewalls have to face many problems in modern networks. Two of the most important ones are the high complexity of ACL design [1] and ACL consistency diagnosis [2, 3]. Networks have different access control requirements which must be translated by a network administrator into firewall ACLs. Writing and managing ACLs are tedious, time-consuming and error-prone tasks for a wide range of reasons [4]. Low-level firewall languages are, in general, hard to learn, use and understand. In addition, each firewall platform has its own low-level language which usually is very different from other vendors’ ones. Changing from one firewall platform to another often means a complete rewrite of the ACL. In this translation process, inconsistencies and redundancies can be introduced [2, 3]. Figs. 1 and 2 present two fragments of ACLs written in IPTables and Cisco PIX respectively to give an idea of the complexity and differences of these languages. Note that the number of rules of a Firewall ACL may range between a few ones and 5000 [5]. Many third-party domain specific languages (DSLs) have been proposed to abstract the network administrator from the underlying firewall platform details and language syntax [6, 7, 8, 9, 10, 11]. A domain specific language provides more possibilities to network administrators, since it can raise the abstraction level of the problem domain using its own concepts. However, as we are going to show in the next section, the proposals of DSLs have different problems regarding different aspects of design and deployment of ACLs. The result is that these languages are not widely adopted by industry. This possibly is not only due to their problems, but also to the need of companies to maintain a large user base tied to a particular low-level language and firewall platform. We think that there is a clear need of a DSL for Firewalls with the expressive power of existing low-level firewall-specific languages, but with significantly less
-A -A -A -A -A -A
FORWARD FORWARD FORWARD FORWARD FORWARD FORWARD
-i -i -i -i -i -i
-s -p -s -d -s -p
192.168.1.0 tcp -p tcp 192.168.1.0 170.0.1.10 192.168.2.0 udp -p udp
-d 170.0.1.10 -j DROP -d 170.0.1.10 -p udp -m udp -d 170.0.2.0 -j DROP
-p tcp -m tcp --sport any
--dport 21 -p tcp -j ACCEPT
-p udp -m udp --dport 53 -p udp -j ACCEPT --dport 53 -p udp -j ACCEPT -p udp -p udp -j ACCEPT
Fig. 1. IPTables ACL fragment
access -list acl -out permit gre host 192.168.201.25 access -list acl -out permit tcp host 192.168.201.25 static (inside ,outside ) 192.168.201.5 10.48.66.106 access -group acl -out in interface outside access -list acl -out permit udp host 192.168.201.25 static (inside ,outside ) 192.168.201.5 10.48.66.106 access -group acl -out in interface outside
host 192.168.201.5 host 192.168.201.5 eq 1723 netmask 255.255.255.255 0 0 host 192.168.201.5 eq 1701 netmask 255.255.255.255 0 0
Fig. 2. Cisco PIX ACL fragment
132
S. Pozo, A.J. Varela-Vaca, and R.M. Gasca
complexity than currently proposed high-level policy languages. In addition, this language must have the possibility of automatic compilation to the market-leader lowlevel firewall languages, and be easily user-extensible in order to support new features and firewall platforms. Recently, we have proposed a new high level firewall DSL with all these capabilities, called Abstract Firewall Policy Language (AFPL) [12]. The DSL was defined in the XML technological space. In this paper we propose an extension of AFPL with Network Address Translation (NAT) [16], a must-have feature of firewall languages. To the best of our knowledge, AFPL2 is the first firewall DSL to support NAT. However, with AFPL some advanced features of firewall platforms cannot be modeled, since the language is an abstraction based on the common features of the market-leaders. For this reason, we propose a framework where AFPL2 can be used along with the integration of other lower-level concepts related to particular platforms. This framework is heavily based on the concepts of MDA extended with a model consistency stage to guarantee the quality of the resulting compiled ACL. The framework is extensible by end-users, in the sense that more concepts can be added to the meta-models, as well as modifications to the transformations between them, in order to represent more features and/or low-level firewall platforms and languages. The structure of this paper is as following: in section 2 related works are described. In section 3 the MDA-based framework for Firewalls is described. In section 4 a DSL with Network Address Translation support is proposed as the MDA-based framework PIM. We conclude in section 5 and propose a research direction for our future works. Finally, in Appendix I it is presented the analysis of firewall platforms (NAT features).
2 Related Works In [14] the authors propose a high-level language, Firmato, which models ACLs as ERDs in order to automatically generate low-level firewall ACLs. However, the complexity of Firmato is similar to that of many low-level languages. Two major limitations of Firmato are that (1) it does not support NAT and (2) can only represent knowledge in positive logic (allow rules), which complicates the specification of exceptions (a rule with a general allow action, immediately preceded by a more restrictive rule with a deny action). This could result in the need to write a lot of rules to express them. However, as a lateral effect, rules are always consistent and orderindependent. FLIP [15] is a recently proposed firewall language which can also be compiled into several low-level ones, although in the paper no more information about his feature is provided. Their authors claim that ACLs expressed in FLIP are always consistent. In fact they are because of one of its limitations: it does not support overlapping between rule selectors. Prohibiting the use of overlaps is a major limitation, since it is impossible to express exceptions. In addition, its syntax is even more complex than Firmato’s one. However, due to this lack of expressiveness, FLIP ACLs are order independent. Finally, NAT is not supported in FLIP. In [7] the authors provide a general language, Ponder, to represent network policies (in general), which cannot compile to any low-level language. A re-engineered version, Ponder2, is also available. However, the complexity of Ponder surpasses the needs of firewall ACLs. In theory, a language that can express any network policy could express a firewall
MDA-Based Framework for Automatic Generation of Consistent Firewall ACLs
133
Table 1. Survey of features of access control languages
FW-Specific User extensible Consistency diagnosis Redundancy diagnosis Support stateful rules Support stateless rules Support NAT Positive logic Negative logic Selector-values overlap User-controlled rule order Topology/logic separation Relative complexity Compilation to low-level Low-level lang. import
Firmato x
Ponder2 x
FLIP
SRML x
Rule-ML x
x
x
x
√
x x
x x N/A N/A N/A
√ √
x
√ √
x x N/A N/A N/A
√
√ √ √
x x N/A N/A N/A
x
√
N/A
x
√
N/A
√
N/A
High
High x x
Low
Low x x
Low x x
√ √ x
√
Partial
√ x
√
√ √ x x
√ x
√ √ √
PCIM x Partial x x N/A N/A N/A
XACML x
AFPL
√
√
√ √ √
√ √ √
x
√
√
√
N/A
N/A
√
√
Medium x x
Low x x
Low
√ √ √
√ √ √
x x N/A N/A N/A
√ x
√ √ √ x
√
Partial
access control policy. However, concepts such as NAT cannot be expressed with Ponder. AFPL [12] is a language developed after an analysis of the features of major firewall languages, supporting most of their functionality at a fraction of their complexities. It can express stateful and stateless rules (although an administrator does not need to know these kind of details, since complexity is hidden in the language), positive and negative rules, overlappings, exceptions, and can be compiled to six market-leader firewall languages. Some organizations have even proposed languages to represent access control policies as XML documents, such as XACML [8], PCIM [9], Rule-ML [10], and SRML [11]. However, none of these languages is specific enough for firewall access control policies, resulting in a complexity to express firewall concepts, or in an impossibility to express them at all (this is the case of NAT for all these languages). Even UML has been proposed to model access control policies [17]. However, in our problem domain, UML could be an aid for the requirements definition stage, but then these models need to be translated into a DSL. These models and languages are very generic and are not intended for the area of any particular access control problem. Table 1 presents a survey of the most important features of the reviewed languages (related to express firewall ACL knowledge). With respect to commercial or Open Source applications, the two most important ones are Firewall Builder and Solsoft ChangeManager. However, these kind of applications have been left out of the comparison, because they are not based on a model of a firewall, but rather on an abstraction of their command-line syntax. For example, in these two solutions, the firewall platform must be specified upon firewall instantiation, since each firewall platform supports a different set of features. The reviewed proposals and the one proposed in this paper are focused towards a unique model for all firewalls.
3 MDA for Firewall ACL Design In the last few years Model-Driven Development (MDD) has promoted the use of models and transformations in software development processes [19]. The Model-Driven
134
S. Pozo, A.J. Varela-Vaca, and R.M. Gasca Platform Information
Computation Independent Model (CIM)
M2M Transf
Platform Independent Model (PIM)
Platform Specific Model 1 (PSM)
Low-level code for Platform 1
Platform Specific Model 2
Low-level code for Platform 2
2M M f ns ra T
Platform Specific Model 3
M2T Transf
Low-level code for Platform 3
Fig. 3. MDA Framework
Architecture [18] is the approach for MDD promoted by the Object Management Group (OMG) (Fig. 3). However, the use of the MDA framework does not guarantee that the model is free of inconsistencies and redundancies. Moving the verification stage to earlier stages in the process, prior to code generation, can reduce budget dedicated to consistency diagnosis and correction of the final ACL. In an MDD approach, this is even more important, since models are the core of the methodology, and executable code will be automatically generated from them. It is important to note that, since the focus of MDD paradigm is on the creation of static models, there are no execution details. Thus, the proposed diagnosis stage is in reality a (static) verification one. This diagnosis stage has been proposed in earlier works [1], and is not the focus of the paper. Fig. 4 shows the proposed framework, which follows MDA, but with the inclusion of two diagnosis stages, one for the PIM and other for the PSM. Note that the PSM diagnosis stage is only necessary (1) if the information regarding the PIM is modified; or (2) when information is modified or added by an end-user to the PSM. If an inconsistency is found, then it must be corrected before applying the next model transformation. 3.1 Modeling Considerations
Firewall platforms are very different from one vendor to another, and even among the available Open Source platforms. These differences range from differences in the number, type, and syntax of selectors that each platform’s filtering algorithm can handle, to huge differences in rule-processing algorithms that can affect the design of the ACL. Fortunately, the vast majority of filtering actions can be expressed with any of the filtering languages and platforms, with the only difference on the number of rules needed, and/or in their syntax. For example, an IP address range can be translated into several blocks of single IP addresses, so both syntaxes are equivalent. Firewall Platform Information
MANUAL
IPTables Specific Model (PSM)
IPTables ACL
PIX Specific Model (PSM)
PIX PSM Diagnosis
PIX ACL
FW-1 Specific Model (PSM)
FW-1 PSM Diagnosis
FW-1 ACL
MANUAL
Computation Independent Model (CIM)
MANUAL
Platform Independent Model (PIM)
AUTOMATIC
PIM Diagnosis M TO AU
AUTOMATIC
AUTOMATIC
IPTables PSM Diagnosis
IC AT
Fig. 4. MDA Framework for Firewalls (with Model Verification)
MDA-Based Framework for Automatic Generation of Consistent Firewall ACLs
135
PIM -
Filtering Selectors Syntax for each Selector Actions NAT
Cisco PIX PSM
IPTables PSM -
PIM features extension Packet mangling Rule hit frequency Malformed packets Connection Tracking Rule Processing Logging
-
PIM features extension Content inspection Interface configuration VPN Rule Processing Logging
Others
CODE -
Connection Tracking State inspection Rule order processing ...
Fig. 5. PIM and PSM meta-models proposal
With the focus on modelling firewall languages and platform functionality, some questions may arise. The first one is whether all firewall platforms analyzed share a common set of filtering selectors (or a common set of functionality). Another one is, for the common set of selectors, if there is at least one common syntax among all firewall platforms (in order to be able to use the functionality); or if not, if the available syntaxes for each platform have equivalencies in the other ones (i.e. are emulable). This part is in fact, the configuration of the ACL an administrator can design. For the design of AFPL2 we use the same methodology as for AFPL [12]: first a DSL with a set of selectors (or features) and syntaxes supported by all the analyzed firewall platforms and languages is created. Then, the non common selectors and syntaxes are analyzed and only added to AFPL2 if they comply to a criterion that is going to be defined in the next section. This methodology yielded in the case of AFPL to a lightweight language with a very simple syntax that would satisfy the vast majority of administrators. We expect the same for AFPL2. The methodology is described in the next sections with more detail, but has been described in detail in [12]. We propose to consider AFPL2 part as the PIM in our framework. However, for each analyzed firewall language and platform there is a set of functionality that has not been modelled in the PIM. There are basically two options for it: to model it in another (lower) level of the framework, or to completely remove it. Removal would cause an advanced administrator needing to use it to modify the generated ACL. However, if this part is introduced in a lower level meta-model, then the administrator can model this advanced behaviour without modifying the ACL. For this reason, we propose to consider this platform-specific functionality as the PSM for each platform. Note that a PSM could modify in some way the PIM (during the automatic M2M transformation, or with a direct modification of the administrator over the PSM), and thus new inconsistencies could be introduced (this is the reason why in
136
S. Pozo, A.J. Varela-Vaca, and R.M. Gasca
Fig. 4 there is a diagnosis stage before the M2T transformations). With this approach the PIM is as simple as possible, serving for the vast majority of administrators, while the PSM facilitates the use of platform specific features (all of them if needed). Besides filtering, firewall platforms have other specific features. These are for example how each platform threats connection tracking (that is, stateful or stateless connections), how the rule processing is performed (forward, backward, with jumps), etc. However, part is related to how each platform executes the ACL, and thus an administrator cannot modify this behaviour. For these reasons, this behaviour is considered only in the M2T transformation from PSM to low-level ACL. We think that the proposed concept separation fits well with the MDA approach. Fig. 5 shows the proposed feature division for Firewalls.
4 AFPL with NAT. PIM Meta-model In this work, we depart from previous results for the DSL design, where various alternative models for a Firewall DSL were created in a bottom-up process, and discussed. The result was Abstract Firewall Policy Language (AFPL) [12]. The main goals of AFPL were simplicity, ease of use, and support of a common set of functionality valid for most administrators. In this section, AFPL is extended to support NAT. This new DSL, AFPL2, serves as the PIM meta-model in the proposed MDA framework. 4.1 AFPL Extended with NAT: AFPL2
NAT is a must-have feature of modern low-level firewall languages (it was defined in the year 1999), and nowadays all firewall platforms support it. The main idea behind NAT is to change (translate) the values of some headers of TCP/IP packets in different situations. These changes are specified using rules (translation rules) in a similar way as filtering rules are specified. There are mainly two modes of NAT (as defined in RFC2663 [16]). •
•
Source NAT (SNAT). Also know as Outbound NAT, Network Address Port Translation (NAPT), or Masquerading, in which the source of a packet is translated when it traverses an outbound interface of a firewall. Response packets are translated back to their real address. Destination NAT (DNAT). Also known as Packet Forwarding, in which the destination of a packet is translated when it traverses an inbound interface of a firewall. Response packets are translated back to their real address.
However, although NAT has been defined in an RFC, it has not been standardized. For this reason, an analysis of the NAT features supported by the market-leader firewall platforms is needed for the design of AFPL2 in order to satisfy the vast majority of administrators. The considered firewall platforms in the NAT analysis for AFPL2 are the same ones as for AFPL: IPTables 1.4.2, Cisco PIX 8, FreeBSD 8 IPFilter, FreeBSD 8 IPFirewall, OpenBSD 4.1 Packet Filter, and Checkpoint Firewall-1 4.1. The analysis is presented in Appendix I and shows the supported NAT modes, its filtering selectors, and available syntaxes.
MDA-Based Framework for Automatic Generation of Consistent Firewall ACLs
137
Table 2(a). AFPL2 Source NAT Translated Selector Source IP Address
Obligation
Mandatory
Syntax
Comments
-Host IP -Interface name
If the interface name is given, the interface IP is used (it could be dynamic link)
Table 2(b). AFPL2 Destination NAT Translated Selector Destination IP Address Destination Port
Obligation
Dependencies
Mandatory
Optional
Syntax -Host IP
Destination port must be specified in the original packet
-Number -Range: [p1,p2]
Our start point is the factorized AFPL2 model presented in Table 2. A factorized model is a model where only common NAT modes, selectors and syntaxes are defined. We take it as a basic NAT model. Note that NAT use both filtering rules for matching packets and translation rules for defining which selectors of the matched packets and how will be translated. In Table 2 only the part related with translation rules is presented, since filtering was covered in AFPL [12]. In the next sections, noncommon functionality is going to be analyzed for its inclusion in AFPL2. 4.1.1 Addition of Uncommon NAT Modes Although there are a lot of ways of expressing translations, only two kinds of translation rules are supported in AFPL2 (Table 2). In fact, the analyzed firewall languages support more NAT modes (analyzed in Appendix I). There, we show that all these modes are in reality variations of the two basic NAT modes defined in RFC2663 (Source and Destination NAT) and can be reproduced in one way or another using these two basic types. Although RFC2663 defines more NAT modes (like Twice NAT or Multi-homed NAT), they are not supported in any of the analyzed firewall platforms. For these reasons, no more NAT modes are necessary in AFPL2. 4.1.2 Addition of Uncommon Selectors An uncommon selector can be added if its functionality can be reproduced (emulated) with the selectors of the factorized model, and it also adds new functionality to the model. A selector adds new functionality to the model if it cannot be emulated with translation rules which do not contain it. The conclusion is that using this criterion, no more selectors can be added to AFPL2. An exhaustive list of selectors per firewall is presented in Appendix I. 4.1.3 Addition of Uncommon Syntaxes In this section, we analyze the possibility of supporting uncommon syntaxes in the considered selectors. In general, we consider an uncommon syntax as a candidate for
138
S. Pozo, A.J. Varela-Vaca, and R.M. Gasca
its addition to the collection of supported syntaxes for that selector in AFPL2, if it can be emulated with the common syntaxes of the same selector and it provides clear usability improvements for human users. A syntax provides clear usability improvements if its use by a human cannot introduce inconsistencies [3] in an ACL created with AFPL2 and it provides compactness. Again, we will base this analysis on results presented in Appendix I. •
•
•
SNAT Source IP address. The uncommon syntaxes of this selector are identifiers, block IPs, IP ranges and collections of IPs. Note that the use of identifiers provides a clear usability improvement and does not introduce inconsistencies in the ACL, and thus will be considered for AFPL2. All the other syntaxes provide ACL compactness, and also represent usability improvements. As they cannot cause ACL inconsistencies, they will also be considered (except IP ranges and IP collections, which are redundant). Block IPs, IP ranges, and in general collections of IPs can be emulated in low-level languages that do not support them by decomposing these collections of IPs into several unique IPs, and defining one NAT rule for each. DNAT Destination IP address. The uncommon syntaxes of this selector are identifiers and IP ranges. Identifiers are included for the same reasons stated for SNAT source IPs. However, IP ranges cannot be included because it is only used for load balancing, a non-emulable feature not supported by all the analyzed platforms. DNAT Destination port. Many range syntaxes are possible in many firewall platforms, as is the case of ranges ‘=p’, ‘(p1, p2)’ and ‘)p1, p2(‘. These syntaxes provide no new functionality or a clear usability improvement, and can be easily emulated with the common ‘[p1, p2]’ syntax without loss of functionality. For this reason they will not be included in AFPL2.
In order to match the original packet, the same selectors and rule format used for filtering in AFPL can be used without restrictions, since all firewall languages support them in at least one of their NAT modes. For translation selectors, selectors presented in Table 3 must be used, with the presented constraints about their syntax. This table represents the final AFPL2 language (NAT part). 4.2 PIM Meta-model
It is necessary to clarify that MDA does not require the use of UML to specify PIMs or PSMs, it is just a recommendation. When a developer has to define a meta-model, she has to choose the meta-modelling technique: a UML-based profile (also named lightweight extension) or a MOF-based meta-model (or heavyweight extension). There are different reasons for selecting one of them [13]. AFPL2 PIM meta-model (from now, PIM) is composed of a hierarchical structure of meta-classes (Fig. 6). The PIM root element is the Policy which represents the ACL concept. An instance of the Policy meta-class could have one or more Rule children meta-classes, and zero or more SNAT and/or DNAT rules. Thus AFPL2 supports three kinds of rules: filtering (also present in AFPL), SNAT, and DNAT. SNAT and DNAT are not mandatory, but at least one filtering rule must be specified (for the default policy). Note that the left part of the figure (grey boxes) represents AFPL meta-model (without NAT), and the right part (white boxes) the NAT extension.
MDA-Based Framework for Automatic Generation of Consistent Firewall ACLs
139
Table 3. Final AFPL2 model (only NAT part)
Source NAT Translated Selector
Obligation
Source IP Address
Mandatory
Dependencies
Common Syntax (Can be optimized)
Uncommon Syntax (Must be emulated)
Comments
-Host IP -Interface name
-Identifier -Block
If the interface name is given, the interface IP is used (it could be dynamic link)
Destination NAT Translated Selector
Obligation
Destination IP Address
Mandatory
Destination Port
Optional
Dependencies
Destination port must be specified in the original packet
Common Syntax (Can be optimized)
Uncommon Syntax (Must be emulated)
-Host IP
-Identifier
-Number -Range: [p1,p2]
- Identifier
Comments
Each instance of the Rule meta-class represents a condition/action rule of the ACL (this part is related to AFPL [12]). A rule can be applied to a particular interface of the underlying firewall platform (interface attribute), and with a particular direction of the flow of packets (direction attribute). These two attributes of the Rule meta-class are optional, since if no interfaces are defined, the rule is applied to all interfaces in all directions (in and out). The comment attribute is also optional and represents the documentation for a rule. In addition, the Rule meta-class has an action attribute representing the action that the firewall should take if a packet matches its condition part. Note that it must be at least one rule in a policy, stating that at least the default policy (allow all or deny all) is present. The information regarding the condition part is represented in the Matches metaclass, which it is a child of the Rule meta-class, and the last meta-class of the hierarchy. Each Rule can have only one condition part and, for that reason, the cardinality is one. The Matches meta-class has a set of attributes representing the filtering selectors of AFPL [12]. These selectors are the fields which are considered during the filtering process, and are: source and destination IP addresses, source and destination ports, protocol, and ICMP type (only if protocol is ICMP). All these attributes have their own data types which represent the syntaxes allowed for a user. These data types are presented in the right part of Fig. 6. Some of them are enumerations and others are regular expressions. At the same level of Rule meta-class there are the DstNATrule and SrcNATrule meta-classes (this is the part related to AFPL2). These meta-classes represent DNAT and SNAT rules (the extended part of AFPL2). Note that NAT rules are optional. Following the analysis for NAT of the previous section, a DNAT rule can be applied to an interface. Note that no information regarding direction can be modelled, since DNAT rules are always applied with incoming direction. In the same way, a SNAT rule can only be applied with outgoing direction. Again, for both kind of rules, it is
140
S. Pozo, A.J. Varela-Vaca, and R.M. Gasca
Fig. 6. AFPL2 PIM meta-model (in EMF)
possible to specify the characteristics of the packet being translated using the NatOrigPacket meta-class, which has the same attributes as the Matches meta-class, but with different cardinalities. However, translation information differs between SNAT and DNAT: their attributes represent the translation selectors explained in the previous subsection (Table 2). Note that there is no way to represent rule priorities in the model. The reason is that rule priority is represented using the rule specification order in the PIM (i.e. rule priority is implicit in the model). As we have noted before, for each analyzed firewall language and platform there is a set of advanced features that has not been included in the PIM meta-model. However, this if part is introduced in the PSM, then administrators can use the advances features without modifying the ACL and without sacrificing PIM simplicity. Note that AFPL2 can be compiled to any of the analyzed low-level firewall languages without any loss of information. In addition, if all non common features of each analyzed firewall platform and languages are supported in its PSM (as proposed), a lossless transformation from the low-level languages is also possible. Furthermore there are also differences regarding how each firewall platform executes NAT (mainly before or after executing filtering rules). However, these differences must only be taken into account only in AFPL2ÆACL compilers, and are not considered in this paper.
5 Conclusions The contribution of this paper is twofold. First, we have proposed a new abstract language to represent firewall ACLs with NAT, AFPL2. To the best of our knowledge, AFPL2 is the first firewall DSL to support NAT. With AFPL2 an administrator is able to model the vast majority of features present in any of the analyzed firewall
MDA-Based Framework for Automatic Generation of Consistent Firewall ACLs
141
platforms and languages (which are market-leaders). However, for an advanced administrator, there could be features that cannot be modelled with AFPL2 alone. Second, AFPL2 is used as the PIM for a framework heavily based on the concepts of MDA extended for model consistency diagnosis stages. The framework can incorporate these advanced features of all the analyzed firewall languages in another (lower) modelling level, allowing administrators to use the features of a particular low-level firewall language not present in AFPL2. This MDA-based framework is extensible by end-users, in the sense that more concepts can be added to the metamodels, as well as modifications to the transformations between them, in order to represent more features and/or low-level firewall platforms and languages. In fact, we have identified the features not modelled in the PIM and propose to model them in the PSM of each firewall platform as a topic for future research. In future works, we pretend to develop the full framework with automatic transformations and ACL generation.
Acknowledgements This work has been partially funded by Spanish Ministry of Science and Education project under grant DPI2006-15476-C02-01, and by FEDER (under ERDF Program). Many thanks to T. Reina and J. Peña for their useful comments on early versions of the paper.
References 1. Pozo, S., Ceballos, R., Gasca, R.M.: Model Based Development of Firewall Rule Sets: Diagnosing Model Faults. Information and Software Technology Journal 51(5), 894–915 (2009) 2. Al-Shaer, E., Hamed, H.H.: Modeling and Management of Firewall Policies. IEEE eTransactions on Network and Service Management 1(1) (2004) 3. Pozo, S., Ceballos, R., Gasca, R.M.: A Heuristic Polynomial Algorithm for Local Inconsistecy Diagnosis in Firewall Rule Sets. In: International Conference on Security and Cryptography (SECRYPT), Porto, Portugal (2008) 4. Wool, A.: A quantitative study of firewall configuration errors. IEEE Computer 37(6), 62– 67 (2004) 5. Taylor, D.E.: Survey and taxonomy of packet classification techniques. ACM Computing Surveys 37(3), 238–275 (2005) 6. Bartal, Y., Mayer, A., Nissim, K., Wool, A.: Firmato: A Novel Firewall Management Toolkit. ACM Transactions on Computer Systems 22(4), 381–420 (2004) 7. Damianou, N., Dulay, N., Lupu, E., Sloman, M.: The Ponder Specification Language Workshop on Policies for Distributed Systems and Networks (POLICY), HP Labs Bristol, UK, pp. 29–31 (2001) 8. OASIS eXtensible Access Control Markup Language (XACML), http://www.oasis-open.org/committees/xacml/ 9. Moore, B., Ellesson, E., Strassner, J., Westerinen, A.: Policy Core Information Model (PCIM), IETF RFC 3060 (2001) 10. Rule Markup Language (RuleML), http://www.ruleml.org/ 11. Simple Rule Markup Language (SRML): A General XML Rule Representation for Forward-chaining Rules, ILOG S.A (2001) 12. Pozo, S., Ceballos, R., Gasca, R.M.: AFPL, An Abstract Language Model for Firewall ACLs. In: Gervasi, O., Murgante, B., Laganà, A., Taniar, D., Mun, Y., Gavrilova, M.L. (eds.) ICCSA 2008, Part II. LNCS, vol. 5073, pp. 468–483. Springer, Heidelberg (2008)
142
S. Pozo, A.J. Varela-Vaca, and R.M. Gasca
13. Desfray, P.: UML Profiles versus Metamodeling Extensions... an Ongoing Debate. In: COM 2000, proceedings of the First Workshop on UML in the .COM Enterprise: Modeling CORBA, Components, XML/XMI and Metadata (2000) 14. Bartal, Y., Mayer, A., Nissim, K., Wool, A.: Firmato: A Novel Firewall Management Toolkit. ACM Transactions on Computer Systems 22(4), 381–420 (2004) 15. Zhang, B., Al-Shaer, E., Jagadeesan, R., Riely, J., Pitcher, C.: Specifications of a Highlevel Conflict-free Firewall Policy Language for Multi-domain Networks. In: ACM Symposium on Access Control Models and Technologies (SACMAT), Sophia Antipolis, France, pp. 185–194 (2007) 16. Srisuresh, P., Holdrege, M.: RFC 2663: IP Network Address Translator (NAT) Terminology and Considerations. IETF (August 1999) 17. Basin, D., Dorser, J., Lodderstedt, T.: Model Driven Security: from UML Models to Access Control Infrastructures. ACM Transactions on Software Engineering and Methodology 15(1), 39–91 (2006) 18. OMG. MDA guide version 1.0. Technical Report omg/2003-05-01, OMG (May 2003) 19. Hailpern, B., Tarr, P.: Model-driven development: The good, the bad, and the ugly. IBM Systems Journals 45(3), 451–462 (2006)
Appendix I This analysis is related to NAT rules of market-leader firewall languages, and refers to the conditions that may be matched against a packet that arrives at the firewall, and how it must be translated.√ indicates that a selector is supported, and x that is not supported. Table 4. Netfilter IPTables analysis NAT Type Src IP Address Translated Src IP Address Src Port Translated Src Port Dst IP Address Translated Dst IP Address Dst Port Translated Dst Port Protocol Interface
Comments
SNAT / MASQ Opt √* -IP Opt Opt** - Number, range Opt x Opt x Opt Opt (Outgoing) *It is not possible to do load balancing in K >=2.6.11 ** If port is given, NAPT is done instead of SNAT
DNAT / PORT FW Opt x Opt x Opt √* -IP Opt Opt - Number, range Opt Opt (Incoming) *It is not possile to do load balancing in K >=2.6.11
MDA-Based Framework for Automatic Generation of Consistent Firewall ACLs
143
Table 5. Cisco PIX analysis NAT Type
Dynamic NAT and PAT
Address Pool
√ √
Src IP Address Translated Src IP Address
Dst IP Address
x
Interface (outbound) Interface (inbound) Connection Settings
Collection using the pool
√
Src Port Translated Src Port
√
Comments
√
√
√
-IP, Interface IP
Opt Automatic (only for PAT)
x Opt (only for PAT)
Opt Opt (only for PAT)
√
x
x
x
Opt
x
x
x
√ √
√
x
√ -IP, Interface IP
√
x
(one or more)
Only in less secure to more secure interface connections
Policy Static NAT and PAT
x
-IP
(bidi) x Opt (only for PAT) TCP or UDP. Only for PAT
(one or more) √/x (bidi) Opt Opt (only for PAT) TCP or UDP. Only for PAT
x
x
√/x
Only in less secure to more secure interface connections
x
x
√
√
Misc options for TCP and UDP
Misc options for TCP and UDP
Misc options for TCP and UDP
Misc options for TCP and UDP
Table 6. OpenBSD Packet Filter analysis NAT Type Src IP Address Translated Src IP Address Src Port Translated Src Port Dst IP Address Translated Dst IP Address Dst Port Translated Dst Port Src/Dst Protocol Interface
Static NAT and PAT
√ √
Collection using the pool -IP, Block -Range, Collection x Automatic (only for PAT)
Translated Dst IP Address Dst Port Translated Dst Port Protocol
Dynamic Policy NAT and PAT
NAT
√ √
Port Forwarding
√
Table 7. FreeBSD IPFirewall analysis NAT Type
Outgoing connections Opt
x
√*
Src IP Address Translated Src IP Address Src Port Translated Src Port Dst IP Address Translated Dst IP Address
Opt
Opt
Dst Port
Opt
x
Opt
x
√ √
Translated Dst Port Src/Dst Protocol Interface
Opt* Opt**
- IP Collection Opt
x Opt
√
x
√
√
√ Bidirectional connections are possible
* If multiple translated dst IPs are specified, loadbalancing is Comments
Incoming connections Opt
√
x
Opt x Opt
Opt x Opt
x
√ ***
√
* Mandatory if Dst Port is specified ** Mandatory if ports are specified
Opt (Port or range) Opt* Opt**
√
* Mandatory if Dst Port is specified, but could be the same by default ** Mandatory if ports are specified *** If multiple translated dst IPs are specified, loadbalancing is accomplished
144
S. Pozo, A.J. Varela-Vaca, and R.M. Gasca
Table 8. OpenBSD IPFilter analysis NAT Type Src IP Address
Basic NAT
Port Redirection Opt x
Src Port Translated Src Port
[Only IP or Block] Opt Opt*
Dst IP Address
Opt
Translated Src IP Address
Translated Dst IP Address Dst Port Translated Dst Port Src/Dst Portocol Interface Comments
√ √
Opt x √* [Only IP]
x
√ (host only)
Opt x Opt
Opt
√
* If port is given, NAPT is done instead of SNAT
√
Opt
√
* If multiple dst IPs are specified, roundrobbin loadbalancing is accomplished
Table 9. Checkpoint FW-1 analysis NAT Type Src IP Address Translated Src IP Address Src Port Translated Src Port Dst IP Address Translated Dst IP Address Dst Port Translated Dst Port Src/Dst Protocol Interface
All types Opt Opt Opt Opt Opt Opt Opt Opt Opt x
Testing Topologies for the Evaluation of IPSEC Implementations Fernando Sánchez-Chaparro1, José M. Sierra1, Oscar Delgado-Mohatar2, and Amparo Fúster-Sabater2 1
Computer Science Department, Universidad Carlos III de Madrid Avd. De la Universidad 30, Leganés, Madrid {josemaria.sierra, fchaparr}@uc3m.es 2 Instituto de Física Aplicada, CSIC C/ Serrano, 144, Madrid {oscar.delgado, amparo}@iec.csic.es
Abstract. The use of virtual private networks (VPNs) for the communications protection is becoming increasingly common. Likewise, the IPSEC architecture has been gaining ground and, at present, is the solution more used for this purpose. That is the reason why a large number of implementations of IPSEC have been created and put into operation. This work proposes three testing topologies to carry out IPSEC implementation assessment, each of these scenarios will supply an important guide for the determination of objectives, digital evidences to collect and test batteries to develop, in any evaluation of a IPSEC device. Keywords: IPSEC evaluation, Virtual Private Networks, IPSEC testing.
1 Introduction The establishment of Virtual Private Networks, has become a need for companies and organizations, as they, in its desire to bring its information system to its users, want to enable the connecting them through networks outside the corporation and, therefore, do not enjoy the level of security required. The use of IPSEC VPNs allows these users to create a cryptographic tunnel between the various communications equipment and the corporate network of the organization, so as to guarantee the security services required by the corporation and, at the same time, it allows total flexibility to users from any network compatible may access the system as if they were within the network itself of the organization. Thus, another of the main applications of virtual private networks is associated with the increasingly frequent need to interconnect several corporate networks through public networks. In this case the problem is similar, because it is that the public network is used in a transparent manner to the systems that are within corporate networks and that, regardless of the need to cross the public network, the communication between any two machines is carried out with the same level of security that if machines were in the same corporate network. To that end, the machines that connect to the corporate networks to the public network established communications O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 145–154, 2009. © Springer-Verlag Berlin Heidelberg 2009
146
F. Sánchez-Chaparro et al.
based on Virtual Private Networks that protect the information exchanged through the public network. With regard to the protocols used to implement Virtual Private Networks, and although there are many with similar objectives in different layers of the network stack, the de facto standard for the establishment of virtual private networks in the network level of TCP/IP architecture is the protocol IPSEC. This protocol defines two mechanisms, AH and ESP, which allow the integrity, confidentiality and authentication of information exchanged. Also, and not less important, this protocol also provides a management system for the protection of security parameters negotiation. In this regard, the protocol IKE (Internet Key Exchange) allows the creation and management of security associations (SA) that define the functioning of the commented security mechanisms including in IPSEC. This paper is organized as follows; section 2 presents the background of IPSEC architecture, mainly describing briefly AH, ESP and IKE protocols. Next section presents well-know attacks to IPSEC implementations, and section 4 describes the proposed topologies, its aim and the evidences that must be collected in each scenario. Finally, section 5 presents the conclusions of our work.
2 Background Internet foundations date from 1969 and its primitive objectives are now not enough for facing society needs. One of the keys of this situation is the TCP/IP protocol. This establishes the Internet communication framework and so its qualities and defects are inherited by the information systems that use it. The IPSEC specifications provide authentication, integrity and confidentiality to network level packets (IP packets). For achieving these features, this architecture includes the description of two new security payloads: the Authentication Header (AH) and the Encapsulated Security Payload (ESP). The first one, AH, supplies mechanisms of authentication and integrity. The ESP guarantees, besides authentication and integrity checking, confidentiality. In this way, only legitimate receivers will be able to access to the information. 2.1 Authentication Header As we said above, AH provides authentication and integrity to IP datagrams. The process consists of adding a new header to the datagram that needs authentication. Only the nodes involved in the communication must pay attention to that header (AH), whereas the other ones only have to pass this information. This implies that IPv6 packets can travel along the nodes even if these are not using this new version. The presence of the AH will not change the behaviour of TCP, nor in fact of any other end to end protocol such as UDP or ICMP. The AH is designed to preserve the integrity of the whole datagram and examining that its content has not been modified in transit. Because of this, it is calculated the Authentication Data (Figure 1), using the datagram data that do not change from sending to arrival (data that change are converted to zero). So, before the Authentication Data computing, the sender must prepare a special version of the packet
Testing Topologies for the Evaluation of IPSEC Implementations
hash
(k
&
IP datagram
&
147
)
k
Authetication Header
Authetication Data Fig. 1. Authentication Data calculation
independent of transformations in transit. After this, it is concatenated a key in the beginning and ending of the datagram and subsequently a hash function is computed (MD5 has been suggested as default). The result is called Authentication Data and is placed into the AH. When the destination node receives the AH, verifies the validity of the datagram. This one will be accepted only if the authentication data is correct. 2.2 Encapsulating Security Payload When confidentiality is needed, the Encapsulated Security Payload (ESP) should be used. Its aim is to provide confidentiality, integrity and authentication to the IP layer. In this case the computational charge of ciphering is more significant than AH, and this performance reduction could bring some problems in certain systems. The Encapsulated Security Payload is structured into three parts, one is plain text (Header and extra information for the decryption process), second part is ciphered text (encryption of the IP datagram) and finally the last part is authentication data (a keyed hash function of the IP datagram). There are two modes in ESP: Tunnel Mode and Transport mode. These are referred to the packet layer that is ciphered into the ESP. In the Tunnel Mode, the sender takes the whole IP datagram and, once the key is determined, the cipher process is accomplished. The cipher data is encapsulated into the payload of a new IP datagram, which is send to the destination with appropriate destination fields settled. 2.3 The IKE Protocol In both mechanisms described above it is necessary to agree a key and some other information between the parties (initialization vector, cipher mode, hash function to use, …). For that reason it is needed the negotiation of those security parameters before the development of this security mechanisms.
148
F. Sánchez-Chaparro et al.
ESP
IP datagram Copy the Header
}
Cipher
K
HASH Fig. 2. ESP calculation
Into the IPSEC framework is also defined how must be accomplished those agreements. The negotiation spins around the construction of a Security Association (from now on SA). A SA is created on behalf of any security services and describes how those entities will implement this service. In this way before using the Authentication Header must exist a SA that determine how the entities will implement the AH. There is a standard framework for the negotiation of SA, this is the IKE Protocol (Internet Key Exchange). According to this protocol, two parties can negotiate securely what security parameters they will use for implementing any security services (AH and ESP ) Likewise, it is also possible to establish the SA’s manually, simply establishing the same SA database (SADB) into the entities that are going to develop the security services. The IKE protocol is based on the ISAKMP framework (Internet Security Association and Key Management Protocol) and, in the same way, it is divided into two phases (see Figure 3). ─ In the first phase the parties establish a session key by a key establishment protocol. Once the key is established, they authenticate one each other and negotiate how to protect the second phase. ─ In the second phase is where is made the negotiation of the security parameters on behalf of any security service. This second phase is protected using the parameters negotiated into the first phase.
3 Well-Know IPSec Attacks This section reflects some of the better known attacks to IPSEC. The knowledge of this type of attacks is useful to make a proper definition of the evaluation topologies. Following the current standard [1], authentication using preshared keys must be implemented by default in any implementation. However, due to the IPSEC flexibility (it is an open framework), it is possible to use configurations which are sensible to brute force attack to achieve successful impersonation attacks.
Testing Topologies for the Evaluation of IPSEC Implementations
B
A Phase 1
Phase 2
149
• •
Key Management Protocol
SA Negotiation Protocol
• •
• •
Anti-clogging tokens Second phase security parameters negotiation Key establishment material Nodes authentication
Anti-clogging tokens Secure parameters negotiation on behalf of security services
Fig. 3. Phases of the SA negotiation by IKE
On the other hand, the “encryption-only” mode of ESP, together with the structure of IP, TCP (or IP, UDP, RTP), it is not enough for detecting integrity changes in transmitted communications.
4 IPSEC Implementation Testing As we have described in sections above, IPSEC is an architecture which supplies important security services. However, as any other software implementation, it needs a rigorous testing phase in order to achieve confidence in its effectiveness and correctness. Nevertheless, and due to the high complexity of IPSEC architecture, any evaluation will find a hard work needed to test this kind of implementations, because any functionality of every protocol of IPSEC architecture must be rigorously evaluated. For this reason this sections presented testing topologies which allows accomplishing a complete evaluation of different functionalities and modes included in IPSEC. Key point of each of these methodologies is to define clearly its aim and the evidences gathering needed. In this way, according to the objective, that is, according to the type of tests which will be developed, we can divide the test into: ─ Conformance test: the objective of this test is to evaluate whether or not the implementation fulfills IPSEC standards. This includes type of messages, payloads to exchanges, behavior expected, and any other specification included into them [7], [8], [9], [10]. ─ Functional testing: these kind of test aims to evaluate the behavior of the implementation with the establishment of IPSEC tunnels.
150
F. Sánchez-Chaparro et al.
Fig. 4. ISAKMP Test Topology
─ Penetration test: Strength of the implementation is evaluated with these tests. Well-known penetration attacks, like those identified in previous sections, are developed against the implementations and the results obtained and analyzed. It is important to remark that this work does not consider performance test over this type of implementations. 4.1 Topology 1: ISAKMP Testing This first topology is the simplest, because only considers two elements, the implementation under evaluation (usually call Target of Evaluation, TOE) and a trusted implementation of IPSEC (reference implementation). This topology permits to develop conformance test, specifically those which establishes a ISAKMP tunnel (IKE phase I) and then the second IKE phase (tunnel IPSEC) is transmitted through it. These conformance tests must check all aspects related with payload transmitted, but also verify the proper behavior specified in IPSEC standards. In order to verify and document the commented targets it is necessary to collect evidences such as: ─ State of the TOE before each test ─ State of the TOE after each test ─ Complete information transmitted between the two entities. This is possible because reference implementation, Host1, is one end of the communication and hence can obtain transmitted information in clear. This conformance test, which evaluates the ISAKMP tunnel, only involves two entities. Due to IPSEC complexity, tests ought to be done incrementally, firstly protocols which involves less entities and step by step, introducing more entities into the testing topology. 4.2 Topology 2: IPSec and Functional Testing Topology 2, will serve us to continue with test about the verification of the proper working of the implementation. In this scenario must be collected the evidences commented for the previous topology, but in this case for IPSEC tunnels (tunnels negotiated in second phase of ISAKMP).
Testing Topologies for the Evaluation of IPSEC Implementations
151
Fig. 5. IPSec Test Topology
On the other hand, this topology is the most reduced for doing functional testing. For this reason, it will be analyzed the information transmitted from Host1 to Host2 for all possible combinations of the TOE cryptographic suite [11], [12], in both initiator and responder roles, and checking the following aspects: ─ ─ ─ ─ ─ ─
Main/Aggresive mode Encryption Algorithms Integrity Algorithms Diffie-Hellman modules Perfect Forward Secrecy Different authentication methods.
The combination of these aspects must include phase I and phase II, even although some of this test combination had been done in topology 1: ISAKMP testing. The reason of this consideration is that although ISAKMP and IPSEC tunnels are independent, it must be checked that any IPSEC tunnel can be established from any ISAKMP tunnel. On the other hand, it is important to remark that reference implementation must use cryptographic algorithms certified by a valid accreditation laboratory (such is NIST 140-2). 4.3 Topology 3: Penetration Test If the TOE passes the test of first and second topology, it can be assured that fulfils standards specifications and that is functionally operable. With this topology 3, is presented an scenario where could be developed attacks against IPSEC protocol, as described in section 3, in order to check strength of the TOE against direct attacks. It can be developed attacks such as: ─ Man-in-the-middle: Host4 could try to impersonate host1, using attacks such those described in Table 1. PSK IPSEC Attakcs. ─ Integrity mechanism testing: In this case, Host4 can modify transmitted packets heads or its content information as described in Table 2. Forging the Checksum. With this type of attack it could be deflected packets sent from Host1 to Host2. ─ IV Attacks: This scenario is similar to the previous one, its aim is to deflect to Host3, certain packets sent from Host1 to Host2, being Host1 in charge of forging the IV, as we described in Table 2 Initizialization Vector forging. Some attacks proposed to be developed over this topology, such as the Initizialization Vector forging using CBC ciphering mode, cannot been avoided using the mechanism
152
F. Sánchez-Chaparro et al.
Fig. 6. Testing Penetration Topology Table 1. PSK IPSEC Attakcs
Exploit of Default user credentials
PSK Shared with remote user
PSK When Aggressive Mode is used to accomplish a phase II of IKE [2], and obtain the HMAC information used for the authentication. This HMAC is created based on the pre-shared key, and once the code is obtained can be accomplish a dictionary attack in order to obtain the preshared key. Under certain IPSEC configurations: ─ VPN gateways could accept traffic from any address ─ All VPN clients use the same preshared key to establish a secure channel In this case attackers, who know the PSK key, can impersonate to any other user of the VPN gateway.
supplied by IPSEC, but this type of test show us the strength of certain TOE facing these attacks. Digital evidences that must be collected are equal to the previous topologies: ─ Datagrams captured by Host4 of communications established from Host1 to the router, and changes made in forged datagrams. ─ Datagrams received by Host2 and Host3, which shows if Host4 attacks have been successful.
Testing Topologies for the Evaluation of IPSEC Implementations
153
Table 2. ESP encryption-only IPSEC Attacks
ESP It is possible to calculate new “ex profeso” checksums without knowing the packets content [3], [4]. In this way, it is possible to attack the protocol in the following ways: Forging the Checksum
─ Denial of service attacks: Doing IP-spoofing over the source and destination addresses routed by the Gateway. ─ Brute force attack: When an attacker posses an internal address of a trusted network, it can be done a brute force attack over the destination address until a cleartext content is received. It have been detected the possibility of forging the initialization vector (IV) using the CBC ciphering mode [5],[6]. In this way, when sent information is decrypted, first cipher block will reveal a modified information. Hence, It is possible to attack the protocol in the following ways:
Initizialization Vector forging
─ Obtaining Access to non- authorized information: Forging the specific bits of the IV and modifying the checksum, an attacker could get plain information about the other recipients of the communications. ─ Changing transmitted information: On the other hand, it is also possible to modify the content of communications between two parties, because it is possible to change the information of the header or the payload length.
5 Conclusions IPSEC architecture supplies a superior standard of protection when it is used with a proper configuration and over a correct implementation. For these reasons, a rigorous evaluation of its functional operation and conformance with the standard is needed to assure that the security level supplied corresponds with the level desired by the user. This evaluation is not a simple task and it is needed to plan properly the target of evaluation the test battery to develop and the digital evidences that must be collected to achieve reliable and provable results. In this sense, in the course of this article, it has been proposed three topologies to conduct an assessment, from the functional point of view, in accordance with the standards and to develop penetration test in order to fully assess a IPSEC implementation (not taking into account performance considerations). For each of these topologies it has defined the evidence that should be collected as well as which may be the goals of each one of the tests. As in any other software development, these tests should be carried out by impartial third parties, in such way, that the companies implementing this type of security solutions could demonstrate that its devices actually offered the level of security expected.
154
F. Sánchez-Chaparro et al.
References 1. Carrel, D., Harkins, D.: The Internet Key Exchange (IKE). IETF RFC 2409 (November 1998) 2. Thumann, M.: PSK Cracking using IKE Aggressive Mode (April 2004), http://www.ernw.de/download/pskattack.pdf 3. Ventzislav, N.: A DoS Attack Against the Integrity-Less ESP (IPSec). In: Malek, M., Fernández-Medina, E., Hernando, J. (eds.) Proceedings of the International Conference on Security and Cryptography, SECRYPT 2006, pp. 19–199. INSTICC Press (2006) 4. Braden, R., Borman, D., Partridge, C.: Computing the Internet Checksum. IETF RFC 1071 (September 1988) 5. Vaarala, S., Nuopponen, A., Virtanen, T.: Attacking Predictable IPsec ESP Initialization Vectors. In: Deng, R.H., Qing, S., Bao, F., Zhou, J. (eds.) ICICS 2002. LNCS, vol. 2513, pp. 160–172. Springer, Heidelberg (2002) 6. McCubbin, C.B., Selçuk, A.A., Sidhu, D.P.: Initialization Vector Attacks on the IPsec Protocol Suite. In: Proceedings of the 9th IEEE international Workshops on Enabling Technologies: infrastructure For Collaborative Enterprises. WETICE, June 4-16, pp. 171–175. IEEE Computer Society, Washington (2000) 7. Kent, S.: IP Authentication Header. IETF RFC 4302 (December 2005) 8. Kent, S.: IP Encapsulating Security Payload (ESP). IETF RFC 4303 (December 2005) 9. Eastlake 3th, D.: Cryptographic Algorithm Implementation Requirements for Encapsulating Security Payload (ESP) and Authentication Header (AH). IETF RFC 4305 (December 2005) 10. Maughan, D., Schertler, M., Shneider, M., Turner, J.: Internet Security Association and Key Management Protocol (ISAKMP). IETF RFC 2408 (November 1998) 11. Hoffman, P.: Algorithms for Internet Key Exchange version 1 (IKEv1). IETF RFC 4109 (May 2005) 12. Hoffman, P.: Cryptographic Suites for IPsec. IETF RFC 4308 (December 2005)
Evaluation of a Client Centric Payment Protocol Using Digital Signature Scheme with Message Recovery Using Self-Certified Public Key Miguel Viedma Astudillo1, Jesús Téllez Isaac2, Diego Suarez Touceda1, and Héctor Plaza López1 1
Computer Science Department, Universidad Carlos III de Madrid Avda de la Universidad, 30, Leganés, Madrid 2 Computer Science Department (FACYT), Universidad de Carabobo, Avda. Universidad. Sector Bárbula, Valencia, Venezuela
Abstract. In this work, we present a mobile e-payment scenario where the merchant entity has important connectivity restrictions, which is only able of communicating through short distance connections (Wi-Fi or Bluetooth). To protect the exchange of messages, our work describes a payment protocol centered in the client which applies digital signature with recovery using self-certified public keys. Additionally, security aspects have been analyzed and protocol implementation specifications have been presented. Keywords: E-payment, mobile devices, m-commerce security.
1 Introduction Most of the mobile payments systems proposed until now are based in a scenario where all the entities are directly connected to one another (formally called "Full Connectivity scenario") [8] because it allows protocol's designers to simplify the protocols and obtain stronger security guarantees than similar applications in the others models. However, we believe that there are some situations where the merchant is not able of connecting to the Internet and it is necessary to develop mobile payment systems where the seller could sell goods even thought she may not have Internet access (Parking Meter, sell machines …). There are different architectures which allow to develop electronic payment transactions through mobile devices. 3D-Secure is one of the most extended, mainly due to its authentication methods flexibility and the security measures supplied. However, the message flow proposed by 3D-Secure is a little bit constrained, because does not accept variations in the number or type of entities involved into the neither payment process, nor connectivity limitations among them. In order to solve the problem of buy and payment of goods in our operational scenario, in section 3, we describe a protocol that permits to a merchant to send a message to acquirer through the client connection (who will not be able to decrypt this O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 155–163, 2009. © Springer-Verlag Berlin Heidelberg 2009
156
M. Viedma Astudillo et al.
message). The proposed protocol employs the authentication encryption scheme proposed by [2] that allows only specified receivers to verify and recover the message, so any other receiver will not be able to access the information. The paper is organized as follows. Section 2 presents the needed background on self certified digital signature with message recovery and section 3 describes the proposed protocol for Client Centric Payment Protocol. Consecutively, the following section gives an analysis and evaluation of the implementation of the protocol, and in final section, conclusions of our work are presented.
2 Related Work In our scenario, where merchant cannot with the sell in a direct way (formally called Client Centric Model), it is necessary to find functions which supply the security services associated with the electronic payment but, at the same time, functions which could provide these protection over the defined restricted connectivity scenario. We propose the usage of a nontraditional digital signature system in order to satisfy requirements commented above. The use of a digital signature with message recovery using self-certified public keys provide us an authenticated encryption scheme that brings together the mechanisms of signature and encryption, which enable only the receiver to verify and recover the original message. The authentication of the public key can implicitly be accomplished with the signature verification. Many signature schemes with message recovery have been proposed in recent years [2], [4], [5], [6]. These schemes allow a signer's public key to be simultaneously authenticated in verifying the signature. As the public keys does not need to be included in a certificate to be authenticated by verifiers (as happens in protocols based on public key infrastructure), communication with a Certificate Authority during a transaction to verify the validity of a certificate is not necessary. Therefore, and as shown in [1], digital signature schemes with message recovery are suitable for mobile payment systems based on a kiosk centric model like the one being suggested in this work.
3 Proposed Client Centric Payment Protocol 3.1 Notation During the description of the signing and encryption mechanism and the protocol notation, the following symbols have been used: • • • • • •
: Entity involved in the communication. : Unique identifier of entity . : Public key of entity . : Private key of entity . : Message signed by entity . : Message signed and encrypted by entity
to entity
.
Evaluation of a Client Centric Payment Protocol
157
• • • • •
: Message sent by entity to entity . : Result of applying the System Authority hash function over the message M. : Identifier of the purchase transaction, including the date and start time. : Description of the requested resource in the transaction. : Price of the resource of the Merchant Access Point requested by the client. • : Total price of the resources requested by the client. • : Status of the transaction. , 3.2 Operational Model The operational model used is composed of five entities: 1. Client (C): User, represented by a mobile device (telephone or Personal Digital Assistant PDA), requesting a product or a service offered by the merchant through the Access Point. 2. Merchant Access Point (AP): Access Point with short-range connectivity (Wi-Fi or Bluetooth) offering Merchant’s products or services. 3. Issuer (I): Client’s Bank. Validates the client’s payment messages and transfers the funds to the Acquirer. 4. Acquirer (A): Merchant’s bank. Checks the validity of the transfers performed by the Issuer and reports the status to the Merchant Access Point. 5. Payment Gateway (PG): Intermediate entity between the banking private network and Internet. Additionally, a sixth entity known as System Authority is considered, trusted by all the involved entities, and responsible for the generation of the system’s parameters in the System Initialization Phase. During this phase, all the involved entities register themselves with the System Authority in order to generate their public key. Figure 1 shows the communications between the five entities (except the interaction between the Merchant Access Point and its bank, the Acquirer). The communication between the Client and the Merchant Access Point, represented by the dotted line, is performed using a short-range wireless link (Bluetooth or IEEE 802.11a/b/g). The study of the security measures used to protect the communications within the banking private network is out of the scope of this paper. 3.3 Digital Signature with Recovery Using Self-Certified Public Keys Message exchange between entities is secured using the algorithm originally described in [2]. 3.3.1 System Initialization Phase During this phase, the trusted entity System Authority generates the required system parameters ( , ’, , ’, , , , . ) following the steps below: 1. Chooses two distinct Sophie-Germain prime numbers (a prime number p is Sophie-Germain prime if 2 ’ 1 is also prime) of similar bit-length ’ y ’.
158
M. Viedma Astudillo et al.
Fig. 1. Operational Model
·
2. Computes the modulus RSA 3. Chooses a generator ·
· 2 · 1 · is a ′ ′ , where · min , collision-resistant hash function that accepts variable-length strings as input and generates a fixed-length string as output (i.e. MD5, SHA-1, SHA-256, SHA-512, etc.). · such that:
4. Chooses a hash function hash
System Authority keeps , , ’, ’ private, and releases , , . as public. Once the initialization parameters have been generated, each involved entity generates its key-pairs. 3.3.2 Private and Public Keys Generation Entity follows the steps below in order to generate its key-pair: as private key. 1. Chooses a number 2. Computes 3. Sends along with its identifier to the System Authority. System Authority receives
,
and computes the public key of the entity . ,
The key-pair used by
,
to secure it communications will be
3.3.3 Signature Mechanism In order to sign a message, an entity 1. Choose a random number .
should follow the steps below: · ·
2. Compute the signed triplet ·
·
·
.
:
Evaluation of a Client Centric Payment Protocol
159
Fig. 2. System Authority Initialization and Generation Public Keys Phases
The receiving entity of the signed triplet can verify the sender’s signature and recover the message as follows: 1. Recovery of the signed message using the public data of the entity issuing the message , : · · 2. Validation of the sender of the message: ·
·
3.3.4 Signing and Encryption Mechanism An entity in order to sign and encrypt a message intended to an entity , of which knows its public data , , should follow the steps described below: 1. Choose a random number 2. Compute a signed and encrypted triplet: · · ·
·
The receiving entity of the signed and encrypted triplet recovers the message and verifies its validity following the equations below: 1. Recovery of the signed message using the public data , of the issuing entity: · · 2. Checking the validity of the recovered message: ·
·
160
M. Viedma Astudillo et al.
3.4 The Proposed Client Centric Protocol The principal functionality of the proposed protocol: 1. 2. 3.
: : :
4.
:
,
,
,
,
,
,
,
,
,
, ,
, ,
, ,
,
, ,
,
,
,
5. :
,
,
,
,
,
,
,
, ,
6.
,
,
,
,
Under banking private and secure network 6.1. 6.2. 6.3.
: : :
6.4.
:
, , ,
,
, ,
,
,
, ,
7.
:
8.
:
,
, , ,
,
The names of the messages exchanged among the entities involved in the described protocol have been assigned using the description published in the General Payment Model [7] y [8].
Fig. 3. Client Centric Payment Protocol Messages Flow
Evaluation of a Client Centric Payment Protocol
161
Step 1 y 2: Client C and Merchant Access Point AP establish a short-range communication in order to identify one to each other and exchange its public keys. Step 3: Client C creates a Payment Request message and sends it to the Merchant Access Point AP. Step 4: Merchant Access Point AP decrypts the Payment Request message received from Client C and generates a Value-Claim Request, message, addressed to the Payment Gateway PG that will be sent through Client C. Step 5: Client C prepares a secured message addressed to the Payment Gateway PG including his request Value-Substraction Request, , along with the message that should be forwarded received from Merchant Access Point AP, Value-Claim . Request, Step 6: The Payment Gateway PG performs and verifies the payment within the banking private network. Step 7: Once received the responses of payment from the two banks, the Payment message, in order to send it Gateway PG generates a Value-Claim Response, to the Merchant Access Point AP through client C. to the Merchant Step 8: Client C forwards the Value-Claim Response, Access Point AP.
4 Proposed Protocol Evaluation Figure 4 presents the technology architecture designed to implement the testing scenario. Our proposed protocol will be evaluated into this testing architecture.
Fig. 4. Client Centric Payment Protocol Evaluation Scenario
162
M. Viedma Astudillo et al.
The protocol had been implemented using Java SE 6 for server entities (Merchant Access Point, Payment Gateway, Issuer and Acquirer), but also the additional entity System Authority. Client implementation over mobile devices had been developed in Java 2 ME platform. All programs use the lightweight cryptography API available inn [10], for the execution of the needed cryptographic functions (hash functions). System Authority inizialization parameters length (in bytes) is shown in table 1. SA parameters
p’ (bytes) 64
q' (bytes) 64
n (bytes) 128
KS (bytes) 128
KP (bytes) 128
5 Conclusions We have presented a new protocol implementation which solves the problem of Client Centric Payment Protocol using Digital Signature scheme with message recovery using Self-certified public key. Our proposal assures the achievement of the security services needed for electronic payment. Security functions implemented are described in section, where it can be found a complete description of the protocol, but also a guide for the implementation of this protocol or any other based on message recovery using self-certified public key. This work also supplies a testing framework for the evaluation of the protocol, which can be used for the performance and security evaluation of the protocol. This architecture is especially interesting for mobile devices because its computational capacity is limited. One future expansion of this work will be the definition of a complete framework for the evaluation of electronic payment mobile protocols. The framework will pay a main attention of the implementation of security mechanism and its performance in mobile devices.
Acknowledgements This work was supported in part by the I-ASPECTS- Project (TIN2007-66107), but all ideas exposed represent the view of the authors.
References 1. Isaac, J.T., Camara, J.S., Manzanares, A.I., Márquez, J.T.: Anonymous payment in a Kiosk centric model using digital signature scheme with message recovery and low computational power devices. J. Theor. Appl. Electron. Commer. Res. 1(2), 1–11 (2006) 2. Zhang, J., Li, H.: On the security of a digital signature with message recovery using selfcertified public key. In: Proceedings: 2005 International Conference on Wireless Communications, Networking and Mobile Computing, vol. 2(23-26), pp. 1171–1174 (2005) 3. Isaac, J.T., Camara, J.S.: An Anonymous Account-Based Mobile Payment Protocol for a Restricted Connectivity Scenario. In: Proceedings of the 18th international Conference on Database and Expert Systems Applications. DEXA, September 3-7, pp. 688–692. IEEE Computer Society, Washington (2007)
Evaluation of a Client Centric Payment Protocol
163
4. Chang, Y., Chang, C., Huang, H.: Digital signature with recovery using slef-certified public keys without trustworthy system authority. Journal of Applied Mathematics and Computation 161(1), 211–227 (2005) 5. Tseng, Y., Jan, J., Chien, H.: Digital signature with message recovery using self-certified public keys and its variants. Journal of Applied Mathematics and Computation 136(2-3) (2003) 6. Zhang, J., Zou, W., Chen, D., Wang, Y.: On the Security of a Digital Signature with Message Recovery using self-certified public key. Soft Computing in Multimedia Processing Special Issue of the Informatica Journal 29(3), 343–346 (2005) 7. Kungpisdan, S., Srinivasan, B., Dung Le, P.: Information Technology: Coding and Computing, 2004. In: International Conference on Proceedings ITCC 2004, vol. 1(5-7), pp. 35– 39 (2004) 8. Abad Peiro, J.L., Asokan, N., Steiner, M., Waidner, M.: Designing a generic payment service. IBM Syst. J. 37(1998), 72–88 (1998) 9. Girault, M.: Self-Certified Public Keys. In: Davies, D.W. (ed.) EUROCRYPT 1991. LNCS, vol. 547, pp. 490–497. Springer, Heidelberg (1991) 10. The Legion of the Bouncy Castle, The Legion of the Bouncy Castle Java cryptography API version 1.42 (2008), http://www.bouncycastle.org/
Security Vulnerabilities of a Remote User Authentication Scheme Using Smart Cards Suited for a Multi-server Environment Youngsook Lee1 and Dongho Won2, 1
2
Department of Cyber Investigation Police, Howon University, Korea
[email protected] Department of Computer Engineering, Sungkyunkwan University, Korea
[email protected]
Abstract. Recently, Liu et al. have proposed an efficient scheme for a remote user authentication using smart cards suited for a multi-server environment. This work reviews Liu et al,’s scheme and provides a security analysis on the scheme. Our analysis shows that Liu et al.’s scheme does not achieve its fundamental goal not only of mutual authentication bur also of password security. We demonstrate these by mounting a user impersonation attack and an off-line password guessing attack, respectively, on Liu et al.’s scheme. Keywords: multi-server environments, remote user authentication scheme, smart card, impersonation attack, password guessing attack, password security.
1
Introduction
The feasibility of password-based user authentication in remotely accessed computer systems was explored as early as the work of Lamport [16]. Due in large part to the practical significance of password-based authentication, this initial work has been followed by a great deal of studies and proposals, including solutions using multi-application smart cards [5,29,11,23,6,12,30]. In a typical password-based authentication scheme using smart cards, remote users are authenticated using their smart card as an identification token; the smart card takes as input a password from a user, recovers a unique identifier from the user-given password, creates a login message using the identifier, and then sends the login message to the server who then checks the validity of the login request before allowing access to any services or resources. This way, the administrative overhead of the server is greatly reduced and the remote user is allowed to remember only his password to log on. Besides just creating and sending login messages, smart cards support mutual authentication where a challenge-response interaction between the card and the server takes place to verify each other’s identity.
This work was supported by Howon University in 2009. Corresponding author.
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 164–172, 2009. c Springer-Verlag Berlin Heidelberg 2009
Security Vulnerabilities of a Remote User Authentication Scheme
165
Mutual authentication is a critical requirement in most real-world applications where one’s private information should not be released to anyone until mutual confidence is established [1]. The experience has shown that the design of secure authentication schemes is not an easy task to do, especially in the presence of an active adversary; there is a long history of schemes for this domain being proposed and subsequently broken by some attacks (e.g., [8,2,3,20,10,30,28,15,7,14]). Therefore, authentication schemes must be subjected to the strictest scrutiny possible before they can be deployed into an untrusted, open network, which might be controlled by an adversary. In 2007, Liu et al. [19] proposed a remote user authentication scheme suited for a multi-server environment [17,18,26,25,24,4,21] in which the user can be authenticated by all servers included in it using a single password shared with the central manager and its smart cards. In addition to reducing the communication cost to a minimum, this scheme exhibits various other merits: (1) it allows the user to register with the central manager only once; (2) it allows the user to gain access to all servers included in a multi-server environment without repeating registration to every single server; (3) it does not require any server to maintain a password table for verifying the legitimacy of login users; (4) it allows users to choose and change their passwords according to their liking and hence gives more user convenience. But unfortunately, despite all the merits above, Liu et al.’s scheme does not achieve its fundamental goal not only of mutual authentication but also of password security. We demonstrate these by mounting a user impersonation attack and an off-line password guessing attack. The remainder of this paper is organized as follows. We begin by reviewing Liu et al.’s remote user authentication scheme in Section 2. Then in Section 3, we present security weaknesses in Liu et al.’s authentication scheme. Finally, we conclude this work in Section 4.
2
Review of Liu et al.’s Authentication Scheme
This section reviews a remote user authentication scheme proposed by Liu et al. [19]. The scheme consists of three phases: initialization phase, registration phase, and authentication phase. The initialization phase is processed when the central manager obtains a RSA certificate from a central authority and generates secret keys for each server included in a multi-server environment. The registration phase is performed only once per user when a new user registers itself with the central manager. The authentication phase is carried out whenever a user wants to gain access to each server included in a multi-server environment. Liu et al.’s scheme participants include a remote user Ui , multiple servers S1 , S2 , ... , Sn , and a central manager CM , where CM is assumed to be trusted. For simplicity, we denote by IDi and IDSj , the identities of Ui and Sj , respectively. The public system parameters used for the scheme are as follows: – A one-way hash function h – A large number N = pq, where p and q are two distinct large primes
166
Y. Lee and D. Won
"
!
#
$
$
!
%
Fig. 1. The initialization phase of Liu et al.’s remote user authentication scheme
– A Euler totient function φ – A generator g of ZN – A pair of symmetric encryption/decryption algorithms (E, D) In describing the scheme, we will omit ‘mod N ’ from expressions for notational simplicity. A high level depiction of the scheme is given in Fig. 1 and in Fig. 2, where dashed lines indicate a secure channel, and a more detailed description follows: Initialization Phase. During this phase, for each server, the central manager CM performs the following running: 1. As in the RSA cryptosystem, the central CM selects a public key e and a secret key d, where ed = 1 mod φ(N ). 2. CM obtains its RSA certificate CertCM (N, e) from a certificate authority. 3. CM computes the server Sj ’s secret key skSj = h(IDSj ||σ) for all j, where σ = gd. 4. Then, CM delivers CertCM (N, e), skSj , h(·) to Sj for j = 1, ..., n, through a secure channel, respectively. Registration Phase. This is the phase where a new registration of a user takes place. The registration proceeds as follows: Step 1. A user Ui , who wants to register with the central manager CM , chooses its password P Wi at will and submits a registration request IDi , P Wi to CM via a secure channel. Step 2. When the central manager CM receives the request, it first chooses a random number r larger than 160 bits by according to the recommendation proposed by DSS [9], computes ski = g rd
and pki = g rP Wi ,
and then issues a smart card containing IDi , h(·), N, g, σ, ski , pki to the user Ui .
Security Vulnerabilities of a Remote User Authentication Scheme
Fig. 2. Liu et al.’s remote user authentication scheme
167
168
Y. Lee and D. Won
Authentication Phase. This phase constitutes the core of the scheme, and is performed whenever some user Ui wants to log on to the server Sj . The user Ui initiates this phase by inserting its smart card into a card reader and then entering its identity IDi and password P Wi and the server Sj ’s identity IDSj . Given the user input, the smart card and the sever Sj perform the following steps: Step 1. Using the user-given P Wi and IDSj , the smart card chooses a random number b larger than 160 bits by according to the recommendation proposed by DSS, generates the current timestamp t1 and a random nonce m, and computes v1 = g be , v2 = skiP Wi g bt1 , skSj = h(IDSj ||σ), Ai = EskSj (m), Bi = h(IDi ||t1 ||pki ||v1 ||v2 ||m). Then the smart card sends the login request message IDi , t1 , pki , v1 , v2 , Ai , Bi to the server Sj . Step 2. When the login request arrives, Sj first acquires the current timestamp t2 and recovers m as m = DskSj (Ai ). Then Sj verifies that: (1) IDi is valid, (2) t2 − t1 ≤ Δt, where Δt is the maximum allowed time interval for transmission delay, and (3) v2e /(v1t1 pki ) equals 1. If one of these conditions is untrue, Sj rejects the login request. Otherwise, Sj generates a secret key K, computes Ci = EskSj (m + 1, K), and sends the response message Ci to Ui . Step 3. Upon receipt of the response Ci , user Ui computes m + 1 as (m + 1, K) = DskSj (Ci ) and checks that m is equal to m . If correct, Ui believes the responding party as the authentic server. Otherwise, Ui aborts its login attempt. Password Change Procedure. One of the general guidelines to get better password security is to ensure that passwords are changed at regular intervals. As already mentioned, this scheme allows users to freely choose and change their passwords. When user Ui wishes to change their password from P Wi to P Wnewi , the following steps are performed: 1. Ui inserts its smart card into a card reader and enters the new password P Wnewi . 2. The smart card computes pki = g rP Wnewi and replaces pki with pki
3
Weaknesses in Liu et al.’s Authentication Scheme
To analyze the security of remote user authentication schemes using smart cards, we need to consider the capabilities of the attacker. First, we assume that the
Security Vulnerabilities of a Remote User Authentication Scheme
169
attacker has complete control of every aspect of all communications between the server and the remote user. That is, he/she may read, modify, insert, delete, replay and delay any messages in the communication channel. Second, he/she may try to steal a user’s smart card and extract the information in the smart card by monitoring the power consumption of the smart card [13,22]. Third, he/she may try to find out a user’s password. Clearly, if both (1) the user’s smart card was stolen and (2) the user’s password was exposed, then there is no way to prevent the attacker from impersonating the user. However, a remote user authentication scheme should be secure if only one of (1) and (2) is the case. So the best we can do is to guarantee the security of the scheme when either the user’s smart card or its password is stolen, but not both. This security property is called two-factor security [27]. In this section we point out that Liu et al.’s scheme not only does not achieve its main security goal of authenticating between a remote individual and the server but also suffers from an off-line password guessing attack. 3.1
Impersonation Attack on Liu et al.’s Scheme
Impersonating Ui to Sj . First, we present a user impersonation attack where an attacker Ua , who is a legitimate user registered with the registration center, can easily impersonate the remote user Ui to any server Sj . Once an attacker obtains the values in the smart card, he/she is able to forge any valid login request without knowing the users password. Before describing the attack, we note that the secret values stored in a smart card could be extracted by monitoring its power consumption [13,22]. We now proceed to describe the server impersonation attack. 1. As a preliminary step, the attacker Ua extracts the server’s secret key σ stored in its smart card. 2. Now when Ui initiates the authentication phase with the login request message IDi , t1 , pki , v1 , v2 , Ai , Bi , the attacker Ua posing as Ui intercepts this login request and sends immediately back to Sj a forged response message as follows: Ua who has obtained the login message, IDi , t1 , pki , v1 , v2 , Ai , Bi , and the secret value σ, first (a) generates the random nonce m , (b) chooses the arbitrary password P Wa , (c) computes pka = pkiP Wa , v1 = v1P Wa , v2 = v2P Wa , skSj = h(IDSj ||σ), Aa = EskSj (m ), Ba = h(IDi ||t1 ||pka ||v1 ||v2 ||m ).
170
Y. Lee and D. Won
(d) and then sends IDi , t1 , pka , v1 , v2 , Aa , Ba in response to Ui ’s login request. 3. After receiving IDi , t1 , pka , v1 , v2 , Aa , Ba , the server Sj proceeds to verify the authenticity of the login request. First, Sj (a) recovers m by decrypting Aa , i.e., m = DskSj (Aa ), (b) checks that IDi is valid?, t2 − t1 ≤ Δt, Ba = h(IDi ||t1 ||pka ||v1 ||v2 ||m ), ?
v2 /(v1 1 pka ) = 1. e
t
?
(c) Since all of the four conditions hold, Sj will welcome Ua ’s visit to the system and sends the response message Ci to Ua . 3.2
Off-Line Password Guessing Attack on Liu et al.’s Scheme
Liu et al. [19] claim that their authentication scheme prevents an attacker from learning some registered user’s password via an off-line password guessing attack. But, unlike the claim, Liu et al.’s protocol is vulnerable to an off-line password guessing attack mounted by extracting the secret information from a smart card [27]. Assume that the attacker, who wants to find out the password of the user Ui , has stolen the Ui ’s smart card or gained access to it and extracted the secret values stored in it by monitoring its power consumption. Now the attacker Ua has obtained the values ski , pki , and e stored in the Ui ’s smart card. Then the following description represents our off-line password guessing attack mounted by Ua against Ui ’s password. 1. As preliminary step, the attacker Ua obtains the values ski , pki , and e stored in Ui ’s smart card and computes the value of μ by recovering μ = skie with the obtained value e. 2. First, Ua makes a guess P Wi for P Wi and computes pki = μP Wi by using μ. 3. Ua then verifies the correctness of P Wi by checking the equality pki = pki . Notice that if P Wi and P Wi are equal, then pki = pki ought to be satisfied. 4. Ua repeats steps (2) and (3) until a correct password is found.
4
Conclusion
We have analyzed the security of the smart card based user authentication scheme proposed by Liu et al. [19]. Our security analysis uncovered that Liu et al.’s scheme does not achieve its main security goal of mutual authentication and the password security. The failure of Liu et al.’s scheme to achieve
Security Vulnerabilities of a Remote User Authentication Scheme
171
authentication has been made clear through a user impersonation attack. The user impersonation attack has been considered to infringe user-to-server authentications of the scheme. Finally, the flaw of password security has been shown via the off-line password guessing attack in which an attacker is easily able to find out some user’s password.
References 1. Anti-Phishing Working Group, http://www.antiphishing.org 2. Bird, R., Gopal, I., Herzberg, A., Janson, P.A., Kutten, S., Molva, R., Yung, M.: Systematic design of a family of attack-resistant authentication protocols. IEEE Journal on Selected Areas in Communications 11(5), 679–693 (1993) 3. Carlsen, U.: Cryptographic protocol flaws: know your enemy. In: Proceedings of the 7th IEEE Computer Security Foundations Workshop, pp. 192–200 (1994) 4. Chang, C., Lee, J.S.: An efficient and secure multi-server password authentication scheme using smart cards. In: IEEE Proceeding of the International Conference on Cyberworlds (2004) 5. Chang, C.-C., Wu, T.-C.: Remote password authentication with smart cards. IEE Proceedings E - Computers and Digital Techniques 138(3), 165–168 (1991) 6. Chien, H.-Y., Jan, J.-K., Tseng, Y.-M.: An efficient and practical solution to remote authentication: smart card. Computers & Security 21(4), 372–375 (2002) 7. Chang, C., Kuo, J.Y.: An efficient multi-server password authenticated keys agreement scheme using smart cards with access control. In: IEEE Proceeding of the 19th International Conference on Advanced Information Networking and Applications, vol. 2, pp. 257–260 (2005) 8. Diffie, W., van Oorschot, P.C., Wiener, M.J.: Authentication and authenticated key exchange. Designs, Codes and Cryptography 2(2), 107–125 (1992) 9. National Institute of Standards and Technology (NIST), Digital signatur standard, FIPS PUB 186, p. 20 (1994) 10. Hsu, C.-L.: Security of Chien et al.’s remote user authentication scheme using smart cards. Computer Standards & Interfaces 26(3), 167–169 (2004) 11. Hwang, M.-S., Li, L.-H.: A new remote user authentication scheme using smart cards. IEEE Transaction on Consumer Electronics 46(1), 28–30 (2000) 12. Hwang, M.-S., Li, L.-H., Tang, Y.-L.: A simple remote user authentication. Mathematical and Computer Modelling 36, 103–107 (2002) 13. Kocher, P.C., Jaffe, J., Jun, B.: Differential power analysis. In: Wiener, M. (ed.) CRYPTO 1999. LNCS, vol. 1666, pp. 388–397. Springer, Heidelberg (1999) 14. Juang, W.S.: Efficient multi-server password authenticated key agreement using smart cards. IEEE Transaction on Consumer Electronics 50(1), 251–255 (2004) 15. Ku, W.-C., Chang, S.-T., Chiang, M.-H.: Weaknesses of a remote user authentication scheme using smart cards for multi-server architecture. IEICE Transaction on Commmunications E88-B(8), 3451–3454 (2005) 16. Lamport, L.: Password authentication with insecure communication. Communications of the ACM 24(11), 770–772 (1981) 17. Li, L.-H., Lin, I.-C., Hwang, M.-S.: A remote password authentication scheme for multi-server architecture using neural networks. IEEE Transaction on Neural Networks 12(6) (2001)
172
Y. Lee and D. Won
18. Lin, I.-C., Hwang, M.-S., Li, L.-H.: A new remote user authentication scheme for multi-server internet environments. Future Generation Computer System 19, 13–22 (2003) 19. Liu, J., Liao, J., Zhu, X.: A password-based authentication and key establishment scheme for mobile environment. In: 21st International Conference on Advanced Information Networking and Applications Workshops (AINAW 2007), vol. 2, pp. 99–104 (2007) 20. Lowe, G.: An attack on the Needham-Schroeder public-key authentication protocol. Information Processing Letters 56(3), 131–133 (1995) 21. Lee, Y., Won, D.: Security Weaknesses in Chang and Wu’s Key Agreement Protocol for a Multi-Server Environment. In: IEEE Proceeding of 2008 International Conference on e-Business Engineering, pp. 308–314. IEEE Computer Society, Los Alamitos (2008) 22. Messerges, T.-S., Dabbish, E.-A., Sloan, R.-H.: Examining smart card security under the threat of power analysis attacks. IEEE Transactions on Computers 51(5), 541–552 (2002) 23. Sun, H.-M.: An efficient remote user authentication scheme using smart cards. IEEE Transaction on Consumer Electronics 46(4), 958–961 (2000) 24. Tsuar, W.-J.: An enhanced user authentication scheme for multi-server internet services. Applied Mathematics and Computation 170, 258–266 (2005) 25. Tsuar, W.-J., Wu, C.-C., Lee, W.-B.: A flexible user authentication for multiserver internet services. In: Lorenz, P. (ed.) ICN 2001. LNCS, vol. 2093, pp. 174–183. Springer, Heidelberg (2001) 26. Tsaur, W.-J., Wu, C.-C., Lee, W.-B.: A smart card-based remote scheme for password authentication in multi-server Internet services. Computer Standards & Interfaces 27, 39–51 (2004) 27. Tian, X., Zhu, R.W., Wong, D.S.: Improved efficient remote user authentication schemes. International Jounal of Network Security 4(2), 149–154 (2007) 28. Yoon, E.-J., Kim, W.-H., Yoo, K.-Y.: Security enhancement for password authentication schemes with smart cards. In: Katsikas, S.K., L´ opez, J., Pernul, G. (eds.) TrustBus 2005. LNCS, vol. 3592, pp. 311–320. Springer, Heidelberg (2005) 29. Yang, W.-H., Shieh, S.-P.: Password authentication schemes with smart card. Computers & Security 18(8), 727–733 (1999) 30. Yoon, E.-J., Ryu, E.-K., Yoo, K.-Y.: An improvement of Hwang-Lee-Tang’s simple remote user authentication scheme. Computers & Security 24(1), 50–56 (2005)
Enhancing Security of a Group Key Exchange Protocol for Users with Individual Passwords Junghyun Nam1 , Sangchul Han1, , Minkyu Park1, Juryon Paik2 , and Ung Mo Kim2 1
Department of Computer Science, Konkuk University, 322 Danwol-dong, Chungju-si, Chungcheongbuk-do 380-701, Korea {jhnam,schan,minkyup}@kku.ac.kr 2 Department of Computer Engineering, Sungkyunkwan University, 300 Cheoncheon-dong, Jangan-gu, Suwon-si, Gyeonggi-do 440-746, Korea {wise96,umkim}@ece.skku.ac.kr
Abstract. Group key exchange protocols allow a group of parties communicating over a public network to come up with a common secret key called a session key. Due to their critical role in building secure multicast channels, a number of group key exchange protocols have been suggested over the years for a variety of settings. Among these is the socalled EKE-M protocol proposed by Byun and Lee for password-based group key exchange in the different password authentication model, where group members are assumed to hold an individual password rather than a common password. While the announcement of the EKE-M protocol was essential in the light of the practical significance of the different password authentication model, Tang and Chen showed that the EKE-M protocol itself suffers from an undetectable on-line dictionary attack. Given Tang and Chen’s attack, Byun et al. have recently suggested a modification to the EKE-M protocol and claimed that their modification makes EKE-M resistant to the attack. However, the claim turned out to be untrue. In the current paper, we demonstrate this by showing that Byun et al.’s modified EKE-M is still vulnerable to an undetectable on-line dictionary attack. Besides reporting our attack, we also figure out what has gone wrong with Byun et al.’s modification and how to fix it. Keywords: Group key exchange, password-based authentication, undetectable on-line dictionary attack.
1
Introduction
The highest priority in designing a key exchange protocol is placed on ensuring the security of session keys to be established by the protocol. Roughly speaking, establishing a session key securely means that the key is being known only to the intended parties at the end of the protocol run. Even if it is computationally infeasible to break the cryptographic algorithms used, the whole system becomes
The corresponding author.
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 173–181, 2009. c Springer-Verlag Berlin Heidelberg 2009
174
J. Nam et al.
vulnerable to all manner of attacks if the keys are not securely established. But unfortunately, the experience has shown that the design of secure key exchange protocols is notoriously difficult. In particular, the difficulty is greatly increased in the group setting where a session key is to be established among an arbitrary number of parties. Indeed, there is a long history of protocols for this domain being proposed and years later found to be flawed (a very partial list of examples includes [14,18,7,21,17,12,1,16]). Thus, group key exchange protocols must be subjected to a thorough and systematic scrutiny before they are deployed into a public network, which might be controlled by an adversary. Secure session-key generation requires an authentication mechanism to be integrated into key exchange protocols. In turn, achieving any form of authentication inevitably requires some secret information to be established between users in advance of the authentication stage. Cryptographic keys, either secret keys for symmetric cryptography or private/public keys for asymmetric cryptography, may be one form of the underlying secret information pre-established between users. However, these high-entropy cryptographic keys are random in appearance and thus are difficult for humans to remember, entailing a significant amount of administrative work and costs. Eventually, it is this drawback that password-based authentication has come to be widely used in reality. Passwords are drawn from a relatively small space like a dictionary, and are easier for humans to remember than cryptographic keys with high entropy. In the past few years there have been several protocols proposed for passwordauthenticated group key exchange. However, most of the protocols have been built in the so-called same password authentication model which assumes a common password pre-established among all users participating in the protocol (e.g., [8,15,1,20,13,6,2]). Hence, these protocols may be inadequate for many clientserver applications in which each user (called client) shares its password only with the server, but not with other users. Given the situation above, Byun and Lee [10] have recently proposed two new protocols, called EKE-U and EKE-M, for password-authenticated group key exchange in the different password authentication model where each user is assumed to hold an individual password rather than a common password. But later, Tang and Chen [19] showed that both EKE-U and EKE-M are not as much secure as originally claimed, suffering from an off-line dictionary attack and an undetectable on-line dictionary attack, respectively. Generating a sequence of attack and defence moves, Byun et al. [11] suggested countermeasures against Tang and Chen’s attacks on EKE-U and EKE-M. However, we found that Byun et al.’s countermeasure for protecting EKE-M against the undetectable on-line dictionary attack is not satisfactory enough. Extending the attack-defence sequence, this paper reports a security defect of the countermeasure and presents how to remedy the security defect.
2
The EKE-M Protocol and Its Weakness
This section reviews the EKE-M protocol presented by Byun and Lee [10] and its weakness pointed out by Tang and Chen [19].
Enhancing Security of a Group Key Exchange Protocol
2.1
175
Description of EKE-M
The EKE-M protocol is designed for use in a multicast network. The protocol participants consist of a single server S and multiple clients C1 , . . . , Cn . The protocol assumes that each client Ci has shared a password pwi with the server S via a secure channel. The followings are the public system parameters used in the protocol. 1. A cyclic group G of prime order q and a generator g of G. 2. A pair of symmetric encryption/decryption algorithms (E, D) modeled as an ideal cipher [3]. 3. Two one-way hash functions H1 and H2 modeled as random oracles [5]. With the passwords and the parameters established, the EKE-M protocol runs in two communication rounds as follows: Round 1: The server S chooses n random numbers s1 , . . . , sn ∈ Z∗q , computes t1 = Epw1 (g s1 ), . . . , tn = Epwn (g sn ), and sends ti to the client Ci for i = 1, . . . , n. Concurrently, each Ci selects a random xi ∈ Z∗q , computes yi = Epwi (g xi ), and broadcasts yi to the rest of the group. S and Ci decrypt respectively yi and ti using pwi , and share the pairwise key ski = H1 (sidg si xi ) where sid = y1 y2 · · · yn . Round 2: S selects a random group secret K ∈ Z∗q and computes k1 = K ⊕ sk1 , . . . , kn = K ⊕ skn . Then S broadcasts all the ki ’s to the clients. After receiving the broadcast message, each Ci computes the group secret K = ki ⊕ ski and the session key sk = H2 (SIDK) where SID = sidk1 k2 · · · kn . If key confirmation is required, the EKE-M protocol can be extended to incorporate the well-known technique [9] which in turn is based on earlier work of [4,3]. 2.2
Attack on EKE-M
As mentioned in the Introduction, Tang and Chen [19] presented an undetectable on-line dictionary attack on the EKE-M protocol. The attack can be mounted by any registered client Cj against any other registered client Ci (1 ≤ i ≤ n, i = j). Through the attack, the adversary Cj can undetectably verify the correctness of its guess for the password of each victim Ci . Seriously, the repeated mounting of the attack could lead to the exposure of the real passwords of the victims. Seen from a high level, the attack scenario is quite clear: the adversary Cj simply runs the protocol with the server S while playing multiple roles of the clients C1 , . . . , Cn . A detailed description of the attack follows.
176
J. Nam et al.
!
'
$
%
)
%
#
"
%
"
"
)
*
(
#
#
&
"
&
+
,
,
0
0
0
,
'
-
.
/
1
2
"
3
3
5
4
,
-
6
7
7 0
0
0
'
8
9
8
-
9
8
9
:
:
1
;
>
<
?
=
0
0
0
'
@
A
-
.
B
C
A
B
.
C
1
D
E
E
F
-
D
J
M
' N
O
L
G
M
P
G
H
I
J
K
H
I
O
K
Q
Q
Q
N
R
M
P
G
1
H
I
R
K
F
-
0
0
0
' N
N
1
S
' T
U
(
F
U
V
V
V
U S
F
' T
F
U
(
F
G
U
V
V
V
U
F
F
G
1
1
E
E
&
' S
U T
J
-
J
P
' M
&
'
T
O
G
O
P
M
L
G
K
E
U
S
L
K
W
E
E
W
E
Q
Q
Q
&
' S
U T
R
R
P
M
L
G
1
E
;
>
<
K
W
E
X
=
Y
A .
B C
D
'
Y
Z
'
G
W
E
W
Q
Q
Q
Y
Z
G
1
W
0
0
0
'
1
W
Y
W
Z
G
-
-
W
E
S
6
[
\
W
U T
'
U
V
V
V
U
G
1
E
&
W
(
W
G
U
6
[
Y M
\
K
E
W
Fig. 1. The description of the EKE-M protocol
1
E
W
Enhancing Security of a Group Key Exchange Protocol
177
1. For client Ci (1 ≤ i ≤ n, i = j), the adversary Cj makes a guess pwi for the password pwi , selects a random xi ∈ Z∗q , and computes yi = Epwi (g xi ). In addition, Cj also computes yj as specified in the protocol. 2. When the server S sends t1 , . . . , tn to the clients in the first round, Cj intercepts these ti ’s and sends yj (with its true identity) and each yi (posing as Ci ) to S. Then, Cj computes the pairwise key skj = H1 (sidg sj xj ). 3. Now, when S broadcasts k1 , . . . , kn in the second round, Cj intercepts these ki ’s and recovers K by computing K = kj ⊕ skj . Finally, Cj verifies the correctness of each guess pwi by computing ski = H1 (sid(Dpwi (ti ))xi ) and by checking the equality K = ki ⊕ ski . ?
It is clear that the vulnerability of the EKE-M protocol to the attack above is mainly because the server does not require the clients to authenticate themselves in the protocol execution.
3
Byun et al.’s Modification to EKE-M
We here describe the modified EKE-M due to Byun et al. [11], which we call EKE-M+ . The EKE-M+ protocol aims to be secure against undetectable on-line password guessing attacks. The first round of EKE-M+ proceeds exactly like that of EKE-M while the second round is extended to let S and Ci exchange the authenticators H2 (ski S) and H2 (ski Ci ). In more detail the EKE-M+ protocol works as follows: Round 1: S selects random numbers s1 , . . . , sn ∈ Z∗q , computes t1 = Epw1 (g s1 ), . . ., tn = Epwn (g sn ), and sends ti to Ci for i = 1, . . . , n. Concurrently, each Ci selects a random xi ∈ Z∗q , computes yi = Epwi (g xi ), and broadcasts yi to the rest of the group. S and Ci decrypt respectively yi and ti using pwi , and share the pairwise key ski = H1 (sidg si xi ) where sid = y1 y2 · · · yn . Round 2: S selects a random group secret K ∈ Z∗q and computes ki = K ⊕ ski and αi = H2 (ski S) for i = 1, . . . , n. Then S broadcasts all the ki ’s and αi ’s to the clients. Concurrently, each Ci computes and broadcasts βi = H2 (ski Ci ). S verifies the correctness of each βi to detect any potential online dictionary attack. Meanwhile, Ci verifies the correctness of αi and only if the verification succeeds, proceeds to compute the group secret K = ki ⊕ ski and the session key sk = H3 (SIDK) where SID = sidk1 k2 · · · kn and H3 is a hash function (other than H1 and H2 ).
178
J. Nam et al.
"
!
$
'
'
'
%
#
)
+
-
&
#
&
$
#
&
*
!
(
$
#
!
,
-
!
1
1
1
!
-
(
.
/
0
2
3
#
4
4
6
5
-
.
7
8
8 !
1
1
1
!
(
9
:
9
.
:
9
:
;
;
2
<
?
=
@
>
!
1
1
1
!
(
A
B
.
/
C
D
B
C
/
D
2
E
F
K
G
.
F
N
E
( O
P
M
H
N
Q
H
I
J
K
L
I
J
P
L
R
R
R
O
S
N
Q
H
2
I
J
S
L
G
.
!
1
1
1
!
( O
O
2
T
( U
V
)
G
V
W
W
W
V T
G
( U
G
V
)
G
H
V
W
W
W
V
G
G
H
2
2
F
F
'
( T
V U
K
.
Q
K
( N
'
(
U
P
M
H
P
Q
N
M
H
L
F
V
T
L
X
F
F
X
F
R
R
R
'
( T
V U
S
S
Q
N
M
H
2
F
<
?
Z
=
L
X
F
Y
>
'
.
)
V
H
N
.
[
.
B /
C D
L
F
X E
( [
( \
H
X
F
X
R
R
R
[
\
H
2
2
X
F
(
'
)
X
(
V N
]
H
7
L
F
X
R
R
R
'
)
V N
]
H
7
Z
2
L
2
F
X
.
!
1
1
1
!
!
!
(
^
_
1
1
1
!
(
]
2
X
]
2
X
#
] ^
_
#
Z
!
1
1
1
!
Z
(
.
2
[
\
H
.
.
X
F
T
7
`
a
X
V U
(
V
W
W
W
V
H
2
F
'
X
*
X
V
H
7
`
[
N
a
F
L
X
Fig. 2. The description of the EKE-M+ protocol
Enhancing Security of a Group Key Exchange Protocol
179
Like EKE-M, the EKE-M+ protocol can be extended to include the well-known technique for key confirmation [9].
4
Attack on EKE-M+
Byun et al. [11] claim that their EKE-M+ protocol is secure against undetectable on-line dictionary attacks. In support of this claim, they argue that the malicious client Cj cannot generate the authenticator βi = H2 (ski Ci ) because it does not know the pairwise key ski . However, this claim is flawed. The fact is that Cj is easily able to compute ski and so is able to generate βi . A direct consequence of this fact is that unlike the claim, the EKE-M+ protocol is still vulnerable to an undetectable on-line dictionary attack, as shown below. Our observation leading to the attack is that computing ski in EKE-M+ does not necessarily require the knowledge of the correct password pwi . The attack proceeds as follows: 1. The adversary Cj begins by preparing the messages to be sent to S in the first round. For each yi (1 ≤ i ≤ n, i = j), Cj computes them as yi = Epwi (g xi ) where pwi is a guess for the password pwi and xi is a random number from Z∗q . For its own yj , Cj computes it exactly as specified in the protocol. 2. When the server S sends t1 , . . . , tn to the clients in the first round, Cj intercepts these ti ’s and sends yj (with its true identity) and each yi (posing as Ci ) to S. Then, Cj computes the pairwise key skj and the authenticator βj as per the protocol specification. 3. Now, when S broadcasts ki ’s and αi ’s in the second round, Cj intercepts the broadcast message and recovers K by computing K = kj ⊕ skj . Then Cj computes ski = ki ⊕ K and βi = H2 (ski Ci ) for i ∈ {1, . . . , n} \ {j}, and sends βj and each βi immediately to S. Finally, Cj verifies the correctness of each guess pwi by computing ski = H1 (sid(Dpwi (ti ))xi ) and by checking the equality K = ki ⊕ ski . ?
5
Enhancing Security of EKE-M+
One intuitive way of preventing the attack above is to modify the EKE-M+ protocol so that the server S broadcasts ki ’s and αi ’s only after it receives and verifies the authenticators βi ’s. With this modification, the attack would be no longer possible because the adversary could not compute βi without having
180
J. Nam et al.
received ki from S. However, this solution might be not so elegant in that it comes at an additional communication round. A better way to fix the EKE-M+ protocol can be found by figuring out the fundamental cause of the security failure. A little thought will make it clear that the main design flaw in EKE-M+ is to use the same ski in computing both ki = K ⊕ ski and βi = H2 (ski Ci ). This oversight allows the adversary to derive βi easily from K and ki , and thus creates the vulnerability to the undetectable on-line dictionary attack. Having identified the source of the problem, it is now apparent how to repair the protocol. The computations of ki and βi should be modified so that one of the two cannot be derived from the other. To this end, it suffices to change the computation of βi to βi = H2 (sk i Ci ) def
where sk i = H1 (g si xi C1 · · · Cn ). The verification of βi changes correspondingly, but all other computations remain unchanged. This modification effectively prevents the undetectable on-line dictionary attack because the adversary Cj can no longer generate βi even with ski at hand. As for efficiency, the modification does not increase the number of communication rounds but only takes an additional evaluation of a hash function.
6
Conclusion
This work has considered the security of Byun et al.’s password-authenticated group key exchange protocol [11] where group members are assumed to hold an individual password rather than a common password. Byun et al.’s protocol, which we named EKE-M+ , specifically aims to be secure against undetectable on-line password guessing attacks. We however presented an undetectable online dictionary attack on the EKE-M+ protocol. This means that the EKE-M+ protocol fails to achieve one of its main security goals. Besides reporting our attack, we also figured out what has gone wrong with the EKE-M+ protocol and how to fix it.
References 1. Abdalla, M., Bresson, E., Chevassut, O., Pointcheval, D.: Password-based group key exchange in a constant number of rounds. In: Yung, M., Dodis, Y., Kiayias, A., Malkin, T.G. (eds.) PKC 2006. LNCS, vol. 3958, pp. 427–442. Springer, Heidelberg (2006) 2. Abdalla, M., Pointcheval, D.: A scalable password-based group key exchange protocol in the standard model. In: Lai, X., Chen, K. (eds.) ASIACRYPT 2006. LNCS, vol. 4284, pp. 332–347. Springer, Heidelberg (2006) 3. Bellare, M., Pointcheval, D., Rogaway, P.: Authenticated key exchange secure against dictionary attacks. In: Preneel, B. (ed.) EUROCRYPT 2000. LNCS, vol. 1807, pp. 139–155. Springer, Heidelberg (2000)
Enhancing Security of a Group Key Exchange Protocol
181
4. Bellare, M., Rogaway, P.: Entity authentication and key distribution. In: Stinson, D.R. (ed.) CRYPTO 1993. LNCS, vol. 773, pp. 232–249. Springer, Heidelberg (1994) 5. Bellare, M., Rogaway, P.: Random oracles are practical: A paradigm for designing efficient protocols. In: Proc. 1st ACM Conference on Computer and Communications Security, pp. 62–73 (1993) 6. Bohli, J.-M., Vasco, M.I.G., Steinwandt, R.: Password-authenticated constantround group key establishment with a common reference string. Cryptology ePrint Archive, Report 2006/214 (2006) 7. Boyd, C., Nieto, J.M.G.: Round-optimal contributory conference key agreement. In: Desmedt, Y.G. (ed.) PKC 2003. LNCS, vol. 2567, pp. 161–174. Springer, Heidelberg (2002) 8. Bresson, E., Chevassut, O., Pointcheval, D.: Group Diffie-Hellman key exchange secure against dictionary attacks. In: Zheng, Y. (ed.) ASIACRYPT 2002. LNCS, vol. 2501, pp. 497–514. Springer, Heidelberg (2002) 9. Bresson, E., Chevassut, O., Pointcheval, D., Quisquater, J.-J.: Provably authenticated group Diffie-Hellman key exchange. In: Proc. 8th ACM Conference on Computer and Communications Security, pp. 255–264 (2001) 10. Byun, J., Lee, D.: N-Party encrypted Diffie-Hellman key exchange using different passwords. In: Ioannidis, J., Keromytis, A.D., Yung, M. (eds.) ACNS 2005. LNCS, vol. 3531, pp. 75–90. Springer, Heidelberg (2005) 11. Byun, J., Lee, D., Lim, J.: Password-based group key exchange secure against insider guessing attacks. In: Hao, Y., Liu, J., Wang, Y.-P., Cheung, Y.-m., Yin, H., Jiao, L., Ma, J., Jiao, Y.-C. (eds.) CIS 2005. LNCS (LNAI), vol. 3802, pp. 143–148. Springer, Heidelberg (2005) 12. Choo, K.-K., Boyd, C., Hitchcock, Y.: Errors in computational complexity proofs for protocols. In: Roy, B. (ed.) ASIACRYPT 2005. LNCS, vol. 3788, pp. 624–643. Springer, Heidelberg (2005) 13. Dutta, R., Barua, R.: Password-based encrypted group key agreement. International Journal of Network Security 3(1), 23–34 (2006) 14. Just, M., Vaudenay, S.: Authenticated multi-party key agreement. In: Kim, K.-c., Matsumoto, T. (eds.) ASIACRYPT 1996. LNCS, vol. 1163, pp. 36–49. Springer, Heidelberg (1996) 15. Lee, S.-M., Hwang, J.Y., Lee, D.H.: Efficient password-based group key exchange. In: Katsikas, S.K., L´ opez, J., Pernul, G. (eds.) TrustBus 2004. LNCS, vol. 3184, pp. 191–199. Springer, Heidelberg (2004) 16. Nam, J., Paik, J., Kim, U., Won, D.: Security enhancement to a passwordauthenticated group key exchange protocol for mobile ad-hoc networks. IEEE Communications Letters 12(2), 127–129 (2008) 17. Nam, J., Kim, S., Won, D.: A weakness in the Bresson-Chevassut-EssiariPointcheval’s group key agreement scheme for low-power mobile devices. IEEE Communications Letters 9(5), 429–431 (2005) 18. Pereira, O., Quisquater, J.-J.: A security analysis of the Cliques protocols suites. In: Proc. 14th IEEE Computer Security Foundations Workshop, pp. 73–81 (2001) 19. Tang, Q., Chen, L.: Weaknesses in two group Diffie-Hellman key exchange protocols. Cryptology ePrint Archive, Report 2005/197 (2005), http://eprint.iacr.org/ 20. Tang, Q., Choo, K.-K.: Secure password-based authenticated group key agreement for data-sharing peer-to-peer networks. In: Zhou, J., Yung, M., Bao, F. (eds.) ACNS 2006. LNCS, vol. 3989, pp. 162–177. Springer, Heidelberg (2006) 21. Zhang, F., Chen, X.: Attack on an ID-based authenticated group key agreement scheme from PKC 2004. Information Processing Letters 91(4), 191–193 (2004)
Smart Card Based AKE Protocol Using Biometric Information in Pervasive Computing Environments Wansuck Yi, Seungjoo Kim, and Dongho Won Information Security Group, School of Information and Communication Engineering, Sungkyunkwan University, Suwon-si, Gyeonggi-do, 440-746, Korea {wsyi,skim,dhwon}@security.re.kr
Abstract. Smart card based authenticated key exchange allows a user with smartcards and the server to authenticate each other and to agree on a session key which can be used for confidentiality or data integrity. In the paper, we propose a two-round smart card based AKE (Authenticated Key Exchange) protocol using biometric information which provides mutual authentication but only requires symmetric cryptographic operations. This paper proposes a new protocol which is best suitable in pervasive computing environments thus providing efficiency in number of rounds, forward secrecy and security against known-key attack. Keywords: smart card, authentication, key exchange, biometrics, forward secrecy, known-key secrecy, pervasive computing.
1
Introduction
Due to recent developments in pervasive computing environment, many different forms of authentication methodologies have been proposed. However, proposed methodologies contain problems in security and efficiency. In the pervasive computing environment, efficiency and security in authentication methodologies have became very important factors. To solve the problems in security and efficiency, various forms of light authentication methodologies, using biometric information, have been proposed. Biometric Information is usually stored in the smartcard to be used for remote user authentication. Remote user authentication is used in an insecure environment such as Internet by a server to authenticate a remote user whether he is an authorized user or not. The first proposed method to be used for authenticating a remote user was a password based identification and authentication. Password based remote user authentication protocol was first proposed in 1981 by Lamport [13]. Lamport’s protocol needed a server to maintain password table for user authentication.
Corresponding author.
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 182–190, 2009. c Springer-Verlag Berlin Heidelberg 2009
Smart Card Based AKE Protocol Using Biometric Information
183
The security problem in Lamport’s protocol using a password table is that if the table is exposed or manipulated by an attacker then the whole system becomes vulnerable. To overcome the security problems of the Lamport’s authentication protocol using a password table, smartcard based authentication protocol was proposed. One of an advantage of the smartcard based authentication protocol is that a server does not have to maintain a password table. Also because of the smartcard’s tamper-resistant physical characteristic, smartcard based authentication methods are very actively researched. After that, Hwang [8] is the first one to propose smartcard based authentication protocol in the year 2000. Hwang [8] proved the vulnerability of the Lamport’s authentication protocol using password table by showing the password table exposing and modification attacks. Furthermore, he proposed the smartcard based authentication protocol, which does not use password table, but using ElGamal public key encryption methods. Later on, Sun [18] proposed an improved scheme reducing the amount of transmitted data and the computational complexity. Since then, many protocols have been proposed and studied to improve the previous proposed protocols [1-5, 7, 9-12, 14-17, 19, 20-25]. In the smartcard based user authentication scheme, a user must prove that he is the owner of the smartcard which he is holding, before the initiating authentication protocol. Password authentication methodology is most frequently used for smartcard access control. Another words, it must be confirmed that the user is the one who knows the smartcard password. Using biometric information for access control in addition to the password authentication to improve smartcard access control became a new subject to study in recent years. Biometric information is unique to everyone and using them will enhance the correctness of the smartcard owner authentication, than just using password. Many smartcard based authentication protocols using biometric information have been proposed [11, 12, 16, 17]. In the one of proposed protocol, Lee et al. [17] have proposed ElGamal public key based user authentication protocol using finger print information. After that Lin et al. [16] proposed a new scheme by pointing out that the protocol proposed by Lee et al. [17] is vulnerable to impersonation attack and also, unable to use passwords that are easily memorized by users. Ku et al. [11] proved that Lee et al.’s scheme is vulnerable to forgery attack where attacker can be attacked as a user by forging user’s messages. Recently, Khan et al. [12] showed that Lin et al. [16] user authentication scheme is vulnerable to server spoofing attack. And they also proposed a mutual authentication protocol where it authenticates not only a user but also a server. This paper proposes the smartcard based user authentication using biometric information and key exchange protocol. It is different from all the other user authentication protocols because after it authenticates a user and a server, it exchanges a session key to be used for encryption and integrity of transmitted data. Proposed smartcard based authentication and key exchange protocol is a secure and efficient key exchange methodology in pervasive environment.
184
1.1
W. Yi, S. Kim, and D. Won
Securities in Key Exchange Protocols
The most basic security requirement in key exchange protocol is the key privacy. Attacker can eavesdrop on a communication line between two entities, forge messages or modify messages to gain information about the key. Key privacy guarantees that an attacker won’t be able to gain any information on the key from passive and active attacks. Other most common security requirements are forward secrecy and known-key secrecy. – Forward secrecy: is that if a secret key which is long-term key is exposed and any attacker who obtains this information would not be able to trace back to any information on session keys which were used in the past to communicate between a server and a user. – Known-key secrecy: means that it should not have any effect on the secrecy of the unexposed session keys even though number of session key is exposed. Depending on the authentication type, key exchange protocol can be categorized into implicit authentication and explicit authentication. If two communicating entities, a server and a user, but none other entities can obtain the information on the session key then it is called implicit authentication. However, implicit authentication does not provide assurance that two key exchanging entities are the actual entities who have shared session key. Each key exchanging entity generates or has confidence that it can generate a session key then it is called the explicit authentication. This paper focuses on the implicit authentication for user. Key exchange protocol providing implicit authentication can be easily converted to an explicit authentication by using MAC (Message Authentication Code) or other mechanisms. 1.2
Previous Research and Analysis
In 2006, Yoon et al. [24] have proposed smartcard based authentication using biometric information and key exchange protocol. Authors proposed forward security scheme and non-forward security scheme. Like Yoon and others’ scheme, scheme proposed in this paper is implicit authentication key exchange protocol providing mutual authentication between a server and a user. Also, it provides forward secrecy and known-key secrecy. Compared with Yoon and others’ scheme, proposed scheme has about equal level of computational complexity but it requires one less round than Yoon’s. It is very important to reduce the number of rounds to enhance efficiency in key exchange protocol. Table one shows comparison between Yoon’s and proposed scheme in efficiency and secrecy. In efficiency comparison, registration phase has been omitted. And for computational complexity comparison, login and key exchange phase of the Yoon’s scheme is compared with access authorization phase and key exchange phase of proposed scheme.
Smart Card Based AKE Protocol Using Biometric Information
185
Table 1. Comparison of smartcard based authentication using biometric information and key exchange protocol
2
Proposed Protocol
In this chapter, we introduces proposed smartcard based authentication using biometric information and key exchange protocol. 2.1
Notations and Parameters
In this subsection, we introduce notations and parameters used in this paper. – – – –
– – – – – – 2.2
G : Cyclic group of order of prime number q g : G generator l : Security parameter M = (Key.G, M ac.G, M ac.V ) : MAC algorithm with strong unforgeability. Key.G generates kmac . Mac.G algorithm uses kmac to calculate tag τ = M ackmac (m) on message m. Mac.V uses kmac to verify the tag with the message and if tag is effective, output would be 1 and otherwise output would be 0. H : {0, 1}∗ → {0, 1}l collision evasive hash function Ui , S : User Ui and Server S IDi , IDs : Ui ’s identity and S’s identity pwi : User Ui ’s password xS : Server’s master secret key Fi : User Ui ’s finger print data Protocol
Proposed protocol is composed of three phases, registration phase where user is registered as a server user and smartcard is issued, access authorization phase
186
W. Yi, S. Kim, and D. Won
where user is authenticated to verify that he is the actual owner of the smartcard, authentication and key exchange phase where a user and a server is mutually authenticated and a session key is exchanged. [ Registration Phase ] (1) After user Ui chooses his IDi and password pwi , he computes si = H(IDFi pwi ) using Fi . (2) User uses secure methods to register Si on a server S. (3) Server S calculates vi = H(IDi xs ) and wi = vi ⊕ si . (4) Server S stores (vi , wi , H(·), M, G, g, p, q) onto the smartcard and issue the smartcard to user Ui in a secure manner. [ Smart card Access Authorization Phase ] (1) If a user Ui wants to use the smartcard, Ui puts the smartcard into the reader and reader will extract finger print information Fi . (2) And then user inputs the pwi . (3) Smartcard will then calculate si = H(IDFi pwi ) and vi = wi ⊕ si to check if vi matches with vi . (4) If they do not match, the smartcard will check to see if the number of incidences where password input fail and finger print authentication fail do not exceed predetermined number of allowed failures. (5) If it exceeds, smartcard will no longer be able to be used. If vi and vi match, then user Ui ’s smartcard will perform next step in the protocol. [ Authentication and Key Exchange Phase ] ( Round 1 ) – User Ui ’s smartcard will generate a random number ∝∈ [1, q] and calculate g ∝ mod p. – From now on mod p will be omitted in the following texts. – Ui sends (IDi g ∝ Ti = M ac.Gvi (IDi IDS g ∝ )) to the server S. ( Round 2 ) – After receiving (IDi g ∝ Ti = M ac.Gvi (IDi IDS g ∝ )) from user Ui , server S verifies whether the output of M ac.Vvi (IDi IDS g ∝ ) using vi is 1 or not. – If MAC won’t be able to verify, then server will terminate the protocol. – If verified, then the server S can confirm that Ui ’s smartcard contains secret key vi and therefore authenticate the user Ui . – Server S generates a random number B ∈ [1, q] and then sends (IDS g β T s = M ac.Gvi (IDS IDi g β ) to user Ui . ( Computing a session key ) – After receiving (IDS g β Ts = M ac.Gvi (IDS IDi g β ), user Ui verifies whether the output of M ac.VVS (IDS IDi g β ) using vi is 1 or not.
Smart Card Based AKE Protocol Using Biometric Information
187
– If MAC won’t be able to verify, then user will terminate the protocol. – If verified, then the user Ui can confirm that the server S has the master key xS and therefore authenticate the server S. – User Ui and server S will generate the session key sk = H(IDi IDS g ∝ g β g ∝β ).
3
Security Analysis
In this chapter, we will look into security against active attack. 3.1
Security against Illegal Use of a Smart Card
Proposed scheme confirms whether a user knows the actual password and verifies finger print data during the access authorization phase to protect from illegal use of the smartcard. Another words, the smartcard uses password pwi and finger print data Fi from user Ui and generate si = H(IDFi pwi ) and vi = wi ⊕ si . And then, check to see if vi is equal to vi . vi = H(IDi xS ) and wi = vi ⊕ Si are stored in the smartcard. If the number of password input fail and finger print authentication fail do exceed specified allowed number of failures, then the smartcard is no longer usable. Therefore, only the people with the password and finger print data will be able to pass the verification stage. 3.2
Mutual Authentication
Proposed If M is a MAC algorithm with strong unforgeability, proposed protocol provides mutual authentication between the user and the server. If an attacker can forge MAC value, an attacker can always pretend to be an authorized user or a server. If an attacker forges a MAC value which was never generated by the authorized server or user, with non-negligible probability to pretend to be the authorized server or user, an attacker can forge a MAC tag on a message which he selected. It means an attacker has broken the MAC algorithm with message M. This contradicts that algorithm M is not a strong unforgeable algorithm. This means that an attacker who can break MAC algorithm with non-negligible probability does not exist, therefore proposed algorithm provides mutual authentication. 3.3
Key Confidentiality
If M is a MAC algorithm with strong unforgeability and if CDH(Computational Diffie-Hellman) is very difficult to solve in group G, then proposed protocol provides key confidentiality. If an attacker can forge a MAC value, an attacker can pretend to be a legal server or a user and share session key. Furthermore, if an attacker is the one who can solve the CDH, then the attacker can compute g ∝β , using exchanged data g ∝ and g β between the server and user, and generate sk = H(IDi IDS g ∝ g β g ∝β ). If there exists an attacker who can break the confidentiality of session key with non-negligible probability, then the attacker can break M’s non-forgeability or
188
W. Yi, S. Kim, and D. Won
can develop an algorithm that can solve CDH problem. However, this contradicts the fact that M is a MAC algorithm with strong non-forgeability and CDH is a very difficult problem to solve. Therefore, an attacker who can break the confidentiality of a session key with non-negligible probability, does not exist, the proposed algorithm provides the key confidentiality. 3.4
Forward Secrecy
If CDH problem in group G is very hard to solve, proposed protocol provides forward security. An attacker, with user’s long term secret key(vi , wi ) and server’s master key xS , will not be able to obtain information on exchanged session key before secret key is exposed. The reason is vi = H(IDi xS ) is used as the key for MAC algorithm and therefore (vi , wi ) and xS does not play any role in generating a session key. If an attacker can solve the CDH problem, the attacker can generate g ∝β from g ∝ and g β , which were exchanged between a server and a user, and therefore he can generate a session key sk = H(IDi IDS g ∝ g β g ∝β ). However, there isn’t a known algorithm that is able to solve CDH problem in polynomial time. 3.5
Known Key Secrecy Attack
In the proposed scheme, session key is generated by sk = H(IDi IDS g ∝ g β g ∝β ) where ∝ and β is a random number generated every time. Therefore exposed session key does not give any information on other session keys. As the conclusion, proposed scheme provides security against known key secrecy attack.
4
Conclusions
This paper introduces smartcard based authentication using biometric information and key exchange protocol. Other based authentication using biometric information and key exchange protocols requires 3 rounds which needs improvements. This paper uses only symmetric key encryption and proposed efficient authentication and key exchange protocol which requires only 2 round. Proposed protocol is designed to provide mutual authentication, forward secrecy and known-key secrecy. Key exchange protocol in this paper is expected to provide a secure and efficient authentication and key exchange methodology to be used in pervasive computing environment.
References 1. Awashti, A.K.: Comment on a dynamic ID-based remote user authentication scheme. Transactions on Cryptology 1(2), 15–16 (2004) 2. Chien, H.Y., Chen, C.H.: A remote authentication scheme preserving user anonymity. In: Intl. Conf. on AINA 2005, vol. 2, pp. 245–248 (March 2005)
Smart Card Based AKE Protocol Using Biometric Information
189
3. Chang, C.C., Hwang, K.F.: Some forgery attacks on a remote user authentication scheme using smart cards. Informatics 14(3), 289–294 (2003) 4. Das, M.L., Saxena, A., Gulati, V.P.: A dynamic ID-based remote user authentication scheme. IEEE Transactions on Consumer Electronics 50(2), 629–631 (2004) 5. Fan, C.I., Chan, Y.C., Zhang, Z.K.: Robust remote authentication scheme with smart cards. Computers & Security 24(8), 619–628 (2005) 6. El Gamal, T.: A public-key cryptosystem and a signature scheme based on discrete logarithms. IEEE Transactions on Information Theory 31(4), 469–472 (1985) 7. Hsu, C.L.: Security of Chien et al.’s remote user authentication scheme using smart cards. Computer Standards and Interfaces 26(3), 167–169 (2004) 8. Hwang, M.S., Li, L.H.: A new remote user authentication scheme using smart cards. IEEE Transactions on Consumer Electronics 46(1), 28–30 (2000) 9. Kumar, M.: New remote user authentication scheme using smart cards. IEEE Transactions on Consumer Electronics 50(2), 597–600 (2004) 10. Ku, W.C., Chang, S.T.: Impersonation attack on a dynamic ID-based remote user authentication scheme using smart cards. IEICE Transactions on Communication E88-B(5), 2165–2167 (2005) 11. Ku, W.C., Chang, S.T., Chiang, M.H.: Further cryptanalysis of fingerprint based remote user authentication scheme using smart cards. IEE Electronics Letters 41(5) (2005) 12. Khan, M.K., Zhang, J.: Improving the security of a exible biometrics remote user authentication scheme. Computer Standards & Interfaces 29(1), 82–85 (2007) 13. Lamport, L.: Password authentication with insecure communication. Communications of the ACM 24(11), 770–772 (1981) 14. Leung, K.C., Cheng, L.M., Fong, A.S., Chan, C.K.: Cryptanalysis of a modified remote user authentication scheme using smart cards. IEEE Transactions on Consumer Electronics 49(4), 1243–1245 (2003) 15. Lee, C.C., Hwang, M.S., Yang, W.P.: A flexible remote user authentication scheme using smart cards. ACM Operating Systems Review 36(3), 46–52 (2002) 16. Lin, C.H., Lai, Y.Y.: A flexible biometrics remote user authentication scheme. Computer Standard and Interfaces 27(1), 19–23 (2004) 17. Lee, J.K., Ryu, S.R., Yoo, K.Y.: Fingerprint-based remote user authentication scheme using smart cards. IEE Electronics Letters 38(12), 554–555 (2002) 18. Sun, H.M.: An efficient remote user authentication scheme using smart cards. IEEE Transactions on Consumer Electronics 46(4), 958–961 (2000) 19. Shen, J.J., Lin, C.W., Hwang, M.S.: A modified remote user authentication scheme using smart cards. IEEE Transactions on Consumer Electronics 49(2), 414–416 (2003) 20. Sun, H.M., Yeh, H.T.: Further cryptanalysis of a password authentication scheme with smart cards. IEICE Transactions and Communications E86-B (4), 1412–1415 (2003) 21. Wang, S.J., Chang, J.F.: Smart card based secure password authentication scheme. Computers & Security 15(3), 231–237 (1996) 22. Yoon, E.J., Ryu, E.K., Yoo, K.Y.: An improvement of Hwang-Lee-Tang’s simple remote user authentication scheme. Computers & Security 24, 50–56 (2005) 23. Yang, W.H., Shieh, S.P.: Password authentication schemes with smart cards. Computers & Security 18(8), 727–733 (1999) 24. Yoon, E.J., Yoo, K.Y.: Biometrics Authenticated Key Agreement Scheme. In: Etzion, O., Kuflik, T., Motro, A. (eds.) NGITS 2006. LNCS, vol. 4032, pp. 345–349. Springer, Heidelberg (2006)
190
W. Yi, S. Kim, and D. Won
25. Yang, C.C., Yang, H.W., Wang, R.C.: Cryptanalysis of security enhancement for the timestamp-based password authentication scheme using smart cards. IEEE Transactions on Consumer Electronics 50(2), 578–579 (2004) 26. ElGamal, T.: A Public-Key Cryptosystem and a Signature Scheme Based on Discrete Logarithms. IEEE Transactions on Information Theory IT-31(4), 469–472 (1985) 27. Guthery, S.B., Jurgensen, T.M.: SmartCard Developer’s Kit. Macmillan Technical Publishing (1998) ISBN 1–57870–027–2 28. Rankl, W., Effing, W.: Smart Card Handbook. John Wiley & Sons, Chichester (1997)
A Practical Approach to a Reliable Electronic Election Kwangwoo Lee, Yunho Lee, Seungjoo Kim , and Dongho Won Information Security Group, School of Information and Communication Engineering, Sungkyunkwan University, Suwon-si, Gyeonggi-do, 440-746, Korea {kwlee,leeyh,skim,dhwon}@security.re.kr
Abstract. A receipt used for electronic voting is a key component of voting techniques designed to provide voter verifiability. In this paper, we propose electronic voting scheme involving issuing a receipt based on the cut-and-choose mechanism. Compared with Chaum et al.’s scheme[3], this scheme does not require large amounts of physical ballots and trust assumptions on tellers. Furthermore, it is as efficient and user-friendly as any of the existing electronic voting schemes issuing receipts described in the relevant literature. They only require simple comparisons between the screen and the receipt during voter’s voting process. This scheme is provably secure under the fact that the ElGamal encryption cryptosystem is semantically secure. We implemented the proposed scheme in according to the defined implementation requirement and provide the security analysis of the scheme. Keywords: electronic voting, paper receipt, cut-and-choose.
1
Introduction
Electronic voting is a voting method utilizing at least one or more electronic devices, such as touch screen, scanner, printer or barcode reader. Over the last two decades, a considerable number of studies have been conducted on the various aspects of electronic voting. Several countries have adopted electronic voting with the expectation that this type of voting increases the turnout of voters and eliminates occurrences of invalid votes resulting from human errors. However, many experts express their concerns about current electronic voting machines because these machines do not provide any proof of their honesty. For this reason, R.Mercuri stated that voting machines should produce paper audit trails for later recount as necessary [6,7]. In [7], after a voter has finished making selections
This research was supported by the Ministry of Knowledge Economy, Korea, under the ITRC(Information Technology Research Center) support program supervised by the IITA(Institute of Information Technology Advancement) (IITA-2009-(C10900902-0016)). Corresponding author.
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 191–203, 2009. c Springer-Verlag Berlin Heidelberg 2009
192
K. Lee et al.
using a voting machine, the machine prints out a paper ballot that contains the voter’s selections for each possible choice to be voted on. The printed ballot is kept behind a window to prevent voters from having an opportunity to tamper with the printed ballot. If voters examine and approve the ballot, the voting machine drops the printed ballot into an opaque ballot box. While voter verifiable paper ballots eliminate the need to trust the voting machine, the need to support printing and collecting of paper ballots increases maintenance costs and election complexity for the poll workers [1]. However, this voter verifiable paper audit trail is not used to recount. This trail allows voters to verify that their votes are cast as intended. This trail serves as evidence of possible voting machine’s fraud. In 2002, D.Chaum et al. [2] proposed a voter verifiable e-voting scheme using visual cryptography [4]. When a voter selects a choice, the voting machine provides receipts which consist of two separable layers. When the two layers are laminated together, they reveal the voter’s choice. However, when separated, each layer contains meaningless dots. After voting, the voter selects one layer and retains the selected layer and submits the unselected layer to a poll worker which should be shredded. The voters should check that their receipts have been correctly posted to the public web bulletin board. The major drawback of this scheme is that the scheme requires special printers and papers. In 2005, D.Chuam et al. also proposed a method to provide voter reliability [3,9]. This scheme provides ballot forms that are printed on normal paper prior to the start of an election. The ordering of the candidate list is based on a random uniqueness for each ballot form. A voter chooses the ballot form, marks the choice, and casts the vote using a scanner type voting machine. The benefits of this scheme are that the voting machines never know the selection of the voter. This kind of machine is unable to change the vote based on its contents [4]. However, this scheme is based on trustworthiness of the tellers and it is potentially vulnerability to chain voting attacks[13,5]. 1.1
Our Contributions
In this paper, we propose an efficient and user-friendly electronic voting scheme issuing receipts. Although many experts have studied schemes to provide reliability using cryptographic methods [2,3,8], some of the schemes are difficult to use in a real election. Also these schemes have some management restrictions. To solve these kinds of problems, first, we review Chaum et al.’s scheme [3] that is a practical solution to provide assurance of secrecy and accuracy without any reliance on the underlying voting system. We propose a basic cut-and-choose based electronic voting scheme and implemented this scheme making it easier for voters to use. This scheme does not require large amounts of physical ballots and trust assumptions on the tellers. This proposal provides usability even with many candidates participating in the election and thus requiring action on the part of the voters. Therefore, it is more practical and efficient for an authentic election than the previous voting schemes.
A Practical Approach to a Reliable Electronic Election
1.2
193
Organization
The rest of paper is organized as follows. In Section 2, we briefly describe related work: Chaum et al.’s scheme [3]. Our basic cut-and-choose based electronic voting scheme for issuing receipts is presented in Section 3. Following this, in Section 4, we discuss the implementation requirements and implement the electronic voting scheme using basic cut-and-choose method to adapt real world election. In Section 5, we present the security analysis of this scheme. Finally, we summarize and conclude in Section 6.
2 2.1
Related Work Prˆ et a ` Voter
D.Chaum et al. proposed a practical electronic voting scheme in 2005, known as the Prˆ et a ` Voter [3]. Their scheme uses a ballot form which similar to the current paper ballot. A ballot form has a left column listing candidates or options and a right column into which the voter can insert her selection. In this section, we briefly explain the scheme. For full details, please see [3]. The Election Setup. An authority prepares a large number of ballots, significantly more than required for the electorate. The order in which the candidates are listing is randomized for each ballot and the information from which the candidate ordering can be reconstructed is encrypted and printed on the ballot (see Fig. 1). Casting a Vote. A voter selects one ballot form among k(2 ≤ k) ballot forms given to her and audits the (k − 1) unselected ballot forms using tellers as an oracle in the voting booth. Suppose that the tellers are honest and (k − 1) ballot forms are valid, then she can trust that the selected ballot form is also valid with probability 1/t. Next, she removes the left-hand (LH) strip and feeds the right-hand (RH) strip into the voting machine equipped with an optical reader
Asterix
Obelix
Panoramix Obelix
Rotation = 2
Idefix Asterix
Idefix
Panoramix 7rJ94K
candidate order
a ballot form (7rJ94K = E(2))
Fig. 1. An example of a ballot form
194
K. Lee et al.
Discard
Receipt
Obelix Idefix Asterix
√
Panoramix 7rJ94K Left column (LH strip)
Right column (RH strip)
Fig. 2. Ballot form when a voter selects the third candidate
or similar device to record the encrypted information at the bottom of the RH strip. The LH strip should be shredded. Then she retains the hard copy of the RH strip as a receipt. The recorded vote will be posted to the public web bulletin board (WBB) and she can check whether the vote is recorded correctly or not. For instance, if a voter selects the third listed candidate “Asterix”, then the ballot form will look like Fig. 2. Tallying. After the election has closed, the tellers should reconstruct the candidate ordering for each vote cast. The reconstruction of candidate ordering can be accomplished by decrypting the printed code on the RH strip. For example, suppose that the vote cast is (3, E(2)), then the voter’s selection v can be calculated by the following equation. v = 3 − D(E(2)) (mod 4) = 3 − 2 (mod 4) = 1 (“Asterix”) 2.2
(1)
Disadvantages of Chaum et al.’s Scheme
The most concern about this scheme is the discrepancy between the printed candidate order and the buried information. They stated that one of the satisfactory methods is to use the tellers as an oracle [3]. In this method, a voter gives the information to the oracle and the oracle returns the candidate ordering. However, this auditing method has two drawbacks: – Voters have to trust the oracle, and – The oracle can be abused for vote buying or vote selling. Though they also suggested dummy vote(s)[3], the second problem still exists. Besides these drawbacks, there are some problems, for e.g., it is vulnerable to the chain voting attack and it is not easy to destroy the LH strip securely[5].
A Practical Approach to a Reliable Electronic Election
3
195
Basic Cut-and-Choose Based Electronic Voting Scheme
We propose an electronic voting scheme issuing receipts using a cut-and-choose method. This scheme provides high reliability for election results. 3.1 – – – – – 3.2
Notation n: the number of candidates E(·, ·), D(·, ·): ElGamal encryption and decryption v: a voter’s selection (1 ≤ v ≤ n) w: a random number for ElGamal encryption e: the encrypted choice (e = E(v, w)) The Voting Procedure
The detailed procedure can be described as follows.
1. The voting machine calculate n encrypted pairs (ej , ej ) = E(j, wj ), E(j, wj ) where wj and wj are random numbers for ElGamal encryption. 2. The voting machine prints all the 2n ciphertexts in random order on the receipt. One simple way to achieve this is to sort the ciphertexts. Then, the voting machine displays ej and ej . (j = 1, ..., n)
3. For each j = 1, ..., n, the voter randomly selects one ciphertext, ej or ej . And then, the voting machine prints n unselected values and their corresponding random numbers on the paper receipt. The voter should check that the n printed values are the same as on the screen. 4. The voter casts a ballot by selecting v(v ∈ {1, ..., n}). 5. The voting machine prints the v th encrypted value from the selected values, ev or ev . The voter should check that the printed value is the same as that on the screen. Fig. 3 depicts an example of the voter’s random selections.
Name
Column 0
Column 1
Cand. A
E (1, w1' )
E (1, w1'' )
Cand. B
E (2, w2' )
E (2, w2'' )
Cand. C
E (3, w3' )
E (3, w3'' )
Cand, D
E (4, w4' )
E (4, w4'' )
Fig. 3. A voter selects 4 ciphertext randomly.(right, left, right, and right)
196
K. Lee et al.
Receipt Candidate Codes
Col. 0
1
E (1, w1' )
Col. 1
w1' E (2, w2'' )
2
Random Numbers w2''
3
E (3, w3' )
w3'
4
E (4, w4' )
w4'
Vote Cast
E (2, w2' )
Fig. 4. Sample receipt with 4 candidates.(voter’s selection is candidate B)
3.3
Receipt Verification
After voting, a voter retains her receipt which looks like Fig 4. A voter encrypts each candidate code j with the corresponding random numbers printed on the receipts using ElGamal encryption. The voter verifies that the encrypted results are equal to the corresponding verification codes. This verification can be accomplished easily by any voter because it requires a n simple encryption. Thus voters do not have to trust any specific election device or election worker in order to verify their receipts. For j = 1, ..., n, if all the encrypted results are equal to the corresponding verification codes, the voter can be assured that her vote was cast as intended with probability 1 − (1/2n−1 ).
4
Making the Scheme More Practical for Real World
In basic cut-and-choose based electronic voting scheme, a voter should check that the printed values are the same as on the screen. Thus, if the number of candidates increases, the voter’s comparisons also increase. These situations are not suitable for a public election. Therefore, we implement enhanced cut-andchoose based electronic voting scheme making it easier for voters to use. It is not only suitable to scenarios where the real election may have many candidates, but this scheme also reduces the comparisons performed by the voter. 4.1
Implementation Requirements
The following are notes about implementation requirements corresponding to various features in the real world election.
A Practical Approach to a Reliable Electronic Election
197
– A voter should be able to compare the verification codes on receipt with screen in easy way without sacrificing security. – The number of comparisons should be small enough regardless of the number of candidates. – No one can obtain or construct a receipt proving the content of a ballot.
4.2
Notation
We introduce some notations. – – – – –
n: the number of candidates V : the original candidate order VS : the set of all possible randomly permuted orders of V Vj : a randomly permuted order of V (Vj ∈ VS ) E(Vj , w): the ElGamal encryption of a randomly ordered list Vj with a random number w – vi : voter i’s selection (1 ≤ vi ≤ n) – Position[Vj , vi ]: an element i’s location in list Vj – H(·): the reduction function 4.3
Making the Receipt Easy to Compare
At the beginning, we present the reduction function. In the basic cut-and-choose method, voter should check that the printed verification codes are the same as on the screen. Usually, the length of verification codes is 2048bits. It is difficult to compare these characters. Therefore, we reduce the length of verification codes using SHA-1, byte EX-OR, and matching table. The security of this function is based on the difficulty of solving a hash function. We use this reduction function to satisfying the implementation requirements. 2,048bit
SHA-1
160bit Byte Ex-OR
10bit
Fig. 5. The reduction function
198
K. Lee et al. Table 1. 5bit matching table bits 00000 00001 00010 00011 00100 00101 00110 00111
4.4
ch. A B C D E F G H
bits 01000 01001 01010 01011 01100 01101 01110 01111
ch. I J K L M N O P
bits 10000 10001 10010 10011 10100 10101 10110 10111
ch. Q R S T U V W X
bits 11000 11001 11010 11011 11100 11101 11110 11111
ch. Y Z 1 2 3 4 5 6
The Voting Procedure
More practical voting scheme can be described as follows. 1. The voting machine calculates two encrypted pairs (e1 , e2 ) = (E(V1 , w1 ), E(V2 , w2 )) where w1 and w2 are random numbers and V1 , V2 ∈R VS .
Fig. 6. A sample voting machine’s screen
A Practical Approach to a Reliable Electronic Election
199
Fig. 7. A sample voting machine’s receipt
2. The voting machine prints e1 and e2 on the receipt in random order. Then, the voting machine displays the election information set (candidates’ numbers, candidates’ names, and parties’ names) and two code pairs (V1 , H(e1 )) and (V2 , H(e2 )). The first code pair is called the top row, and the other is called the bottom row. Fig. 6 shows an example in case of V1 = [1, 3, 2, 5, 4], V2 = [5, 1, 4, 3, 2]. In this example, QR and XO are H(e1 ) and H(e2 ) respectively. 3. The voter selects the candidate number vi and confirms the selection. Next, the voter selects one from the two code pairs (top or bottom). Fig.6 depicts a voting machine’s screen if a voter selects the third candidate(vi = 3) and the top row. 4. The voting machine prints the selected row VS (∈ {V1 , V2 })on the receipt without the candidates’ numbers and indicates the selected candidate’ position (Position[VS , vi ]) with the ”v” mark. At the same time, the voting machine prints the unselected row information (Vu , eu , wu )(s = u) on the receipt to prove that all of the encryption process is correct. Finally, the voting machine prints a signature on the receipt to preserve integrity. 5. The voter should compare that the printed information is the same as on the screen (See Fig. 6 and Fig. 7). 6. After leaving the polling place, the voter checks the validity of the receipt using the voting machine’s signature. Then, the voter should check whether the following equation holds using Vu and wu . H(eu ) = H(E(Vu , wu ))
(2)
200
K. Lee et al.
The voter also checks the WBB to verify that the (Position[Vs , vi ], es ) is posted correctly. 4.5
Tallying
The election authority begins tallying process after the election closed. Voter i’s selection vi can be computed using es and pi = Position[Vs , vi ]. The authority decrypts es yielding Vs and finds the pi th number in Vs , the voter i’s selection.
5
Security and Efficiency Analysis
The voter verifiability (or individual verifiability) means that a voter should be able to satisfy him/herself that the vote is cast as intended. An e-voting scheme providing voter verifiability should satisfy the following two conditions, fraud detection and receipt-freeness. Definition 1. Fraud Detection. The voting machine’s fraud, for e.g., modifying the vote value against the voter’s intention, should be detected with probability at least 1/2. Definition 2. Receipt-Freeness. No one can obtain or is able to construct evidence of how she voted even using the receipt. We analyze security of the proposed scheme based on these two conditions. 5.1
Fraud Detection
We denote the set of all ciphertexts by S S = {e1 , , et }(|S| = t)
(3)
Obviously the chance of the voting machine’s fraud would go undetected is 1/t because the voter selects one code from S and the voting machine should reveal all the encryption parameters i.e., random numbers except the selected one. In other words, the voting machine should predict the voter’s choice exactly in order to modify the vote cast without being detected. This is also true when we incorporate the reduction function H(cdot). Let S be the set of all reduced codes from S and ∀ i, j (i = j, 1 ≤ i, j ≤ t), H(ei ) = H(ej ). S = H(e1 ), ..., H(et ) (|S | = |S| = t)
(4)
Then S and S are bijective. Suppose |hj | and |ej | are the bit length of hj , and ej where hj = H(ej ) respectively, then it is easy to find ej satisfying the following equation. hj = hj (where hj = H(ej ), ej = ej ∧ 1 ≤ j ≤ t) (5)
A Practical Approach to a Reliable Electronic Election
201
Fig. 8. The WBB of practical electronic voting system
This means that the voting machine can modify the vote cast after the voter’s selection easily. However, if we make the voting machine to commit all ciphertexts in random order prior to the voter’s selection, then the voting machine should predict the voter’s selection exactly in order to modify the vote cast without being detected and the chance is also 1/t. Thus the proposed scheme’s minimal security is 1/2 because t is greater than or equal to 2. 5.2
Receipt Freeness
Theorem 1. If the ElGamal encryption scheme is not secure in the sense of indistinguishability, then there exists a probabilistic polynomial time TM that solves the DDH(Decisional Diffie-Hellman) problem with overwhelming probability. Proof: Please refer to [11].
Theorem 1 implies that the ElGamal encryption scheme is secure because it is highly believed that the DDH problem is intractable. The Lemma 1 follows immediately from the Theorem 1. Lemma 1. If there exists a secure CSPRNG (Cryptographically Secure Pseudo Random Number Generator) and a voting machine generates random numbers using CSPRNG, then no one can get any partial information about the voter’s choice from a receipt.
202
K. Lee et al.
Proof: An encryption scheme secure in the sense of indistinguishability is semantically secure [10]. Thus, the ElGamal encryption scheme is semantically secure and is secure under a chosen plaintext attack. Let A be a set of encrypted values printed on a receipt and B be a set of encrypted values computed by the voter herself. Obviously no one can distinguish A from B if the voting machine generates random numbers using CSPRNG. Moreover, several secure CSPRNGs exist, for e.g., [12]. Thus an attack trying to obtain information from a receipt can be thought as a kind of chosen plaintext attack. However, an attacker can not learn any partial information about the voter’s choice because the ElGamal encryption scheme is secure under a chosen plaintext attack. 5.3
Efficiency
In an electronic voting scheme issuing a voter verifiable receipt, a voter has to compare the printed codes on the receipt with screen in the voting booth. Thus, the length of each code and the number of codes should be small enough without sacrificing its security. We make the codes much shorter than the ciphertexts using reduction function and prove that using reduction function does not weaken security. Compared with Chaum et al.’s scheme [3], the proposed scheme does not require preparation of large amounts of physical ballots. Moreover, the proposed scheme does not require any teller (verifier) which is a key component of Chaum et al.’s scheme and all voters have to trust it in the voting booth. We argue that a voter should never trust any devices or workers in the voting booth during voting period. Furthermore, the tellers can be abused to prove voters’ selections.
6
Conclusion
Many researchers and cryptographers agreed that an electronic voting system using cryptography will be used in the future - at least in next 20 years. While Chaum et al.’s scheme [3] is practical ever proposed, it has some drawbacks for e.g., a large number of physical ballots significantly more than required for the electorate are needed and auditing ballot form using tellers as an oracle is required. Auditing ballot form is potentially rather delicate and fragile because it can be abused for vote buying and selling. Our scheme does not need the physical ballots and requires simple comparisons of short strings consisting of two characters. Moreover, a voter can audit the receipt by herself outside the voting place, thus she does not have to trust any devices or workers in the voting place. However, she can not prove her selection using the receipt.
References 1. Evans, D., Paul, N.: Election Security: Perception and Reality. IEEE Security and Privacy Magazine 2(1), 24–31 (2004) 2. Chaum, D.: Secret-Ballot Receipts: True Voter-Verifiable Elections. IEEE Security and Privacy Magazine 2(1), 38–47 (2004)
A Practical Approach to a Reliable Electronic Election
203
3. Chaum, D., Ryan, P.Y.A., Schneider, S.: A Practical Voter-Verifiable Election Scheme. In: di Vimercati, S.d.C., Syverson, P.F., Gollmann, D. (eds.) ESORICS 2005. LNCS, vol. 3679, pp. 118–139. Springer, Heidelberg (2005) 4. Naor, M., Shamir, A.: Visual Cryptography. In: De Santis, A. (ed.) EUROCRYPT 1994. LNCS, vol. 950, pp. 1–12. Springer, Heidelberg (1995) 5. Ryan, P.Y.A., Peacock, T.: A Threat Analysis of Pret a Voter. In: Proc. of IAVoSS Workshop On Trustworthy Elections (WOTE 2006), pp. 101–117 (2006) 6. Mercuri, R.: Rebecca Mercuri’s Statement on Electronic Voting (2001), http://www.notablesoftware.com/RMstatement.html 7. Mercuri, R.: A Better Ballot Box. IEEE Spectrum Online, 46–50 (October 2002) 8. Neff, C.A., Adler, J.: Verifiable e-Voting: Indisputable Electronic Elections at Polling Places, VoteHere Inc (2003), http://www.votehere.net/vhti/documentation/ VHVHTiWhitePaper.pdf 9. Chaum, D., Ryan, P.Y.A., Schneider, S.: A Practical, Voter-Verifiable Election Scheme, Technical Report CS-TR-880, University of Newcastle upon Tyne (2004) 10. Goldwasser, S., Micali, S.: Probabilistic Encryption. Journal of Computer System Sciences (JCSS) 28(2), 270–299 (1984) 11. Tsiounis, Y., Yung, M.: On the Security of ElGamal Based Encryption. In: Imai, H., Zheng, Y. (eds.) PKC 1998. LNCS, vol. 1431, pp. 117–134. Springer, Heidelberg (1998) 12. Blum, L., Blum, M., Shub, M.: A Simple Secure Unpredictable Pseudo-random Number Generator. SIAM Journal on Computing 15, 364–383 (1986) 13. Ryan, P.Y.A., Peacock, T.: Pret a Voter: A Systems Perspective, NCL CS Tech Report 929 (September 2005)
Security Weakness in a Provable Secure Authentication Protocol Given Forward Secure Session Key Mijin Kim, Heasuk Jo, Seungjoo Kim, and Dongho Won Department of Electrical and Computer Engineering, Sungkyunkwan University, 300 Cheoncheon-dong, Jangan-gu, Suwon-si, Gyeonggi-do 440-746, Korea {mjkim,hsjo,skim,dhwon}@security.re.kr
Abstract. Shi, Jang and Yoo recently proposed a provable secure key distribution and authentication protocol between user, service provider and key distribution center(KDC). The protocol was based on symmetric cryptosystem, challenge-response, Diffie-Hellman component and hash function. Despite the claim of provable security, the protocol is in fact insecure in the presence of an active adversary. In this paper, we present the imperfection of Shi et al.’s protocol and suggest modifications to the protocol which would resolve the problem. Keywords: Cryptography, Key distribution, Authentication, Known key attack, Provable security.
1
Introduction
In many areas of modern computing, the solution to their security needs, in particular key management, is still open research challenges [1]. Researchers proposed a variety of authentication protocols which enables the users to be authenticated in order to get service from the service provider [2]. Kerberos [3] which is based on the technology of timestemp and symmetric secret key is one of the most widely used authentication protocols, but it has limitations and drawbacks such as vulnerabilities of password guessing attacks, replay attacks and exposure of session keys [4,5]. Improved authentication protocols have been proposed to enhance the security, the scalability, and the efficiency of Kerberos [6,7,8,9,10,11,12]. The robustness of authentication protocol against the loss of a session key has been the subject of many investigations [13,14,15]. Bellare and Rogaway observed that it is necessary for secure authenticated key exchange [14]. They pointed out that even if an attacker gets hold of a session key, this should effect only the session which that key protects. In particular, it should not be any easier for the attacker to compute another session key. Chien and Jan [12] pointed out
Corresponding author.
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 204–211, 2009. c Springer-Verlag Berlin Heidelberg 2009
Security Weakness in a Provable Secure Authentication Protocol
205
the security weaknesses in certain session key certificate based protocols [6,10], and then proposed a hybrid authentication for large mobile networks based on public key cryptography, challenge-response and hash chaining. But, Tang and Mitchell showed the protocol suffered from security vulnerabilities [16]. Recently Shi et al. proposed a key distribution and authentication protocol between user, service provider and key distribution center(KDC) [17]. The protocol was based on symmetric cryptosystem, challenge-response, Diffie-Hellman component and hash function. Shi et al. claimed that the proposed protocol is a provably secure authentication protocol. Unfortunately, if an attacker gets hold of a session key in Shi et al.’s protocol, it would be easy for the attacker to compute another session key. Well designed protocols prevent this. We suggest the ways to prevent the known key attack in Shi et al.’s protocol. This paper is organized as follows: In Section 2 we reviews Shi et al.’s authentication protocol. In Section 3 we present security weakness of Shi et al.’s protocol. In Section 4 we propose modifications of the protocol and analyze their security. Finally, we conclude this work in Section 5.
2
Review of Shi et al.’s Authentication Protocol
The protocol consists of two phases: the initial phase and the subsequent phase. The KDC maintains a secret key Kc to calculate the symmetric key for his users. Kuc is the long-term shared key between the user and the KDC, while Ksc is the long-term shared key between the server and the KDC, where Kuc = f (Kc , U ), Ksc = f (Kc , S), U and S denote their identities, f is a hash function. The detailed description is given below. 2.1
Initial Phase
In order to get service from a server S, a user U performs the initial phase to establish the session key with the server as following steps. I1. U → S U, ax mod p, h(ax mod p, Kuc ) where p is a large prime and a is a generator with order p − 1 in GF (p), h denotes a one-way hash function. 1. U randomly selects a secret number x to compute ax mod p. 2. U calculates h(ax mod p, Kuc ). 3. U sends his identity U, ax mod p, h(ax mod p, Kuc ) to server. I2. S → KDC U, ax mod p, h(ax mod p, Kuc ), S, ay mod p, h(ay mod p, Ksc ) 1. 2. 3. 4.
S stores ax mod p. S randomly selects a secret number y to compute ay mod p. S calculates h(ay mod p, Ksc ). S sends U, ax mod p, h(ax mod p, Kuc ) and his S, ay mod p, h(ay mod p, Ksc ) to KDC.
identity
206
M. Kim et al.
I3. KDC → S EKsc (ay mod p, n, rn , EKuc (ax mod p, ay mod p, n, rn )) 1. KDC authenticates user and server by checking the hash value h(ax mod p, Kuc ) = h(ax mod p, Kuc ) and h(ay mod p, Ksc ) = h(ay mod p, Ksc ) separately. 2. KDC chooses n and generates a random number rn , where n is the number of times that U can communicate with S. 3. KDC sends EKsc (ay mod p, n, rn , EKuc (ax mod p, ay mod p, n, rn )) to S. I4. S → U EKuc (ax mod p, ay mod p, n, rn ), EKn (ax mod pay mod p) 1. S decrypts message EKsc (ay mod p, n, rn , EKuc (ax mod p, ay mod p, n, rn )) using Ksc . 2. S authenticates KDC by checking the value of ay mod p. 3. S computes the axy mod p to get Kn and encrypts ax mod p, ay mod p using Kn . 4. S sends EKuc (ax mod p, ay mod p, n, rn ), EKn (ax mod pay mod p) to U. 5. S keeps U, Kn , n, rn . I5. U receives EKuc (ax mod p, ay mod p, n, rn ), EKn (ax mod pay mod p) 1. U decrypts message EKuc (ax mod p, ay mod p, n, rn ) using Kuc . 2. U authenticates KDC by checking the value of ax mod p. 3. U computes the axy mod p to get Kn . 4. U decrypts EKn (ax mod pay mod p) using Kn . 5. U authenticates the server by checking the value of ax mod p and ay mod p. 6. U keeps Kn , n, rn . 2.2
Subsequent Phase
In the subsequent phase, they assume the user is requesting the ith service now. S1. U → S U, EKn−i+1 (rn−i+1 ) 1. U encrypts rn−i+1 using Kn−i+1 . 2. U sends U, EKn−i+1 (rn−i+1 ) to S. S2. S → U EKn−i (rn−i+1 rn−i ) 1. S decrypts the message EKn−i+1 (rn−i+1 ) using Kn−i+1 . 2. S authenticates U by checking rn−i+1 ? = rn−i+1 . 3. If rn−i+1 == rn−i+1 , S computes Kn−i = h(Kn−i+1 rn−i+1 ). 4. S generates a new random number rn−i . 5. S encrypts (rn−i+1 rn−i ) using Kn−i . 6. S stores Kn−i and rn−i , updates i. S3. U receives EKn−i (rn−i+1 rn−i ) 1. U computes Kn−i = h(Kn−i+1 rn−i+1 ). 2. U decrypts EKn−i (rn−i+1 rn−i ), and authenticates U by checking rn−i+1 == rn−i+1 . 3. If rn−i+1 == rn−i+1 , U stores Kn−i and rn−i , updates i.
Security Weakness in a Provable Secure Authentication Protocol
3
207
Weaknesses in Shi et al.’s Authentication Protocol
A protocol is said to provide known key security if compromising of session keys does not allow a passive adversary to compromise keys of other sessions, nor an active adversary to impersonate one of the protocol parties. In this section, we show that Shi et al.’s protocol suffers from a known key attack. Since we consider a known key attack, we assume that the session key Kn−i+1 is revealed to an attacker A. The attack scenario is as follows: 1. When a user U is requesting the ith service in the subsequent phase, the attacker A eavesdrops the transmitted messages U, EKn−i+1 (rn−i+1 ) and EKn−i (rn−i+1 rn−i ). With the message U, EKn−i+1 (rn−i+1 ), A can obtain rn−i+1 by decrypting EKn−i+1 (rn−i+1 ). 2. Having Kn−i+1 and rn−i+1 , A is able to compute the next session key Kn−i = h(Kn−i+1 rn−i+1 ). By decrypting EKn−i (rn−i+1 rn−i ), A obtains rn−i . Then A records Kn−i and rn−i for a new session. Therefore the server shares with the attacker the key that has been shared by the user of each session of Shi et al.’s protocol. It implies that this protocol cannot be proven secure in a well-known security model for key establishment protocols [14].
4 4.1
Security Enhancement Preventing the Known Key Attack
To achieve security against the presented attack, we propose two methods that allow the parties to establish a session key which can be used to protect their subsequent communications. The modified protocols are given below. Improvement1. In the attack we have demonstrated previously, the attacker obtains a session key and then uses this information to determine new session keys. It is apparent that if both U and S use other shared secret value in S1 of the subsequent phase, the attacker A would not be able to compute rn−i+1 and new session keys by learning a session key and eavesdropped communicating messages. Therefore, both U and S should have other secret value for preventing the known key attack. We describe the protocol as follows: The protocol consists of two phases: the initial phase and the subsequent phase. The KDC maintains a secret key Kc to calculate the symmetric key for his users. Kuc is the long-term shared key between the user and the KDC, while Ksc is the long-term shared key between the server and the KDC, where Kuc = f (Kc , U ), Ksc = f (Kc , S), U and S denote their identities, f is a hash function. In order to secure the protocol, KDC calculates long-term shared key Kus = f (Kc , U, S) between the user and the server. After performing the initial phase of Shi et al.’s protocol, U and S perform the following steps to request the ith service in the subsequent phase.
208
M. Kim et al.
S1. U → S U, EKn−i+1 (EKus (rn−i+1 )) 1. U encrypts rn−i+1 using Kus and Kn−i+1 . 2. U sends U, EKn−i+1 (EKus (rn−i+1 )) to S. S2. S → U EKn−i (rn−i+1 rn−i ) 1. S decrypts the message EKn−i+1 (EKus (rn−i+1 )) using Kn−i+1 and Kus . 2. S authenticates U by checking rn−i+1 ? = rn−i+1 . 3. If rn−i+1 == rn−i+1 , S computes Kn−i = h(Kn−i+1 rn−i+1 ). 4. S generates a new random number rn−i . 5. S encrypts (rn−i+1 rn−i ) using Kn−i . 6. S stores Kn−i and rn−i , updates i. S3. U receives EKn−i (rn−i+1 rn−i ) 1. U computes Kn−i = h(Kn−i+1 rn−i+1 ). 2. U decrypts EKn−i (rn−i+1 rn−i ), and authenticates U by checking rn−i+1 ? = rn−i+1 . 3. If rn−i+1 == rn−i+1 , U stores Kn−i and rn−i , updates i. Improvement2. We modify Shi et al.’s protocol based on the advantages of hash chain and symmetric key cryptosystem [18]. In the modified protocol, the long-term shared key Kuc and Ksc are same as those of Shi et al.’s protocol. • Initial Phase In order to get service from the server S, the user U performs the initial phase to establish the session key with S as following steps. I1. U → S U, ax mod p, h(ax mod p, Kuc ) where p is a large prime and a is a generator with order p − 1 in GF (p), h denotes a one-way hash function. 1. U randomly selects a secret number x to compute ax mod p. 2. U calculates h(ax mod p, Kuc ). 3. U sends U, ax mod p, h(ax mod p, Kuc ) to S. I2. S → KDC U, ax mod p, h(ax mod p, Kuc ), S, ay mod p, h(ay mod p, Ksc ) 1. S stores ax mod p. 2. S randomly selects a secret number y to compute ay mod p. 3. S calculates h(ay mod p, Ksc ). 4. S sends U, ax mod p, h(ax mod p, Kuc ) and y y S, a mod p, h(a mod p, Ksc ) to KDC. I3. KDC→S EKsc (ay mod p, n, hn (rn ), EKuc (ax mod p, ay mod p, n, hn (rn )) 1. KDC authenticates user and server by checking the hash value h(ax mod p, Kuc )? = h(ax mod p, Kuc ) and h(ay mod p, Ksc )? = h(ay mod p, Ksc ) separately. 2. KDC chooses n and generates the authentication token hn (rn ) where n is the number of times that U is authorized to access S and hn (rn ) is the nth hashing value hn (rn ) = h(h(...(h(rn )...)) of random number rn . 3. KDC sends EKsc (ay mod p, n, hn (rn ), EKuc (ax mod p, ay mod p, n, hn (rn ))) to S.
Security Weakness in a Provable Secure Authentication Protocol
209
I4. S → U EKuc (ax mod p, ay mod p, n, hn (rn )), EKn (ax mod pay mod p) 1. S decrypts EKsc (ay mod p, n, hn (rn ), EKuc (ax mod p, ay mod p, n, hn (rn ))) using Ksc . 2. S authenticates KDC by checking the value of ay mod p. 3. S computes the axy mod p to get Kn and encrypts (ax mod p ay mod p) using Kn . 4. S sends EKuc (ax mod p, ay mod p, n, hn (rn )), EKn (ax mod p ay mod p) to U. 5. S keeps U, Kn , n, hn (rn ). I5. U receives EKuc (ax mod p, ay mod p, n, hn (rn )), EKn (ax mod p ay mod p) 1. U decrypts message EKuc (ax mod p, ay mod p, n, hn (rn )) using Kuc . 2. U authenticates KDC by checking the value of ax mod p. 3. U computes the axy mod p to get Kn . 4. U decrypts EKn (ax mod pay mod p) using Kn . 5. U authenticates S by checking the value of ax mod p and ay mod p. 6. U keeps Kn , n, hn (rn ). • Subsequent Phase In the subsequent phase, they assume the user U is requesting the ith service now. S1. U → S U, EKn−i+1 (hn−i (rn−i+1 )) 1. U computes hn−i (rn−i+1 ). 2. U sends U, EKn−i+1 (hn−i (rn−i+1 )) to S. S2. S → U EKn−i (hn−i (rn−i+1 )hn−i (rn−i )) 1. S decrypts the message EKn−i+1 (hn−i (rn−i+1 )) using Kn−i+1 . 2. S checks whether h(hn−i (rn−i+1 )) equals the stored hash value hn−i+1 (rn−i+1 ). 3. If the check succeeds, S generates a new random number rn−i and computes Kn−i = h(Kn−i+1 hn−i+1 (rn−i+1 )). 4. S encrypts (hn−i (rn−i+1 )hn−i (rn−i )) using Kn−i . 5. S stores Kn−i and hn−i (rn−i ), updates i. S3. U receives EKn−i (hn−i (rn−i+1 )hn−i (rn−i )) 1. U computes Kn−i = h(Kn−i+1 hn−i+1 (rn−i+1 )). 2. U decrypts EKn−i (hn−i (rn−i+1 )hn−i (rn−i )), and checks whether h(hn−i (rn−i+1 )) equals the stored hash value hn−i+1 (rn−i+1 ). 3. If the check succeeds, U stores Kn−i and hn−i (rn−i ), updates i. 4.2
Security Analysis of Our Enhanced Protocol
In the improvement1, both U and S use a long-term shared key Kus in order to protect communications in the subsequent phase. We assume the attacker A knows a session key Kn−i+1 and eavesdropped communicating messages S1, S2. Without knowing the long-term shared key Kus , A cannot extract rn−i+1 and calculate Kn−i in the modified protocol. However, only the legitimate user can compute the session key.
210
M. Kim et al.
In the improvement2, even if a session key Kn−i+1 is compromised, the attacker also has to know the token hn−i+1 (rn−i+1 ) to compute Kn−i . Therefore, our attack is no longer be valid against the modified protocol.
5
Conclusion
We have presented security weakness of Shi et al.’s protocol [17]. They claimed that even if an attacker compromised an old session key Kn−i+1 shared by the user and the server, the known key attack still fails. But, our attack demonstrates that the claim of provable security for Shi et al.’s protocol was incorrect. To remedy this problem, we proposed modifications to the original protocol so as to secure the protocol. Our improved protocols remain preserving all the desirable security features that the original protocol possesses. Although these modified protocols defeat our attack, provable security is claimed against all attacks, not just against known attacks. Work remains to be done to formalize these protocols.
References 1. Smith, J., Weingarten, F.: Report from the Workshop on Research Directions for NGI. Research challenges for the next generation internet (2007) 2. Mitchell, C.: Security for Mobility. IEE press (2004) 3. Kohl, J., Neuman, C.: The Kerberos network authentication service(v5). Internet Request for Comments 1510 (1993) 4. Bellovin, S., Merritt, M.: Limitations of the Kerboros authentication system. ACM communications review 20, 119–132 (1990) 5. Neuman, B.C., Ts’o, T.: An authentication service for computer networks. IEEE communications 32, 33–38 (1994) 6. Kao, I., Chow, R.: An efficient and secure authentication protocol using uncertified keys. ACM Operating Systems Review 29, 14–21 (1995) 7. Ganesan, R.: Yaksha: augmenting Kerberos with public key cryptography. In: Proceedings of symposium on Network and Distributed System Security(SNDSS 1995), pp. 132–143. IEEE Computer Society, Los Alamitos (1995) 8. Fox, A., Gribble, S.: Security on the movie: indirect authentication using Kerberos. In: Proceedings of the second annual International Conference on Mobile Computing and Networking, pp. 154–164. ACM press, New York (1996) 9. Sirbu, M., Chuang, J.: Distrbuted authentication in Kerberos using public key cryptography. In: Proceedings of the Symposium on Network and Distributed System Security, pp. 134–141. IEEE Computer Society, Los Alamitos (1997) 10. Shieh, S., Ho, F., Huang, Y.: An efficient authentication protocol for mobile networks. Journal of Information Science and Engineering 15, 505–520 (1999) 11. SamaraKoon, M., Honary, B.: Novel authentication and key agreement protocol for low processing power and systems resource requirements in portable communications systems. IEE Colloquium on novel DSP Algorithms and Architectures for Radio Systems, pp. 9/1–9/5 (1999) 12. Chien, H., Jan, J.: A hybrid authentication protocol for large mobile networks. Journal of Systems and software 67, 123–137 (2003)
Security Weakness in a Provable Secure Authentication Protocol
211
13. Yacobi, Y.: A key distribution paradox. In: Menezes, A., Vanstone, S.A. (eds.) CRYPTO 1990. LNCS, vol. 537, pp. 268–273. Springer, Heidelberg (1991) 14. Bellare, M., Rogaway, P.: Entity Authentication and key distribution. In: Stinson, D.R. (ed.) CRYPTO 1993. LNCS, vol. 773, pp. 232–249. Springer, Heidelberg (1994) 15. Nyberg, K., Rueppel, R.: Weaknesses in some recent key agreement protocols. Electronics Letters 30, 26–27 (1994) 16. Tang, Q., Mitchell, C.: Cryptanalysis of a hybrid authentication protocol for large mobile networks. The journal of systems and software 79, 496–501 (2006) 17. Shi, W., Jang, I., Yoo, H.: A provable secure authentication protocol given forward secure session key. In: Zhang, Y., Yu, G., Bertino, E., Xu, G. (eds.) APWeb 2008. LNCS, vol. 4976, pp. 309–318. Springer, Heidelberg (2008) 18. Hwang, R., Su, F.: A new efficient authentication protocol for mobile networks. Computer Standards & Interfaces 28, 241–252 (2005)
Performance of STBC PPM-TH UWB Systems with Double Binary Turbo Code in Multi-user Environments Eun Cheol Kim and Jin Young Kim 447-1, Kwangwoon University, Nowon-Gu, Wolgye-Dong, Seoul, Korea
[email protected],
[email protected]
Abstract. The performance of space time block code (STBC) pulse position modulation-time hopping (PPM-TH) ultra-wide band (UWB) systems with double binary turbo coding is analyzed and simulated in multi-user environments. The channel model is considered as a modified Saleh and Valenzuela (SV) model which was suggested as a UWB indoor channel model by the IEEE 802.15.SG3a in July 2003. In order to apply the STBC scheme to the UWB system considered in this paper, the Alamouti algorithm for real-valued signals is employed because UWB signals with the pulse modulation have the type of real signal constellation. The system performance is evaluated in terms of bit error probability. From the simulation results, it is demonstrated that the double binary turbo coding technique offers considerable coding gain with reasonable encoding and decoding complexity. Also, it is demonstrated that the performance of the STBC-UWB system can be substantially improved by increasing the number of iterations in the decoding process for a fixed code rate. Also, it is confirmed that the double binary turbo coding and STBC schemes are very effective in increasing the number of simultaneous users for a given bit error probability requirement. The results of this paper can be applied to implement the down-link of the PPM-TH UWB system. Keywords: Double binary turbo code, modified Saleh and Valenzuela (SV) channel model, multi-user interference, pulse position modulation-time hopping (PPM-TH), space time block code (STBC), ultra-wide band (UWB).
1 Introduction In recent wireless communications, the next generation radio transmission technologies are required to transmit data rapidly with low error probability. In mobile communication environments, the quality of signals is seriously degraded by multi-path fading, propagation loss, noise, interference, and so on. Especially, multi-path fading effect is one of the important problems to be resolved for high speed data transmission. A multiple-input multiple output (MIMO) [1,2] technique is one method for overcoming the multi-path fading problem. The main point of this idea is employing multiple antennas both in transmitter and receiver. The capacity of the radio channel can increase by adopting the MIMO scheme. This scheme is subdivided into three O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 212– 225, 2009. © Springer-Verlag Berlin Heidelberg 2009
Performance of STBC PPM-TH UWB Systems with Double Binary Turbo Code
213
technologies, which are space time trellis coding (STTC) [3], space time block coding (STBC) [4], and Bell laboratories layered space time (BLAST) coding [5] techniques. Also, loss of information can appear due to noise and interfering signals during transmission between a transmitter and a receiver. Therefore, some methods for detecting and correcting the errors that happen in the wireless channels are necessary in order to enhance the system reliability. One method is a channel coding scheme. There are several channel codes such as Reed-Solomon (RS) code, convolutional code, and their concatenate code [6-7]. In particular, turbo code which has been proposed in 1993 by Berrou et al has very excellent error correction capacity near Shannon limit [8] in additive white Gaussian noise (AWGN) channel. So turbo code has been selected as the standard of error correction code in the third generation mobile communication systems of wideband code division multiple access (WCDMA) and CDMA 2000 [9,10]. Also, double binary turbo code, which enhances error correction performance by changing the structure of turbo code, has been suggested [11,12]. Just one bit per one clock is input to the encoder in turbo coding technique. But in double binary turbo coding, two bits per one clock are input to the encoder. As a result, the minimum distance between code words of double binary turbo code increases, decoding performance is improved, and processing time decreases because of high throughput. In [12], double binary turbo code shows better error correction performance than single binary turbo code does. Double binary turbo code is selected as optional standard of institute of electrical and electronics engineers (IEEE) 802.16 and digital video broadcasting-return channel satellite (DVB RCS) systems. In this paper, the performance of STBC ultra-wide band (UWB) systems with double binary turbo coding is analyzed and simulated in multi-user environments. We consider the pulse position modulation-time hopping (PPM-TH) UWB system. The system is simulated in indoor wireless channels. This indoor wireless channel is modeled as a modified Saleh and Valenzuela (SV) model proposed as a UWB indoor channel model by the IEEE 802.15.SG3a in July 2003 [13]. In the STBC encoding process, an Alamouti scheme for real-valued signals is employed because UWB signals have the type of real signal constellation. It is assumed that the channel coefficients are constant across two consecutive symbol transmission periods. And these coefficients can be recovered perfectly at the receiver. For double binary turbo decoding process, the Max-Log-MAP algorithm is employed. The performance is evaluated in terms of bit error probability. The remainder of this paper is organized as follows. In Section 2, the STBC PPM-TH UWB system models including the transmitter, receiver, and indoor wireless channel models are described. And the double binary turbo code encoding and decoding schemes are introduced in Section 3. In Section 4, the simulation results for the proposed system are presented. Finally, concluding remarks are drawn in Section 5.
2 System Model The block diagram for the STBC PPM-TH UWB system with double binary turbo coding scheme considered in this paper is shown in Fig. 1.
214
E.C. Kim and J.Y. Kim
śśś
śśś
śśś
Fig. 1. Block diagram of STBC PPM-TH UWB system with double binary turbo code
There are U users. In order to guarantee that there is no correlation effect between each pair of transmitter antennas, each transmitter antenna is sufficiently separated in space. Digitalized information bits of the u th user are encoded by a double binary turbo encoder. Then, the encoder outputs modulate a train of very short pulses and the results are spread by the TH code in the PPM-TH modulator block. The outputs of the PPM-TH modulator are encoded by the STBC encoder. The STBC encoding scheme utilizes the Alamouti algorithm for real-valued signals. The outputs are transmitted through the indoor wireless channel. The received signals feed into the STBC decoder. And the decoder outputs are demodulated in the PPM-TH demodulator block. After the double binary turbo decoding procedure, the original information is estimated. 2.1 Transmitter Model The STBC PPM-TH UWB transmitter model of the u th user with double binary turbo code is depicted in Fig. 2.
Bu
Fig. 2. Block diagram of STBC PPM-TH UWB transmitter with double binary turbo code of the
u th user Given the binary sequence Bu = (..., b0u , b1u ,..., buj ,...) of the u th user to be transmitted, the double binary turbo encoder encodes each bits with code rate of 1 3 and generates
the L = u
(
coded
)= (
..., l0u , l1u ,..., l uj ,...
repeats each bits
sequence
..., b0u , zu0 , z0u , b1u , z1u , z1u ,..., buj
of z uj
the u th user ) . Then, a code repetition encoder
z uj ,...
N s times and generates a binary sequence of the u
th
user
Performance of STBC PPM-TH UWB Systems with Double Binary Turbo Code
(
) (
)
A u = ..., a0u , a1u ,..., a uj ,... = ..., l0u , l0u ,..., l0u , l1u , l1u ,..., l1u ,..., l uj , l uj ,..., l uj ,... . This system is a
215
(N s , 1)
block coder and introduces redundancy. Then, a transmission coder applies an integer-valued code Cu = (..., c0u , c1u ,..., cuj ,...) to binary sequence An and generates a new sequence Du = (..., d 0u , d1u ,..., d uj ,...). The code Cu introduces the TH shift on the generated
signal. The generic element of the sequence Du is expressed as (1). d uj = cujTc + a ujε ,
(1)
where Tc and ε are constant terms that satisfy the condition cujTc + ε < Ts . And Ts is pulse repetition time, which may be a hundred times the pulse with Tc . The coded real-valued sequence Du enters the PPM modulator, which generates a sequence of unit pulses. The modulator outputs are encoded by the STBC encoder. When there are two transmit antennas shown in Fig. 2, the transmission matrix [14] of the u th user for real-valued signal is shown in (2). ⎡su Su = ⎢ 0u ⎣⎢ s1
− s1u ⎤ , ⎥ s0u ⎦⎥
(2)
where s0u is the sequence transmitted from the antenna 0 and s1u is the sequence transmitted from the antenna 1. The STBC PPM-TH UWB signal of the u th user with double binary turbo code at the output of the transmitter can be expressed as siu (t ) =
u ETX 2
∞
∑ μ p(t − jT − c T − a ε ) , u i, j
s
u j c
u j
(3)
j = −∞
u is the transmitted energy per pulse of the u th user and p(t ) is the energy where ETX normalized impulse response of the pulse shaping filter. t is the clock time of a transmitter and μiu, j ∈ {± 1} is the STBC encoding factor of the j th pulse at the number of
antenna i . And a ujε term represents the time shift introduced by modulation. a uj represents the magnitude of the j transmitted pulse. In the PPM-TH UWB signal, a uj = 1 . 2.2 Channel Model
In this paper, the modified SV model, which is selected as the UWB indoor multipath channel model by IEEE 802.15.SG3a, is adopted as an indoor wireless channel model. In order to employ STBC scheme, we define the channel impulse responses from antennas 0 and 1 of the u th user by h0u (t ) and h1u (t ) , respectively, at time t . Then, these can be represented as h0u (t ) = X 0u
N −1 K ( n )
∑ ∑α n =0 k = 0
u 0 nk δ
(t − T
u 0n
)
− τ 0unk ,
(4)
216
E.C. Kim and J.Y. Kim
h1u (t ) = X 1u
∑ ∑α δ (t − T N −1 K ( n )
u 1nk
u 1n
)
− τ1unk ,
n=0 k =0
(5)
where X mu (m = 0, 1) is a log-normal random variable representing the amplitude gain from the antenna m of the u th user in the channel, N is the number of observed clusters, K (n) is the number of multipath contributions received within the nth cluster, u α mnk (m = 0, 1) is the coefficient of the k th multipath contribution of the nth cluster, Tmnu u (m = 0, 1) is the time of arrival of the nth cluster, and τ mnk (m = 0, 1) is the delay of the k th th u multipath contribution within the n cluster. The channel coefficient α mnk can be defined as follows. u u u , α mnk = pmnk β mnk
(6)
u where pmnk (m = 0, 1) is a discrete random variable assuming values of ± 1 with equal u probability and β mnk (m = 0, 1) is the log-normal distributed channel coefficient of u multi-path contribution k belonging to cluster n . The β mnk term can be expressed as
u β mnk = 10
u x mnk 20
,
(7)
u u where xmnk and (m = 0, 1) is assumed to be a Gaussian random variable with mean μmnk 2
u u variance σ mnk . Variable xmnk can be further decomposed as u u u u , xmnk = μmnk + ξ mn + ζ mnk
(8)
u u where ξ mn (m = 0, 1) and ζ mnk (m = 0, 1) are two Gaussian random variables that represent the fluctuations of the channel coefficient on each cluster and on each contribution, respectively. In this paper, it is assumed that impulse responses in (4) and (5) are constant across two consecutive symbol transmission periods. Then, these can be rewritten as
h0u (t ) = h0u (t + T ) = h0u ,
(9)
h1u (t ) = h1u (t + T ) = h1u ,
(10)
where hmu (m = 0, 1) is the overall channel gain for the path from transmit antenna m to the receive antenna, and T is the symbol duration. 2.3 Receiver Model
The STBC PPM-TH UWB receiver with double binary turbo code is presented in Fig. 3. The received signal passing through the indoor wireless channel is decoded at the STBC decoder and demodulated by the PPM demodulator. Then, after dispreading process with the TH code identical to that at the transmitter, the output is decoded by the repetition decoder. And the original transmitted bits are estimated utilizing the double binary turbo decoding process.
Performance of STBC PPM-TH UWB Systems with Double Binary Turbo Code
217
Fig. 3. Block diagram of STBC PPM-TH UWB receiver m (t )
~ s0
r 0 r1
∑
Combiner
∑
~ s1
z0
⎧z ≥ 0, sˆ = 0 ⎨ ⎩ z < 0, sˆ = 1
z1
sˆ0
sˆ1
Double Binary Turbo Decoding
bˆ
m (t )
Fig. 4. Block diagram of detailed STBC and PPM-TH decoders
Fig. 4 shows the block diagram of the detailed STBC and PPM-TH decoders. Given the channel models of (4) and (5), the received signal at the receiver is given by the sum of all signals originating from the U transmitters, and can be expressed as r (t ) =
1 U −1 ∞
N −1 K ( n )
∑∑ ∑ ∑ ∑ X
u u mα mnk
m = 0 u = 0 j = −∞ n = 0 k = 0
(
)
u ETX u u p t − jTs − c ujTc − a ujε − Tmn − τ mnk − τ mu + n(t ) , (11) 2
where τ mu is the time delay of the u th user and n(t ) is additive white Gaussian noise (AWGN) with zero mean and variance σ 2 . If we assume that the receiver is listening to the first transmitter and the clock time between the first transmitter and the receiver is perfectly synchronized, the time delay τ m0 is known by the receiver. And we can assume τ m0 = 0 given that only relative delays and phase are relevant. Then, the received signal can be rewritten as r (t ) = rw (t ) + rmui (t ) + n(t ) ,
(12)
where rw (t ) and rmui (t ) are the wanted signal and multi-user interference (MUI) at the receiver input. If we focus on a bit time interval of Tb , rw (t ) and rmui (t ) can be expressed as rw (t ) =
1 N s −1N −1 K ( n )
∑ ∑∑ ∑ X
0 0 mα mnk
m = 0 j = 0 n = 0 k =0
rmui (t ) =
1 U −1 ∞
N −1 K ( n )
∑∑ ∑ ∑ ∑ X
m = 0 u =1 j = −∞ n = 0 k = 0
for t ∈ [0, Tb ] .
u u mα mnk
(
)
0 ETX 0 0 , p t − jTs − c 0j Tc − a 0j ε − Tmn − τ mnk 2
(
(13)
)
u ETX u u p t − jTs − c ujTc − α ujε − Tmn − τ mnk − τ mu , 2
(14)
218
E.C. Kim and J.Y. Kim
When we represent the received signals for the first and second symbol durations as r0 and r1 , respectively, these can be expressed as r0 = r (t ) =
∑ (h s
U −1
u u 0 0
)
+ h1u s1u + n0 ,
u =0
r1 = r (t + T ) =
∑ (− h s
U −1
u u 0 1
(15)
)
+ h1u s0u + n1 ,
u =0
(16)
where n0 and n1 are independent variables with zero mean and unit variance, representing additive white Gaussian noise (AWGN) samples at time t and t + T , respectively. In this paper, it is assumed that the channel coefficients h0u and h1u can be recovered perfectly at the receiver. And the signal transmitted from the first transmitter is the wanted signal. Therefore, the combiner combines the received signal as ~ s00 = h00 r0 + h10 r1
( )
2
= h00 s00 +
∑ {(h h
U −1
0 u 0 0
) (
) }
,
(17)
) (
) }
.
(18)
+ h10 h1u s0u + h00 h1u − h10 h0u s1u + h00 n0 + h10 n1
u =1
~ s10 = h10 r0 − h00r1
( )
2
= h10 s10 +
∑ {(h h
U −1
0 u 1 0
− h00 h1u s0u + h10h1u + h00 h0u s1u + h10 n0 − h00 n1
u =1
Then, the combiner outputs ~s00 and ~s10 are correlated with a correlation mask m(t ) , which is defined as m(t ) =
N s −1
∑ v(t − jT − c T ) , s
0 j c
(19)
j =0
v(t ) = p(t ) − p(t − ε ) .
(20)
The correlation mask is unique for each user. Therefore, among the received signals the interfering signals can be eliminated. The outputs of the correlator, z0 and z1 , are compared with a threshold value and sˆ00 and sˆ10 , which are needed to be double binary turbo decoded, are produced. Finally, original sequences are restored using the Max-Log-MAP algorithm in the turbo decoding procedure.
3 Double Binary Turbo Code 3.1 Encoder
The double binary turbo encoder [11,12] is shown in Fig. 5, where S1 , S 2 , and S3 indicate the shift registers. The data sequence to be encoded, made up of W information bits, feeds the circular recursive systematic convolutional (CRSC) encoder twice.
Performance of STBC PPM-TH UWB Systems with Double Binary Turbo Code
219
Fig. 5. Encoder structure of double binary turbo code
The first is in the natural order of the data, where a switch is in position 1. And the next is in an interleaved order, given by time permutation block, convolutional turbo code (CTC) interleaver, where a switch is in position 2. The encoder is fed by blocks of W bits or F couples, W = 2 × F bits . F is a multiple of 4. So W is a multiple of 8. The most significant bit (MSB) of the first byte after the burst preamble is assigned to A , the next bit to B , and so on for the remainder of the burst content. In order to perform a complete encoding operation of the data sequence, two circulation states have to be determined, one for each component encoder, and the sequence has to be encoded four times instead of twice. For each data couple, the encoded code word involves two systematic bits that are the copy of input pair (X1 and X2) and 4 parity bits (Y1, W1, Y2, and W2) for the normal and the interleaved order, respectively. 3.2 Decoder
Fig. 6 shows the structure of the double binary turbo decoder [15]. The systematic information is the channel value of information symbols d w = {00, 01 ,10 ,11} . Parity 1 and parity 2 are the channel value of the outputs of encoder parity bits. Li dˆw is the
( )
( )
log-likelihood ratio (LLR) of a posteriori probability (APP) for i = 1, 2, 3 . And Lei dˆw is the extrinsic information. In the double binary turbo decoder, the sequential input bits are divided into the information and the parity bits through the trellis MUX. And the information and the parity bits and a priori information produced by the soft input soft output (SISO) decoder are used in the decoding procedure. The results of decoding process are compared with the previously decoded results. Then, this decoding procedure is repeated for increasing the reliability of the decoding outputs. After some iteration, the final values are determined by the soft decision.
220
E.C. Kim and J.Y. Kim
Li (dw ) = 0
Li ( dˆ w )
L ei ( dˆ w ) Li (dˆ w )
Lei ( dˆ w )
Fig. 6. Decoder structure of double binary turbo code
In this paper, the sub-optimal Max Log-MAP algorithm is considered for the double binary turbo decoding because of its characteristics, that is, low computational complexity, high throughput, and low power consumption [16]. Extrinsic information coupling for the feedback is performed according [17]. L (dˆ ) is the LLR of a i
w
Lei (dˆw )
posteriori probability (APP) for i = 1, 2, 3 and is the extrinsic information. First, according to the decoding rule, find logarithm of the branch transition probability as γ wi ( S w −1 , S w ) = ln γ wi ( S w −1 , S w ) = ln p ( y w | d w ) P ( d w ) ,
(21)
where S w is the encoder state at time w . y w is the received symbol and d w is the information symbol. As shown in [9], the result of (18) is as follows.
[
γ w (S w −1 , S w ) =
]
1 LC yws , I xws , I (i ) + y ws ,Q xws, Q (i ) + ln P(d w ) + W 2 , 1 p, I p, I p , Q p ,Q = LC y w xw (i, S w −1 , S w ) + y w xw (i, S w −1 , S w ) 2
[
]
(22)
where yws, I , yws,Q , ywp, I , and ywp,Q represent the received systematic and parity values transmitted through the I and Q channels, respectively. And xws, I (i ) , xws,Q (i ) , xwp, I (i, S w −1, S w ) , and xwp,Q (i , S w−1, S w ) represent the bits of codeword mapped to quadrature phase shift keying (QPSK) constellation, respectively. And W is a constant. In this paper, for modulation scheme PPM is used instead of QPSK. Therefore, there is no data transmitted through the I and Q channels. Then, double binary turbo coding can not be applied to UWB systems. In order to employ double binary turbo code, it is assumed that the coded sequence of the odd position is transmitted through the I channel and that of the even position is transmitted through the Q channel. Next, compute the values of αw(sw) and βw(sw) , which are yielded by the forward and backward recursion, respectively, and then take max-function,
[
]
α w (S w ) ≈ max γ w (S w−1, S w ) + α w−1(S w−1 ) , S w−1
(23)
Performance of STBC PPM-TH UWB Systems with Double Binary Turbo Code
[
]
β w−1 (S w−1 ) ≈ max γ w (S w−1, S w ) + β w (S w ) . Sw
221
(24)
The LLR value is given by (S w−1, S w ) Li (d w ) = ln
∑ (S
w −1, S w
)α w−1(Sw−1 )β w (Sw )
∑ (S
w −1, S w
)α w−1(Sw−1 )β w (Sw )
γ wi d w =i (S w−1, S w ) γ w0 d w =0
,
(25)
where i = 1, 2, 3 . And, the extrinsic information can be calculated as
( )
(S w−1, S w ) γ wi (e ) (Sw −1, Sw )α w−1 (Sw −1 )β w (Sw )
∑
d =i Lei dˆw = ln (S w S ) w−1, w
∑γ ( )(S 0e w
w −1, S w
.
(26)
)α w−1(Sw−1 )β w (Sw )
d w =0
Lei (dˆw ) of the SISO decoder 1 is a priori information of the SISO decoder 2. Also, the
decoding processing in the SISO decoder 2 is performed the same method in the SISO decoder 1. Lei (dˆw ) of the SISO decoder 2 is a priori information of the SISO decoder 1. After several decoding iterations, the soft decisions are made according to ⎧ 01 if L(dˆw ) = L1 (dˆw ) and L1 (dˆw ) > 0 ⎪ ⎪10 if L(dˆw ) = L2 (dˆw ) and L2 (dˆw ) > 0 , dˆw = ⎨ ˆ ˆ ˆ ⎪11 if L(d w ) = L3 (d w ) and L3 (d w ) > 0 ⎪ 00 else ⎩
(27)
where L(dˆw ) = max( L1 (dˆw ), L2 (dˆw ), L3 (dˆw )) .
4 Simulation Results In this section, the performance of the STBC PPM-TH UWB systems employing the double binary turbo code is simulated in multi-user environments. To verify the performance of the system considered in this paper, its bit error probability is tested. In PPM modulation, the number of pulses per one bit is set to be one, and the average transmit power is set at -30 dBm. We consider the STBC system equipping two transmit and one receive antennas. In the double binary turbo coding, the coding rate is 1 3 . It is assumed that the transmitter and receiver are in line of sight (LOS) environment. And distance between the transmitter and the receiver is set two meters. Also, it is assumed that there is no inter-symbol interference (ISI). The parameters of the modified SV channel model for case A are listed in Table 1.
222
E.C. Kim and J.Y. Kim Table 1. Parameter settings for Case A of modified SV channel model
Scenarios Case A LOS (0-4m)
Λ
λ
(1 / ns )
(1 / ns )
Γ
γ
0.0233
2.5
7.1
4.3
σξ
σζ
σg
(dB )
(dB )
(dB )
3.3941
3.3941
3
In Fig. 7, the bit error probability versus EX/N0 performance, where EX is the energy received within a single pulse, of the proposed system for the double binary turbo code and STBC schemes is presented in multi-user environments. The number of user is set at 10 and the number of iterations for the double binary turbo decoding process is set to be 1. The single-input single-output (SISO) and uncoded cases are also shown for performance comparison. As expected, it is shown from these results that the double binary turbo code offers considerable coding gain as EX/N0 increases compared with the uncoded cases. It is also confirmed that the STBC scheme makes more performance enhancement than the SISO scheme does. Therefore, the proposed system employing the double binary turbo code and STBC achieves more performance than any other systems in multiuser environments. Fig 8 shows the bit error probability versus EX/N0 performance of the proposed system for several different numbers of iterations used in the decoding process using Max Log-MAP algorithm in multi-user environments. The number of user is set at 10. 0
10
SISO, Uncoded MIMO, Uncoded SISO, Coded MIMO, Coded
10
10
10
-1
-2
-3
0
3
6
9
12
15
18
EX/N0 [dB]
Fig. 7. Bit error probability versus EX/N0 performance of the proposed system for the double binary turbo code and STBC schemes. ( U =10 users and 1 decoding iteration)
Performance of STBC PPM-TH UWB Systems with Double Binary Turbo Code
223
0
10
Bit Error Probability
1-Iteration 2-Iteration 3-Iteration 10
-1
10
-2
10
-3
10
-4
0
3
6
9
12
15
EX/N0 [dB]
Fig. 8. Bit error probability versus EX/N0 performance of the proposed system for a different numbers of iterations. ( U =10 users) 10
0
-1
10
-2
10
SISO, E X/N0 = 3 dB SISO, E /N = 6 dB X
-3
0
SISO, E /N = 9 dB
10
X
0
MIMO, E /N = 3 dB X
0
MIMO, E /N = 6 dB X
0
MIMO, E /N = 9 dB X
-4
0
10
1
3
5
7
9
Fig. 9. Bit error probability versus number of users of the proposed system for several value of EX/N0. (1 decoding iteration)
224
E.C. Kim and J.Y. Kim
It is seen that the system performance is significantly improved by increasing the number of iterations. However, it needs to be noted that, once the number of iterations exceeds some number, further iterations offer only marginal additional coding gain, as one would expect as the useful information is progressively gleaned from the observations. In Fig. 9, the bit error probability versus the number of users of the proposed system for several values of EX/N0 is shown. The number of iterations is 1. These results show that increasing the multi-user interference has a significant impact on the system performance, as one would expect.
5 Conclusions In this paper, we have analyzed and simulated the performance of the STBC PPM-TH UWB systems employing the double binary turbo coding in multi-user environments. From the simulation results, we have seen that the double binary turbo coding offers considerable coding gain compared with the uncoded system within reasonable encoding/decoding complexity. This performance gradually improves by increasing the number of iterations. Also, the STBC scheme makes an effect on enhancing the system performance. The performance of the proposed system increases even in multi-user environments. It has been shown that the double binary turbo coding and STBC schemes can be very effective in improving the number of users for a given bit error probability requirement. The results of this paper can be applied to implement the down link of the PPM-TH UWB systems.
Acknowledgement “This research was supported by the MKE(Ministry of Knowledge Economy), Korea, under the ITRC(Information Technology Research Center) support program supervised by the IITA(Institute for Information Technology Advancement)” (IITA-2009-C1090-0902-0005).
References 1. Telatar, E.: Capacity of Multiantenna Gaussina Channels. European Transactions on Telecommunications 10(6), 585–595 (1999) 2. Foschini, G.J., Gans, M.J.: On Limits of Wireless Communicatinos in a Fading Environment When Using Multiple Antennas. Wireless Personal Communications 6, 311–335 (1998) 3. Tarokh, V., Seshadri, N., Calderbank, A.R.: Space Time Codes for High Data Rate Wireless Communication: Performance Criterion and Code Construction. IEEE Transactions on Information Theory 2, 744–765 (1998) 4. Alamouti, S.M.: A Simple Transmit Diversity Technique for Wireless Communications. IEEE Journal of Selected Areas on Communications 16(8), 1451–1458 (1998) 5. Foschini, G.J.: Layered Space-Time Architecture for Wireless Communicatinos in a Fading Environment When Using Multiple Antennas. Bell Laboratorys Technical Journal 6(2), 41–59 (1996)
Performance of STBC PPM-TH UWB Systems with Double Binary Turbo Code
225
6. Wicker, S.B.: Error Control Systems for Digital Communication and Storage. Prentice-Hall, Englewood Cliffs (1995) 7. Lin, S., Costello, D.J.: Error Control Coding. Prentice (2004) 8. Berrou, C., Glavieux, A., Thitimajshima, P.: Near Shannon Limit Error-Correcting Coding and Cecoding: Turbo Codes 1. In: IEEE ICC 1993, Geneva, Switzerland (1993) 9. 3GPP2. C.S0002-D, v.2.0 (2005), http://www.3gpp2.org 10. 3GPP, TS36.212, v.1.3.0 (2007), http://www.3gpp.org 11. Douillard, C., Berrou, C.: Turbo Codes with Rate-m/(m+1) Constituent Convolutional Ccodes. IEEE Transactions on Communications 53(10), 1630–1638 (2005) 12. Berrou, C., Douillard, C., Jezequel, M.: Multiple Parallel Concatenation of Circular Recursive Systematic Convolutional (CRSC) Codes. Annals of Telecommunications 54(3-4) (1999) 13. IEEE 802.15.SG3a: Channel Modeling Sub-Committee Report Final. In: IEEE P802.15-02/490r1-SG3a (2003) 14. Tarokh, V., Jafarkhani, H., Calderbank, A.R.: Space-Time Block Coding for Wireless: Performance Results. IEEE Journal of Selected Areas on Communications 17(3), 451–460 (1999) 15. Soleymani, M.R., Gao, Y., Vilaipornsawai, U.: Turbo Coding for Satellite and Wireless Communications. Kluwer Academic Publishers, Dordrecht (2002) 16. Robertson, P., Hoeher, P., Villebrun, E.: Optimal and Sub-Optimal Maximum a Posteriori Algorithm Suitable for Turbo Decoding. European Transactions on Telecommunications 8(2), 119–125 (1997) 17. Hagenauer, J., Offer, E., Papke, L.: Iterative Decoding of Binary Block and Convolutional Codes. IEEE Transactions on Information Theory 42(2), 429–445 (1996)
Performance Evaluation of PN Code Acquisition with Delay Diversity Receiver for TH-UWB System Eun Cheol Kim and Jin Young Kim 447-1, Kwangwoon University, Nowon-Gu, Wolgye-Dong, Seoul, Korea
[email protected],
[email protected]
Abstract. This paper evaluates the pseudo noise (PN) code acquisition with delay diversity receiver for a time hopping-ultra wideband (TH-UWB) system. The detection, overall miss detection, and false alarm probabilities, and mean acquisition time are evaluated over the hypothesis of multiple synchronous states ( H1 cells) in the uncertainty region of the PN code. And the code acquisition performance is evaluated when the correlator output are non-coherently combined by using equal gain combining (EGC) scheme. A constant false alarm rate (CFAR) criterion is applied to the threshold setting rule. From the simulation results, it is demonstrated that the proposed acquisition system with delay diversity receiver achieves a remarkable diversity gain with reasonable complexity. The proposed acquisition scheme can be applied to the UWB base stations in order to enhance the acquisition performance with low complexity. Keywords: Delay diversity receiver, equal gain combining (EGC), mean acquisition time, pseudo-noise (PN) code acquisition, time hopping-ultra wideband (TH-UWB) system.
1 Introduction It is well known that diversity techniques use additional independent signal paths to increase the signal-to-noise ratio (SNR) of the received signal [1]. Therefore, system performance can be improved with several diversity schemes such as time, frequency, transmit, and receive diversities in fading environment [2]. Receive diversity is usually considered in the conventional wireless communication systems. However, there are some problems on size and complexity of hand held devices. According as mobile communication is generalized, transmit diversity attracts much attention. And delay diversity transmit (DDT) technique is suggested for its merits of low complexity [3]. If complexity is a critical issue, it is a practical solution to apply DDT to designing a transmitter. Furthermore, delay diversity receiver (DDR) scheme has been previously studied with a motivation that it can increase diversity order with low complexity [4,5]. Particularly, the authors of [4] investigated the benefit of DDR that has spatially separated polarized antennas at the CDMA cellular systems. In this paper, the PN code acquisition performance with delay diversity receiver, which can increase diversity order with low complexity, is evaluated for a time O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 226–236, 2009. © Springer-Verlag Berlin Heidelberg 2009
Performance Evaluation of PN Code Acquisition with Delay Diversity Receiver
227
hopping-ultra wideband (TH-UWB) system. The channel is modeled as frequency selective lognormal fading channel [6]. Before code acquisition is achieved, the receiver has no idea about any timing phase information on the received signals. And non-coherent combining schemes can be applied without timing phase information on the received signals. So in our analysis, non-coherent equal gain combining (EGC) scheme is applied for collecting the energies available in the multipath components. In almost practical PN code acquisition systems, it is possible that there exist more than two synchronous cells in the uncertainty region of the search process due to multipath effects [7, 8]. So it’s assumed that there are multiple synchronous cells in the uncertainty region of the PN code. The closed-form formula for the conditional probability density function (PDF) of decision variable is derived when the signal with Gaussian distribution goes through the lognormal fading channel. The performance is analyzed through deriving formulas for the detection, false alarm, and miss detection probabilities, and mean acquisition time of the proposed system. The remainder of this paper is organized as follows. In Section 2, the proposed system is shown. And the proposed system performance is analyzed along with the comparisons to the system using a conventional diversity receiver in Section 3. The simulation results for the proposed system are presented in Section 4, and concluding remarks are given in Section 5.
2 System Description The block diagram for the conventional L -branch diversity receiver is shown in Fig. 1. In order to guarantee that the signals between each pair of receiver antennas fade independently, each receiver antenna is sufficiently separated in space. It’s assumed that the search step size is Tc / 2 . All correlators are associated with the same
phase of the local dispreading code. And the outputs X l , l = 0, 1, ..., L − 1 , of the L correlators are combined into one decision variable V . This combining scheme is EGC. Then the decision variable V is compared to a threshold T in order to make a decision whether codes are aligned or not. The threshold value T is determined by using the cell averaging-constant false alarm rate (CA-CFAR) algorithm. The search mode employs a serial search strategy. Whenever the decision variable V exceeds the threshold T , the system decides that the corresponding delay of the locally generated PN sequence is the correct one and enters into the verification mode. If V does not exceed T , the phase of a locally generated PN sequence is rejected. Then, another phase is chosen to update the decision variable and above operation is repeated. This conventional structure requires additional hardware units according to increasing diversity branches so that the complexity increases. The proposed receiver employs delay diversity schemes which can increase diversity order with low complexity. The proposed L -branch DDR is shown in Fig. 2. The received signal from one of the two antenna element is intentionally delayed by the amount of the predetermined value Δ . In order to avoid overlapping with the intentionally delayed signals from the delay branches, Δ should be larger than the maximum excess path delay for the meaningful multipath signals from the normal branches. In this receiver, half of L branches are for the normal branches and the
228
E.C. Kim and J.Y. Kim
r0 ( t )
r1 ( t )
r2 (t ) r3 ( t )
X0
X1
V X2 T X3
Fig. 1. Conventional L -branch diversity receiver. ( L =4)
r0 ( t )
r1 ( t )
X0
r2 (t ) r3 ( t )
V X1
T
Fig. 2. Proposed L -branch DDR. ( L =4)
remaining L / 2 branches are for the delay branches. The proposed DDR has L diversity branches. The signals of the normal and delay branches are input to the correlators in sequence. Then, the correlator outputs are combined into one decision variable V using EGC scheme. And then, V is compared with the threshold T in order to determine whether codes are aligned or not. The proposed receiver does not require additional hardware units and space compared with the conventional diversity receiver, shown in Fig. 1. Therefore, for the proposed receiver, we can reduce complexity.
Performance Evaluation of PN Code Acquisition with Delay Diversity Receiver
229
3 Performance Analysis 3.1 Conventional L -Branch Diversity Receiver
It’s assumed that the combining scheme is non-coherent equal gain combining. And we consider an array of L identical receiving antennas, sufficiently separated in space to eliminate correlation between antenna elements. The input signal to each receiver antenna is corrupted by additive white Gaussian noise (AWGN) with twosided power spectral density of N 0 / 2 . The received signal in the multiuser multipath environment at the l th , l = 0, 1, ..., L − 1 , receiver antenna, rl (t ) , is given by rl (t ) =
U −1 G −1
∑∑α
u u l , g S tr (t
− gTs ) + nl (t ) ,
(1)
u =0 g =0
where α l,u g represents the attenuation over the propagation path of the signal, and
Stru (t − gTs ) is the transmitted signal from the u th user. nl (t ) represents AWGN with zero mean and variance σ nl2 . Since the distances between transmitter antenna and each pair of receiver antenna are approximately same all over l , it can be assumed that all the signals arrive at L receiver antennas simultaneously. Therefore, the phase offset of the PN code is common for all over the L antennas. Since the fading characteristics of different receiver antennas are assumed to be mutually independent, α l,u g is an independently and identically distributed (i.i.d.) lognormal random variable with a PDF [6], when α l,u g is greater than or equal to zero,
( )
f α lu, g =
( )
1
( )
α lu, g 2π σ lu, g
( ) , and μ
2 where E ⎡ α lu, g ⎤ = σ lu, g ⎢⎣ ⎥⎦
−
2
u l, g
2
(ln α
e
u l ,g
− μ lu, g
( )
2 σ lu, g
)
2
2
,
(2)
and σ l,u g are mean and standard deviation of
ln α l,u g , respectively. The correlator output X l of Fig. 1 can be expressed as Xl =
∫
( j +1)T f jT f
u rl (t ) S rec (t ) dt
,
(3)
= R Xl + N Xl where
u Srec (t )
u Srec (t ) = P
∞
is
the
receiver
template
signal
expressed
as
N f −1
∑ ∑W
rec [t
j = −∞ k = 0
− ( jN f + k )T f − Ω(u, k )Tc − δD j ]
,
and
230
RXl =
E.C. Kim and J.Y. Kim U −1 G −1
∑∑α ∫ u l, g
u =0 g =0
( j +1)T f jT f
u Stru , m (t − gTs ) Srec (t )dt . N Xl =
∫
( j +1)T f
jT f
u nl (t ) S rec (t ) dt denotes
2 . an AWGN with zero mean and variance σ Xl After EGC, the decision variable V can be expressed as
L −1
∑X
V =
.
l
(4)
l =0
Then, the conditional PDF of V associated with a H1 cell can be expressed as
fV (v H1, MV ) =
where MV =
L−1
∑
μ Xl , σ V2 =
l =0
L−1
∑σ
2 Xl
−
1
(v−MV )2
e
2πσV2
2σV2
,
(5)
, and μ Xl is mean of R Xl .
l =0
As fading coefficients for different paths are independent from one another, M V is a sum of independent lognormal random variables (RVs). In this paper, the Wilkinson’s method [9] is applied to derive the PDF for a sum of independent lognormal RVs. Then, when M V is greater than or equal to zero, the approximated PDF of M V is obtained as f (MV ) =
−
1 MV 2πσY2
(ln MV −μY )2 2σ Y2
e
.
(6)
The conditioning in (5) may be removed by using fV ( v H 1 ) =
= where ln MV = Ω and
Ω − μY 2σ Y
∫
∞
−∞
fV (v H1 , M V ) f ( M V ) dM V
1
∫
2πσ V =Φ
∞
⎛⎜ v − e ( −⎝
2σY Φ+ μY
) ⎞⎟ 2
e
,
⎠
2σ V2
(7)
e − Φ dΦ 2
−∞
.
The integral expression in (7) is efficiently and accurately evaluated using GaussHermite quadrature formula [10]. Then, fV (v H1) =
∫
∞
−∞ I
≈
∑ i =1
f (Φ)e−Φ dΦ 2
hΦi 2πσV
⎛⎜ v −e( −⎝
e
2σY Φi +μY
2σ V2
) ⎞⎟2 , ⎠
(8)
Performance Evaluation of PN Code Acquisition with Delay Diversity Receiver
where f (Φ ) =
1 2πσ V
⎛⎜ v − e ( −⎝
e
2σ Y Φ + μY
231
) ⎞⎟ 2 ⎠
2σ V2
, I is the quadrature order, which determines
approximation accuracy, Φ i , i = 1, 2, ..., I , are the zeros of the I th order quadrature polynomial, and hΦ i are the Gauss-Hermite quadrature weight factors tabulated in e.g. [10]. The PDF of a sample
H 0 is fV (v H0 ) =
−
1 2πσV2
e
v2 2σV2
.
(9)
The detection probability for a given value of the decision threshold is defined as the probability of the event that the output decision variable corresponding to an H1 cell exceeds the decision threshold T , which can be obtained by
∫
PD =
∞
T
fV (v H1 ) dv ,
(10)
where PD represents the detection probability of an H1 cell. Upon substituting (8) into the above equation, it can be derived after some algebra that
PD = =
I
1 2πσ V
∑h ∫ i =1
I
∑h π
1
i =1
Φi
∞
⎛⎜ v − e ( −⎝
e
T
2σ Y Φi + μY
2σ V2
) ⎞⎟ 2 ⎠
dv .
(11)
⎛ T − A(Φ i ) ⎞ ⎟ ⎟ ⎝ σV ⎠
Φ i Q⎜ ⎜
where e( 2σY Φi +μY ) = A(Φi ) and Q(⋅) is the standard normal complementary cumulative distribution function (CDF) [11]. The threshold value is determined from the false alarm probability, PFA , associated with an H 0 cell. The false alarm probability is defined as the probability of the event that the output decision variable corresponding to an H 0 cell exceeds the decision threshold, which can be expressed as
PFA =
∫
∞
T
fV (v H 0 ) dv
⎛ T = Q⎜⎜ ⎝ σV
⎞ ⎟ ⎟ ⎠
.
(12)
If the number of H1 cells is λ , the overall miss detection probability, PM (λ ) , of a search over the full uncertainty region can be expressed as [7]
232
E.C. Kim and J.Y. Kim
PM (λ ) =
λ
∏ (1 − P ) .
(13)
D
n =1
A generalized expression of the asymptotic equation for the mean acquisition time under the multiple H1 -cell hypothesis is given by [7] T acq =
[1 + PM (λ )] (1 + JPFA ) (qτ ) , D 2[1 − PM (λ )]
(14)
where q is the total number of states in the uncertainty region of the PN sequence. In the approximation, it is assumed that q >> 1 . And Jτ D represents the ‘penalty time’ associated with determining that there is a false alarm and with re-entering the search mode. 3.2 Proposed
L -Branch Delay Diversity Receiver
Each correlator processes two kinds of signals. One is from the normal branch and the other is from the delay branch. Therefore, the l th correlator output is given by Xl =
∫
( j + 1) T f jT f
⎡ U − 1 G −1 =⎢ ⎢⎣ u = 0 g = 0
∑∑∫
+
[
∫
( j +1 ) T f jT f
[r
Normal 2l
( j + 1) T f jT f
(α
(α
u , Delay 2 l +1, g
]
u ( t ) + r2Delay l + 1 ( t + Δ ) S rec ( t ) dt
u , Normal 2l , g
)
u S tru , Normal ( t − gT s ) + n 2Normal (t ) S rec ( t ) dt l
, (15)
)
⎤ u S tru , Delay ( t − gT s − Δ ) + n 2Delay l + 1 ( t − Δ ) S rec ( t ) dt ⎥ ⎦
] [
Delay = R XNormal + N XNormal + R XDelay ,2l ,2l , 2 l +1 + N X , 2 l +1
]
where r2Normal (t ) and r2Delay l l +1 (t + Δ ) are the received signals at the normal and delay and α 2ul,+Delay branches, respectively. Each component of α 2ul,,Normal g 1, g is i.i.d. lognormal random variables of the normal and delay branches, respectively. n2Normal (t ) , l Normal n2Delay , and N XDelay l +1 (t − Δ ) , N X , 2l , 2l +1 are AWGN with zero mean and variance 2 2 2 2 σ Normal , σ Delay , σ Normal , X , 2l , and σ Delay , X , 2 l +1 , respectively.
When there are H1 and H 0 cells the PDF of X l can be expressed as follows. f X l ( x H1, μ Xl ) =
f X l (x H 0 ) =
1 2 2πσ Xl
−
( x−μ Xl )2 2 2σ Xl
e
1 2 2πσ Xl
−
e
,
(16)
x2 2 2σ Xl
,
(17)
Performance Evaluation of PN Code Acquisition with Delay Diversity Receiver
233
Normal where μ XNormal and μ XDelay and R XDelay , 2l , 2l +1 are mean values of R X , 2l , 2l +1 , respec2 2 2 Normal = σ Normal + μ XDelay tively. σ Xl , X , 2l + σ Delay , X , 2l +1 and μ xl = μ X , 2l , 2l +1 .
After EGC, the decision variable V is expressed as
V=
L / 2 −1
∑X
l
.
(18)
l =0
Therefore, the conditional PDF of V when an H1 and H 0 cells are tested is given by
fV (v H1, MV ) =
1 2πσV2
L / 2 −1
∑
μ Xl and σ V2 =
l =0
(v−MV )2 2σV2
e
1
fV (v H0 ) =
where M V =
−
2πσV2
−
e
v2 2σV2
,
,
(19)
(20)
L / 2 −1
∑σ
2 Xl
.
l =0
The PDFs of the decision variable V in (19) and (20) are similar to those of the conventional diversity receiver except the forms of mean and variance. So measures of the system performance can be expressed as (10), (12), (13), and (14).
4 Simulation Results In this section, the code acquisition performance of the proposed system using delay diversity receiver is evaluated and compared with that of the conventional system. To verify the performance of the proposed system, its detection and overall miss detection probabilities, and mean acquisition time are tested using various system parameters in frequency-selective lognormal fading channel. As an application for the serial search is considered, we use a PN sequence of length 1023. Therefore, q , which determines the length of the uncertainty region, is 2046 since the search step size is assumed to be half of the chip duration. For convenience, the normalized mean acquisition time, which is derived from (14) divided by τ D , is considered. For the analysis, the false alarm rate is set at 0.001 and the penalty time constant, J , is set at 1000. In Fig. 3, the detection probability versus SNR per chip performance is shown in accordance with the number of receive antenna. The proposed DDR achieves remarkable gains over the conventional diversity receiver for comparable complexity. For the detection probability of 0.8, the gain of the 4-branch DDR over the conventional 2-branch diversity receiver is about 2 dB with almost the same complexity. Meanwhile, the gain of the conventional 4-branch diversity receiver over the conventional 2-branch diversity receiver is about 2.5 dB. The performance difference between the conventional and proposed 4-branch receiver is about 0.5 dB. Although conventional
234
E.C. Kim and J.Y. Kim
Fig. 3. Detection probability versus SNR per chip performance
4-branch diversity receiver has better detection performance than any other receivers, it requires considerable complexity. But the proposed 4-branch DDR shows similar performance with the conventional 4-branch delay diversity receiver with about half the complexity. The saved resources can be used to enhance the capacity or coverage of the system.
Fig. 4. Mean acquisition time versus SNR per chip performance
Performance Evaluation of PN Code Acquisition with Delay Diversity Receiver
235
Fig. 5. Overall miss detection probability versus SNR per chip performance
Fig. 4 shows the normalized mean acquisition time versus SNR per chip performance with various number of receive antennas. For the normalized mean acquisition time of 0.5 × 10 4 , the gap of the required SNR between the proposed 4-branch DDR and the conventional 2-branch diversity receiver is about 2 dB. That means the received power level of the conventional 2-branch diversity receiver should be 2 dB larger than that of the proposed 4-branch DDR at the same level of AWGN to achieve the same mean acquisition time. Thus, transmit power of the mobile station in the proposed 4-branch DDR can be 2 dB less than that of the conventional 2-branch diversity receiver. This saved power can be applied to extend the life time of battery in the mobile station. Besides, we can reduce the interference to other users by diminishing the transmit power. When the SNR/chip is over -6dB, the normalized mean acquisition time performance is almost identical, regardless of the receivers. The reason of this is that the miss detection probability is under 10 −1 when the SNR/chip is over -6dB in Fig. 5. Therefore, the miss detection probability term in (18) hardly affects the mean acquisition time. Again it can be seen that for receivers of the similar complexity, the delay diversity technique provides an increase in the acquisition capacity.
5 Conclusions In this paper, we have evaluated the delay diversity receiver for the PN code acquisition of TH-UWB signals in the frequency selective lognormal fading channel. And we have compared the acquisition performance with conventional diversity receivers. From the simulation results, we have shown that the proposed DDR achieves a
236
E.C. Kim and J.Y. Kim
remarkable diversity gain with reasonable complexity for the PN code acquisition of a TH-UWB system. Therefore, the proposed DDR is expected to provide a practical solution for enhancing reverse link capacity and improving the system performance.
Acknowledgement “This research was supported by the MKE(Ministry of Knowledge Economy), Korea, under the ITRC(Information Technology Research Center) support program supervised by the IITA(Institute for Information Technology Advancement)” (IITA-2009-C1090-0902-0005).
References 1. Ryu, W.H., Park, M.K., Oh, S.K.: Code Acquisition Schemes using Antenna Arrays for DS-SS Systems and Their Performance in Spatially Correlated Fading Channels. IEEE Transactions on Communications 50(8), 1337–1347 (2002) 2. Dietrich, C.B., Dietze, K., Nealy, J.R., Stutzman, W.L.: Spatial, Polarization, and Pattern Diversity for Wireless Handheld Terminals. IEEE Transactions on Communications 49(9), 1271–1281 (2001) 3. Wittneben, A.: A New Bandwidth Efficient Transmit Antenna Modulation Diversity Scheme for Linear Digital Modulation. In: IEEE International Conference on Communications, Geneva, Switzerland, vol. 3, pp. 1630–1634 (1993) 4. Choi, W., Andrews, J.G.: Generalized Performance Analysis of a Delay Diversity Receiver in Asynchronous CDMA Channel. IEEE Transactions on Communications 4(5), 2057– 2063 (2005) 5. Choi, W., Yi, C., Kim, J.Y., Kim, D.I.: A Delay Diversity Receiver for CDMA Cellular Systems. In: International Symposium on Intelligent Signal Processing and Communication Systems, pp. 468–471 (2004) 6. Proakis, J.G.: Digital Communications, 4th edn. McGraw-Hill, New York (2001) 7. Polydoros, A., Weber, C.L.: A Unified Approach to Serial Search Spread-Spectrum Code Acquisition-Part I: General Theory. IEEE Transactions on Communications 32(5), 542– 549 (1984) 8. Yang, L.L., Hanzo, L.: Serial Acquisition of DS-CDMA Signals in Multipath Fading Mobile Channels. IEEE Transactions on Vehicular Technology 50(2), 617–628 (2001) 9. Beaulieu, N.C., Abu-Dayya, A.A., Mclane, P.J.: Estimating the Distribution of a Sum of Independent Lognormal Random Variables. IEEE Transactions on Communications 43(12), 2869–2873 (1995) 10. Abramowitz, M., Stegun, I.A.: Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. 9th printing. Dover, New York (1972) 11. Alberto, L.G.: Probability and Random Processes for Electrical Engineering, 2nd edn. Addison-Wesley, Reading (1994)
Performance Analysis of Binary Negative-Exponential Backoff Algorithm in IEEE 802.11a WLAN under Erroneous Channel Condition Bum-Gon Choi, Sueng Jae Bae, Tae-Jin Lee, and Min Young Chung School of Information and Communication Engineering Sungkyunkwan University 300, Chunchun-dong, Jangan-gu, Suwon, Kyunggi-do, 440-746, Korea
Abstract. IEEE 802.11 is the most famous protocol for implementation of wireless local area network (WLAN). To access the medium, IEEE 802.11 uses carrier sense multiple access with collision avoidance (CSMA/CA) mechanism, called distributed coordination function (DCF). Although DCF uses binary exponential backoff (BEB) algorithm to avoid frame collisions, wireless resources are wasted due to a lot of collisions under the condition that there are many contending stations. To solve this problem, a binary negative-exponential backoff (BNEB) algorithm has been proposed and its saturation throughput was evaluated under erroneous channel condition. As extention of our previous work, this paper evaluates performance of BNEB algorithm via mathematical analysis and simulations under saturation and erroneous channel condition in terms of the MAC delay and throughput. Also, we compare the performance of DCF with BEB to that with BNEB under normal traffic and erroneous channel condition by intensive simulations. From the results, BNEB yields better performance than BEB in general. Keywords: IEEE 802.11, WLAN, MAC, CSMA/CA, DCF, BNEB.
1
Introduction
The IEEE 802.11 WLAN employed DCF as a mandatory contention based channel access method [1]. DCF is a random access function, based on CSMA/CA protocol adopting BEB algorithm. Although IEEE 802.11 standard also introduces point coordination function (PCF) as a contention-free based channel access method, PCF is barely implemented in current products because PCF wastes some bandwidth due to polling overheads and null packets [2][3].
This research was supported by the MKE(Ministry of Knowledge Economy), Korea, under the ITRC(Information Technology Research Center) support program supervised by the IITA(Institute for Information Technology Advancement) (IITA-2009-C1090-0902-0005). Corresponding author.
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 237–249, 2009. c Springer-Verlag Berlin Heidelberg 2009
238
B.-G. Choi et al.
DCF operates with three parameters, backoff stage, backoff counter and contention window. In DCF, when a station has frame(s) to transmit, it sets backoff stage and randomly chooses backoff counter in the range [0, CW − 1], where CW is current contention window size. At the first transmission of a frame, a station sets backoff stage to 0 and contention window size to minimum contention window size (CWmin ). After a channel is sensed idle during DCF interframe space (DIFS), the station decreases backoff counter by one in every idle slot. When backoff counter of a station is equal to 0, the station transmits a frame at the beginning of slot time. If the station does not receive acknowledgement (ACK), it increases backoff stage by one and doubles their contention window size. If the station successfully transmits its frame, it resets backoff stage to 0 and contention window size to CWmin . However, in DCF, the more number of stations share the wireless resources, the more frames are collided. In order to solve this problem, much research on the IEEE 802.11 DCF is conducted. Bianchi presented an analytical model using bi-dimensional Markov chain model and showed that the proposed model is very accurate [4][5]. With simple modification of Bianchi’s model, Xiao showed the limits of throughput and delay of IEEE 802.11 DCF [6]. The performance of the DCF in the presence of transmission error was evaluated in [7][8]. In addition, they presented an analytical model to verify the performance of the DCF in presence of channel bit error rate. The gentle distributed coordination function (GDCF) was proposed by Wang et al. [9]. In GDCF, stations decrease their backoff stages by one whenever the number of consecutive successful frame transmissions reaches the maximum number of permitted consecutive successes. Since GDCF uses the fixed number of permitted consecutive successful transmissions for decreasing the backoff stage, its performance depends on the number of contending stations. The enhanced GDCF (EGDCF) uses a consecutive success counter to represent the number of consecutive successful transmissions at the same backoff stage [10][11]. If the number of consecutive successful transmissions reaches maximum-permitted value, stations decrease their backoff stages by one. Since the maximum permitted value of consecutive successful transmissions is assigned differently according to the backoff stage of stations, performance of EGDCF scarcely depends on the number of contending stations. A Binary Negative-Exponential Backoff (BNEB) algorithm has been proposed by Ki et al. [14]. The BNEB algorithm maintains the contention window to the maximum window size when stations experience a collision and reduces a contention window size by half when a frame is transmitted successfully without retransmission. Since BNEB introduces minus backoff stage to simply represent consecutive transmission successes. In [14] and [15], the results showed that the BNEB algorithm had better performance than conventional DCF in ideal and erroneous channel conditions. However, the performance was evaluated in saturation condition and the performance in terms of MAC delay was not evaluated. Based on our previous works [14][15], in this paper, we intensively evaluate performance of DCF with BNEB in terms of MAC delay and throughput under saturation and normal traffic conditions in erroneous channel. The rest of paper
Performance Analysis of BNEB Algorithm
239
is organized as follows. Section 2 illustrates BNEB algorithm briefly. Section 3 derives an analytical model to evaluate the normalized throughput of BNEB under saturation and erroneous channel conditions. In Section 4, we verify our analytical model by simulations. Also, we compare the throughput and MAC delay of DCF with BNEB to that with BEB under saturation and normal traffic condition. Finally, we conclude in Section 5.
2
Binary Negative-Exponential Backoff Algorithm
The BNEB algorithm uses three parameters, backoff stage, backoff counter, and contention window. The roles of these parameters are similar to those in DCF with BEB. In DCF with BEB, contention window size becomes double whenever a station experiences a collision, until it reaches the maximum contention window size (CWmax ). However, in DCF with BNEB, contention window size initially sets to CWmax to reduce the probability that more than two stations select the same backoff counter value. When a frame successfully transmitted without retransmission, BNEB decreases the contention window size by half to reduce the delay related to backoff time. Since BNEB introduces minus backoff stage to simply represent consecutive transmission successes, it uses two counters, backoff stage and backoff counter. The contention window size Wi (CWmin ≤ Wi ≤ CWmax ) at backoff stage i is decided as follows. CWmax , 0 < i ≤ m, Wi = (1) max(2i (CWmax ), CWmin ), −L ≤ i ≤ 0, where m is the maximum retry limit and L is the natural number that plays a role in assistance number. CWmin and CWmax represent minimum contention window and maximum contention window, respectively. Stations having frame(s) randomly select their backoff counter values from [0, Wi − 1] and decrease their backoff counter values by one whenever a slot is idle. A station starts to transmit its frame if its backoff counter value reaches zero. For the frame to be transmitted, the backoff stage is decided by both the previous backoff stage used for the previous frame and the result of its transmission, success or collision. If the previous frame was successfully transmitted at the backoff stage i, the station sets its backoff stage to 0 for 0 < i ≤ m. The station sets its backoff stage to i − 1 for −L < i ≤ 0. And the station sets its backoff stage to −L if i = −L. If the transmission of the previous frame failed at the backoff stage i, the station sets its backoff stage to i + 1 if 0 ≤ i < m. The station sets its backoff stage to 1 for −L ≤ i < 0. And if the backoff stage is equal to m, the station drops its frame and then initializes its backoff stage to 0. Therefore, if a collision occurs, the station using BNEB algorithm can effectively resolve collision by using the maximum contention window size.
3
Analytical Model for BNEB
For the conventional DCF, Bianchi presented an analytical model using bidimensional Markov chain model and showed that the proposed model is very
240
B.-G. Choi et al.
accurate [4][5]. Therefore, we expend Bianchi’s Markov chain model to evaluate the saturation throughput of BNEB in the presence of transmission error. In analytical model, we assume that there are n stations having frame(s) to transmit and each station has frame(s) after successful transmission. In addition, it is possible that a station fails to transmit a frame by a channel noise, not a collision. For a station, s(t) is defined as the random process representing the backoff stage and b(t) is defined as the random process representing the value of the backoff counter at time t. Then, BNEB algorithm can be modeled as a bi-dimensional discrete-time Markov chain (s(t), b(t)). Fig. 1 illustrates the state transition diagram of the Markov chain of the BNEB. Let lim P {s(t) = i, b(t) = j} = bi,j and p be the transmission failure probat→∞ bility that a station experiences a transmission failure due to collision or transmission error in a slot time. Then, we can make the following relations through chain regularities. bi,0 = (1 − p)−i b0,0 , Wi − j bi,j = b0,0 , Wi (1 − p)L b−L,0 = b0,0 , p bi,0 = pi−1 b0,0 , Wi − j bi,j = b0,0 , Wi From Equation (2) and
m W i −1
i ∈ [−L + 1, 0] , i ∈ [−L + 1, 0], j ∈ [0, Wi − 1], i = L, j = 0,
(2)
i ∈ [1, m] , i ∈ [1, m], j ∈ [0, Wi − 1]. bi,j = 1, we can derive b0,0
i=−L j=0
b0,0 =
1 m (W p+1)+ (1−p) [W ( 1−p 2 2 ) −1] p(1+p)
+
W +1 1−pL 2 ( 1−p )
.
(3)
Let τ be the probability that a station attempts to transmit a frame. Then we have m 1 1 − pm τ= bi,0 = + b0,0 (4) p 1−p i=−L
and p = 1 − (1 − τ )n−1 (1 − BER)l+H ,
(5)
where BER and l respectively represent the channel bit error rate and packet payload size. H(= P HYhdr + M AChdr ) represents the sum of packet header sizes, P HY header (P HYhdr ) and M AC header (M AChdr ). The transmission success probability Ps , the probability Pc that an occurred packet collides, and the probability Per that a packet received in error are calculated as nτ (1 − τ )n−1 Ps = (1 − P ER), (6) Ptr
Performance Analysis of BNEB Algorithm
Transmission Success Collision Occurance
L,0
L,1
L,2
L,WL-2
L,WL-1
j,Wj-2
j,Wj-1
2,W2-2
2,W2-1
1,W1-2
1,W1-1
p/Wmax
p/Wmax j,0
j,1
j,2 p/Wmax
p/Wmax 2,0
2,1
2,2 p/Wmax
1,0
1,1
1,2 p/Wmax
p/W0 (1-p)/W0 0,0
0,1
0,2
0,W0-2
0,W0-1
-1,W-1-2
-1,W-1-1
i,Wi-2
i,Wi-1
-m,W-m-2
-m,W-m-1
(1-p)/W-1 -1,0
-1,1
-1,2 (1-p)/W-2
(1-p)/Wi i,0
i,1
i,2
(1-p)/Wi-1
(1-p)/W-m -m,0
-m,1
-m,2 (1-p)/W-m
Fig. 1. Markov chain of BNEB
241
242
B.-G. Choi et al.
nτ (1 − τ )n−1 , Ptr nτ (1 − τ )n−1 = P ER, Ptr
Pc = 1 − Per
(7) (8)
where Ptr is the probability that there are at least one transmission and P ER is packet error rate, Ptr = 1 − (1 − τ )n , P ER = 1 − (1 − BER)l+H .
(9) (10)
Let Ts be the mean time required for the successful transmission of a frame and Tc and Ter be the mean wasting time due to the collision and transmission error of a transmitted frame. Then, Ts , Tc and Ter are obtained as follows Ts = DIF S + H + E[P ] + 2δ + SIF S + ACK, Tc = DIF S + H + E[P ∗] + δ,
(11) (12)
Ter = DIF S + H + E[P ∗] + δ,
(13)
and where E[P ] and E[P ∗] are the mean transmission time of successfully transmitted packet and collided packet, respectively. SIFS represents a short interframe space and ACK denotes a transmission time of acknowledgement. The parameter δ denotes a propagation delay. Finally, we can obtain normalized saturation throughput of BNEB in presence of channel bit error rate as follows S=
Ptr Ps E[P ] , (1 − Ptr )σ + Ptr Ps Ts + Ptr Pc Tc + Ptr Per Ter
(14)
where σ is the duration of a backoff slot. To evaluate the saturation MAC delay of BNEB in the presence of transmission error, we define the mean sojourn time di at backoff stage i. Let Tb and Pb denote the mean freezing time due to the busy channel and the probability that channel is busy, respectively. Then, di is given by di = (1 − p)Ts + pTc + [(1 − Pb )σ + Pb Tb ]
Wi , 2
i ∈ [−L, m] .
(15)
Also, Pb and Tb are calculated as Pb = 1 − (1 − τ )n−1 , (n − 1)τ (1 − τ )n−2 (n − 1)τ (1 − τ )n−2 Tb = Ts + 1 − Tc . Pb Pb
(16) (17)
Therefore, when a previous frame was transmitted in stage i, Di is given by ⎧ −L ≤ i ≤ 0, ⎨ di + pD1 , 1 ≤ i ≤ m, Di = di + pDi+1 , (18) ⎩ di , i = m.
Performance Analysis of BNEB Algorithm
243
The mean MAC delay D can be calculated as 0
D=
bi,0 Di
i=−L 0
.
(19)
bi,0
i=−L
4
Performance Evaluation
To evaluate the performance of DCF with BEB and BNEB, we consider L=5, m=7, and the MAC parameters are given in Table 1. In IEEE 802.11a, control packets such as ACK are transmitted with Control Rate(Ccon )[12][13]. 4.1
Performance Evaluation under Saturation Condition
In order to evaluate normalized throughput and MAC delay of BNEB under saturation condition, we assume that there are n stations having frame(s) to transmit and each station has frame(s) after successful transmission. When there is frame transmission error in wireless channel, normalized saturation throughputs of DCF with BEB and BNEB are shown in Fig. 2. For n = 5, the throughput of BNEB is sensitive to the BER. However, for n=25 and 50, the throughput of BNEB is not sensitive to the BER. From the results, for n=5, the saturation throughput of BNEB is less than that of DCF with BEB, however, the saturation throughput of BNEB is greater than that of DCF with BEB for n=25 and
Fig. 2. Saturation throughput of DCF with BEB and BNEB in IEEE 802.11a under varying BER (L=5, m=7)
244
B.-G. Choi et al.
Fig. 3. Saturation MAC delay of DCF with BEB and BNEB in IEEE 802.11a under varying BER (L=5, m=7)
50. When BER ≥ 10−4 , the possibility of transmission failure by a transmission error is larger than that by a collision. Therefore, the throughputs of DCF with BEB and BNEB are closed to 0 as BER increases. Fig. 3 shows the saturation MAC delay of DCF with BEB and BNEB under varying BER. If frame transmission fails, BNEB stations increase contention window to CWmax , but BEB stations increase contention window by double. When the number of contending stations is small, transmission failure caused by transmission error is dominant. Therefore, the saturation MAC delay of BNEB is higher than that of DCF with BEB for n=5 and high BER because the CWmax Table 1. IEEE 802.11a MAC parameters PARAMETER Packet payload MAC header PHY header ACK length Control rate (Ccon ) Data rate (C) Propagation Delay SIFS DIFS Slot Time CWmin CWmax
VALUE 8184 bits 272 bits 128 bits 240 bits 24Mbps 54Mbps 1 µs 16 µs 34 µs 9 µs 16 1024
Performance Analysis of BNEB Algorithm
245
Fig. 4. Saturation throughput of DCF with BEB and BNEB in IEEE 802.11a for BER and the number of stations (L=5, m=7)
causes long backoff time. For n=25 and 50, the saturation MAC delay of BNEB is less than that of DCF with BEB. The more the number of stations uses wireless resource, the more collision occurrences are possible than frame transmission error. Because BNEB can resolve the collision more effectively in case that there are many contending stations, the saturation MAC delay of BNEB is less than that of DCF with BEB for n=25 and 50. When BER is higher than 10−4 , difference of the MAC delay between DCF with BEB and BNEB decreases. Also, the saturation MAC delay sharply increases for BER ≥ 10−4 . Fig. 4 represents the normalized saturation throughput of DCF with BEB and BNEB as the number of contending stations increases when BER = 10−6 , 10−5 , and 10−4 . When there are many contending stations, the transmission failure related to the collision is more frequent than that related to transmission error. Therefore, BNEB which initially sets contention window to CWmax shows better performance than DCF with BEB. The waiting time due to the backoff slot time strongly affects the saturation throughput of DCF with BEB and BNEB when there are less contending stations. BNEB sets contention window to CWmax whenever a station experiences transmission failure. For this reason, as the transmission failure is increased by channel BER, the waiting time of BNEB is larger than that of DCF with BEB. Therefore, the performance of BNEB is less than that of DCF with BEB in the small number of contending stations and high BER. However, because the wireless devices operate in BER ≤ 10−5 in most cases, the performance of BNEB is better than that of DCF with BEB. The saturation MAC delay of DCF with BEB and BNEB is shown in Fig. 5. For BER = 10−6 and 10−5 , the saturation MAC delay of BNEB is less than that of DCF with BEB. However, for BER = 10−4 and n ≤ 35, the saturation MAC delay of BNEB is higher than that
246
B.-G. Choi et al.
Fig. 5. Saturation MAC delay of DCF with BEB and BNEB in IEEE 802.11a for BER and the number of stations (L=5, m=7)
of DCF with BEB because transmission failure related to the transmission errors is more frequent than that related to the collision. 4.2
Performance Evaluation under Normal Traffic Condition
To evaluate performance of DCF with BEB and BNEB under normal traffic condition, we assume that packets arrive at devices as a Poisson process with a mean arrival rate λ. For BER = 10−6 , 10−5 , and 10−4 , the normalized throughput of DCF with BEB and BNEB varying packet arrival rates (λ) is shown in Fig. 6. The possibility that stations have frame(s) for transmission increases as λ increases until λ = λsat . If the packet arrival rate is larger than the specific value (λsat ), the probability that a station has frame(s) for transmission is close to 1. From the results, the throughput of DCF with BEB and BNEB linearly increases as λ increases until λ = λsat and maintains constant for λ ≥ λsat . The normalized throughput of BNEB is higher than that of DCF with BEB for BER = 10−6 and 10−5 . When BER > 10−5 , the possibility of transmission failure caused by a transmission error is larger than that of transmission failure caused by a collision. Therefore, the normalized throughput of BNEB is less than that of DCF with BEB for BER = 10−4 . Since the λsat of the BNEB is greater than DCF with BEB, the BNEB can serve more traffic than BEB. Fig. 7 shows the MAC delay of DCF with BEB and BNEB varying packet arrival rates (λ). As BER increases, the transmission failure due to the transmission error occurs more frequently than that due to collision. DCF stations set contention window size to CWmin when a station transmit a new frame and double their contention window size when DCF stations participate in the col-
Performance Analysis of BNEB Algorithm
247
Fig. 6. Normalized throughput of the DCF and the BNEB varying packet alival rate for BER in IEEE 802.11a (L=5, m=7)
lision. However, to effectively resolve collisions, BNEB stations set contention window size to CWmin when a station transmit a frame. The possibility that a stations retransmits a frame is very small until λ = λsat . Also, the waiting time
Fig. 7. MAC delay of the DCF and the BNEB varying packet arrival rate for BER in IEEE802.11a (L=5, m=7)
248
B.-G. Choi et al.
by the backoff strongly affects the MAC delay of DCF with BEB and BNEB when λ ≤ λsat . Therefore the MAC delay of BNEB is higher than DCF with BEB for λ ≤ λsat because BNEB stations use greater contention window size than that of DCF stations. However, when λ ≥ λsat , the MAC delay of BNEB is less than that of DCF with BEB for BER = 10−6 and 10−5 . For, BER = 10−4 , the MAC delay of BNEB is higher than that of DCF with BEB. However, because the wireless devices operate in BER ≤ 10−5 , in most cases, performance of BNEB is better than that of DCF with BEB.
5
Conclusion
In this paper, we briefly explained a binary-negative exponential backoff (BNEB) algorithm to enhance the performance of DCF with BEB. And we proposed a mathematical analysis model to evaluate the performance of BNEB in presence of channel bit error rate. Also, we verified our proposed algorithm via analytical model and simulations under saturation and erroneous channel condition. From the results, BNEB can resolve collision more effectively than DCF with BEB. We also evaluated the performance of BNEB algorithm in the presence of transmission error by simulations under normal traffic conditions. In low BER, performance of BNEB is better than that of DCF with BEB because of effective collision resolution. In high BER, throughput of BNEB is smaller than DCF due to ineffective management of backoff time. However, we expect that BNEB improves performance of DCF compared with BEB under erroneous channel condition.
References 1. IEEE standard for Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications. ISO/IEC 8802-11 (1999(E)) (1999) 2. Kanjanavapasti, A., Landfeldt, B.: An analysis of a modified point coordination function in IEEE 802.11. In: Proc. of PIMRC 2003, vol. 2, pp. 1732–1736 (2003) 3. Xiao, Y.: Performance analysis of priority schemes for IEEE 802.11 and IEEE 802.11e wireless LANs. IEEE Trans. Wireless Commun. 4(4), 1506–1515 (2005) 4. Bianchi, G.: IEEE 802.11 saturation throughput analysis. IEEE Commun. Lett. 2(12), 318–320 (1998) 5. Bianchi, G.: Performance analysis of the IEEE 802.11 distributed coordination function. IEEE J. Sel. Areas Commun. 18(3), 535–547 (2000) 6. Xiao, Y., Rosdahl, J.: Throughput and delay limits of IEEE 802.11. IEEE Commun. Lett. 6(8), 355–357 (2002) 7. Chatzimisios, P., Boucouvalas, A.C., Vitsas, V.: Performance analysis of IEEE 802.11 DCF in presence of transmission errors. In: Proc. of IEEE International Conf. on Commun., vol. 7, pp. 3854–3858 (2004) 8. Chatzimisios, P., Boucouvalas, A.C., Vitsas, V.: Influence of channel BER on IEEE 802.11 DCF. Electronics Lett. 39(23), 1687–1689 (2003) 9. Wang, C., Li, B., Li, L.: A new collision resolution mechanism to enhance the performance of IEEE 802.11 DCF. IEEE Trans. Veh. Techno. 53(4), 1235–1243 (2004)
Performance Analysis of BNEB Algorithm
249
10. Chung, M.Y., Kim, M.-S., Lee, T.-J., Lee, Y.: Performance evaluation of an enhanced GDCF for IEEE 802.11. IEICE Trans. Commun. E88-B(10), 4125–4128 (2005) 11. Kim, D.H., Choi, S.-H., Jung, M.-H., Chung, M.Y., Lee, T.-J., Lee, Y.: Performance evaluation of an enhanced GDCF under normal traffic condition. In: Proc. of IEEE TENCON, pp. 1560–1566 (2005) 12. Chatzimisios, P., Boucouvalas, A.C., Vitsas, V.: Effectiveness of RTS/CTS handshake in IEEE 802.11a wireless LANs. IEE Electronics Lett. 40(14), 915–916 (2004) 13. Raffaele, B., Marco, C.: IEEE 802.11 Optimal performances: RTS/CTS mechanism vs. basic access. In: Proc. of PIMRC, vol. 4, pp. 1747–1751 (2002) 14. Ki, H.J., Choi, S.-H., Chung, M.Y., Lee, T.-J.: Performance evaluation of Binary Negative-Exponential Backoff Algorithm in IEEE 802.11 WLAN. In: Cao, J., Stojmenovic, I., Jia, X., Das, S.K. (eds.) MSN 2006. LNCS, vol. 4325, pp. 294–303. Springer, Heidelberg (2006) 15. Choi, B.-G., Ki, H.J., Chung, M.Y., Lee, T.-J.: Performance evaluation of Binary Negative-Exponential Backoff Algorithm in presence of a channel bit error rate. In: Shi, Y., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2007. LNCS, vol. 4490, pp. 554–557. Springer, Heidelberg (2007)
A Resource-Estimated Call Admission Control Algorithm in 3GPP LTE System Sueng Jae Bae1 , Jin Ju Lee1 , Bum-Gon Choi1 , Sungoh Kwon2 , and Min Young Chung1, 1
School of Information and Communication Engineering Sungkyunkwan University 300, Chunchun-dong, Jangan-gu, Suwon, Kyunggi-do, 440-746, Korea 2 Telecommunication R&D Center Samsung Electronics 416, Maetan-dong, Youngtong-gu, Suwon, Kyunggi-do, 443-742, Korea
Abstract. As the evolution of high speed downlink packet access (HSDPA), long-term evolution (LTE) has being standardized by the 3rd generation partnership project (3GPP). In the existing mobile communication networks, voice traffic is delivered through circuit-switched networks, but to the contrary in LTE, all kinds of traffic are transferred through packet-switched networks based on IP. In order to provide quality of service (QoS) in wireless networks, radio resource management (RRM) is very important. To reduce network congestion and guarantee certain level of QoS for on-going calls, call admission control (CAC), in part of RRM, accepts or rejects service requests. In this paper, we proposed resource-estimated CAC algorithm and evaluated the performance of the proposed CAC algorithm. The result shows that the proposed algorithm can maximize PRB utilization and guarantee certain level of QoS. Keywords: LTE System, CAC, QoS, RRM.
1
Introduction
LTE has being standardized by 3GPP as part of 3GPP release 8 [1]. By adopting orthogonal frequency division multiple access (OFDMA) and multiple-input multiple-output (MIMO) technologies, LTE increases data rate and improves spectral efficiency [2]. In addition, since LTE evolves from HSDPA, it can be easily compatible with the current mobile communication networks [3]. In the existing mobile communication networks, voice traffic is delivered through circuitswitched networks, but in LTE, all kinds of traffic, such as voice, streaming, data, etc., are transferred through packet-switched networks based on IP.
This work was partially supported by Samsung Electronics and the MKE(The Ministry of Knowledge Economy), Korea, under the ITRC(Information Technology Research Center) support program supervised by the IITA(Institute for Information Technology Advancement) (IITA-2009-C1090-0902-0005). Corresponding author.
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 250–260, 2009. c Springer-Verlag Berlin Heidelberg 2009
A Resource-Estimated CAC Algorithm in 3GPP LTE System
251
In order to provide QoS for various kinds of services in wireless environments, RRM is very important [4]. To reduce network congestion and guarantee certain level of QoS for on-going calls, CAC, in part of RRM, decides acceptance or rejection of service requests. Evolved universal terrestrial radio access network node B (eNB), base station in LTE system, may perform CAC as several conditions, such as channel status, QoS requirements for requested services, buffer state in eNB, and so on [1][5]. The existing CAC algorithms can be classified into two categories, static and dynamic. Static CAC algorithms reserve resources for handoff calls [6][7][8][9]. However, channel reservation method may cause lower spectral efficiency [10]. Dynamic CAC algorithms perform admission control through estimation of radio channel state and available resources [11][12][13]. Since dynamic CAC algorithms assume that all requested calls have the same QoS requirement, they can not directly adapt to LTE system which provides various kinds of services. In this paper, we propose a resource-estimated CAC algorithm. Whenever a service request occurs, the resource-estimated CAC algorithm estimates the number of Physical Resource Blocks (PRBs) required for the service request. Based on the service type and modulation and coding scheme (MCS) level of user,the number of required PRBs is determined. Since the resource-estimated CAC algorithm considers minimum data rate required for the requested service, it can maximize the utilization of physical resources. We conduct intensive simulation in order to evaluate performance of the proposed CAC algorithm. The rest part of this paper is organized as follows. Section 2 describes existing CAC algorithms. The proposed CAC algorithm is discussed in Section 3. In Section 4, we analyze the performance of proposed algorithm. Finally, conclusions are presented in Section 5.
2
Pre-studied CAC Algorithms
The existing CAC algorithms can be divided into static and dynamic. Static CAC algorithms reserve resources for handoff calls [6][7][8][9]. Dynamic CAC algorithms perform admission control through estimation of radio channel status and available resources [11][12][13]. In this section, we explain existing three static CAC algorithms, guard channel, fractional guard channel, and queueing principle. In addition, we illustrate existing three dynamic CAC algorithms, local predictive, distributed, and shadow cluster. 2.1
Static CAC Algorithms
Guard channel algorithm reserves some channels among total number of channels for handoff calls [6]. Admission control procedure of guard channel algorithm is simple and its implementation is easy. However, in the guard channel algorithm, it is very difficult to determine the number of channels reserved for handoff calls because arrival patterns of handoff calls are changed as the movement of users. Moreover, the guard channel algorithm may decrease utilization of physical resources as the number of reserved channels increases.
252
S.J. Bae et al.
To improve resource utilization of the guard channel algorithm, Ramjee et al. proposed fractional guard channel algorithm [7]. The fractional guard channel algorithm determines an acceptance or a rejection of new calls with decision probability varied as the number of busy channels. In case that channels are sufficiently available, handoff calls are accepted, but new calls can be rejected. Thus, in the fractional guard channel algorithm, dropping probability of handoff calls may be smaller than that of new calls. In addition, since decision probability varies as the number of available channels, the fractional guard channel algorithm alleviates congestion in the network. However, channel reservation schemes, such as guard channel and fractional guard channel, may use inefficiently wireless resources [10]. Moreover, channel reservation schemes may excessively block new calls compared with handoff calls, because they always reserve some channels for handoff calls [14]. To overcome disadvantages of channel reservation schemes, queueing principle algorithms were proposed [8][9]. In these algorithms, call requests are accepted when there exist available channels. However, if all the channels are unavailable, new and handoff calls are registered in the waiting list as their queueing discipline. When channels go into idle state due to call release or handoff, they are allocated to the call with the highest priority in waiting list. 2.2
Dynamic CAC Algorithms
Local predictive CAC algorithm predicts resource in local base station [11]. The local predictive CAC algorithm estimates the amount of resources required for a serving call based on Wiener processes. In addition, it predicts arrival times for handoff calls and then reserves resources for the handoff calls. The local predictive CAC algorithm has lower dropping probability for handoff calls than that for new calls because of adaptively preserving wireless resources for handoff calls. However, to correctly predict arrival times and required bandwidth of handoff calls, information on handoff calls should be shared with base stations related with the handoff calls. Naghshineh et al. proposed distributed CAC algorithm which predicts local resources as well as resources of adjacent cells [12]. The distributed CAC algorithm considers the number of handoff calls moving from adjacent cells and their QoS requirements. The distributed CAC algorithm guarantees the QoS of handoff calls more effectively than the local predictive, because neighbor base stations exchange information on handoff calls, such as arrival rate, required bandwidth, etc. However, the distributed CAC algorithm assumes that all calls have the same service type and QoS requirement. In multimedia wireless networks, there exist various kinds of service calls and their QoS requirements may be different. Thus, the service types of calls should be reflected on CAC algorithm. Shadow cluster CAC algorithm, one of distributed CAC algorithms, performs admission control considering movement of UEs, i.e., velocity, movement direction, and position in cell of UEs [13]. According to movement of user equipment
A Resource-Estimated CAC Algorithm in 3GPP LTE System
253
(UE), the base station which UE belongs to selects adjacent cells, named for shadow cluster, that UE possibly moves to. The base stations of adjacent cells selected as shadow cluster reserve on an amount of resources for handoff calls. However, the shadow cluster CAC algorithm may incur overhead because all base stations should have information on the movement of UEs.
3
Proposed CAC Algorithm
Resource-estimated CAC algorithm estimates the number of PRBs which should be allocated to the requested call. The number of required PRB should be decided by reflecting the type of the requested service and current MCS level of UE. In addition, the resource-estimated CAC algorithm calculates available PRBs based on PRB usage of on-going call measured by eNB. Fig. 1 illustrates the flow chart of call admission control procedure in the resource-estimated CAC algorithm. P RB P RB In Fig. 1, Nreq , Breq , and BMCS denote the number of required PRBs per one second, required data rate, and the number of bits carried in a PRB under the current MCS level of an UE requesting a service, respectively. Since transmission P RB data rate in wireless environment varies as channel condition, Nreq is calculated P RB as Breq over BMCS . Breq is determined as the service type of the requested call. P RB BMCS is calculated as channel quality indicator (CQI) information reported from the corresponding UE through physical uplink control channel (PUCCH) or physical uplink shard channel (PUSCH). In general, handoff call requests occur when corresponding UEs cross over cell boundary. Thus, for handoff calls, we P RB P RB use the smallest BMCS among possible BMCS s under the given cell environment. ͲΕΞΚΤΤΚΠΟ͑ΣΖΦΖΤΥ͑
ͶΤΥΚΞΒΥΖ͑ΣΖΤΠΦΣΔΖΤ͑ΗΠΣ͑ΣΖΦΖΤΥΖΕ͑͑ ΔΒΝΝ͑ΒΟΕ͑ΒΧΒΚΝΒΓΝΖ͑ΣΖΤΠΦΣΔΖΤ͑ PRB N req =
Breq PRB BMCS
,
PRB PRB PRB N PRB free = N total − N RT − N NRT
PRB N PRB ? free > N req
ͿΠ͑
ΖΛΖΔΥ͑ΔΒΝΝ͑
ΊΖΤ͑
ͲΔΔΖΡΥ͑ΔΒΝΝ͑
Fig. 1. Flow chart of call admission control procedure in the resource-estimated CAC algorithm
254
S.J. Bae et al.
RB NfPree denotes the total number of available PRBs during the past one second. RB To find NfPree , eNB calculates PRB usage of on-going calls at arrival time of a P RB P RB service request. NRT and NN RT denote the number of PRBs during the past one second that eNB actually allocates to real-time services and non real-time P RB services, respectively. The total number of PRBs per second, Ntotal is decided RB as the channel bandwidth of LTE system. Thus, NfPree is easily obtained by P RB P RB P RB P RB subtracting the sum of NRT and NN RT from Ntotal . If Nf ree is bigger than P RB Nreq , the requested call is accepted. Otherwise, the requested call is rejected. Since resource-estimated CAC algorithm only estimates minimum data rate of the requested service, it can be easily implemented.
4
Performance Evaluation
We develop event-driven simulator for 3GPP LTE downlink system using C++. To evaluate performance of the proposed CAC algorithm, we consider a radio access network consisting of seven hexagonal cells. Radius of each cell is 250m and identification numbers of cells are 0 to 6, as shown in Fig. 2. In addition, we assume proportional fair (PF) scheduling scheme as MAC scheduling algorithm [17]. We consider an OFDMA system with 5 MHz of downlink channel bandwidth which is one of the channel bandwidths specified in LTE system. For wireless channel conditions, path-loss and multi-path fading are considered but inter-cell interference is not reflected on our simulation. To determine the MCS level of an UE, we use modified COST 231 Hata model which reflects 10 dB log-normal shadow fading[15]. The number of UEs is 1750 and their positions are uniformly distributed in seven cells at the starting time of simulation. The mobility model is considered as random-walk model. The velocities of all UEs are assumed to be 4km/h, and flight time of all UEs is uniformly distributed between 10 sec and 20 sec. Service requests arrive at eNB as Poisson
Fig. 2. Cell structure in simulation
A Resource-Estimated CAC Algorithm in 3GPP LTE System
255
Table 1. The simulation parameters Parameter Number of cells Radius of cell Number of UEs Velocity of UEs Flight time of UEs Downlink channel bandwidth Traffic mixture ratio Simulation time Service duration Queue length TTI
Value 7 250m 1750 4km/h [10, 20] sec 5MHz 25:25:25:25 10,000 sec 180sec 10MB 1ms
Note hexagonal shape
uniform distribution 50 PRBs per TTI FTP:web:video:VoIP 0–2,000 sec is ignored exponential distribution
processes with parameter λ and service time is determined by an exponential distribution with mean 1/μ. The simulation time is 10,000 sec and statistical information between from 0 sec to 2,000 sec is ignored. eNB has a logical queue per service of an UE with 10MBytes. The simulation parameters are described in Table 1. For simulations, we consider four service types, FTP, web, video, and VoIP. When user requests a service, service type is uniformly selected among four service types. The traffic mixture ratio of FTP, web, video, and VoIP is considered 25:25:25:25, and their characteristics are given in Table 2 [15][16]. As performance measures, the average data rate, the average packet delay, PRB utilization, blocking probability of new calls, and dropping probability of handoff calls are considered. The average data rate is defined as the ratio of total amount of bits, sent by the eNB to all UEs during the simulation time. The average packet delay is considered as the sum of mean packet transmission time and mean queue-waiting time. PRB utilization is defined as the ratio of the number of PRBs allocated to UEs during the whole simulation time. The blocking probability and dropping probability are defined as the ratio of the number of rejected new and handoff calls and total number of arrived new and handoff calls, respectively. In simulations, CAC tightly coupled with scheduling algorithm is performed only at the centered cell with Cell 0. CAC in cells except Cell 0 decides acceptance or rejection of call requests based on new call blocking probability and handoff call dropping probability measured in Cell 0. While an UE moves from adjacent cells of Cell 0 to Cell 0 during its service time, the generation of the corresponding packets is started at the handoff-in time and ended at the handoff-out time. If packets are remained in eNB for Cell 0 at the handoff-out time, they are discarded from the eNB. Since existing CAC algorithms have been developed for channel-based cellular systems, it is difficult to directly compare the performance of existing CAC algorithms and the proposed CAC algorithm. Here, we compare performance of the proposed CAC algorithm with that of the case without CAC.
256
S.J. Bae et al.
Fig. 3. Average data rate when velocities of all UEs are 4km/h and traffic mixture ratio is 25:25:25:25
In resource-estimated CAC algorithm, Breq is decided as the service type of the requested call. Based on LTE specification, Breq sets as 8kbps and 20kbps for VoIP and streaming, respectively [18][19]. For web and FTP services, since it is difficult to determine their data rates, the measured data rate of same traffic class is used as Breq . Figs. 3 and 4 represent the average data rate and PRB utilization, respectively. The maximum average data rates are near by 10Mbps and 7.7Mbps in
Fig. 4. PRB utilization when velocities of all UEs are 4km/h and traffic mixture ratio is 25:25:25:25
A Resource-Estimated CAC Algorithm in 3GPP LTE System
257
Table 2. Characteristics of traffic considered for simulation QoS class
Service
Best effort FTP
Component File size
Interactive Web Number of data browsing pages per session (HTTP) Main object size
Statistical Characteristics Truncated log normal distribution Log normal distribution Truncated log normal distribution
Embedded object Truncated size log normal distribution Number of embedded objects per pages Reading time Parsing time
Truncated Pareto distribution Exponential distribution Exponential distribution Deterministic
Streaming Video Session (64kbps) duration(movie) Inter-arrival time Deterministic between the beginning of each frame Number of packets Deterministic (slices) in a frame Packet size Truncated Pareto distribution Inter-arrival time Truncated between the Pareto packets in a frame distribution Voice VoIP Average call Exponential holding time distribution Voice CODEC AMR Frame length Deterministic Talk spurt length Exponential distribution Silence length Exponential distribution
Parameters Mean: 2MB Std.dev.: 0.722MB Max: 5MB Mean : 17 Std.dev.: 22 Mean: 10710Bytes Std.dev.: 25032Bytes Max: 2MB Min: 100Bytes Mean: 7758Bytes Std.dev. 126168Bytes Max: 2MB Min: 50Bytes Mean: 5.64 Max: 53 Mean: 30sec Mean: 0.13sec 3600sec 100ms (based on 10frames per second 8 packets per frame Mean: 50Bytes Max: 250Bytes Mean : 50Bytes Max: 12.5ms Mean : 210sec 12.2kbps 20ms Mean: 1026ms Mean: 1171ms
258
S.J. Bae et al.
Fig. 5. Average packet delay when velocities of all UEs are 4km/h and traffic mixture ratio is 25:25:25:25
non-CAC and resource-estimated CAC algorithm, respectively. In addition, maximum PRB utilizations become 1 and 0.89 for non-CAC and resource-estimated CAC algorithm, respectively. The proposed CAC algorithm should reject some of requested calls to prevent network congestion, its total average data rate and total PRB utilization are less than those in non-CAC. The average packet delay is shown as Fig. 5. As arrival rate times service duration per UE, ρ increases, the average packet delay with non-CAC increases.
Fig. 6. Call rejection ratio of real-time services when velocities of all UEs are 4km/h and traffic mixture ratio is 25:25:25:25
A Resource-Estimated CAC Algorithm in 3GPP LTE System
259
Fig. 7. Call rejection ratio of non real-time services when velocities of all UEs are 4km/h and traffic mixture ratio is 25:25:25:25
Since the sizes of packets for non real-time services, i.e., FTP and web, are larger than those of real-time services, such as streaming and VoIP, the average packet delays of non real-time services increase more sharply than those of real-time services. The average packet delay of resource-estimated CAC algorithm is lower than that of non-CAC. Figs. 6 and 7 illustrate call rejection ratio for real-time and non real-time services, respectively. Since packet size of FTP service is larger than those of other services, the number of rejected calls for FTP service is more than those of other services. For resource-estimated CAC algorithm, handoff call dropping probability is higher than new call blocking probability because MCS level and code rate for handoff calls at their request time are worse than those for new calls at their request time.
5
Conclusion
In this paper, to guarantee QoS requirements for packet delay in LTE system, we proposed resource-estimated CAC algorithm. Resource-estimated CAC algorithm predicts the amount of PRBs required for service requests and it has low complexity. In order to evaluate the performance of the proposed CAC algorithm, we performed simulations under various simulation environments. From the simulation results, even though the average data rate and PRB utilization of proposed CAC algorithm is lower than those of non-CAC, performance of delay of proposed CAC algorithm is better than that of non-CAC. For further studies, research on the enhancement of the proposed algorithm is required for reducing handoff call dropping probability.
260
S.J. Bae et al.
References 1. 3GPP TS 36.300 v.8.3.0: Evolved UTRA and Evolved UTRAN (E-UTRAN); Overall description (2007) 2. Ekstrom, H., Furuskar, A., Karlsson, J., Meyer, M., Parkvall, S., Torsner, J., Wahlqvist, M.: Technical Solutions for the 3G Long-Term Evolution. IEEE Commun. Magazine 44(3), 38–45 (2006) 3. 3G Americas, white paper: UMTS Evolution from 3GPP Release 7 to Release 8 HSPA and SAE/LTE (2007) 4. Fang, Y., Zhang, Y.: Call Admission Control Schemes and Performance Analysis in Wireless Mobile Networks. Proc. of IEEE VTC 51, 371–382 (2002) 5. 3GPP TR 25.912 v.7.2.0: Feasibility study for evolved Universal Terrestrial Radio Access (UTRA) and Universal Terrestrial Radio Access Network (UTRAN) (2007) 6. Hong, D., Rappaport, S.S.: Traffic Model and Performance Analysis for Cellular Mobile Radio Telephone Systems with Prioritized and Nonprioritized Handoff Procedures. IEEE Trans. Vehi. Technol. 35(3), 77–92 (1986) 7. Ramjee, R., Nagarajan, R., Towsley, D.: On Optimal Call Admission Control in Cellular Networks. Wireless Networks 3, 29–41 (1997) 8. Guerin, R.: Queueing-Block System with Two Arrivals Streams and Guard Channels. IEEE Trans. Commun. 36(2), 153–163 (1988) 9. McMillan, D.: Delay Analysis of a Cellular Mobile Priority Queueing System. IEEE/ACM Trans. Networking 3(3), 310–319 (1995) 10. Qin, C., Yu, G., Zhang, Z., Jia, H., Huang, A.: Power Reservation-based Admission Control Scheme for IEEE 802.16e OFDMA Systems. In: Proc. of IEEE WCNC, pp. 1831–1835 (2007) 11. Zhang, T., Berg, E., Chennikara, J., Agrawal, P., Chen, J.-C., Kodama, T.: Local Predictive Resource Reservation for Handoff in Multimedia Wireless IP Networks. IEEE J. Select. Areas Commun. 19(10), 1931–1941 (2001) 12. Naghshineh, M., Schwartz, M.: Distributed Call Admission Control in Mobile/Wireless Networks. IEEE J. Select. Areas Commun. 14(4), 711–717 (1996) 13. Levine, D.A., Akyildiz, I.F., Naghshineh, M.: A Resource Estimation and Call Admission Algorithm for Wireless Multimedia Networks Using the Shadow Cluster Concept. IEEE/ACM Trans. Networking 5(1), 1–12 (1997) 14. Ryu, S., Ryu, B.-H., Seo, H., Shin, M., Park, S.: Wireless Packet Scheduling Algorithm for OFDMA System Based on Time-Utility and Channel State. ETRI Journal 27(6), 777–787 (2005) 15. WiMAX Forum: WiMAX System Evaluation Methodology. v.2.01 (2007) 16. Next Generation Mobile Networks (NGMN), white paper: Next Generation Mobile Networks Radio Access Performance Evaluation Methodology. v.1.2 (2007) 17. Jalali, A., Padovani, R., Pankaj, R.: Data Throughput of CDMA-HDR a High Efficiency-High Data Rate Persona Communication Wireless System. In: Proc. of VTC 2000-spring, pp. 1854–1858 (2000) 18. 3GPP TS 22.105 v.8.4.0: Services and service capabilities (2007) 19. ITU–T G.1010 Recommendation: End–user multimedia QoS categories (2001)
Problems with Correct Traffic Differentiation in Line Topology IEEE 802.11 EDCA Networks in the Presence of Hidden and Exposed Nodes∗ Katarzyna Kosek1, Marek Natkaniec1, and Luca Vollero2 1
AGH University of Science and Technology, Department of Telecommunications, al. Mickiewicza 30, 30-056 Krakow, Poland 2 Consorzio Interuniversitario Nazionale per l'Informatica, University of Naples Naples, Italy {kkosek, natkaniec, vollero}@ieee.org
Abstract. The problem of content delivery with a required QoS is currently one of the most important. In ad-hoc networks it is IEEE 802.11 EDCA which tries to face this problem. This paper describes several EDCA line topology configurations with mixed priorities of nodes. Detailed conclusions about the innovative results help to understand the behavior of EDCA in the presence of hidden and exposed nodes. They reveal a strong unfairness in medium access between certain nodes dependent on their placement. They prove that for short lines a frequent inversion in the throughput levels of high and low priority traffic occurs and makes reliable content exchange impossible. The importance of the strength of the exposedness and hiddenness of nodes is also discussed. Furthermore, the usefulness of the four-way handshake mechanism is argued and descriptions of the known solutions to the hidden and exposed node problems are given. Finally, novel conclusions about EDCA are provided. Keywords: EDCA, hidden and exposed nodes, QoS, simulations.
1 Introduction License-free wireless LAN (WLAN) technologies allow wireless community networks to be created effortlessly. People can easily deploy WLANs to provide new services and exchange multimedia content from digital cameras, media centers, laptops, palmtops, mp3 recorders, mobile phones, camcorders, iPhones, etc. These new services include: streaming audio information like community radio, voice over IP (VoIP), video on demand (VoD), IP television (IPTv), online gaming using community game servers, neighborhood watch (providing surveillance, crime prevention and safety), p2p connections, shared Internet gateways and many others. In this article the authors focus on wireless ad-hoc networks which seem one of the most promising access technologies, and which will surely play an important role in the nearest future. ∗
This work has been realized under NoE CONTENT project no. 038423 within TA1 (Community Networks and Service Guarantees).
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 261–275, 2009. © Springer-Verlag Berlin Heidelberg 2009
262
K. Kosek, M. Natkaniec, and L. Vollero
These networks without infrastructure will allow users fast and easy configuration of wireless networks anytime and anywhere. They will play the role of community networks which greatly facilitate the network forming process and provide Internet access for neighborhood groups, small businesses, towns, schools, organizations, companies, and many others. They will become irreplaceable during conferences, project meetings, gatherings and situations in which fast network deployment is a crucial factor. Currently existing wireless networks have demonstrated that it is possible to efficiently deal with data services (e.g., Internet connectivity). Therefore, there is a growing expectation that these networks will efficiently deal with multimedia services as well. As an answer to the variety of the QoS (Quality of Service) requirements of different traffic types, the EDCA (Enhanced Distributed Channel Access) function was proposed [1]. However, the nature of ad-hoc networks makes the task of serving delay sensitive or bandwidth consuming traffic with a proper QoS very complicated. Therefore, it has been proven that EDCA tends to cease to function in imperfect conditions [10]. The two most troublesome characteristics of wireless networks are the following. Network users share a common radio channel, usually with limited access control, making traffic delivery fluctuant and unpredictable. Network capacity is also threatened by the problem of hidden and exposed nodes. The authors focus on the second issue, which they find more interesting. The authors have found it crucial to check if current ad-hoc networks are able to provide QoS in the most typical topologies. This paper focuses on line topology scenarios. The purpose of analyzing line topology networks is very simple. A good example of such a topology is an ad-hoc network in which nodes communicate with a gateway every time they access Internet services. At the same time, most of these nodes are out of range of the gateway and need to send their data through their neighboring nodes. Other examples are long distance multihop links using the same radio channel which could be used in rural areas where access to the infrastructure part of a network is highly limited. The analysis of several basic line topology networks was described in [10]. Due to the unsatisfying results, it has encouraged the authors to analyze more complicated configurations, i.e., configurations with mixed priorities of nodes. The analysis provided in this article helps to draw several novel conclusions about EDCA based ad-hoc networks. Among many consequences arising from the presence of hidden and exposed nodes within an ad-hoc network, the following seem the most important: (a) unfairness in granting medium access between different nodes, strongly dependent on their placement, (b) severely distorted order of the throughput levels of the access categories and frequent prioritizing of low priority traffic over high priority traffic, and (c) the inability of the four-way handshake mechanism to meaningfully improve the measured network performance. The remainder of this paper is organized as follows. Section 2 contains the state-ofthe-art in which the most important solutions of the hidden and exposed node problems and the EDCA function are described. Section 3 and Section 4 contain the simulation scenarios and simulation results, respectively. The concluding remarks are given in Section 5.
Problems with Correct Traffic Differentiation in Line Topology
263
2 State-of-the-Art In this section the following issues are briefly described: the EDCA function and the most important solutions of the hidden and exposed node problems. The description is aimed to organize the current knowledge about ad-hoc networks and their ability to satisfy the QoS requirements of different traffic classes. Additionally, the known flaws of the chief solutions of the hidden and exposed node problems are stressed in this section. 2.1 EDCA Function of the IEEE 802.11 Standard The IEEE 802.11 standard defines two medium access functions with QoS support – EDCA (Enhanced Distributed Channel Access) and HCCA (Hybrid Coordination function Channel Access). EDCA is described in more details next because it was designed for the purpose of ad-hoc networks. For more details on HCCA, designed for infrastructure networks, see [1]. The EDCA mechanism defines several extensions to the traditional medium access procedure (Carrier Sense Multiple Access with Collision Avoidance) in order to assure the transportation of different traffic types with a proper QoS. It introduces four Access Categories (AC) differenced by their access parameters. They are Voice (VO), Video (VI), Best Effort (BE) and Background (BK). Since VO and VI are more jitter, delay and packet loss sensitive they have a higher priority than BE and BK. Inside a QoS node each frame of a particular traffic stream is mapped into an appropriate AC and then it is buffered into an appropriate hardware queue (Figure 1). For each frame the probability of being granted the channel access depends on the access parameters of the AC it belongs to. These parameters are Arbitration Inter-Fame Space Number ― AIFSN[AC], Contention Window minimum and maximum boundary limits ― CWmin[AC] and CWmax[AC], and Transmission Opportunity Limit ― TXOPlimit[AC]. The impact of these parameters on channel access prioritization is shown in Figure 2.
Fig. 1. Mapping into ACs [1]
264
K. Kosek, M. Natkaniec, and L. Vollero
Fig. 2. Channel access prioritization [1]
The Backoff value is set to a random number from the number interval [0, CW]. Initially CW is set to CWmin[AC] but it increases (up to CWmax[AC]) whenever this AC is involved in a collision. AIFS[AC] is given by the following equation:
AIFSN[AC] = SIFS + AIFS[AC] x SlotTime
(1)
Every QoS node is assigned the right to transmit after the medium was sensed idle for AIFS[AC] and when the Backoff time has elapsed. Therefore, the smaller the AIFSN[AC] and the CW sizes, the higher the probability of being granted access to the wireless medium before other ACs. Two types of collisions may occur during the EDCA channel access procedure ― virtual and physical. A virtual collision happens when more than one AC is granted the right to transmit at the same time. In such a case, a QoS node is obliged to send the higher priority frame and delay the lower priority ones. A physical collision occurs when two or more QoS nodes start their transmissions over the wireless medium simultaneously. The second type of collisions is common for hidden and exposed nodes. 2.2 Examples of Solutions of the Hidden and Exposed Node Problems Solutions to the hidden and exposed node problems can be divided into the three groups: (a) sender-initiated, (b) receiver-initiated and (c) hybrid solutions. The most known sender-initiated mechanism which minimizes the destructive effects of hidden nodes is the four-way handshake [1]. It uses four different types of frames: Request to Send (RTS), Clear to Send (CTS), Data (DATA) and Acknowledgement (ACK). Unfortunately, the four-way handshake mechanism has several disadvantages. Firstly, it is unable to eliminate the problem of hidden nodes when the network is multihop. Secondly, it cannot solve the problem of exposed nodes at all. Additionally, the four-way handshake consumes bandwidth even if no hidden nodes appear within the network. Furthermore, due to the exchange of additional signaling frames (i.e., RTS and CTS), the mechanism is unsuitable for delay sensitive traffic. An improvement to the four-way handshake is Multiple Access with Collision Avoidance for Wireless (MACAW, [4]) where five different types of frames are exchanged: RTS, CTS, Data Sending (DS), DATA and ACK. In order to increase the per-node fairness MACAW involves an additional Request to RTS (RRTS) control frame. The biggest weakness of MACAW is the unsolved exposed node problem and furthermore, the increased signaling overhead. Another solution is the RTSS/CTSS mechanism [5]. This solution involves new types of RTS and CTS frames, namely RTS-Simultaneously and CTS-Simultaneously, in order to coordinate concurrent
Problems with Correct Traffic Differentiation in Line Topology
265
transmissions over exposed links. The main drawback of the RTSS/CTSS method is the requirement of modification in the PHY layer which prevents its implementation in currently available hardware. The most known receiver-initiated protocol is Multiple Access Collision Avoidance By Invitation (MACA-BI, [6]) where a three-way handshake mechanism (CTS/DATA/ACK) is invoked for every frame transmission. However, the mechanism is suitable only for infrastructure networks. In ad-hoc networks polling a node without packets to be sent is a waste of time. Hybrid solutions, e.g. [7], are built on the basis of both the sender- and receiverinitiated mechanisms. Their main aim is to combine the advantages and eliminate the main weaknesses of the previous solutions. These mechanisms assure better fairness and decrease end-to-end delay. However, they cannot guarantee QoS for delay sensitive traffic and were tested only in pure DCF environments. Apart from the mentioned protocols, there exists a family of mechanisms which involve busy tone signals in order to combat the hidden and/or exposed node problems. The most known solution is Dual Busy Tone Multiple Access (DBTMA, [8]). DBMA uses two busy tone signals and two sub-channels to avoid hidden and exposed nodes. However, it does not take into account the possible interference on the control channel and does not involve ACKs. Resigning from ACKs seems illogical in the case of the unreliable wireless channel. Another solution is Floor Acquisition Multiple Access with Non-persistent Carrier Sensing (FAMA-NCS, [9]). It takes advantage of using long CTS frames, which aim to prevent any contending transmissions within the receiver range. Unfortunately, this scheme requires all nodes to hear the interference which makes the mechanism inefficient in case of short DATA frames. To the authors’ best knowledge, a good solution of the hidden and exposed node problems for EDCA based networks does not exist. Most of the current solutions are mainly based on the four-way handshake mechanism or mechanisms similar to it (e.g., [11]-[14]). To summarize, even though there are several concurrent solutions to the four-way handshake mechanism in the literature, none of them have become popular enough to be broadly used. Additionally, in order to deal with the hidden node problem, the IEEE 802.11 standard suggests the use of the four-way handshake method and does not recommend any protocol to deal with the exposed node problem. Therefore, as the best candidate, only this protocol is analyzed during the conducted tests.
3 Simulation Scenarios The simulation analysis was performed with the use of an improved version of the TKN EDCA implementation [2] for the ns2 simulator. The adjustments made mostly affect, but are not limited to, the four-way handshake mechanism which was not supported by the original version of the TKN EDCA patch and the process of handling duplicate frames. All important simulation parameters are given in Table 1 and Table 2. The authors assumed that all nodes send CBR traffic with a varying sending rate. DSSS is used at the PHY layer and the EDCA function is set as the MAC layer type. In all configurations, nodes form line topology networks in which each node can only detect the transmissions of its nearest neighbors. The number of nodes changes from 3
266
K. Kosek, M. Natkaniec, and L. Vollero Table 1. EDCA parameter set [1] Access Category VO VI BE BK
CWmin[AC] 7 15 31 31
CWmax[AC] 15 31 1023 1023
AIFSN[AC] 2 2 3 7
TXOP 0 0 0 0
Table 2. General simulation parameters SIFS IFQ length Tx Range Frame Size CS Range
10 µs 5000 frames 250 m 1000 B 263 m
DIFS Slot Time Tx Power Traffic Type Node Distance
50 µs 20 µs 0.282 W CBR/UDP 200 m
to 5 depending on the configuration (see Figure 3). The analysis is performed on a singlehop basis because the authors focus only on the MAC layer. IP layer connections are out of the scope of this paper. The main aim of the performed tests lies in showing how serious is the impact of hidden and exposed nodes on EDCA performance and, furthermore, how much it depends on the configured priorities. In all simulations the packet generation times of different nodes are not synchronized and a frame size of 1000 B is assumed for all traffic priorities. This assumption is made, primarily, in order to compare the four EDCA queues under similar conditions and, secondly, to avoid ineffective transmissions of small DATA frames. For all configurations two cases are analyzed: basic channel access (DATA/ACK) and the four-way handshake mechanism (RTS/CTS/DATA/ACK). Furthermore, configurations with only VO and BK priorities of nodes are tested because their performance is very similar to that of VI and BE, respectively [10]. In all figures the error of each simulation point for a 95 % confidence interval does not exceed ± 2 %.
Fig. 3. Simulated networks
4 Simulation Results Several different configurations of line topology scenarios with mixed priorities were analyzed. The most interesting ones are given in Table 3. In all configurations the carrier sensing range for all nodes was set to 263 m in order to achieve hidden and exposed nodes within a network (c.f., Figure 3 and Table 2 in which node distance equals 200 m).
Problems with Correct Traffic Differentiation in Line Topology
267
Table 3. Configurations of line topology networks with mixed priorities Network
Three-node line
Four-node line
Five-node line
Configuration No.
1
2
3
1
2
1
2
3
N0
VO
VO
BK
VO
BK
BK
BK
VO
N1
BK
VO
BK
BK
VO
BK
VO
BK
N2
VO
BK
VO
BK
VO
VO
BK
BK
N3
-
-
-
VO
BK
BK
VO
BK
N4
-
-
-
-
-
BK
BK
VO
To simplify understanding of the behavior of nodes, the authors present the throughput values obtained by different nodes (under every network load) and frame loss (under maximum network load) for all configurations. The following types of frame losses are taken into account:
Throughput [KB/s]
250 200 150 100 50 0 0
500
1000 1500 2000 Total offered load [KB/s]
N0/N2(Vo) RTS on N0/N2(Vo) RTS off
2500
3000
N1(BK) RTS on N1(BK) RTS off
No. of Frames Lost (Saturation)
(a) 500000 450000 400000 350000 300000 250000 200000 150000 100000 50000 0 N0/N2(Vo), RTS on
N1(BK), RTS on IFQ
ARP
N0/N2(Vo), RTS off
DUP
RET
N1(BK), RTS off
COL
(b) Fig. 4. 3-node line: throughput (a) and frame losses (b) in Configuration 1
268
K. Kosek, M. Natkaniec, and L. Vollero
•
• • • •
Duplicate (DUP) drops– the result of collisions of either DATA and ACK frames or RTS and ACK frames caused mainly by the exposedness of nodes. The collision on an ACK frame causes a retransmission of the corresponding DATA frame. As a result, the node which previously sent the ACK receives the same DATA frame and drops it. Collisions (COL) – occur when DATA frames are lost due to a collision. Retransmission (RET) drops – occur when frames are dropped due to the transgression of the short or long retransmission limits. IFQ drops – frames dropped in the MAC queues. ARP drops – the result of not receiving ARP replies.
The authors put a stress on the most important losses in each analyzed configuration in order to clearly explain the figures representing throughput. All presented values are normalized per-node and per-collision domain in order to simplify the comparison of nodes from different collision domains. 4.1 Three-Node Network According to the IEEE 802.11 standard, the higher the priority the more often nodes may compete for medium access. As a result, in the case of a three node line network (Figures 4-6), the hidden nodes (N0 and N2) experience a higher number of collisions when their priority is higher. In the case of this network the RTS/CTS mechanism
Throughput [KB/s]
250 200 150 100 50 0 0
500
1000
1500
2000
2500
3000
Total offered load [KB/s] N0(Vo), RTS on N0(Vo), RTS off
N1(Vo), RTS on N1(Vo), RTS off
N2(BK), RTS on N2(BK), RTS off
No of. Frames Lost (Saturation)
(a) 350000 300000 250000 200000 150000 100000 50000 0 N0(Vo), RTS on
N1(Vo), RTS on IFQ
N2(BK), RTS on ARP
N0(Vo), RTS off
DUP
RET
N1(Vo), RTS off
N2(BK), RTS off
COL
(b) Fig. 5. 3-node line: throughput (a) and frame losses (b) in Configuration 2
Problems with Correct Traffic Differentiation in Line Topology
269
Throughput [KB/s]
600 500 400 300 200 100 0 0
500
1000
1500
2000
2500
3000
Total offered load [KB/s] N0(BK), RTS on N0(BK), RTS off
N1(BK), RTS on N1(BK), RTS off
N2(Vo), RTS on N2(Vo), RTS off
No. of Frames Lost (Saturation)
(a) 350000 300000 250000 200000 150000 100000 50000 0 N0(BK), RTS on
N1(BK), RTS on IFQ
N2(Vo), RTS on ARP
N0(BK), RTS off
DUP
RET
N1(BK), RTS off
N2(Vo), RTS off
COL
(b) Fig. 6. 3-node line: throughput (a) and frame losses (b) in Configuration 3
eliminates the problem of DATA collisions. Therefore, if it is enabled the number of COLs counted for the hidden nodes decreases practically to zero. As a consequence, N1 does not have to wait long for its data transmission and has more chances to send its traffic (Figure 4). At the same time, the throughput achieved by the hidden nodes is unsatisfactorily low. This is because of permanent collisions of RTS frames sent by these nodes. Additionally, comparing Configurations 1-3, it can be seen that the synchronization of hidden nodes sending high priority traffic is the most severe problem for Configuration 1. This synchronization is a result of the fact that 1000 B frames need more than 36 time slots (for DSSS) for an uninterrupted transmission. Unfortunately, for VO the value of CWmax is equal to 15 and, therefore, the unacceptable number of collisions, causing a severe reduction of throughput, is unavoidable (Figure 4). Furthermore, the priority of the unhidden node N1 influences the throughput values of the hidden nodes. However, this is not the main cause of the problems with serving high priority streams by EDCA. In Figure 5 it is shown, that the high priority of N1 does not completely degrade the performance of the hidden node with the high priority traffic. On the other hand, Figure 6 shows that the unhidden N1 may be favored over the hidden N0 when they transmit traffic of the same priority. Furthermore, it appears that enabling RTS/CTS is not always reasonable because it may decrease the throughput values (Figure 5 and Figure 6).
270
K. Kosek, M. Natkaniec, and L. Vollero
Throughput [KB/s]
160 120 80 40 0 0
500
1000
1500
2000
2500
3000
3500
Total offered load [KB/s] N0/N3(Vo) RTS on
N1/N2(BK) RTS on
N0/N3(Vo) RTS off
N1/N2(BK) RTS off
(a) No. of Frame Lost (Saturation)
270000 250000 230000 210000 190000 170000 150000 N0/N3(Vo), RTS on
N1/N2(BK), RTS on IFQ
ARP
N0/N3(Vo), RTS off DUP
RET
N1/N2(BK), RTS off
COL
(b) Fig. 7. 4-node line: throughput (a) and frame losses (b) in Configuration 1
4.2 Four-Node Network In the four-node line scenario enabling RTS/CTS degrades its performance (Figures 7-8). This happens especially when the middle nodes transmit high priority traffic (Figure 8). This behavior can be explained by the high number of RET drops and, furthermore, by the increased signaling overhead. It is interesting that in Configuration 1 (Figure 7) the exposed nature of the middle nodes plays the most important role (i.e., they experience a lot of DUP drops) and in Configuration 2 (Figure 8) their hidden nature (i.e., they do not experience many DUP drops) is influenced. The main reason is that in Configuration 1 the side nodes send VO (causing multiple collisions on ACK frames from either N1 or N2 as well as ARP drops) and in Configuration 2 the middle nodes send VO (causing mostly collisions on either frames from N0, N3 or from themselves). In both cases nodes sending VO traffic have a significant number of IFQ drops because after each COL they have to resend the collided DATA frame and cause meaningful delays in sending new DATA frames. Therefore, also with RTS/CTS disabled, nodes with low priority traffic win medium access most often. 4.3 Five-Node Network The performance of the five-node line appeared similar to the performance of the sixand the seven-node line (c.f., [10]), therefore, the authors decided to test only this
Problems with Correct Traffic Differentiation in Line Topology
271
Throughput [KB/s]
400 300 200 100 0 0
500
1000
1500
2000
2500
3000
3500
Total offered load [KB/s] N0/N3(BK) RTS on
N1/N2(Vo) RTS on
N0/N3(BK) RTS off
N1/N2(Vo) RTS off
(a) No. of Frame Lost (Saturation)
350000 300000 250000 200000 150000 100000 50000 0 N0/N3(BK), RTS on
N1/N2(Vo), RTS on IFQ
ARP
N0/N3(BK), RTS off DUP
RET
N1/N2(Vo), RTS off
COL
(b) Fig. 8. 4-node line: throughput (a) and frame losses (b) in Configuration 2
network with mixed traffic priorities. The most interesting conclusion from the analysis of this network is that in general high priority traffic is favored over low priority traffic regardless of the position of the transmitting nodes (Figures 9-11). Obviously, a similar behavior is expected for the six- and the seven-node networks. In Configuration 1 (Figure 9) the network performance is self-explainable. The middle node is the only one with the high priority. It collides mostly with the side nodes sending low priority traffic which do not win medium access very often. In Configuration 3 (Figure 11) the situation is again very simple. This time only the side nodes have high priority and, as a result, may send their traffic most often. They collide mostly with the middle node, which sends low priority traffic. The most interesting is Configuration 2 (Figure 10) in which N1 and N3 (sending VO) can collide with any node. Additionally, due to their placement, they experience some DUP drops. Under a small network load, the number of DUP drops as well as the number of RET drops is the highest for N1 and N3, however, the number of IFQ drops is the smallest. As a result, their throughput is visibly limited. As the network load grows, also the throughput of N1 and N3 increases. This performance can be explained by the strong hiddenness of the middle node N2. This node experiences the highest number of IFQ drops under practically every network load. However, with the increase of the offered load its inferiority becomes even more evident. Obviously, nodes sending the same
K. Kosek, M. Natkaniec, and L. Vollero
Throughput [KB/s]
300
225
150
75
0 0
1000
2000
3000
4000
5000
6000
Total offered load [KB/s]
N0/N4(BK) RTS on N0/N4(BK) RTS off
N1/N3(BK) RTS on N1/N3(BK) RTS off
N2(Vo) RTS on N2(Vo) RTS off
(a) No. of Frames Lost (Saturation)
390000 370000 350000 330000 310000 290000 270000 250000 N0/N4(BK), N1/N3(BK), RTS on RTS on
IFQ
N2(Vo), RTS on
ARP
N0/N4(BK), N1/N3(BK), RTS off RTS off
DUP
RET
N2(Vo), RTS off
COL
(b) Fig. 9. 5-node line: throughput (a) and frame losses (b) in Configuration 1
Throughput [KB/s]
200
150
100
50
0 0
1000
2000
3000
4000
5000
6000
Total offered load [KB/s]
N0/N4(BK) RTS on N0/N4(BK) RTS off
N1/N3(Vo) RTS on N1/N3(Vo) RTS off
N2(BK) RTS on N2(BK) RTS off
(a) 360000 No. of Frames Lost (Saturation)
272
350000 340000 330000 320000 310000 300000 290000 N0/N4(BK), N1/N3(Vo), RTS on RTS on
IFQ
N2(BK), RTS on
ARP
N0/N4(BK), N1/N3(Vo), RTS off RTS off
DUP
RET
N2(BK), RTS off
COL
(b) Fig. 10. 5-node line: throughput (a) and frame losses (b) in Configuration 2
Problems with Correct Traffic Differentiation in Line Topology
273
Throughput [KB/s]
600
450
300
150
0 0
1000
2000
3000
4000
5000
6000
Total offered load [KB/s]
N0/N4(Vo) RTS on N0/N4(Vo) RTS off
N1/N3(BK) RTS on N1/N3(BK) RTS off
N2(BK) RTS on N2(BK) RTS off
(a) No. of Lost Frames (Saturation)
400000 350000 300000 250000 200000 150000 N0/N4(Vo), N1/N3(BK), RTS on RTS on
IFQ
N2(BK), RTS on
ARP
N0/N4(Vo), N1/N3(BK), RTS off RTS off
DUP
RET
N2(BK), RTS off
COL
(b) Fig. 11. 5-node line: throughput (a) and frame losses (b) in Configuration 3
priority traffic do not achieve the same throughput. This observation leads to a conclusion that also in this configuration the problem of unfairness between certain nodes appears and it depends on the nodes’ positions. Additionally, similarly to the fournode line, in all analyzed configurations enabling RTS/CTS decreased the obtained throughput values.
5 General Conclusions This paper presents a novel simulation study of line topology ad-hoc networks based on EDCA in which nodes are assigned different priorities. The problems caused by the hiddenness and exposedness of nodes are commented in details. Most of all, it is noticed that the analyzed networks are unable to transport high priority traffic with a desired QoS. Moreover, the paper argues the usefulness of the use of the RTS/CTS mechanism which, in most cases, does not improve the performance of the simulated networks but rather causes a decrease of the obtained throughput values. The most important conclusions are the following. When we look at the three-node line network, the problem of synchronization of hidden nodes is the most severe one. Especially with RTS/CTS disabled, it causes a strong reduction of the throughput values of the high priority streams. The priority of the middle node is also crucial because it influences the values of the throughput levels of the neighboring nodes, however, it is not the main reason of the inability in providing their streams with a
274
K. Kosek, M. Natkaniec, and L. Vollero
desired QoS. The performance of the four-node line topology is the most unpredictable because nodes which were previously prioritized are no longer superior. This can be explained by either the strong hiddenness or exposedness of the middle nodes, depending on the actual network configuration. The performance of longer line topologies is much better. In all analyzed cases, nodes sending high priority traffic were favored over nodes sending low priority traffic. Therefore, their performance was close to that which is required by the IEEE 802.11 standard. Unfortunately, the unfairness in granting medium access between nodes sending traffic of the same priority was also revealed. On the basis of all observations, the authors find dealing with line topology networks with mixed priority traffic as the most troublesome when the lines are short. However, from a wider perspective, the performance of all measured networks is completely unacceptable. Every simulation scenario disclosed a severe unfairness in granting medium access between certain nodes. Therefore, the authors find it crucial to find a novel mechanism which will improve the fairness between nodes and make the traffic delivery reliable even if hidden or exposed nodes are present within a network. In particular, they think that the awareness of nodes should increase. For this reason, their future research will be focused on defining new metrics, which should be taken into account during the design of a new MAC protocol.
References 1. IEEE 802.11 Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications. IEEE Inc. (2007) 2. TKN EDCA 802.11e extension (2006), http://www.tkn.tuberlin.de/research/802.11e_ns2 3. Bai, X., Mao, Y.M.: The Impact of Hidden Nodes on MAC Layer Performance of Multihop Wireless Networks Using IEEE802.11e Protocol. In: International Conference on Wireless Communications, Networking and Mobile Computing, WiCom 2007 (2007) 4. Bharghavan, V., Demers, A., Shenker, S., Zhang, L.: MACAW: A Media Access Protocol for Wireless LAN’s. In: ACM SIGCOMM 1994 (1994) 5. Mittal, K., Belding, E.M.: RTSS/CTSS: Mitigation of Exposed Terminals in Static 802.11based Mesh Networks. In: The 2nd IEEE Workshop on Wireless Mesh Networks, WiMesh 2006 (2006) 6. Talucci, F., Gerla, M., Fratta, L.: MACA-BI (MACA by invitation) - A Receiver Oriented Access Protocol for Wireless Multihop Networks. In: The 8th IEEE International Symposium on Personal, Indoor and Mobile Radio Communications, IEEE PIMRC 1997 (1997) 7. Wang, Y., Garcia-Luna-Aceves, J.J.: A New Hybrid Channel Access Scheme for Ad Hoc Networks. In: The 1st IFIP Annual Mediterranean Ad Hoc Networking Workshop, MedHocNet 2002 (2002) 8. Haas, Z.J., Deng, J.: Dual Busy Tone Multiple Access (DBTMA) – A Multiple Access Control for Ad Hoc Networks. IEEE Transactions on Communications (2002) 9. Fullmer, C.L., Garcia-Luna-Aceves, J.J.: Solutions to Hidden Terminal Problems in Wireless Networks. In: ACM SIGCOMM 1997 (1997) 10. Kosek, K., Natkaniec, M., Pach, A.R.: Analysis of IEEE 802.11e Line Topology Scenarios in the Presence of Hidden Nodes. In: Coudert, D., Simplot-Ryl, D., Stojmenovic, I. (eds.) ADHOC-NOW 2008. LNCS, vol. 5198, pp. 380–390. Springer, Heidelberg (2008)
Problems with Correct Traffic Differentiation in Line Topology
275
11. Hamidian, A., Körner, U.: Providing QoS in Ad Hoc Networks with Distributed Resource Reservation. In: Mason, L.G., Drwiega, T., Yan, J. (eds.) ITC 2007. LNCS, vol. 4516, pp. 309–320. Springer, Heidelberg (2007) 12. Ying, Z., Ananda, A.L., Jacob, L.: A QoS Enabled MAC Protocol for Multi-hop Ad Hoc Wireless Networks. In: The 22nd IEEE International Performance, Computing, and Communications Conference, IPCCC 2003 (2003) 13. Benveniste, M., Tao, Z.: Performance Evaluation of a Medium Access Control Protocol for IEEE 802.11s Mesh Networks. In: IEEE Sarnoff Symposium (2006) 14. Choi, S., Kim, S., Lee, S.: The Impact of IEEE 802.11 MAC Strategies on Multi-hop Wireless Mesh Networks. In: The 2nd IEEE Workshop on Wireless Mesh Networks, WiMesh 2006 (2006)
Adaptive and Iterative GSC/MRC Switching Techniques Based on CRC Error Detection for AF Relaying System Jong Sung Lee and Dong In Kim School of Information and Communication Engineering, Sungkyunkwan University Suwon 440-746, Korea {wwwljs,dikim}@ece.skku.ac.kr
Abstract. In cooperative amplify-and-forward (AF) system using spatially distributed relays, there exist some relay paths degrading performance because of noise enhancement. In order to alleviate the deteriorating effect of such paths, we propose adaptive generalized selection combining (GSC)/MRC and iterative GSC/MRC switching techniques based on cyclic-redundancy-check (CRC) error detection. The adaptive GSC (with order N )/MRC switching with K relays operates as follows: the destination performs initial data detection by combining the strongest N (< K + 1) paths, and CRC is checked for error detection. If an error is detected, MRC with all the K + 1 paths is then performed. To further generalize, the iterative GSC/MRC is employed in an iterative manner conditioned on successive error detection. Our simulation results show that the proposed schemes show better diversity performance than conventional one in terms of the bit-error rate (BER), when an appropriate CRC generator polynomial is used depending on the frame size. The diversity gain of the proposed schemes become far prominent when imperfect channel estimation is assumed at the destination. Keywords: Cooperative AF system, cooperative diversity, CRC, adaptive GSC/MRC, iterative GSC/MRC.
1
Introduction
Diversity techniques are used widely to mitigate the effects of fading in wireless communications, and have several kinds such as time, frequency, and spatial diversity, etc. Among them, spatial diversity is very attractive, because it can be easily achieved without any delay or rate loss. Spatial diversity is a common method of generating multiple communication paths by using more than one antenna at the transmitter and/or the receiver; however, the terminals cannot
This research was supported by the MKE(Ministry of Knowledge Economy), Korea, under the ITRC(Information Technology Research Center) support program supervised by the IITA(Institute for Information Technology Advancement)” (IITA-2009-C1090-0902-0005).
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 276–285, 2009. c Springer-Verlag Berlin Heidelberg 2009
Adaptive and Iterative GSC/MRC Switching Techniques
277
practically employ multiple antennas due to their space limitations. In this case, many works have found that distributed relays, each equipped with a single antenna, can help the source transmit signals to the destination, namely cooperative system [1], [2]. Since the cooperation among relays can provide the spatial diversity to the system, it extends cell coverage and improves capacity. Recently, various cooperative protocols involving relays have been proposed in [3]: representatively, the amplifier-and-forward (AF) protocol and the decodeand-forward (DF) protocol. In cooperative AF system, relays amplify the received signal from the source and retransmit to the destination. In cooperative DF system, on the other hand, relays decode the received signal from the source and transmit the regenerated signal to the destination, where more complex processing is required than AF system at the relays [3]. Moreover, while the full diversity can be achieved with AF relays using maximal-radio combining (MRC) [4], it cannot be done in DF relays using MRC [5]. Since cooperative AF system can achieve the diversity gain of order up to the number of diversity paths, conventional MRC combines all the paths from a source and relays. However, it might not be optimal because of spectrum efficiency loss due to an increased orthogonal channel allocation. Moreover, when channel condition is bad or channel estimation is imperfect at the destination, there could be some path degrading the performance due to noise enhancement [6]. More flexible scheme was recently proposed in [7], where the destination decides to combine the signals in the earliest N paths or all the K + 1 paths according to error detection, in order to reduce the burden of MRC with all the diversity paths and improve the outage probability. Here, the error detection assumes perfect, and it depends on the signal-to-noise ratio (SNR) at the destination, where an error is declared if the received SNR is lower than a certain threshold value. In this paper, we propose adaptive generalized selection combining (GSC)/ MRC and iterative GSC/MRC switching techniques, based on cyclic-redundancycheck (CRC) error detection, for cooperative AF system. The source transmits data which consist of information bits and parity bits calculated by a CRC generator polynomial. Then, the destination first performs maximal-ratio combining with relatively strongest (highest SNR) N paths among a direct path (a S-D link) and relay paths (S-R-D links), and performs CRC for error detection. If any error is detected, the destination then performs MRC with all the K + 1 paths. This idea is similar to that given in [7]. However, we consider a more practical error detection approach (e.g. CRC) and GSC with the strongest paths in perfect and imperfect channel estimation environments. Our simulation results demonstrate that our proposed combining schemes can improve the bit-error rate (BER) performance of cooperative AF system. Particularly, when channel estimation error exists at the destination, the diversity gain of the proposed schemes considerably increases. In the following Section 2 we describe the cooperative multiple AF system considered herein. The proposed schemes are presented in Section 3. Simulation results are presented in Section 4, and conclusion is given in Section 5.
278
2
J.S. Lee and D.I. Kim
System Model
We consider the cooperative multiple AF system in which K relay nodes help a source node to communicate with a destination node as shown in Fig. 1 [8]. All the nodes are half-duplex and thus cannot receive and transmit simultaneously. The transmission is performed in time-slotted orthogonal channels, because the number of available frequency bands is limited. We assume that the fading coefficients of the channels remain constant in a frame ,i.e., fast-fading, and hs,d , hs,ri and hri ,d are an independent and identically distributed (i.i.d.) slow fading channel gains, respectively, from source to destination, source to ith relay and ith 2 2 relay to destination. Here, we assume hs,d ∼ CN (0, σs,d ), hs,ri ∼ CN (0, σs,r ) i 2 2 2 2 2 2 and hri ,d ∼ CN (0, σri ,d ) with σs,d = E[|hs,d | ], σs,ri = E[|hs,ri | ] and σri ,d = E[|hri ,d |2 ]. 2.1
Cooperative System
We consider a cooperative strategy in two-phase manner. In the first phase, the source broadcasts its data, which consist of information bits and parity bits calculated by a CRC generator polynomial, to the destination and K relay nodes. The received signals ys,d and ys,ri at the destination and the ith relay are given, respectively, by ys,d = ys,ri =
√ ps hs,d x + ηs,d ,
√ ps hs,ri x + ηs,ri ,
i ∈ [1, K],
(1)
(2)
where ps is the transmitted source power, ηs,d and ηs,ri are an additive white Gaussian noise (AWGN) at the source and at ith relay ∼ CN (0, N0 ). In the second phase, the received signals yri ,d at the destination are given by yri ,d = αi hri ,d hs,ri x + ηri ,d ,
i ∈ [1, K],
UU UU UU
Fig. 1. Cooperative multiple AF system model
(3)
Adaptive and Iterative GSC/MRC Switching Techniques
279
where ηri ,d is an AWGN ∼ CN (0, N0 ) and αi is the amplifier gain at the ith relay. Since the available energy at the ith relay is given by pi , the amplifier gain αi should satisfy [1] pi αi ≤ , i ∈ [1, K]. (4) ps |hs,ri |2 + N0 We assume that equality holds in (4) because SNR is maximized in that case. 2.2
Sufficient Statistic
The output of the conventional MRC with all K+1 paths is given by [8] √
√ K ps h∗s,d ps αi h∗ri ,d h∗s,ri ys,d + yr ,d 2 N0 (αi |hri ,d |2 + 1)N0 i i=1 √ K ps |hs,d |2 ps α2i |hri ,d |2 |hs,ri |2 = + x N0 (α2i |hri ,d |2 + 1)N0 i=1 √ K ps h∗s,d ps α2i h∗ri ,d h∗s,ri + ηs,d + N0 (α2i |hri ,d |2 + 1)N0 i=1
y=
+ (ηs,d + hri ,d αi ηs,ri ).
(5)
The instantaneous SNRs of both the direct-path and relay-paths at the destination are given, respectively, by [9] γdirect =
γrelay,i =
ps |hs,d |2 , N0
ps α2i |hri ,d |2 |hs,ri |2 , (α2i + |hri ,d |2 + 1)N0
(6)
i ∈ [1, K].
(7)
In our proposed schemes, to obtain the strongest paths up to N among the K +1, sorting {γdirect , γrelay,i }i∈[1,K] in a descending order, resulting in {γ(j) }j∈[1,K+1] , where γ(1) ≥ · · · ≥ γ(K+1) holds. Next, combine the highest SNR paths up to N among the K + 1. 2.3
Channel Estimation Error Model
Most of the cooperative AF systems have assumed perfect channel estimation at the destination. Channel estimation can be aided by transmitting pilot symbols that are known at the destination.Therefore, system performance depends on the accuracy of channel estimate and the number of pilot symbols being sent. In practice, it is desirable to limit the number of transmitted pilot symbols, because pilot symbols not only reduce the spectrum efficiency but also consume additional transmit power. In order to examine a practical system performance,
280
J.S. Lee and D.I. Kim
the assumption that perfect channel estimation is available at the destination must be removed. Therefore, we define the imperfect channel estimation at the destination as ˆ x,y = hx,y + ex,y , h
(8)
where ex,y is the channel estimation error of each path at the destination [10]. If we assume that the channel is estimated by a minimum mean square error ˆ x,y |2 ] = E[|hx,y |2 ] + (MMSE) estimator, the orthogonality principle implies E[|h 2 E[|ex,y | ] [11]. We further assume that the variance of the channel estimation 2 2 error is equal to βσx,y , i.e., E[|ex,y |2 ] = βσx,y . Therefore, we define the channel estimation error model as ˆ x,y ∼ [CN (0, σ 2 ) + CN (0, βσ 2 )], h x,y x,y
(9)
where β is the channel estimation error rate of each path at the destination.
3
Proposed Algorithms
In this section, we present the algorithms of the proposed schemes. Since cooperative AF system with K relays can achieve the diversity gain of order up to the number of diversity paths, conventional MRC combines all K + 1 paths. However, conventional MRC might not be optimal, because it is more sensitive to channel estimation errors and there could be some paths degrading the performance due to noise enhancement. On the other hand, our proposed schemes combine the relatively strongest paths except the relatively weakest paths while CRC being checked for error detection. 3.1
Adaptive GSC/MRC Scheme
Fig. 2 shows the algorithm of adaptive GSC/MRC switching scheme based on CRC error detection. The destination performs an initial detection that is made by selecting the highest SNR paths up to N among the K + 1 available ones and maximal-ratio combining them. Then, CRC is checked to find any possible error(s) at the initial data detection. With no error detected, the initial data detection is declared correct and otherwise, i.e., if there is error(s), the MRC with K + 1 paths is performed. 3.2
Iterative GSC/MRC Scheme
The adaptive GSC/MRC can be generalized further to design an iterative GSC/ MRC that is operated in an iterative manner conditioned on successive error detection. Even though the complexity of the iterative scheme at the destination is proportional to the number of iteration, the iterative scheme improves the BER performance compared to the adaptive GSC/MRC. Therefore, it is essential to select a proper N corresponding to the number of relays. A detailed algorithm of the iterative GSC/MRC scheme based on CRC error detection is given in Fig. 3.
Adaptive and Iterative GSC/MRC Switching Techniques
281
Fig. 2. Algorithm of adaptive GSC/MRC switching scheme based on CRC error detection
3.3
CRC Polynomial Selection
A CRC is popular because it is simple to implement in binary hardware, particularly good at detecting common errors caused by noise in transmission channels [12]. In the proposed schemes, an overhead occurs because transmission data consist of information bits and parity bits calculated by a CRC generator polynomial, defining Γ as the ratio of the overhead to a frame size. Although the reliability of error detection is generally proportional to the order of CRC
Fig. 3. Algorithm of iterative GSC/MRC switching scheme based on CRC error detection
282
J.S. Lee and D.I. Kim
generator polynomial, it will saturate at a certain point. Therefore, to reduce the overhead, it is important to select a proper CRC generator polynomial corresponding to a frame size.
4
Simulation Results
In this section, some simulation results are presented to investigate the BER performance of the proposed schemes, assuming both perfect and imperfect channel estimation at the destination. We assume that the number of relays is K = 3, the adopted modulation is BPSK and the available power of each node is same as ps = pi = p, yielding the average input SNR γ = P/N0 . Especially, we consider an symmetric relay channel condition; thus the average channel gains 2 2 are assumed σs,d = σs,r = σr2i ,d . To look into the effect of different sizes of i generator polynomials relative to the frame size, we employ both CRC-4-IT U (CRC4, X 4 + X 3 + 1) and CRC-8-Dallas/M axim (CRC8, X 8 + X 5 + X 4 + 1) when the frame size is 128 and 256. 4.1
Perfect Channel Estimation
First, we compare conventional MRC with the proposed schemes in terms of the BER performance when channel estimation is perfect at the destination. In Fig. 4, we inspect an optimal order of GSC/MRC using CRC8. As seen in this figure, the GSC/MRC slightly outperforms the conventional MRC in the high SNR region, and the performance of the GSC (with order N =1)/MRC is almost similar to that of the GSC (with order N =2)/MRC. Since complexity at the destination is generally proportional to the order of GSC, we set N =1 in the following simulations. −1
10
−2
BER−−−−−−−−−−−−−−>
10
−3
10
−4
10
Conventional MRC GSC/MRC(N=1) GSC/MRC(N=2)
−5
10
0
3
6 SNR(dB)−−−−−−−−−−>
9
12
Fig. 4. Comparison of conventional MRC and adaptive GSC(N =1,2)/MRC switching scheme using 8 bit-CRC when K = 3 and frame size is 128
Adaptive and Iterative GSC/MRC Switching Techniques
−1
−1
10
10
−2
−2
10 BER−−−−−−−−−−−−−−>
BER−−−−−−−−−−−−−−>
10
−3
10
Conventional MRC GSC/MRC (CRC4) Iterative GSC (CRC4) GSC/MRC (CRC8) Iterative GSC (CRC8)
−4
10
−5
10
283
0
3
6 9 SNR(dB)−−−−−−−−−−> (A) frame size = 128
−3
10
Conventional MRC GSC/MRC (CRC4) Iterative GSC (CRC4) GSC/MRC (CRC8) Iterative GSC (CRC8)
−4
10
−5
12
10
0
3
6 9 SNR(dB)−−−−−−−−−−> (B) frame size = 256
12
Fig. 5. Comparison of conventional MRC and proposed schemes using 4 and 8 bit-CRC when K = 3 and channel estimation is perfect at the destination
Next, we compare conventional MRC with our proposed schemes using two different CRC generator polynomials which are CRC4 and CRC8. When frame size is 128, the overhead portion of two different CRC codes is Γ4 = 3% (CRC4) and Γ8 = 6% (CRC8), and when frame size is 256, the overhead portion of those is Γ4 = 3% (CRC4) and Γ8 = 6% (CRC8), respectively. Fig. 5(A) shows that both adaptive and iterative GSC/MRC outperform conventional MRC except for the adaptive GSC/MRC using CRC4 in the low SNR region when the frame size is 128. Since the overhead portion of using CRC4 is lower than the case of using CRC8, the use of CRC4 is attractive in this environment. On the order hand, when the frame size is 256, the proposed schemes using CRC8 outperform conventional MRC, while the proposed schemes using CRC4 rather degrade the BER performance compared to conventional MRC because of unreliable error detection, as shown in Fig. 5(A). These simulation results confirm that our proposed schemes must employ an appropriate CRC generator polynomial for reliable error detection. 4.2
Imperfect Channel Estimation
Here, we compare conventional MRC with the proposed schemes when channel estimation is imperfect, and assume that β = 5%, where β is the channel estimation error rate of each path at the destination. In contrast to the previous results, Figs. 6(A) and Figs. 6(B) show that the proposed schemes using appropriate CRC generator polynomial significantly outperform the conventional MRC in terms of the BER performance as the SNR increases. Moreover, we can see that the performance gap between adaptive GSC (with order N = 1)/MRC and iterative GSC/MRC becomes larger owing to successive error detection. However, since an excessive iteration increases the complexity, it is essential to select a proper N corresponding to the number of relays K.
284
J.S. Lee and D.I. Kim −1
−1
10
10
−2
−2
10 BER−−−−−−−−−−−−−−>
BER−−−−−−−−−−−−−−>
10
−3
10
Conventional MRC GSC/MRC (CRC4) Iterative GSC (CRC4) GSC/MRC (CRC8) Iterative GSC (CRC8)
−4
10
−5
10
0
3
6 9 SNR(dB)−−−−−−−−−−> (A) frame size = 128
−3
10
Conventional MRC GSC/MRC (CRC4) Iterative GSC (CRC4) GSC/MRC (CRC8) Iterative GSC (CRC8)
−4
10
−5
12
15
10
0
3
6 9 SNR(dB)−−−−−−−−−−> (B) frame size = 256
12
15
Fig. 6. Comparison of conventional MRC and proposed schemes using 4 and 8 bit-CRC when K = 3 and channel estimation is imperfect at the destination (β = 5%)
5
Conclusion
We have proposed adaptive and iterative GSC/MRC switching techniques to overcome both the noise enhancement and inaccurate channel estimation encountered in real cooperative multiple AF systems. The proposed schemes first combine a subset of relatively strongest paths and then switch to the MRC combining all paths, depending on error detection, which is performed by the reliable CRC error detection. Our simulation results showed that the proposed schemes outperform conventional MRC in terms of the BER performance when an appropriate CRC polynomial is selected for a given frame size. The diversity gain of the proposed schemes is well preserved even in a practical environment where channel estimation error exists at the destination.
References 1. Laneman, J.N., Wornell, G.W.: Energy-efficient antenna sharing and relaying for wireless networks. In: IEEE WCNC, pp. 7–12. IEEE Press, Chicago (2000) 2. Laneman, J.N.: Cooperative diversity in wireless networks: Algorithms and architectures. Ph.D. dissertation, Mass. Inst. Technol., Cambridge (2002) 3. Laneman, J.N., Tse, D.N.C.: Cooperative diversity in wireless networks: efficient protocols and outage behavior. IEEE Trans. Inform. Theory 50, 3062–3080 (2004) 4. Mo, W., Wang, Z.: Average symbol error probability and outage probability analysis for general cooperative diversity system at high signal to noise ratio. In: Conf. Inf. Sci. Syst., Princeton, pp. 1443–1448 (2004) 5. Adeane, J., Rodrigues, M.R.D., Wassell, I.J.: Characterisation of the performance of cooperative networks in Ricean fading channels. In: 12th Int. Conf. Telecommun., Cape Town (2005)
Adaptive and Iterative GSC/MRC Switching Techniques
285
6. Kim, D.I.: Adaptive selection/maximal ratio combining based on error detection of multidimensional multicode DS-CDMA. IEEE Trans. Commun. 52, 446–456 (2004) 7. Choi, W., Hong, J.P., Kim, D.I., Kim, B.-H.: An error detection aided GSC/MRC switching scheme in AF based cooperative communications. In: IEEE VTC 2009 Spring. IEEE Press, Barcelona (2009) 8. Brennan, D.G.: Linear diversity combining techniques. Proceedings of the IEEE 91, 331–356 (2003) 9. Seddik, K.G., Sadek, A.K., Su, W., Ray Liu, K.J.: Outage analysis of multi-node amplify-and-forward relay networks. In: IEEE WCNC, pp. 1184–1188. IEEE Computer Society Press, Los Alamitos (2006) 10. Vakili, A., Sharif, M., Hassibi, B.: The Effect of Channel Estimation Error on the Throughput of Broadcast Channels. In: IEEE ICASSP, pp. 29–32. IEEE Computer Society Press, Los Alamitos (2006) 11. Kailath, T., Sayed, A.H., Hassibi, B.: Linear Estimation. Prentice-Hall, Englewood Cliffs (2000) 12. Castagnoli, G., Ganz, J., Graber, P.: Optimum cyclic redundancy check codes with 16-bit redundancy. IEEE Trans. Commun. 38, 111–114 (1990)
WiBro Net.-Based Five Senses Multimedia Technology Using Mobile Mash-Up Jung-Hyun Kim, Hyeong-Joon Kwon, and Kwang-Seok Hong School of Information and Communication Engineering, Sungkyunkwan University, 300, Chunchun-dong, Jangan-gu, Suwon, KyungKi-do, 440-746, Korea {kjh0328, katsyuki}@skku.edu,
[email protected] http://hci.skku.ac.kr
Abstract. In this paper, we suggest and implement an enhanced Wireless Broadband (WiBro) Net.-based five senses multimedia technology using Webmap-oriented mobile Mash-up. The WiBro Net. in this applicative technology supports and allows Web 2.0-oriented various issues such as user-centric multimedia, individual multimedia message exchange between multi-users, and a new media-based information and knowledge sharing / participation without spatiotemporal-dependency. To inspect applicability and usability of the technology, we accomplish various experiments, which include 1) WiBro Net.based real-time field tests and 2) ISO 9241/11 and /10-based surveys on the user satisfaction by relative experience in comparison with the AP-based commercialized mobile service. As a result, this application provides higher data rate transmission in UP-DOWN air-link and wider coverage region. Also, its average System Usability Scale (SUS) scores estimated at 83.58%, and it relatively held competitive advantage in the specific item scales such as system integration, individualization, and conformity with user expectations.
1 Introduction Since the late 1990s, mobile multimedia applications and services that currently provided and supported by multiple specific network-centric architectures are migrating towards a single converged user-centric communications network, and have become an increasingly important market worldwide. Further mobile Web 2.0-oriented next generation mobile multimedia technology must sufficiently reflect individual sensory information with the progress of the next generation mobile stations and wireless Internet/communication technologies that include various network standards such as WiBro (Mobile WiMAX), GPRS, 3G, LTE UMTS/HSPA, EV-DO and some portable satellite-based systems. Namely, mobile Web 2.0-based user-centric multimedia technology focuses on the next generation wireless mobile applications and services, which will integrate collective intelligence-oriented participational five senses contents and entertaining media capabilities to better suit users’ needs in a converged, wireless Internet world, and sensor network-based ubiquitous computing environment. However, the related conventional studies [1~3] with mobile station-based multimedia convergence and service have focused generally on Mobile MCP (Multimedia O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 286–301, 2009. © Springer-Verlag Berlin Heidelberg 2009
WiBro Net.-Based Five Senses Multimedia Technology Using Mobile Mash-Up
287
Content Provider)-centered monolithic mobile multimedia services such as DMB (Digital Multimedia Broadcasting) service, location-sensitive pedestrian navigation, generic tourist guide and multimedia cartography. As a result, lacking is acceptance of collective intelligence-oriented Web 2.0’s core concepts, such as participation and sharing of arbitrary multimedia or knowledge. In addition, according as a Wireless Access Point (WAP/AP) using Wi-Fi, Bluetooth or related standards-based conventional mobile multimedia technology implies typical and intrinsically problems with limited uniform spatial coverage and bandwidth, mobile users must consider the problems from various aspect in their service access strategy, such as proximity to an access point, amount of interference, quality of mobile multimedia service or, more abstractly, based on a utility function capturing a user’s valuation of available multimedia services and their current costs [4~5]. Consequently, in this paper we suggest and implement an enhanced WiBro Net.based five senses multimedia technology that realizes a collective intelligence and mobile social networking between multi-mobile users, toward mobile Web 2.0 and next generation realistic mobile multimedia technology, using Web-map-oriented mobile Mash-up. WiBro Net. technology in South Korea is compatible with IEEE 802.16e (mobile WiMAX), which is the extension to 802.16d/a physical and MAC layer for mobile access. Our application includes 1) mobile station with a GPS (Global Positioning System) unit-based mixed-Web map module via mobile mash-up, 2) authoring module of location-based five senses multimedia contents using ubiquitous-oriented sensor network and WiBro Net., and 3) LBS-oriented intelligent agent module that includes ontology-based five senses multimedia retrieval module, social network-based user detection interface and user-centric automatic five senses multimedia recommender interface.
2 Related Work 2.1 Web 2.0 and Collective Intelligence Web 2.0 encapsulates the concept of the proliferation of interconnectivity and interactivity of Web-delivered content, and it describes trends in the use of World Wide Web (WWW) technology and Web design that aim to enhance creativity, communications, secure information sharing, collaboration and functionality of the Web. Its concepts have led to the development and evolution of Web culture communities and hosted services. Such an example is social-networking sites that focus on building online communities of people who share interests and activities, or who are interested in exploring the interests and activities of others. Other examples include blogs, and video sharing sites that allows individuals to upload video clips to an Internet website, wikis that are a page or collection of Web pages designed to enable anyone who accesses them to contribute or modify content [6~8]. Collective intelligence is a shared or group intelligence that emerges from the collaboration and competition of many individuals. It can also be defined as a form of networking enabled by the rise of communications technology, namely the Internet. Web 2.0 has enabled interactivity and thus, users are able to generate their own content. Collective Intelligence draws on this to enhance the social pool of existing
288
J.-H. Kim, H.-J. Kwon, and K.-S. Hong
knowledge. Henry Jenkins, a key theorist of new media and media convergence draws on the theory that collective intelligence can be attributed to media convergence and participatory culture [9]. New media is often associated with the promotion and enhancement of collective intelligence. The ability of new media to easily store and retrieve information, predominantly through databases and the Internet, allows information to be shared without difficulty. Thus, through interaction with new media, knowledge easily passes between sources [8], resulting in a form of collective intelligence. The use of interactive new media, particularly the Internet, promotes online interaction and this distribution of knowledge between users. In this context, collective intelligence is often confused with shared knowledge. The former is knowledge that is generally available to all members of a community, whilst the latter is information known by all members of a community. 2.2 Mobile Multimedia Application Technology Convergence of media occurs when multiple products come together to form one product with the advantages of all of them. Multimedia Messaging Services (MMS) are another good example. They are a standard for telephone messaging systems that allows sending messages that include multimedia objects (images, audio, video, rich text) and not just text as in Short Message Service (SMS). Mobile phones with builtin or attached cameras or with built-in MP3 players are very likely to have MMS messaging client - software that interacts with the mobile subscriber to compose, address, send, receive, and view MMS messages. MMS Technology is tapped by various companies to suit different solutions. CNN-IBN, India's biggest English news channel, has Mobile Citizen Journalism where citizens can send MMS photos directly to the studio. It is mainly deployed in cellular networks along with other messaging systems such as SMS, Mobile Instant Messaging and Mobile E-mail. We find interesting research approaches with the Cuypers system [10] and the OPERA project [11]. Mobile devices are not in their research focus, even though they deal with personalization. Also, mobile multimedia content can be created dynamically. For example, research approaches [12-13] use constraints and transformation rules to generate the personalized multimedia content, amongst other things. However, our observation is, that these approaches depended on limited media elements (sensory modalities) and restricted wireless network regions, such as AP (Access Point) wireless LAN-based only audio, video, text, and image selected according to the user profile information and composed in time and space using an internal multimedia model. They are not content with participation and sharing-based Web 2.0’s core concepts [14]. 2.3 LBS(Location-Based Service) Application and Mobile Mash-Up LBS are an information and entertainment service, accessible with mobile stations through the mobile network. It utilizes the ability to make use of the geographical position of the mobile station. In the UK, networks do not use triangulation; LBS services use a single base station, with a 'radius' of inaccuracy, to determine a phone's location. The Singaporean mobile operator, MobileOne, carried out such an initiative in 2007. It involved many local marketers and was reported to be a huge success in
WiBro Net.-Based Five Senses Multimedia Technology Using Mobile Mash-Up
289
terms of subscriber acceptance. The mobile generation does not restrict users to a fixed desktop location but allows mobile computing, anywhere anytime. Tourist guides are the most common application scenario for location-based services. The mobile city guide GUIDE for the city of Lancaster was one of the first systems to integrate “personalized” information to the user. More recent projects, such as LoL@ presenting a city guide for the city of Vienna, integrated multimedia in mobile city guides. However, these systems do not address the dynamic creation of personalized multimedia. This is the aim of the present approach [15]. In Web development, a Mash-up is a Web application that combines data from more than one source into a single integrated tool. The term Mash-up implies easy, fast integration, frequently done by access to open APIs and data sources to produce results that were not the original goal of the data owners. Floyd et al. [16] show how Mash-up techniques can be used for rapid prototyping in user-centered software development processes. A study at the Human-Computer Interaction Institute of the Carnegie Mellon University showed that Mash-ups can be even used for user programming [17]. IBM emphasizes the great benefits of so-called Enterprise Mash-ups, information heavy applications that integrate distributed business information within an enterprise in a quick and dynamic way [18]. Erik Wilde applied the Mash-up idea to the management of large knowledge bases [19]. Most of the existing Mash-ups are programmed manually. However, there exist a number of Mash-up platforms that facilitate the development: Mash-o-matic [20] can be used to generate Geo-Mash-ups. The focus of the SPARCE project is so-called superimposed information. With online tools like Yahoo! pipes or Microsoft’s Popfly, Mash-ups can be built out of predefined components and combined using interactive drag-and-drop interfaces. IBM’s QEDWiki is an AJAX interface to combine user interface components that are connected to external data providers.
3 WiBro Net.-Based Five Senses Multimedia Technology Using Mobile Mash-Up 3.1 Overview The system architecture of WiBro Net.-based five senses multimedia technology using mobile Mash-up is given Fig. 1. It consists of two major modules; 1) A distributed processing-based convergence and representation module of multimedia contents fusing ubiquitous-oriented the five senses information and GPS-based location information; and 2) Interactive LBS-based intelligent agent that includees mobile social network-based user detection interface, user-centric automatic five senses multimedia recommender interface and semantic- and keyword-based five senses multimedia retrieval module, to realize mobile Web 2.0-oriented collective intelligence. In Fig. 1, the acquired individual five senses information via embedded camera, microphone, mobile-sensor network and GPS unit-based location information are transmitted into the ‘five senses multimedia database’ and 'location database' on the convergence server by the WiBro network. Then, the convergence module creates and stores newmultimedia content that is fused by user selected-various media effects, with contents in the 'multimedia database'. Since, LBS-based intelligent agent automatically displays and
290
J.-H. Kim, H.-J. Kwon, and K.-S. Hong
IMAGES
SPEECH
HAPTIC
SMELL
TASTE
GPS
WiBro WIRELESS INTERNET
SPEECH AND IMAGE-BASED MULTIMODAL USER AUTHENTICATION
UBIQUITOUS SENSOR NETWORK-BASED ACQUISITION OF FIVE SENSES AND LOCATION INFORMATION
MULTIMEDIA CONVERGENCE SERVER BLUETOOTH OR WiBro WIRELESS INTERNET
FIVE SENSES MULTIMEDIA DATABASE
LOCATION DATABASE
USER DATABASE
REGISTRATION / UPDATE KNOWLEDGE & IMFORMATION
LOCATION-BASED FIVE SENSES MULTIMEDIA
FIVE SENSES MULTIMEDIA ONTOLOGY
MULTIPLE WiBro MOBILE USERS LBS-BASED INTELLIGENT AGENT PARTICIPATION & COLLABORATION
SOCIAL NETWORK-BASED USER DETECTION MODULE RSS–BASED AUTOMATIC RECOMMEND MODULE
OPENNESS & SHARING
REPRESENTATION OF LOCATION-BASED FIVE SENSES MULTIMEDIA USING MIXED-WEBMAP ON MOBILE STATIONS
FIVE SENSES MULTIMEDIA RETRIEVAL MODULE
WiBro WIRELESS INTERNET
Fig. 1. System architecture for WiBro Net.-based five senses multimedia technology using mobile Mash-up
recommends new-create five senses multimedia contents, they correspond to the present user location or pre-populated location of user-interest or semantic- and keyword-based five senses multimedia retrieval results, on the mobile station-based Web-map. This application is designed to enable anyone who accesses it to contribute or modify or rebuild five senses multimedia contents. The ability of new media to easily store and retrieve information, predominantly through databases and the Internet, allows information to be shared without difficulty. Thus, through interaction with new media, knowledge easily passes between sources, resulting in a form of collective intelligence. In live situations, the inclusion of a user together with sound, images, and motion video multimedia can arguably be distinguished from conventional motion pictures or movies both by the scale of the production and by the possibility of audience interactivity or involvement. Interactive elements can include speech commands, mouse manipulation, text entry, touch screen, video capture of the user, or live participation in live presentations. 3.2 Mobile Station-Based Mixed-Web Map Interface In case mixed-Web map consists of two or more geographic maps are produced with the same geographic parameters and output size, the results can be accurately overlaid to produce a composite map. This study designed and implemented the mobile station with a GPS unit-based mixed-Web map interface using a location-based mobile mashup with Google and Yahoo maps. This application is an essential component technology that enables to design and realize the LBS-based intelligent agent including user-centric automatic five senses multimedia recommender interface, mobile social networking-based user detection interface, and LBS-based five senses multimedia contents retrieval and representation module in section 4. The proposed mobile station-based mixed-map interface is designed using 2- and 3-dimensional geographical position information provided by Google and Yahoo maps, and then controls the map data independently in the location server side. The entire system architecture and integrated block-diagram in the location server side for
WiBro Net.-Based Five Senses Multimedia Technology Using Mobile Mash-Up
291
START
GPS CHECK?
Location-Based Intelligent Agent
YES CALL AND INITIALIZE GPS MODULE
DISPLAY DETECTED USERS
SOCIAL NETWORK-BASED USER DETECTION MODULE
DISPLAY RECOMMENDED RESULTS
RSS–BASED AUTOMATIC RECOMMEND MODULE
GPS DATA ACQUISITION DISPLAY FIVE SENSES MULTIMEDIA
CALL WEB-BROWSER AND YAHOO MAP
NO
DISPLAY RETRIEVAL RESULTS
MTEADATA ENCAPSULATION
DISPLAY CURRENT USER LOCATON
TRANSMISSION
CALL WEB-BROWSER AND GOOLE MAP
LOCATION DATABASE
FIVE SENSES MULTIMEDIA DATABASE
FIVE SENSES MULTIMEDIA RETRIEVAL MODULE
USER DATABASE
END
Fig. 2. The block-diagram in the location server side for mixed-Web map module
the proposed mobile station with a GPS unit-based the mixed-Web map interface are depicted in Fig. 2. This module is systematized by 4 major steps: 1) it obtains usercentric location information that includes current user location and user-interested location via mobile station with a GPS unit by multi-mobile station user that try to access the LBS-based various services. Then, 2) the obtained large-scale and continuous user-centric location information is transmitted and stored to location database in server side via WiBro wireless Internet. In the next step, 3) the mixed-Web map interface accomplishes mobile mash-up and mapping process with commercial Web-map service such as Google and Yahoo maps API, using the transmitted geographic parameters that are transformed into longitude and latitude coordinates. Finally, 4) it provides the mixed-Web map-based various application technologies such as real-time location and tracking and user-centric automatic five senses multimedia recommender to multi-mobile station user, by LBS-based intelligent agent. 3.3 Authoring and Representation of LBS-Based Five Senses Multimedia Contents In this paper we designed and implemented five senses information processing-based cross multimedia convergence/creation technology that is more advanced core multimedia-conversion technology over multiple mobile platforms. It includes XML-based encoding, GPS unit-based location information, and ubiquitous-oriented five senses image, speech, haptic, smell, and taste - recognition technology. This paper created a Wibro mobile-server network-based five senses multimedia content, fusing and synthesizing between the five natural senses with GPS-based location information. Acquisition and Processing of Five Senses Multimedia Contents: It has currently constructed 512 ‘Five Senses Multimedia Databases’ and 1,382 ‘location databases’ depending on numerous users and various locations. The multi-mobile station (including PDAs or mobile phones) acquire the image, movie-clip, speech, touch, smell and taste information. They are captured from mobile stations with built-in cameras and
292
J.-H. Kim, H.-J. Kwon, and K.-S. Hong
(A)
(B)
(F)
(G)
(C)
(H)
(D)
(I)
(E)
(J)
Fig. 3. WiBro station screen dumps on sequential acquisition steps of five senses information; (A) Initial menu state; (B) Ready state for five senses acquisition; (C) Capture step of images and movie clips; (D) and (E) Selection step of recorded speech message and background music; (F) Automatic acquisition step of haptic, smell and taste information from sensor network-oriented recognition interfaces; (G) Entrance step of user's recommendation, title and categorizing information for XML-based metadata encapsulation including five senses and location information; (H) and (I) User-centric selection step of special effects options and background image for multimedia convergence; (J) Transmission step of encapsulated XML-based metadata
microphone. They are obtained automatically by ubiquitous sensor network-oriented haptic devices, smell and taste recognition interfaces that are realized already in pattern recognition-based our preliminary studies using [21-23]. Then, they are transmitted to the ‘five senses multimedia database’ using the WiBro Network. The sensitivity or insensitivity of a human, often considered with regard to a particular kind of stimulus, is the strength of the feeling in which it results, in comparison with the strength of the stimulus. Simple examples - WiBro station's screen dumps on sequential acquisition steps (from (A) initial menu state to (J) transmission step) of five senses information via embedded camera, microphone, keypad on touch-screen, and individual recognition interface are captured in Fig. 3. Transmission of Five Senses Multimedia Contents: We designed and implemented XML metadata- and WiBro-based content encoding and transmission interface, including XML Encoding Rules (XER) and transmission protocol. The acquired five senses and location information are encapsulated by XML-based metadata with user's recommendation, title and categorizing information, and so on. WiBro-based wireless transfer protocol allows the transfer of encapsulated XML-based metadata to distributed computing-based multimedia convergence server that converges into five senses and location-based new multimedia content.
WiBro Net.-Based Five Senses Multimedia Technology Using Mobile Mash-Up
293
Convergence and Representation of Five Senses Multimedia Contents: This paper accepted convergence or synthesis methods for five senses multimedia convergence interface include with six essential functions: 1) photo slide-show function using time interval between multi-images, with accompanying audio; 2) chroma-key and edge detection-based image convergence function between background image and object image; 3) speech synthesis function between background music and original message speech; 4) image-based short message function with image and text message; 5) movie clip including speech and image information; and 6) sound-image synthesis function in which is a still image with synchronized sound, or sound technologically is coupled to a captured image, as opposed to a recorded speech. Since then, the multimedia representation module in the client-side displays the newly created location-based five senses movie clip and still image using MPEG-4 and JPEG-based cross display module that is able to represent five senses multimedia under cross platform. 3.4 LBS-Oriented Intelligent Agent 3.4.1 User-Centric Automatic Five Senses Multimedia Recommender Interface This agent allows two agent functions. 1) Automatic five senses multimedia indication module guides and draws five senses multimedia's position information that is registered in the user settings-customized distance that is estimated by the radial distance between the user’s current location and the location of user-interest using coordinate transformation and trigonometrical function, and 2) RSS (Really Simple Syndication)-based automatic five senses multimedia recommender module notifies and displays automatically the updated or the best five senses multimedia content using the user's annotation and evaluation-based ranking scores in the location of user-interest, on the mobile station-based mixed-Web map. RSS is Web feed formats used to publish frequently updated work. It benefits publishers by letting them syndicate content automatically, and readers who wish to subscribe to timely updates from favored websites. In summary, according as the recommended five senses multimedia content is updated or edited by multi-user and they can create new five senses multimedia using LBS-based intelligent agent, a collective intelligent five senses multimedia technology for enhanced creativity, information sharing, and collaboration works is completed. Simple examples for the user-centric automatic five senses multimedia recommender interface are depicted in Fig. 4. 3.4.2 Social Network-Based User Detection Interface Mobile social networking is social networking where one or more individuals with similar interests or commonalities, it converse and connect with one another using mobile station. However, most of these are extensions of PC-based services, whilst others are pure mobile-focused offerings. Consequently, we designed and implemented a social network-based user detection interface, to communicate and share LBS-based five senses multimedia contents between multiple users in user-interest groups or user-interest locations to which they belong, and register as friends a group of members they selected from their work group. This function will reduce the cumbersome chores of adding a new individual one friend at a time. In our mobile communities, mobile phone users can now create their own profiles, make friends, hold private conversations, share photos and videos. It provides innovative features that extend the social networking experience into the real world. Fig. 5 shows the WiBro mobile station-based user interface for social network-based user detection.
294
J.-H. Kim, H.-J. Kwon, and K.-S. Hong
RESULTS : RANKED FIVE SENSES MULTIMEDIA CONTENTS USER INTERESTED LOCATION
(A)
USER SETTING-DISTANCE
(B)
(C)
(D)
Fig. 4. Simple examples for user-centric automatic five senses multimedia recommender interfaces; (A) RSS (Really Simple Syndication)-based automatic five senses multimedia recommender module, (B) Automatic five senses multimedia indication module, (C) Mixed mobile map-based display step on WiBro mobile station, (D) Representation step of the selected LBS-based five senses multimedia contents
3.4.3 Keyword- and Semantic-Based Five Senses Multimedia Retrieval Module Multimedia search and retrieval has become an active research field thanks to the increasing demand that accompanies many new practical applications. The applications include large-scale multimedia search engines on the Web, media asset management systems in corporations, audio-visual broadcast servers, and personal media servers for consumers. Diverse requirements derived from these applications impose great challenges and incentives for research in this field. In this paper, we designed and implemented LBS-based five senses multimedia retrieval module including Javabased keyword retrieval interface and semantic-based retrieval interface using five senses multimedia ontology. These retrieval modules are called by a LBS-based intelligent agent. Fig. 6 shows integrated block-diagram of the Java-based keyword retrieval interface and semantic-based retrieval interface. Java-Based Keyword Retrieval Interface: In this paper, we adopt Apache Lucene library that is a high-performance, full-featured text search engine library written entirely in Java. Lucene has a very flexible and powerful search capability that supports a wide array of possible searches including AND, OR and NOT, fuzzy logic searches, proximity searches, wildcard searches, and range searches to locate indexed items, and it is a technology suitable for nearly any application that requires full-text search, especially cross-platform. The first thing it does is create an IndexSearcher object pointing to the directory where the contents have been indexed by title, location, user’s annotation and tags, and then create a StandardAnalyzer object. The StandardAnalyzer is passed to the constructor of a QueryParser along with the name of the default field to use for the search. This will be the field that is used if the user does not specify a field in their search criteria. It then parses the actual search criterion that was specified giving us a Query object. We can run the Query against the IndexSearcher object. This returns a Hits object which is a collection of all the contents that met the specified criteria [24].
WiBro Net.-Based Five Senses Multimedia Technology Using Mobile Mash-Up
295
Fig. 5. Simple examples for social network-based user detection interface
Semantic-Based Retrieval Interface Using Five Senses Multimedia Ontology: This paper accepted the semantic annotation and phase-based five senses multimedia retrieval method using a semantic relationship query with multimedia title and administrative district name. The implemented semantic-based retrieval interface includes three major steps; 1) Term mapping step that maps between ontology resource and the input query language; 2) Query graph construction step using relations between mapped ontology resources in step 1; and 3) SPARQL(SPARQL Protocol and RDF Query Language) query conversion step using the query graph created in step 2. After 3 steps, the constructed SPARQL query is used to search the knowledge in the five senses ontology, and form the XML document from the search result. Term mapping is the process of connection between the input query and the ontology resource. The query is split by the space, and each token is mapped when the token is the same as the name or label of the class or instance in the ontology. The mapped ontology resource constructs all possible query graphs using the ‘Minimum Spanning Tree’ algorithm, then a spanning tree with weight less than or equal to the weight of every other spanning tree [25]. START
QUERY INPUT
SEMANTIC-BASED RETRIEVAL INTERFACE TRANSFER TO RETRIEVAL INTERFACE
TERM MAPPING
FIVE SENSES MULTIMEDIA DATABASE KEYWORD-BASED RETRIEVAL
CREATE QUERY GRAPH
FIVE SENSES MULTIMEDIA ONTOLOGY
QUERY RANKING
CREATE SPARQL
LOCATION DATABASE SEMANTIC-BASED RETRIEVAL
MEMBER DATABASE
KEYWORD-BASED RETRIEVAL INTERFACE EXTEND BY KEYWORD- QUERY
SEMANTIC-BASED RETRIEVAL RESULTS
RETRIEVAL RESULTS YES LOCATION DISPLAY?
CALL MIXED-MAP MODULE
DISPLAY RETRIEVAL RESULTS USING MIXED-MAP
NO END
Fig. 6. Integrated block-diagram of keyword- and semantic-based retrieval interface
296
J.-H. Kim, H.-J. Kwon, and K.-S. Hong
4 Experiments and Results 4.1 WiBro-Based Real-Time Field Tests and Performance Evaluation This paper designed and implemented a prototype with WiBro Net.-based five senses multimedia technology using mobile Mash-up. The application is implemented in Microsoft Visual Studio 2005-supported c# and c++ using the Microsoft.Net Compact Framework 2.0 and Windows Mobile 5.0 Pocket PC SDK. Experimental environments consisted of a blue-tooth module for data communication between the mobile station and GPS unit, and WiBro mobile stations and individual sense recognizers for the ubiquitous-oriented mobile-sensor network. Three WiBro mobile stations with GPS units are used during the experiments, they are MSM 6500(EVDO) and 520MHz CPU-based SAMSUNG SPH-M8200. It is a PDA type WiBro station that runs Window Mobile 6.0 OS and supports WiBro and CDMA connectivity. WiBro Net.-based real-time field tests for this prototype, which includes the authoring and representation module of LBS-based five senses multimedia contents and the LBS-based intelligent agent module for the WiBro mobile stations, are performed in the Seoul and Suwon Areas in South Korea, since the KT (Korea Telecom) WiBro service is now available in Seoul (including the subway) and 19 cities in the metropolitan area. Fig. 7 shows simple examples with the WiBro mobile station's screen captures for the authoring and representation module of LBS-based five senses multimedia contents, and some examples with the WiBro mobile station’s the screen dumps that include the user-centric automatic five senses multimedia recommender interface and social network-based user detection interface of LBS-based intelligent agent module in shown Fig. 8. As a result, WiBro Net.-based this application provided higher data rate transmission in UP-DOWN air-link (average up 904 Kbps(113 KB/s) / down 1.4 Mbps (176 KB/s)) and wider coverage region, in comparison with AP-based conventional network
(A)
(B)
(C)
(D)
Fig. 7. WiBro mobile station's screen captures for the convergence (creation) experiment of five senses multimedia; (A) Acquisition step of contents that include five senses information, user's message (annotation), and title, and user-selected category information and convergence options, (B) Acquisition step of location information using GPS unit, (C) Transmission step of acquired five senses and location information into multimedia convergence server, (D) Mixed mobile map-based display step on WiBro mobile station of concentrated LBS-based five senses multimedia contents
WiBro Net.-Based Five Senses Multimedia Technology Using Mobile Mash-Up
(A)
(B)
(C)
297
(D)
Fig. 8. The screen dumps and captures on the WiBro mobile station for LBS-based intelligent agent; (A) distance settings for user-interest locations and initial detection results depending on distance settings, (B) The result on mixed mobile map of automatic multimedia recommender agent user entered by user setting; in this case the user settings for user-interest locations have radius 5km, (C) and (D) are screen dumps on mixed mobile map for the result of user-centric automatic five senses multimedia recommend interface and social network-based user(who is registered friend) detection interface using the WiBro mobile station
technology and service (where, PDA (HP-RX4540)-based KT NESPOT wireless access service is adopted in this experiments) that offered the outermost limits is AP position-centric within a 42-meter radius (at this time, the radio receiving sensitivity is approximately -91dbm) in this experimental results. 4.2 Surveys on Personal Satisfaction by Actual Experience For qualitative usability test and performance evaluation of the suggested WiBro Net.-based five senses multimedia technology using mobile Mash-up, this paper accomplished two different types of survey on personal satisfaction by actual experience; 1) System Usability Scale (SUS) method using ISO 9241/11 questionnaires[26], and 2) Suitability (Quality) survey by ISO 9241/10 questionnaires-based the software ergonomics. These surveys relatively interviewed and evaluated for 37 of the subjects as compared with “Photo Album Service” of KTF (Korea Telecom Freetel) that is a South Korean telecommunications firm, specializing in cellular, or mobile, phones. 4.2.1 System Usability Scale (SUS) Method Using ISO 9241/11 Questionnaires The SUS is a simple, ten-item scale, which is given Table 1, giving a global view of subjective assessments of usability. Items were selected so that the common response to half of them was strong agreement, and to the other half, strong disagreement. To calculate the SUS score, first sum the score contributions from each item. Each item's score contribution will range from 0 to 4. For items 1, 3, 5, 7, and 9 the score contribution is the scale position minus 1. For items 2, 4, 6, 8 and 10, the contribution is 5 minus the scale position. Multiply the sum of the scores by 2.5 to obtain the overall value of SUS. SUS scores have a range of 0 to 100.
298
J.-H. Kim, H.-J. Kwon, and K.-S. Hong Table 1. ISO 9241/11 questionnaires-based System Usability Scale and SUS scores
NO
ISO 9241/10 questionnaires-based survey scale
1 2 3
I think that I would like to use this system frequently I found the system unnecessarily complex I thought the system was easy to use I think that I would need the support of a technical person to be able to use this system I found the various functions in this system were well integrated I thought there was too much inconsistency in this system I would imagine that most people would learn to use this system very quickly I found the system very cumbersome to use I felt very confident using the system I needed to learn a lot of things before I could get going with this system
4 5 6 7 8 9 10
1
2
Scores 3 4
5
SUS method-based survey results of using ISO 9241/11 questionnaires are shown in Fig. 9, and it shows average SUS scores on ten item scale. The suggested applicative technology’s average SUS scores estimated at 83.58% and it is relatively superior to the commercialized multimedia service in item scale 5 and 9. 4.2.2 Suitability (Quality) Survey Using ISO 9241/10 Questionnaires The other survey has been done by the software ergonomics based on ISO 9241/10 questionnaires which is given Table 2. The interviews have been used to perform a qualitative analysis about the “Suitability(Quality)” of the suggested applicative technology, and some modifications have been done in order to adopt the questionnaires to this system. Fig. 10 shows results of the ISO 9241/10 questionnaires showing qualitative “Suitability(Quality)” of the suggested applicative technology. This survey shows the results for two different user groups; a beginners group and an experienced
Fig. 9. SUS survey results: Average SUS scores on ISO 9241/11 questionnaire 10 scales
WiBro Net.-Based Five Senses Multimedia Technology Using Mobile Mash-Up
299
Table 2. ISO 9241/10 questionnaires-based survey scale and scores NO 1
ISO 9241/10 questionnaires-based survey scale
1
2
3
Scores 4 5
6
7
Suitability for the task
2
Suitability for MMS(Multimedia Massage Service)
3
Suitability for individualization
4
Self-descriptiveness(description)
5
Controllability(stabilization)
6
Conformity with user expectations
7
Error tolerance
Fig. 10. Survey results using ISO 9241/10 questionnaires: Qualitative “Suitability(Quality)”
users group. All users have already worked before with PDA, Wibro mobile station and touch-screens. The suggested applicative technology is relatively superior to the commercialized multimedia service in item scale 3 and 6, for both user groups with a beginners group and an experienced users group on the suggested system.
5 Conclusions In this study, we suggested and implemented an enhanced WiBro Net.-based five senses multimedia technology using mobile mash-up. It includes five senses and location-based multimedia convergence module, and LBS-based intelligent agent module with 4 major functions; 1) Five senses multimedia retrieval module, 2) Social network-based users (registered friends) detection module, and 3) Automatic five senses multimedia indication and recommender module. In the experimental results, this application provided higher data rate transmission in UP-DOWN air-link and wider coverage region. Also, its average System Usability Scale (SUS) scores estimated at
300
J.-H. Kim, H.-J. Kwon, and K.-S. Hong
83.58%, and it relatively held competitive advantage in the specific item scales such as system integration, individualization, and conformity with user expectations. In contrast to other proposed static or active media-based mobile multimedia studies, our approach is unique in four aspects: 1) This study, which represents the convergence technology of five sense information acquired from ubiquitous-oriented multi-devices into a single form, is able to expand into more personal and realistic multimedia service, to reflect individual sensory information, 2) this issue may satisfy and reflect the Web 2.0’s core concepts such as mobile-based indirect experiences, enhanced creativity, information sharing, and collaboration works. In addition, 3) user-specific MMS (Multimedia Messaging Service) is possible in the relatively low memory and storage capacity available on mobile stations, since our approach orients user-centric multimedia technology with the LBS-based intelligent agent. Finally, 4) according as this study used the WiBro system for high speed broadband wireless Internet access, it is able to provide more seamlessly / quickly a variety of location information and five senses multimedia contents to various user terminals, compared with WIFI-based conventional wireless LAN. In the future, we will try to accomplish personalization of realistic five senses multimedia via location-based contextawareness. This will be extended to various application fields, such as enhanced education, entertainment, and realistic MMS service using location-based collective intelligent five senses multimedia technology with an implementation of five senses representation devices such as portable olfactory exhaling apparatus.
Acknowledgement This research was supported by MIC, Korea under ITRC IITA-2009-(C1090-09020046), and the Korea Science and Engineering Foundation (KOSEF) grant funded by the Korea government (MEST) (No. 20090058909).
References 1. Boll, S., et al.: Personalized Mobile Multimedia meets Location-Based Services. In: Proc. Multimedia Inf. Syst. Workshop at 34th Annual Convention of the German Informatics Society, pp. 64–69 (2004) 2. Choi, Y.B., et al.: Applications of “human factors” in wireless telecommunications service delivery. International Journal of Services and Standards 1(3) (2005) 3. Moreno, R.: Learning in High-Tech and Multimedia Environments. Current Directions in Psychological Science 15(2) (2005) 4. He, J., et al.: Extending WLAN coverage using infrastructureless access points. In: High Performance Switching and Routing (HPSR), pp. 162–166. IEEE, Los Alamitos (2005) 5. Zemlianov, A., et al.: Cooperation and decision-making in a wireless multi-provider setting. In: INFOCOM 2005, 24th Annual Joint Conference of the IEEE Computer and Communications Societies, vol. 1, pp. 386–397. IEEE, Los Alamitos (2005) 6. Oxford University Press, Oxford English Dictionary (2007) 7. Encyclopaedia Britannica. Inc., Encyclopædia Britannica (2007) 8. Flew, T.: New Media: an introduction. Oxford University Press, Melbourne (2008)
WiBro Net.-Based Five Senses Multimedia Technology Using Mobile Mash-Up
301
9. Jenkins, H.: Fans, Bloggers and Gamers: Exploring Participatory Culture. New York University Press, New York (2006) 10. van Ossenbruggen, J.R., et al.: Cuypers: a semi-automatic hypermedia presentation system, Technical Report INS-R0025, CWI, Netherlands (2000) 11. Lemlouma, T., Laya"ıda, N.: Adapted Content Delivery for Different Contexts. In: Conf. on SAINT 2003 (2003) 12. Lemlouma, T., Laya"ıda, N.: Context-Aware Adaptation for Mobile Devices. In: IEEE Int. Conf. on Mobile Data Management (2004) 13. Metso, M., et al.: Mobile Multimedia Services Content Adaptation. In: 3rd Intl. Conf. on Information, Comm. and Signal Processing (2001) 14. Scherp, A., et al.: Generic support for personalized mobile multimedia tourist applications. In: Int. Multimedia Conference proceedings of the 12th annual ACM international conference on Multimedia table of contents, pp. 178–179 (2004) 15. Lee, H.-H., et al.: Design and Implementation of a Mobile Devices-based Real-time Location Tracking. In: Proc. of UBICOMM 2008 (2008) 16. Floyd, I.R., Jones, M.C., Rathi, D., Twidale, M.B.: Web mash-ups and patchwork prototyping: User-driven technological innovation with Web 2.0 and Open Source software. In: Proceedings of HICSS. IEEE Computer Society, Los Alamitos (2007) 17. Wong, J., Hong, J.I.: Making mashups with marmite: towards end-user programming for the web. In: ACM-CHI (2007) 18. Jhingran, A.: Enterprise information mashups: Integrating information, simply. In: ACM VLDB 2006 (2006) 19. Wilde, E.: Knowledge organization mashups.: TIK Report 245. ETH Zurich (Swiss Federal Institute of Technology) (2006) 20. Murthy, S., Maier, D., Delcambre, L.M.L.: Mash-o-matic. In: ACM Symposium on Document Engineering (2006) 21. Kim, J.H., et al.: Distributed-computing-based multimodal fusion interface using VoiceXML and KSSL for wearable PC. Electronics Letters 44(1), 58–60 (2008) 22. Kim, J.H., et al.: MMSDS: ubiquitous Computing and WWW-based Multi-Modal Sentential Dialog System. In: Sha, E., Han, S.-K., Xu, C.-Z., Kim, M.-H., Yang, L.T., Xiao, B. (eds.) EUC 2006. LNCS, vol. 4096, pp. 539–548. Springer, Heidelberg (2006) 23. Cheon, B., et al.: Implementation of Floral Scent Recognition System Using Correlation Coefficients. In: Proc. APIC-IST 2008 (2008) 24. Paul, T.: The Lucene Search Engine: Adding search to your applications, http://www.javaranch.com/journal/2004/04/Lucene.html 25. Pettie, S., Ramachandran, V.: An Optimal Minimum Spanning Tree Algorithm. JACM 49(1), 16–34 (2002) 26. Tullis, T.S., et al.: A Comparison of Questionnaires for Assessing Website Usability. In: UPA 2004 (2004)
Overlay Ring Based Secure Group Communication Scheme for Mobile Agents Hyunsu Jang1 , Kwang Sun Ko2 , Young-woo Jung3 , and Young Ik Eom1 1
Dept. of Computer Engineering, Sungkyunkwan University, Republic of Korea {jhs4071,yieom}@ece.skku.ac.kr 2 Financial Security Agency, Republic of Korea
[email protected] 3 Semiconductor Business Samsung Electronics Co., Ltd., Republic of Korea
[email protected]
Abstract. Among the various inter-agent communication models proposed for multi-agent systems so far, several group communication schemes have been issued to guarantee the transparent communication among the agents. However, in mobile agent environments where each agent has mobility in the network, those schemes are not sufficient to fully handle the topological change caused by the migration of mobile agents. Also, these group communication schemes should be secure in order for them to be practical. In this paper, we propose a secure group communication scheme which applies the hierarchical overlay ring structure of mobile agents. The proposed scheme uses the ring channel in order to cope adaptively with the change of ring topology. The ring channel has the fundamental information to construct the ring and is managed by only the mobile agent platforms. Therefore, each mobile agent does not need to directly handle the ring channel and can perform group communication regardless of the changing ring topology. Keywords: Mobile agents, Group communication, Ring topology, Group shared key.
1
Introduction
A mobile agent is a program that represents a user in networks and can migrate autonomously among the nodes in the network in order to perform some computational operations on a user’s behalf. With the use of mobile agents, we can get several advantages: reduction of network traffic, asynchronous and autonomous activities of computational operations, and dynamic adaptation capability [1]. Mobile agents have been used in various distributed applications such as distributed information retrieval, network management, e-commerce, and so on [2][3][4].
This research was supported by MKE, Korea under ITRC IITA-2009-(C1090-09020046).
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 302–313, 2009. c Springer-Verlag Berlin Heidelberg 2009
Overlay Ring Based Secure Group Communication Scheme
303
In this kind of network environments, the communication among the mobile agents is one of the most fundamental issues, and various communication schemes have been released for these environments [5][6]. Especially, the group communication among mobile agents is an important and widely used one, which guarantees communication messages to be transferred transparently. There are two major issues to be considered: adaptation to the topology changes of the mobile agents and the authentication of mobile agents caused by migration [7]. In this paper, we propose a secure group communication scheme for mobile agents using overlay rings. This scheme uses ring channels (RCs) to flexibly handle the change of ring topology which is caused by the migration of mobile agents. An RC is an object to represent a ring topology of the mobile agents and is managed by the platforms. The roles of the platform are to authenticate mobile agents joining a group, to generate and distribute a group shared key for each ring, and to reconstruct ring topology using RCs. Therefore, in this scheme, mobile agents can do a group communication without considering the detail mechanism to construct the rings. The rest of the paper is organized as follows. Existing group communication protocols are given in Section 2. Section 3 describes the architecture of the secure group communication scheme using the overlay ring which is proposed in this paper. Section 4 gives a mechanism needed to manage ring topology, and Section 5 describes the authentication mechanism and the key distribution mechanism for our scheme. Section 6 analyses the stability of our scheme and finally we conclude this paper in Section 7.
2
Overlay Ring Based Group Communication
In this Section, we describe two well known group communication protocols; the HFTRP (Hierarchical Fault Tolerant Ring Protocol) [8] and the SSRP(Secure Synchronous Ring Protocol) [9] which are designed for distributed system environments. The HFTRP is hierarchical communication protocol for the real-time distributed system, where the overall structure is depicted in Figure 1. In the HFTRP, a ring consists of normal nodes(N), representative nodes(R), and a leader node(L). These nodes form a hierarchical structure based on the role of each node. Each ring is organized with a representative node and some normal nodes. A representative node manages the normal nodes which are included in its inner ring, and also connects its inner ring to the outer ring which is virtually located at one step higher logical position. A representative node creates an empty token periodically and sends it through the ring. The empty token stores the information received from the visited node while passing through each node in the ring, and finally returns to the representative node. The token which is returned to the representative node is sent to the ring of next the higher level and then goes through the same procedure. This process is performed repeatedly until the token finally reaches to the leader node. In this protocol, the depth of the ring hierarchy can be determined with the considerations on group size, communication cost, and computation overhead.
304
H. Jang et al.
N
N
Inner Ring 1
Inner Ring 2
N
Outer Ring
R
R/L
N
R
Inner Ring 3
N N : Normal Node
N
R : Representative Node
L : Leader Node
Fig. 1. The Hierarchical Fault Tolerant Ring Protocol
The SSRP is a fault-tolerant group communication protocol which operates in a synchronous manner. The structure of the SSRP is hierarchical and uses Cliques GDH key management protocol [10] and Diffie-Hellman encryption algorithm for the safe group communication. By integrating a security capability into the group communication layer, it is guaranteed that both the application messages of the group and the corresponding information related to the group communication layer are protected from the outside attacks. The SSRP uses the GDH as a key management protocol, which incurs minimum cost when it performs the membership operations. The structure of the GDH is similar with the shape of the ring and it also performs the key management through the representative nodes. Moreover, the SSRP makes a group key for each ring independently while it performs the membership operations such as member join/leave and group join/leave. This action makes the system more secure. Even though these group communication protocols have an excellent performance, these still have some limitation to support the group communications among the mobile agents. Especially, the topological change which is incurred by the migration of mobile agents causes many problems in the authentication for the group communication between mobile agents, creation and distribution of the group shared key, and the reorganization of the topology. These problems make the group communication difficult. In this paper, we suggest the secure group communication scheme using the hierarchical overlay ring to ensure the secure communication among the mobile agents.
3
System Architecture
In this Section, we describe the overlay ring topology for group communication among the mobile agents. We also introduce the concept of ring channel, which provides the necessary information to bulid the ring topology.
Overlay Ring Based Secure Group Communication Scheme
305
Local Ring ( KGLR 1)
Global Ring (GR)
Local Ring ( KGLR 2)
Local Ring ( KG LR 4 )
KGGR
Local Ring ( KGLR 3)
: Normal member
: Gateway member
K GLR : Group shared key of LR
: Mobile agent platform (MAP)
KGGR : Group shared key of GR
Fig. 2. Architecture of hierarchical overlay ring topology
3.1
Ring Topology
We assume that all members of each group are mobile agents that can migrate to another local network and each member knows the range of the networks to which it can migrate. The proposed scheme is based on the hierarchical ring structure composed of two kinds of logical ring: Local Ring(LR) and Global Ring(GR). LR is a logical connection of mobile agents in each local network. Each LR is composed of several normal members(NMs) and one gateway member(GM). GR is a global ring that connects LRs, which is constructed in each local network. The GM of each LR participates in the GR as a group member. The overall system architecture is shown in Figure 2. 3.2
Ring Channel
The members of each ring keep their connections in the ring channel(RC) object. The RC has basic information for the ring and is managed by the mobile agent platform. When a mobile agent wants to join a group, the mobile agent platform creates a new RC for the mobile agent, connects the new RC with existing RCs of the LR, and provides the group communication facility to the mobile agent. Therefore, mobile agent does not need to know the existence of RC and the lower level mechanisms such as the topology change, reconstruction of the ring, management of the secure group key, and etc. The RC object contains the following information: RC = {RCID | P ID | M AID | RCtype | GroupID | KG | P Link | N Link}
306 · · · · · · · ·
H. Jang et al.
RCID - The ID of ring channel P ID - The Platform ID that manages this ring channel M AID - The ID of the Mobile Agent that belongs to the this ring channel RCtype - The type of the ring(LR or GR) GroupID - The Group ID of the logical ring KG - The Group Shared Key of the ring P Link - Link to the previous member in the ring N Link - Link to the next member in the ring
The P Link and N Link are the structures that include RCID and PID fields for linking neighbor members of a group.
4
Ring Management
In this Section, we describe the ring management scheme for joining and leaving of each member to a group. We also explain the scheme for dealing with the changes of the ring topology caused by the migration of its members. 4.1
Joining a Group
The mobile agent has to request “join” operation to the mobile agent platform, if it wants to participate in the group communication. Figure 3 shows the join algorithm. If a mobile agent requests the mobile agent platform to join a group, the platform first looks for the LR that the mobile agent wants to connect. The platform creates a new RC, and find out a mobile agent on the same platform that already participates in the group. If the mobile agent exists, the platform performs a simple RC connection that links the new RC with existing ones of the group. void join(MobileAgent ma, int groupID){ RingChannel new_lrc = makeRingChannel(ma, groupID); bexist = Search_LR(groupID); if(bexist == true) { RingChannel rc = getRingChannel (groupID); Simple_RC_Connection(rc, new_lrc); } else { msg = discoveryLR (groupID); if(msg.bfind == true) { joinLR(new_lrc, msg.previousLink, msg.nextLink); } else { makeGateway (ma, groupID); msg = discoveryGR (groupID); if(msg.bfind == true) { RingChannel new_grc = makeRingChannel (ma, groupID,GR); joinGR(new_grcm, msg.previousLink, msg.nextLink); } } } }
Fig. 3. The join algorithm
Overlay Ring Based Secure Group Communication Scheme
307
If no mobile agents are found in the group, the platform performs the LR discovery. For the LR discovery, the platform generates the LR DISCOV ERY message and broadcasts it. The LR DISCOV ERY message has the following format: < LR DISCOV ERY, GroupID, P latf ormID, M obileAgentID > If the LR exists in the local network, the GM receives the LR DISCOV ERY message and authenticates the mobile agent as a new member. If the new member is authenticated, the GM distributes the group shared key, KG . The mobile agent is connected to the group members by link authentication. We describe the member authentication and link authentication in Section 5. When the platform cannot find a LR by LR discovery, the platform makes the mobile agent as a GM of the new LR. In this case, the platform performs the GR discovery with GR DISCOV ERY message. The GR discovery is performed like LR discovery. After GR discovery, the mobile agent becomes a member of the GR. 4.2
Leaving a Group
When a member wants to leave a group, the member requests “leave” operation to the mobile agent platform. Let us assume that a mobile agent αi on the platform Pi wants to leave a group G. Platform Pi sends a LIN K REF RESH message to αP and αN , which are the mobile agents connected by P Link and N Link of αi respectively. The LIN K REF RESH message has the following format: < LIN K REF RESH, N ew Link T arget > In this message, N ew Link T arget is the field for linking with another neighbor member of the group. For example, the platform PP in which αP is located, receives a LIN K REF RESH message including N LIN K of RC for αi , and modifies the RC of αP accordingly. The platform PN of αN also performs the same process. After the process, the platform Pi removes the RC of αi . If the mobile agent αi is a GM, the platform Pi delegates the role of GM to αN . The platform Pi transfers the member list of LRs managed by αi and RC needed to join a GR in the LIN K REF RESH message. As soon as PN joins GR through the RC received from Pi . The delegation is completed. 4.3
Reconstruction of Ring Topology
The migration of a mobile agent may affect the topology of LR and GR. The ring topology may be affected in two cases: the migration of a mobile agent in its local network, the migration of a mobile agent to another network. When a mobile agent α migrates from the platform Pi to the platform Pj in its local network, the logical topology of LR does not change. The platform Pi sends the RC of the mobile agent α to the platform Pj . After completion of the mobile agent migration, the platform Pj modifies the P ID field in RC to have the value of P ID of platform Pj , and requests the neighbor members to change the P ID in their RC’s(N Link or P Link) field.
308
H. Jang et al.
When a mobile agent α migrates from the platform Pi to the platform Pj in another network LRk , α requests “leave” to the platform Pi . The platform Pi sends the RC of α to the platform Pj , and removes α from the current group using LIN K REF RESH message. The platform Pj registers the mobile agent α as a normal member of LRk by join mechanism explained in Section 4.1. In the proposed scheme, the mobile agent does not have to consider the change in the ring topology due to its migration because the mobile agent does not directly handle the RC composing the ring topology. The RC is maintained by only the mobile agent platform.
5
Authentication and Key Management
In this Section, we describe the authentication scheme for the new group member and the key management scheme for secure group communication. The proposed authentication scheme includes member authentication and link authentication. The member authentication is the authentication for the new member who wants to join a group. If the new member is authenticated, the GM distributes the group shared key to the new member. The link authentication is a verification of connection between a new member and existing group members. 5.1
The Member Authentication and Key Distribution
We assume that migration of mobile agent is performed through the secure channel. When a mobile agent joins a group during migration, the member authentication process is performed. When a join process is performed through LR discovery, the platform PGM , managing mobile agent GM, performs the member authentication, and distributes the group shared key, KG . Figure 4 shows the member authentication protocol. The platform Pi sends LR DISCOV ERY message to the platform PGM . When PGM receives LR DISCOV ERY message, PGM generates a temporary Pi
PGM
LR_DISCOVERY | PID | MAID
MEMBER_AUTH | EKP i+(KT, nonce) Decrypts both K T and nonce by KPi Encrypts both GroupID and nonce by K T
MEMBER_AUTH | EKT(GroupID, nonce)
ACCEPT_MEMBER | EKT(KG, nonce)
REJECT_MEMBER | EKT(reason, nonce)
Pi MAID KG
: Platfrom i : Mobile Agent ID : Group Shared Key
Generates K T Encrypts both K T and nonce by KPi +
PGM RCID KPi
: Platfrom of Gateway Member : Ring Channel ID : Private key of Platfrom i
Decrypts both GroupID and nonce by K T Checks whether the GroupID equals to the ID of RC for GM
else
PID KT KPi +
Fig. 4. The member authentication protocol
: Platfrom ID : Temporary Secret Key : Public key of Platform i
Overlay Ring Based Secure Group Communication Scheme
Pi
309
PN LINK_AUTHENTICATION | EKG (GroupID, nonce)
ACCEPT_LINK | | EKG (GroupID, nonce)
Decrypts both GroupID and nonce by KG Checks whether the GroupID equals to the ID of RC for aN or not
LINK_INFO_EXCHANGE || EKG (RCID, PID, N_Link of yours, nonce) LINK_INFO_EXCHANGE || EKG (N_Link, nonce) REJECT_LINK | | EKG (reason, nonce)
P MAID
: Platfrom a : Mobile Agent ID
PN RCID
else
: Platfrom linked N_Link of RC for a : Ring Channel ID
PID KG
: Platfrom ID : Group Shared Key
Fig. 5. The link authentication protocol
secret key, KT , and transfers KT and nonce encrypted by public key KP i + of the platform Pi , which is denoted by EKP i + (KT , nonce). The platform Pi decrypts EKP i + (KT , nonce) by its private key KP i to extract KT . Pi also encrypts GroupID and nonce by KT , denoted by EKT (GroupID, nonce), and sends it to PGM . The PGM decrypts EKT (GroupID, nonce) by KT and checks whether the GroupID in the message equals with its own GroupID or not. If two values are same, PGM sends an acceptance message including a group shared key, KG , to the Pi , and finishes the member authentication process. 5.2
The Link Authentication
The link authentication is performed after the member authentication is completed. We assume that the mobile agent α want to connect with other mobile agents αP and αN . The platform PP and PN of αP and αN perform the link authentication for α. The link authentication checks the validity of the connection between the new member and existing group member through LIN K AU T HEN T ICAT ION message. Figure 5 shows the link authentication protocol. The platform P sends a LIN K AU T HEN T ICAT ION message that includes the GroupID and nonce encrypted by the group shared key KG . When the platform PN receives a LIN K AU T HEN T ICAT ION message, PN decrypts GroupID and nonce by KG , and checks whether the GroupID equals to the RCID for αN or not. If two values are same, P and PN exchange the information for connection. when they are not same, PN transfers a rejection message to P .
6
Safety Analysis
In this Section, we describe safety analysis of the overlay ring based secure group communication scheme. Before presenting the safety analysis, we assume that the encrypted message is safe against message distortion attack. Table 1 describes some notations for safety analysis. The ring topology in our scheme is as follows:
310
H. Jang et al. Table 1. Notations for safety analysis for our proposed scheme Notation
Description
EncKU i (m)
Encryption of message m with the public key of platform i
DecKP i (m)
Decryption of message m with the private key of platform i
KG
Group shared key
KT
Temporary secret key
PM A
Platform P managing mobile agent M A
R(a1 , a2 , ..., an )
A ring topology including members a1 , a2 , ..., an
M (m1 , m2 , ..., mk )
Creation of the message including m1 , m2 , ..., mk
Ext(m, ai )
Extraction of ai from message m
Adv(m)
Attack for the message m
mState
State of the message m
A→B:m
A sends a message m to B
Network N = {LR1 , LR2 , ..., LRn } LRi = R(nm1 , nm2 , ..., nmk , gmi ) GR = R(gm1 , gm2 , ..., gml ) Theorem 1. The member authentication protocol guarantees a safe distribution of the group shared key. Proof Pi : m0 = M (LR DISCOV ERY, P ID, M AID) Pi → PGM : m0 PGM
PGM : mA = EncKU i (KT , nonce) → Pi : mA Pi : Ext(DecKP i (mA ), KT ) : mA = EncKT (GroupID, nonce)
Pi → PGM : mA PGM : Compare(Ext(DecKT (mA ), GroupID), GroupID of RC f or GM ) : mG = EncKT (KG , nonce) PG M → Pi : mG Pi : Ext(DecKT (mG ), KG )
Overlay Ring Based Secure Group Communication Scheme
311
if madv = Adv(mG ) then PGM → Pi : madv Pi : Ext(DecKT (madv , KG ) However, Pi cannot decrypt madv , so Pi can detect the message transformation. Therefore, Pi sends m0 to PGM again to get the KG . Theorem 2. The member authentication protocol is safe against message replay attack. Proof Adversary : mcaptured = Adv(mA ) or Adv(mG ) Adversary → PGM : mcaptured PGM : Compare(Ext(DecKT (mcaptured , nonce, T IM E) Therefore, PGM can detect the captured message, and rejects a member authentication. Theorem 3. The link authentication protocol guarantees a safe connection between the new member and existing group member. Proof Pi → PiN : m0 = M (LIN K AU T HEN T ICAT ION, EncKG (GroupID, nonce)) PiN : Compare((DecKG (m0 ), GroupID), GroupID of RC f or Pi ) : mAccept = EncKG (GroupID, nonce) PiN → Pi : mAccept Pi : mLink = EncKG (RCID, P ID, n link of yours, nonce) Pi → PiN : Ext(DecKG (mLink , n link of yours, ) : mLink = EncKG (n link, nonce) PiN → Pi : mLink if madv = Adv(m0 ) then Pi : Compare((DecKG (madv )), GroupID of RC f or PiN )
312
H. Jang et al.
However, PiN cannot decrypt madv with KT , so PiN can detect message transformation, and PiN requests m0 to Pi . PiN : mreject = EncKG (madv , nonce) PiN → Pi : mreject Pi : Compare(Ext(mreject , madv ), m0 ) : m0 = M (LIN K AU T HEN T ICAT ION, EncKG (GroupID, nonce)) Therefore, PiN can detect transformation of message m0 and requests retransmission of the message m0 . Theorem 4. The link authentication protocol is safe for message replay attack. Proof
Adversary : mcaptured = Adv(m0 ) or Adv(mLink ) Adversary → PiN : mcaptured PiN : Compare(Ext(DecKG (mcaptured , nonce, T IM E) Therefore, PiN can detect captured message, and rejects the link authentication. Theorem 5. The group communication based on overlay ring is safe. Proof 1. All members of a group can obtain the group shared key through the secure channel. (Theorem 1 and Theorem 2) 2. Each member of a group can connect with its neighbor members safely. (Theorem 3 and Theorem 4) 3. The message encrypted by the group shared key is safe against message distortion attack. (assumption) The message MKG that is encrypted by the group shared key KG , would be safely transferred to the group members through the safe linkage, which is constructed as a overlay ring structure.
7
Conclusion
In this paper, we proposed a secure group communication scheme among the mobile agents, which uses a hierarchical overlay ring structure. Through this scheme, mobile agents can transparently transfer the message and securely to
Overlay Ring Based Secure Group Communication Scheme
313
other mobile agents located in a different network by using the hierarchical overlay ring. Our scheme uses the ring channel to flexibly handle the change of ring topology caused by the migration of mobile agents. Mobile agents can do a group communication without considering the change of ring topology caused by the migration of mobile agents because the mobile agents do not need to directly handle the ring channel, which is needed to construct the ring topology. Furthermore, our scheme cut down the computation cost for group communication by the computation of ring reconstruction not in the mobile agents, but in the platform which is faster than mobile agents.
References 1. Cho, K., Hattori, M., Ohsuga, A., Honiden, S.: Picoplangent: An Intelligent Mobile Agent System for Ubiquitous Computing. In: Barley, M.W., Kasabov, N. (eds.) PRIMA 2004. LNCS (LNAI), vol. 3371, pp. 43–56. Springer, Heidelberg (2005) 2. Aneiba, A., Rees, S.J.: Mobile Agents Technology and Mobility. In: Proc. of the 5th Annual Postgraduate Symposium on the Convergence of Telecommunications, Networking, and Broadcasting (June 2004) 3. Takashio, K., Soeda, G., Tokuda, H.: A Mobile Agent Framework for Follow-Me Applications in Ubiquitous Computing Environment. In: Proc. of 21st International Conference on Distributed Computing System Workshop (2001) 4. Ledoux, T., Bouraqadi-Saadani, N.: Adaptability in Mobile Agent Systems using Reflection. In: Proc. of the Workshop on Reflective Middleware (RM 2000), New York (2000) 5. Mahmoud, Q.H.: Understanding Network Class Loaders, Developer Technical Articles & Tips (2004) 6. Mihailescu, P., Binder, W., Kendall, E.: MAE: Mobile Agent Environment for Resource Limited Devices. In: Magnusson, B. (ed.) ECOOP 2002. LNCS, vol. 2374. Springer, Heidelberg (2002) 7. Jung, Y., Choi, J.H., Ko, K.S., Kim, G.S., Eom, Y.I.: A Secure Group Communication Scheme for Mobile Agents using the Hierarchical Overlay Ring. The KIPS Transactions: Part A 14-A(6), KIPS (2007) 8. Tunal, T., Erciye, K., Soysert, Z.: A Hierarchical Fault-Tolerant Ring Protocol for Distributed Real-Time Systems. In: Proc. of the Special issue of Parallel and Dis-tributed Computing Practices on Parallel and Distributed Real-Time Systems (2000) 9. Saglam, O., Dalkilic, M., Erciyes, K.: Design and Implementation of a Secure Group Communication Protocol on a Fault Tolerant Ring. In: Proc. of the Computer and Information Sciences (2003) 10. Steiner, M., Tsudik, G., Waidner, M.: Key Agreement in Dynamic Peer Group. IEEE Trans. on Parallel and Distributed Systems 11(8), 769–781 (2000)
Enhanced Multiple-Shift Scheme for Rapid Code Acquisition in Optical CDMA Systems Dahae Chong, Taeung Yoon, Youngyoon Lee, Chonghan Song, Myungsoo Lee, and Seokho Yoon School of Information and Communication Engineering, Sungkyunkwan University, 300 Chunchun-dong, Jangan-gu, Suwon, Gyeonggi-do, 440-746, Korea {lvjs1019,ytw0201,news8876,starsong83,maxls813,syoon}@skku.edu
Abstract. In this paper, we propose a novel code acquisition scheme called enhanced multiple-shift (EMS) for optical code division multiple access (CDMA) systems. By using multiple thresholds, the proposed EMS scheme provides a shorter mean acquisition time (MAT) than that of the conventional mutiple-shift (MS) scheme. The simulation results demonstrate that the MAT of EMS scheme is shorter than that of MS scheme in both single-user and multi-user environments.
1
Introduction
In code division multiple access (CDMA) based systems, the data demodulation is possible only after a code synchronization is completed. Therefore, the code synchronization is one of the most important tasks in CDMA based systems [1]. Generally, the code synchronization consists of two stages: code acquisition and tracking. Achieving the code synchronization is called code acquisition and maintaining the code synchronization is called tracking [2], of which the former is dealt with in this paper. In code acquisition process, the most significant performance measure is mean acquisition time (MAT), which is a mean time that elapses prior to acquisition. An optical CDMA system uses a spreading code called optical orthogonal code (OOC) proposed by Salehi [3]. Due to its good auto-correlation and crosscorrelation properties, the OOC has been widely used for various CDMA based systems including optical CDMA systems [4], [5]. Keshavarzian and Salehi introduced the serial-search (SS) [4] scheme using the OOC, which is simple; however, its MAT increases as the code length becomes longer. Thus, the SS scheme is not suitable for rapid acquisition of a long code that is essential for multi-user environments. In order to overcome this drawback, in [5], the same authors proposed
“This research was supported by the Information Technology Research Center program of the Institute for Information Technology Advancement, under Grant IITA2009-C1090-0902-0005, with funding from the Ministry of Knowledge Economy, Korea.” Corresponding author.
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 314–324, 2009. c Springer-Verlag Berlin Heidelberg 2009
Enhanced Multiple-Shift Scheme for Rapid Code Acquisition
315
the multiple-shift (MS) scheme using the OOC, which consists of two stages and offers a shorter MAT compared with that of the SS scheme. In this paper, we propose a novel code acquisition scheme called enhanced multiple-shift (EMS). The EMS scheme also consists of two stages like the MS scheme, however, by using multiple thresholds and modified local code, the EMS scheme provides a shorter MAT compared with that of the MS scheme. The remainder of this paper is organized as follows. Section 2 describes the system model. In Section 3, we present the conventional MS and proposed EMS schemes. Section 4 analyzes the MAT performance of the EMS scheme. In Section 5, the simulation results show the MATs of MS and EMS schemes in single-user and multi-user environments. Section 6 concludes this paper.
2
System Model
In an optical CDMA channel, there exist various kinds of impairments such as noise, multipath signals, and multiple access interference (MAI). The influences of noise and multipath signals can be almost completely mitigated by using fiberoptic medium; however, that of MAI should be alleviated in the receiver [6], [7]. In this paper, thus, we consider a multi-user environment without the influences of noise and multipath signals. Then, the received signal r(t) can be written as r(t) =
N
s(n) (t − τ (n) ),
(1)
n=1
where s(n) (t) is the transmitted signal of the n-th user; τ (n) ∈ [0, Tb ) denotes the time delay of the n-th user with bit duration Tb ; and N is the number of total users. We consider the on-off-keying (OOK) modulation and assume that the bit rate is the same for all users. Thus, transmitted signal s(n) (t) can be expressed as s(n) (t) =
∞
(n)
bi c(n) (t − iTb ),
(2)
i=−∞ (n)
where bi is the i-th binary data bit of the n-th user and c(n) (t) = F −1 (n) j=0 aj p(t − jTc ) is the OOC of the n-th user with chip duration Tc and (n)
sequence aj (n) aj )
∈ {0, 1} of length F and weight K (the total number of ‘1’s in
with the rectangular pulse p(t) of length Tc defined as 1, 0 ≤ t < Tc , p(t) = 0, otherwise.
(3)
Generally, the OOC can be denoted by its parameters, i.e., (F, K, λa , λc ), where λa and λc are auto-correlation and cross-correlation constraints, respectively [3]. For ideal strict orthogonality of OOC, both λa and λc have to be zero; however, since an OOC consists of 0 and 1, the ideal strict orthogonality cannot be satisfied. Thus, in this paper, both λa and λc are set to 1.
316
3 3.1
D. Chong et al.
Code Acquisition Schemes Multiple-Shift (MS) Scheme
In the MS scheme, total F cells in the search space are divided into Q groups, each of which contains M cells. The relation of Q and M is given by F Q= , (4) M where · denotes the ceiling operation. Note that the upper closest integer to the ratio of F/M is chosen as the value of Q when M is not a divisor of F . The MS scheme consists of two stages. In the first stage, the received signal r(t) is correlated with the first stage local code shown in Fig. 1. The correlation is repeated on a group-by-group basis. If the correlation value corresponding to a certain group exceeds a given threshold T HMS,f irst , the group is declared to be the correct group having the time delay τ (n) and the process is transferred to the second stage. In the second stage, the correlation-based search is performed again with the second stage local code (original OOC) on a cell-by-cell basis over M cells in the correct group. As in the first stage, when the correlation value corresponding to a certain cell exceeds a given threshold T HMS,second, which is F chips
(a) The first stage local code of the MS scheme (M = 2)
(b) The first stage local code of the MS scheme (M = 3)
(c) The second stage local code of the MS scheme (original OOC) Fig. 1. The local codes of the MS scheme when OOC of (32,4,1,1) is used
Enhanced Multiple-Shift Scheme for Rapid Code Acquisition
317
different from the value in the first stage, the cell is declared to be an estimate of the time delay τ (n) . Using the definition in [5], the MAT of MS scheme TMS is given by TMS =
Q+1 M +1 + . 2 2
From (4) and √ (5), we can find that the minimum value of TMS equals to when M = F . 3.2
(5) √ F
Enhanced Multiple-Shift (EMS) Scheme
The EMS scheme proposed in this paper also consists of two stages as the MS scheme. However, using multiple thresholds and modified local code, the EMS scheme has a shorter MAT than that of the MS scheme. In the first stage, the first stage local code shown in Fig. 2 is used instead of the conventional local code for EMS scheme. The first stage local code consists of large and small chips. When M is an even number, the number of large chips is equal to the number of small chips; otherwise, the number of large chips is F chips
(a) The first stage local code of the EMS scheme (M = 2)
(b) The first stage local code of the EMS scheme (M = 3)
(c) The second stage local code of the EMS scheme (original OOC) Fig. 2. The local codes of the EMS scheme when OOC of (32,4,1,1) is used
318
D. Chong et al.
larger by 1 than the number of small chips. The power of the small chips is determined with the following conditions ⎧ ⎪ ⎨ α < 1, Conditions of α = (6) α > λKa , ⎪ ⎩ λc α> K, where α represents the power of small chips. On the other hand, the power of large chips is always set to 1. The correlation is repeated on a group-by-group basis as in the first stage of the MS scheme. If the correlation value corresponding to a certain group exceeds a given thresholds T HEMS,f irst or thEMS,f irst , the group is declared to be the correct group having the time delay τ (n) and the process is transferred to the second stage. In the second stage, the correlation-based search is performed again with the second stage local code (original OOC) on a cell-by-cell basis over M cells in the correct group. As in the first stage, when the correlation value corresponding to a certain cell exceeds a given threshold T HEMS,second , the cell is declared to be an estimate of the time delay τ (n) . F chips
tmin
(a) Original OOC (tmin=3 chips)
(b) The first stage local code of the MS scheme (M = 4)
(c) The first stage local code of the EMS scheme (M = 4)
Fig. 3. The local codes of the MS and EMS scheme when M > tmin
Enhanced Multiple-Shift Scheme for Rapid Code Acquisition
319
In the first stage, if the correlation value of the correct group is equal to or larger than T HEMS,f irst , we only search the first half of M cells of the correct group in the second stage; otherwise and if the correlation value of the correct group is equal to or larger than thEMS,f irst , the search is performed over the second half of M cells of the correct group in the second stage. In the EMS scheme, two thresholds, T HEMS,f irst and thEMS,f irst can be determined based on the power of large and small chips in the first stage local code. Using the definition in [5], the MAT of EMS scheme TEMS is given by TEMS =
Q+1 M +2 + . 2 4
(7)
As we can see in (7), the MAT in the second stage is reduced by half and the √ minimum value of TEMS equals to F/2 + 1 when M = 2F . For a correct operation of the proposed EMS and conventional MS schemes, M has to be equal to or smaller than tmin , where tmin is the minimum chip interval of the OOC. Otherwise, chips of local code are overlapped as shown in Fig. 3, where tmin = 3 and M = 4. In this case, the EMS scheme cannot guarantee a correct operation and good performance.
4
Performance Analysis
In this section, we derive the MAT expression of the EMS scheme by modeling the acquisition process as a discrete-time Markov process [1] with the circular flow graph diagram shown in Figs. 4 and 5. In Fig. 4, ‘ACQ’ and ‘F A’ represent the acquisition and false alarm states, respectively, and PD (pd ) and PF A (pf a ) denote the detection and false alarm probabilities in the first and second stages, respectively. Fig. 5 represents the transition process diagrams of Hdet (z) in Fig. 4. In Fig. 4, states Q1 , Q2 , · · · , QQ−1 correspond to the incorrect groups and a state QQ corresponds to the correct group, where the acquisition can be achieved. States s1 , s2 , · · · , sM/2 and states s(M/2)+1 , s(M/2)+2 , · · · , sM correspond to cells in the second stage, when the correlation value of the correct group is large (when the correlation value ≥ T HEMS,f irst ) and small (when thEMS,f irst ≤ the correlation value < T HEMS,f irst ), respectively. The gains Hdet (z) and Hmiss (z) represent the transition gains from QQ to ACQ (detection) and from QQ to Q1 (miss), respectively. The gains H(z) and h(z) represent the transition gains between the successful incorrect groups and between the successful incorrect cells, respectively. L is the penalty time factor due to the false alarm. The gains h(z), H(z), Hdet (z), and Hmiss (z) can be expressed as follows: h(z) = (1 − pf a )z + pf a z L+1 ,
(8)
H(z) = (1 − PF A )z + PF A zhM/2 (z),
(9)
(v)
Hdet (z) = pd PD z 2 hv−1 (z),
(10)
320
D. Chong et al.
ACQ
FA
zL
s2
h(z)
pfaz
s3
Hdet(z) s1 QQ H(z)
FA
QQ-1
(1-pfa)z
Hmiss(z)
H(z)
PFAz s(M/2)+1 pfaz PFAz Q1
(1-PFA)z
L
z QQ-2
Q2
(1-pfa)z
s(M/2)+2
sM/2 h(z)
h(z)
h(z)
H(z)
sM
s(M/2)+3
Q3
Qi+1 Qi
H(z)
Fig. 4. Markov state model for EMS scheme
and Hmiss (z) = (1 − PD )z + PD (1 − pd )z 2 h(M/2)−1 (z),
(11)
where v ∈ {1, 2, · · · , M/2} is distributed uniformly over [1, M/2] and represents the correct state. Let us assume that the search is started at Qi , then the transfer function between Qi and ACQ nodes can be written as (v)
(v)
Ui (z) =
H Q−i (z)Hdet (z) . 1 − Hmiss (z)H Q−1 (z)
(12)
In (12), i ∈ {1, 2, · · · , Q} is assumed to be distributed uniformly over [1, Q]. (v) Thus, after averaging Ui (z) over the probability density function (pdf) of i and v, we can re-write (12) as Hdet (z) 1 Q−i H (z), 1 − Hmiss (z)H Q−1 (z) Q i=1 Q
U (z) =
(13)
M/2 (v) 2 where Hdet (z) = M v=1 Hdet (z). Using the moment generating function, we can obtain the following relationship between U (Z) and TEMS . U (z) = E(z TEM S ),
(14)
Enhanced Multiple-Shift Scheme for Rapid Code Acquisition
321
ACQ
(a)
pdz h(z) h(z)
s2
h(z)
(1-pd)z
sv-1
sv
sv+1
s(M/2)-1
s1
sM/2 h(z)
PDz
QQ
Q1
(1-PD)z
ACQ
(b)
pdz h(z) h(z)
s(M/2)+2
sv-1
h(z)
(1-pd)z sv
sv+1
s(M/2)+1
sM-1 sM h(z)
PDz
QQ
(1-PD)z
Q1
Fig. 5. (a) Large group from QQ to Q1 (b) Small group from QQ to Q1
where E(·) denotes the statistical expectation operation. In (14), E(TEMS ) can be computed by its moment generating function as dU (z) = U (1), dz z=1
(15)
Hdet (1) + Hmiss (1) 2 − pd PD + (Q − 1)H (1) , pd PD 2pd PD
(16)
E(TEMS ) = and thus, can be obtained as E(TEMS ) =
where H (1), Hdet (1), and Hmiss (1) can be expressed as
H (1) = 1 +
M PF A (1 + Lpf a ), 2
(17)
322
D. Chong et al.
Hdet (1) = 2pd PD +
(M/2) − 1 pd PD (1 + Lpf a ), 2
(18)
and Hmiss (1) = (1 − PD ) + PD (1 − pd )[(M/2) + 1 + {(M/2) − 1}Lpf a ],
(19)
respectively. When pd = PD = 1 and pf a = PF A = 0 (i.e., single-user case), (16) is re-written as Q+1 M +2 E(TEMS ) = + . (20) 2 4 After differentiating (20), we finally obtain the optimum M and minimum TEMS
√ F as 2F and 2 + 1, respectively, when F 1.
5
Simulation Results
In this section, we compare the MAT of the conventional MS scheme with that of the proposed EMS scheme in single-user and multi-user environments. Simulation parameters are as follows: F = 200; K = 5; λa = λc = 1; N = 1 ∼ 4; α = 0.75; T HMS,f irst = T HMS,second = T HEMS,f irst = T HEMS,second = K; thEMS,f irst = αK; and L = 10. We assume that each user transmits the data 1 or 0 with equal probability and chip is synchronized perfectly. Fig. 6 shows the MAT performance of the MS and EMS schemes as a function of M in a single-user environment. In the figure, the dotted and solid lines 55 MS, simulation Simulation Theory MS, theory EMS, Simulation simulation EMS, Theory theory
50 45
MAT
40 35
Minimum MAT
30 25 20 15 10
0
5
10
15
20
25
M Fig. 6. The MATs of the MS and EMS schemes in a single-user environment
Enhanced Multiple-Shift Scheme for Rapid Code Acquisition
323
55 users MS, 2user 2 EMS, 2user 2 users users MS, 3user 3 EMS, 3user 3 users users MS, 4user 4 users EMS, 4user 4
50 45
MAT
40 35 30 25 20 15 10
0
5
10
15
20
25
M Fig. 7. The MATs of the MS and EMS schemes in a multi-user environment
represent the simulation results of the MS and EMS schemes, respectively, and, the markers ∇ and O represent the theoretical results of the MS and EMS schemes, respectively. From Fig. 6, we can observe that the MAT of the EMS scheme is shorter than that of the MS scheme, and confirm √ that the EMS and MS schemes provide the minimum MAT when M = 14 ( F 14) and M = 20 √ ( 2F = 20), respectively. The difference of MAT between the MS and EMS schemes increases as M increases. Fig. 7 shows the MAT performance of the MS and EMS schemes as a function of M in a multi-user environment. From Fig. 7, we can observe that the MAT of the EMS scheme is shorter than that of the MS scheme, and the difference of MAI between the MS and EMS schemes increases as M increases as in the single-user environment.
6
Conclusion
In this paper, we have proposed a novel code acquisition scheme called EMS for optical CDMA systems. Exploiting the multiple thresholds and modified local code, the proposed EMS scheme can provide a shorter MAT compared with that of the MS scheme. The performance of the EMS scheme has been analyzed using the circular flow graph diagram. The simulation results have confirmed that the EMS scheme offers a shorter MAT compared with that of the MS scheme in both single-user and multi-user environments.
324
D. Chong et al.
References 1. Polydoros, A., Weber, C.L.: A unified approach to serial-search spread spectrum code acquisition-Part I: General theory. IEEE Trans. Commun. 32(5), 542–549 (1984) 2. Chong, D., Lee, B., Kim, S., Joung, Y.-B., Song, I., Yoon, S.: Phase-shift-networkbased differential sequential estimation for code acquisition in CDMA systems. Journal of Korea Inform. and Commun. Society 32(3), 281–289 (2007) 3. Salehi, J.A.: Code division multiple-access techniques in optical fiber networks Part I: Fundamental principles. IEEE Trans. Commun. 37(8), 824–833 (1989) 4. Keshavarzian, A., Salehi, J.A.: Optical orthogonal code acquisition in fiber-optic CDMA systems via the simple serial-search method. IEEE Trans. Commun. 50(3), 473–483 (2002) 5. Keshavarzian, A., Salehi, J.A.: Multiple-shift code acquisition of optical orthogonal codes in optical CDMA systems. IEEE Trans. Commun. 53(3), 687–697 (2005) 6. Salehi, J.A., Brackett, C.A.: Code division multiple-access techniques in optical fiber networks - Part II: Systems performance analysis. IEEE Trans. Commun. 37(8), 834–842 (1989) 7. Stok, A., Sargent, E.H.: Lighting the local area: Optical code-division multiple access and quality of service provisioning. IEEE Network 14(6), 42–46 (2000)
AltBOC and CBOC Correlation Functions for GNSS Signal Synchronization Youngpo Lee, Youngyoon Lee, Taeung Yoon, Chonghan Song, Sanghun Kim, and Seokho Yoon School of Information and Communication Engineering, Sungkyunkwan University, 300 Chunchun-dong, Jangan-gu, Suwon, Gyeonggi-do, 440-746, Korea {leeyp204,news8876,ytw0201,starsong83,ksh7150,syoon}@skku.edu
Abstract. Binary offset carrier (BOC) signal synchronization is based on the correlation between the received and locally generated BOC signals. Thus, the multiple side-peaks in BOC autocorrelation are one of the main error sources in synchronizing BOC signals. Recently, new correlation functions with no side-peak were proposed for sine and cosine phased BOC signal synchronization, respectively, by the authors [3]. In this paper, we propose new correlation functions with no side-peak for alternative BOC (AltBOC) and composite BOC (CBOC) signals by using the similar approach to the previous work.
1
Introduction
In binary offset carrier (BOC) modulation, the BOC signal is created by the product of the data signal, a pseudo random noise (PRN) code, and a subcarrier. Due to its capability to resist multipath and its separated spectrum from that of the global positioning system (GPS) signal [1], the BOC has been adopted as the modulation method for the global navigation satellite systems (GNSSs) including the European Galileo and the GPS III systems [2]. In GNSSs, a timing error from the synchronization process can result in a critical positioning error. Thus, timing synchronization is crucial for reliable GNSSbased communications. The BOC signal synchronization is generally carried out in two stages: acquisition and tracking. In acquisition stage, first, time phase of the locally generated BOC signal is aligned with that of the received signal within the allowable tracking range, and then, the fine adjustment is performed to achieve synchronization in tracking stage. Fig. 1 shows the autocorrelation and early-late discriminator output for BOC signal in the acquisition and tracking stages, respectively, where Tb is the PRN code chip duration, d denotes an earlylate spacing, and false alarm is the event that an autocorrelation value outside
“This research was supported by the Information Technology Research Center program of the Institute for Information Technology Advancement, under Grant IITA2009-C1090-0902-0005, with funding from the Ministry of Knowledge Economy, Korea.” Corresponding author.
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 325–334, 2009. c Springer-Verlag Berlin Heidelberg 2009
326
Y. Lee et al.
Fig. 1. Ambiguous problem in BOC signal synchronization
the allowable tracking range exceeds a specified threshold. From the figure, we can see that the autocorrelation has multiple side-peaks, which would increase the false alarm probability, and consequently, might cause the synchronization process converge to a false lock point. This is called the ambiguous problem. In order to solve the ambiguous problem, new correlation functions with no side-peak were proposed for original BOC signals including sine phased BOC (SinBOC) and cosine phased BOC (CosBOC) [3], whose sub-carriers are the sine and cosine phased square waves having each binary ±1 values, respectively. However, the correlation function with no side-peak proposed for original BOC signals cannot be employed for alternative BOC (AltBOC) and composite BOC (CBOC) signals specialized for use on Galileo E5 band and Galileo open service, respectively [2]. In this paper, thus, we propose new correlation functions for signal synchronization of AltBOC and CBOC signals with a similar approach as in [3]. The remainder of this paper is organized as follows. Section 2 describes AltBOC and CBOC signal models and proposes new correlation functions for AltBOC and CBOC signals. Section 3 presents simulation results, and finally, Section 4 concludes this paper.
2 2.1
Proposed Correlation Functions Correlation Function with No Side-Peak for AltBOC Signals
A general AltBOC signal AltBOC(kn, n), where n represents the ratio of the PRN code rate to 1.023 MHz and k is the integer ratio of the PRN chip duration to the period of the square wave sub-carrier, can be expressed as Salt (t) = c(t)d(t)calt (t),
(1)
AltBOC and CBOC Correlation Functions for GNSS Signal Synchronization
327
Fig. 2. Waveforms of the conventional and proposed sub-carrier signals for AltBOC(kn, n)
where c(t) and d(t) denote the PRN code with duration Tb and the navigation data, respectively, and calt (t) is the square wave sub-carrier for the AltBOC signal with period Tc , whose waveform is shown in Fig. 2. With the sub-carrier, AltBOC can provide higher degree of spectral separation compared with those of SinBOC and CosBOC, and thus, has the advantage that multiple signals can be easily processed at the same time, allowing low hardware complexity. In this paper, we assume that d(t) = 1 as in a pilot channel. A GNSS often includes a pilot channel to achieve rapid synchronization in the absence of data modulation on the transmitted signal [4]. The proposed correlation function is generated by using correlations between the received AltBOC(kn, n) signal and newly designed partial sub-carrier signals c0alt (t), c1alt (t), · · · , c2k−1 alt (t) shown in Fig. 2. Each partial sub-carrier with duration Tc /2 is generated by being separated from the AltBOC sub-carrier, and so, the total duration of c0alt (t), c1alt (t), · · · , c2k−1 alt (t) is equal to the PRN duration Tb . Fig. 3 shows the processes of generating the proposed correlation function 2k−1 0 1 for AltBOC(kn, n), where Ralt , Ralt , · · · , Ralt denote correlations between the received AltBOC(kn, n) signal and partial sub-carrier signals c0alt (t), c1alt (t), · · · , c2k−1 alt (t) over Tb , respectively. From the figure, we can see that the operation q 2k−q−1 |Ralt | + |Ralt | for q = 0, 1, · · · , k − 1 create the correlation function with 4k − 2q − 1 peaks including a main-peak and 2q side-peaks in the same shape as q 2k−q−1 the main-peak. On the other hand, the operation |Ralt − Ralt | creates the correlation function with 4k − 4q − 2 side-peaks only where the main-peak and q 2k−q−1 the 2q side-peaks are removed. Thus, with the operation |Ralt | + |Ralt |− q 2k−q−1 |Ralt − Ralt |, we can obtain a correlation function with a main-peak and 2q side-peaks in the same shape as the main-peak. Finally, removing 2q side-peaks and increasing the main-peak magnitude, we can obtain a correlation function with no side-peak as
328
Y. Lee et al.
Fig. 3. The process of generating the proposed correlation function for AltBOC(kn, n)
proposed Ralt =
k−1
q 2k−q−1 q 2k−q−1 {|Ralt | + |Ralt | − |Ralt − Ralt |}.
(2)
q=0
We also consider a case that the ratio of the PRN chip duration to the period of the square wave sub-carrier is not an integer, e.g., AltBOC(1.5n, n). For such AltBOC signals, the approach used in derivation of (2) cannot be fully adopted, and alternatively, we can obtain the correlation function with no side-peak as proposed 1 2 0 2 1 Ralt = {|Ralt | + |Ralt | − |Ralt + Ralt |} × |Ralt |. (3) 2.2
Correlation Function with No Side-Peak for CBOC Signals
The CBOC signal CBOC(u, v, γ) is obtained from a weighted sum of two BOC signals SinBOC(u, 1) and SinBOC(v, 1) with the power split ratio γ, where SinBOC(kn, n) is defined as the product of the navigation data, a PRN code, and a square wave sub-carrier sgn{sin(2πknt × 1.023 × 106 )}. For example, CBOC(6, 1, 1/11) is generated through the combination of SinBOC(6, 1) and SinBOC(1, 1) with spectrum components given by GCBOC (f ) =
1 10 GSinBOC(6,1) (f ) + GSinBOC(1,1) (f ), 11 11
(4)
where GSinBOC(·,·) (f ) is the unit power spectrum density of a SinBOC defined in [5]. Thus, it is natural to consider u and v having u/v (u > v) even number
AltBOC and CBOC Correlation Functions for GNSS Signal Synchronization
329
Fig. 4. Waveforms of the conventional and proposed sub-carrier signals for CBOC(u, v, γ)
Fig. 5. The proposed CBOC(6, 1, 1/11)
correlation
function
and
autocorrelation
function
of
to guarantee orthogonality between two SinBOC (fundamentally, sub-carrier) signals, then, the CBOC(u, v, γ) signal can be expressed as Sc (t) = c(t)d(t)cc (t) √ = c(t)d(t){ γcsin(u,1) (t) + 1 − γcsin(v,1) (t)},
(5)
330
Y. Lee et al.
where the CBOC sub-carrier cc (t) is the weighted sum of two square wave sub-carrier csin(u,1) (t) (= sgn{sin(2πut × 1.023 × 106 )}) and csin(v,1) (t) (= sgn{sin(2πvt × 1.023 × 106 )}). Fig. 4 shows the waveform of CBOC(u, v, γ), where Tc denotes the period of the sub-carrier csin(v,1) (t). Assuming d(t) = 1 and using the same approach as in Subsection 2.1, we can obtain a correlation function with no side-peak as Rcproposed =
v−1
{|Rcq | + |Rc2v−q−1 | − |Rcq − Rc2v−q−1 |},
(6)
q=0
where Rc0 , Rc1 , · · · , Rc2v−1 denote correlations between the received CBOC(u, v, γ) signal and partial sub-carrier signals over Tb , respectively. Fig. 5 shows the proposed correlation function and autocorrelation function of CBOC(6, 1, 1/11), and we can observe that side-peaks can be completely removed with the proposed correlation function.
3
Simulation Results
In this section, proposed correlation functions are compared with autocorrelation functions in terms of the power ratio between a main-peak and all peaks including the main-peak defined as power ratio =
power in a main-peak power in all peaks
(7)
It should be noted that the power ratio of 1 means that correlation functions do not have any side-peak. For simulation, we assume the two-path channel model [6], whose impulse response is defined as h(t) = δ(t) + αδ(t − β),
(8)
where α (α < 1) denotes the attenuation factor of second path which is set to 0.5 in this paper, β (β > 0) is the time difference between the first and second path, and δ(t) represents the Dirac-delta function. Fig. 6 shows the power ratio of the proposed correlation and autocorrelation functions for AltBOC(kn, n) as a function of k in an interference-free environment. From the figure, we can observe that the proposed correlation function has the power ratio of 1 for the whole range of k values shown, implying that there is no side-peak. Figs. 7 and 8 show the power ratio of the proposed correlation and autocorrelation functions for AltBOC(n, n) and AltBOC(1.5n, n), respectively, in a two-path channel. As we can see from the figures, the proposed correlation function has a better power ratio than that of the AltBOC autocorrelation function in the two-path channel. Figs. 9 and 10 show the power ratio of the proposed correlation and autocorrelation function for CBOC(6, 1, 1/11) and CBOC(6, 1, 4/33), respectively, in a
AltBOC and CBOC Correlation Functions for GNSS Signal Synchronization
331
Fig. 6. Power ratio of the proposed correlation and autocorrelation functions for AltBOC(kn, n) as a function of k in an interference-free environment
Fig. 7. Power ratio of the proposed correlation and autocorrelation functions for AltBOC(n, n) in a multipath channel
two-path channel. From the figures, we can see that the proposed CBOC correlation function has a better power ratio than that of the CBOC autocorrelation function in the two-path channel.
332
Y. Lee et al.
Fig. 8. Power ratio of the proposed correlation and autocorrelation functions for AltBOC(1.5n, n) in a multipath channel
Fig. 9. Power ratio of the proposed correlation and autocorrelation functions for CBOC(6, 1, 1/11) in a multipath channel
AltBOC and CBOC Correlation Functions for GNSS Signal Synchronization
333
Fig. 10. Power ratio of the proposed correlation and autocorrelation functions for CBOC(6, 1, 4/33) in a multipath channel
4
Conclusion
In this paper, new correlation functions with no side-peak have been proposed for AltBOC and CBOC signal synchronization. We have first created new subcarrier signals by dividing the conventional sub-carrier signals of AltBOC and CBOC signals. Then, new correlation functions with no side-peak have been obtained by combining the correlation between received AltBOC/CBOC signal and divided sub-carrier signals. In an interference-free channel, the proposed correlation functions have the power ratio of 1, i.e., they have no side-peak. In a multipath channel, on the other hand, the proposed correlation functions have the power ratio less than 1; however, they have much better power ratios than those of the autocorrelation functions.
References 1. Julien, O., Macabiau, C., Cannon, M.E., Lachapelle, G.: ASPeCT: Unambiguous sine-BOC(n,n) acquisition/tracking technique for navigation applications. IEEE Trans. Aerospace and Electronic Systems 43, 150–162 (2007) 2. Fantino, M., Marucco, G., Mulassano, P., Pini, M.: Performance analysis of MBOC, AltBOC and BOC modulations in terms of multipath effects on the carrier tracking loop within GNSS receivers. In: Proc. IEEE/ION Position Location and Navigation Symposium (PLANS), Monterey, CA (May 2008)
334
Y. Lee et al.
3. Kim, S., Chong, D., Joung, Y.-B., Ahn, S., Lee, Y., Kim, S.Y., Yoon, S.: A novel unambiguous BOC signal synchronization scheme for global navigation satellite systems. In: Proc. IASTED Commun. Internet, Inform. Technol. (CIIT), Banff, AB, Canada, pp. 43–48 (July 2007) 4. Lohan, E.S., Lakhzouri, A., Renfors, M.: Complex double-binary-offset-carrier modulation for a unitary characterisation of Galileo and GPS signals. IEE Proc.-Radar Sonar Navig. 153, 403–408 (2006) 5. Betz, J.W.: Binary offset carrier modulations for radionavigations. Journal of The Institute of Navigation 48, 227–246 (Winter 2001-2002) 6. Soubielle, J., Fijalkow, I., Duvaut, P., Bibaut, A.: GPS positioning in a multipath environment. IEEE Trans. Signal Process. 50, 141–150 (2002)
Performance Enhancement of IEEE 802.11b WLANs Using Cooperative MAC Protocol Jin-Seong Kim and Tae-Jin Lee School of Information and Communication Engineering Sungkyunkwan University 440-746, Suwon, South Korea +82-31-290-7149 {kimjinseong,tjlee}@ece.skku.ac.kr
Abstract. IEEE 802.11 Distributed Coordination Function (DCF) guarantees fair transmission opportunity to stations. Channel occupation of low rate stations in IEEE 802.11 Wireless Local Area Network (WLAN) causes the performance anomaly resulting in the degradation of overall network throughput. Using cooperative communications between a high rate station and a low rate station, performance anomaly can be mitigated in IEEE 802.11 WLAN. In this paper, we propose a new cooperative Medium Access Control (MAC) protocol to remedy the performance anomaly and to improve transmission opportunities for low rate stations. We show that the proposed cooperative MAC protocol can enhance overall performance (e.g., throughput and MAC delay).
1 Introduction IEEE 802.11 is widely used as a major standard for Wireless Local Area Network (WLAN). IEEE 802.11a/b/g supports multi-rate for the stations in a network. Using link adaptation algorithm, IEEE 802.11a/b/g virtually groups the stations by the rates in a WLAN. IEEE 802.11b provides four transmission rates for group 1, 2, 3, and 4, i.e., 11Mbps, 5.5Mbps, 2Mbps and 1Mbps, respectively [1]. IEEE 802.11 Medium Access Control (MAC) supports Carrier Sense Multiple Access / Collision Avoidance (CSMA/CA) for Distributed Coordination Function (DCF) in contention-based access, and it also supports Point Coordination Function (PCF) in contention-free access. CSMA /CA operates the Binary Exponential Backoff (BEB) mechanism [2]. IEEE 802.11 CSMA/CA guarantees fair transmission opportunities to stations. If all stations have the same size of data frames for transmission, low rate stations consume more channel occupation time than high rate stations. Thus, decreased channel occupation time of high rate stations causes the network performance degradation. This phenomenon caused by low rate stations is known as performance anomaly [3]. In this paper, we propose a new cooperative communications to solve the performance anomaly and to enhance throughput. In order to improve overall throughput with the help of relays, cooperative communications have been researched. CoopMAC is proposed for enhancement of throughput and mitigation of service delays to stations [4]. For the reliable transmission between the Access Point (AP) and stations, Cooperative Diversity MAC (CD-MAC) is suggested [5]. Cooperative Communication MAC (CMAC) is proposed to support robust transmission of stations with adaptive Forward Error Control (FEC) O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 335–344, 2009. c Springer-Verlag Berlin Heidelberg 2009
336
J.-S. Kim and T.-J. Lee
schemes via cooperation [6]. In this paper, we propose a new cooperative protocol. We solve the performance anomaly by the proposed cooperative MAC protocol, which is relay-assisted protocol. It improves throughput and reduces MAC delay. The organization of this paper is as follows. We propose a cooperative MAC protocol, and present operating procedures in Section 2. Then, we describe a simulation environment and measure system performance (e.g., throughput, fairness and MAC delay) by simulations in Section 3. Finally, we conclude in Section 4.
2 Proposed Cooperative MAC Protocol for IEEE 802.11b The proposed cooperative MAC protocol operates as in Fig. 1. We define low rate stations located in group 4 with the transmission rate of 1Mbps. Helper stations are the stations in group 3 with the transmission rate of 2Mbps. There are three components for cooperative communications in the proposed cooperative MAC protocol, e.g., an AP, a helper station, and a low rate sending station. Suppose that the channel quality between the helper station and the low rate station is good enough to support 11Mbps rate, because those stations are located near each other. A helper station selects a low rate station among the stations placed in 11Mbps transmission range from the helper station. Helper stations can work as relays of low rate stations in the network. A helper station forwards a data frame of the low rate station to the AP using Decode-and-Forward (DF) cooperation technique. The proposed cooperative MAC protocol is triggered by a helper station. In this paper, we assume that there are sufficient helper stations in a WLAN and a helper station recognizes and associates a proper low rate station for cooperative communications. To resolve the hidden node problem, we use the four-way handshaking method by using Request-To-Send / Clear-To-Send (RTS/CTS) messages. The operating procedure among a helper station, a low rate station and an AP is illustrated in Fig. 2. A helper station sends a HELLO frame to the selected low rate station. If the helper station receives an Acknowledgement (ACK) frame from the
Cooperative commuincations Direct transmission
11Mbps
2Mbps A station in group 3 (Helper station)
A station in group 4 (Low rate station)
1Mbps
AP
Fig. 1. Proposed cooperative MAC protocol operating scenario in IEEE 802.11b WLAN
Performance Enhancement of IEEE 802.11b WLANs Low rate Station (Group4)
P
Helper Station (Group3)
P HELLO
337
P M Payload
ACK
P
coop RTS
P
P
AP
coop CTS
M
P
Payload
P
M
Payload
coop ACK
P
coop ACK
Time DIFS
SIFS
SIFS
SIFS
SIFS
SIFS
SIFS
Backoff
SIFS
P
SIFS
PLCP header
M
DIFS
MAC header
Fig. 2. Operating procedure of proposed cooperative MAC protocol
low rate station, the helper station knows that the low rate station demands cooperative communications. Then the helper station transmits a cooperation RTS (coopRTS) frame to the AP. When the AP receives a coopRTS frame from the helper station successfully, the AP transmits a cooperation CTS (coopCTS) frame to the helper station and the low rate station. The helper station and the low rate station, which receive a coopCTS frame, recognize that a channel is established for cooperative communications. The low rate station sends a data frame to the helper station with 11Mbps transmission rate. After the reception of a data frame from the low rate station, the helper station forwards a data frame with 2Mbps transmission rate using DF cooperation technique. If the low rate station receives a cooperation ACK (coopACK) frame and the helper station overhears a coopACK frame toward the low rate station, the helper station sends its own data frame to the AP. When the AP receives a data frame of the helper station successfully, the AP transmits a coopACK frame to the helper station. The helper station, which receives a coopACK frame, and the low rate station, which overhears a coopACK frame, terminates the proposed cooperative MAC protocol. Most of low rate stations are expected to achieve the benefit of cooperative communications, because low rate stations can get the higher transmission rate by cooperative communications than by direct transmission. However, demands for cooperative communications are overhead to helper stations. For efficient control of cooperative communications, helper stations determine whether cooperative communications is used or not. So, we propose a cooperative MAC protocol triggered by helper stations. To guarantee transmission opportunities for low rate stations, we make helper stations operate the proposed cooperative MAC protocol with a selected low rate station with the probability of 1/2. The channel establishment is needed for reliable cooperative communications between the low rate station and the helper station as well as the helper station and the AP. Thus, the proposed cooperative MAC protocol defines four new control frames as shown in Fig. 3. Using the broadcasting nature of wireless communications, the proposed cooperative MAC protocol minimizes the time for channel establishment process.
338
J.-S. Kim and T.-J. Lee
bytes
2
2
6
6
4
Frame control
Duration
Receiver Address (Low rate station)
Transmitter Address (Helper station)
FCS
(a) HELLO Control Frame 2
2
6
6
6
4
Frame control
Duration
Receiver Address (AP)
Transmitter Address (Helper station)
Optional Address (Low rate station)
FCS
bytes
(b) coopRTS Control Frame bytes
2
2
6
6
4
Frame control
Duration
Receiver1 Address (Helper station)
Receiver2 Address (Low rate station)
FCS
(c) coopCTS Control Frame bytes
2
2
6
4
Frame control
Duration
Receiver Address
FCS
(Helper or low rate station)
(d) coopACK Control Frame
Fig. 3. Proposed control frames for cooperative MAC protocol
3 Simulation Result We consider a network environment as in Fig. 4. The number of stations in group 1 is 5, group 2 is 5, and group 3 is 10. As we increase the number of low rate stations from 0 to 10, we evaluate the performance of the proposed cooperative MAC protocol. We assume that the channel condition is ideal, in which path loss and interference do not affect the channel quality. The stations assumed to be fixed. And we assume saturated traffic condition. The saturated traffic condition indicates that transmission queues of all stations are always full. Helper stations are located in group 3 and they already have corresponding low rate stations for cooperative communications.
10
Group4 (1Mbps)
1
Group3 (2Mbps) Group2 (5.5Mbps)
9
2
Group1 (11Mbps)
AP
8
Wireless stations 3
Helper stations Low rate stations
7 4 6 5
Fig. 4. Network topology for cooperative MAC protocol in IEEE 802.11b WLAN
Performance Enhancement of IEEE 802.11b WLANs
339
The number of groups G=4, and i denotes the group index, 1≤i≤G. The transmission rate of a station in group i is denoted as ri . When i is 1, 2, 3 and 4, ri is 11 Mbps, 5.5 Mbps, 2 Mbps and 1Mbps, respectively. In this simulation, we use the size of frames and the parameters in accordance with the IEEE 802.11b standard (Table1 and Table2). The size of MAC header is SMACheader . TMACheader,i is the time duration to transmit a MAC header by a station in group i. The time to transmit the MAC Service Data Unit (MSDU) by a station in group i is defined as TMSDU,i and SMSDU is the size of Table 1. Size of frames (unit : bytes) Frame
Value
PLCP header
24
MAC Header 28 ACK
14
RTS
20
CTS
14
HELLO
20
coopRTS
26
coopCTS
20
coopACK
14
MSDU
2304
Table 2. Simulation parameters (unit : μsec) Parameter
Value
TP LCP
192 bits / 1 Mbps
TRT S
TP LCP + 20×8 bits / 1 Mbps
TCT S
TP LCP + 14×8 bits / 1 Mbps
TACK
TP LCP + 14×8 bits / 1 Mbps
TM ACheader,i TP LCP + SM ACheader ×8 bits / ri Mbps TM SDU,i
SM SDU ×8 bits / ri Mbps
THELLO
TP LCP + 20×8 bits / 1 Mbps
TcoopRT S
TP LCP + 26×8 bits / 1 Mbps
TcoopCT S
TP LCP + 20×8 bits / 1 Mbps
TcoopACK
TP LCP + 14×8 bits / 1 Mbps
Tsif s
10
Tdif s
50
Tslot
20
δ
1
340
J.-S. Kim and T.-J. Lee
MSDU. Physical Layer Convergence Procedure (PLCP) headers and control frames are transmitted with the transmission rate of 1Mbps for reliability. TP LCP , TRT S , TCT S , and TACK represent the time duration to transmit a PLCP header, an RTS frame, a CTS frame and an ACK frame. THELLO , TcoopRT S , TcoopCT S , and TcoopACK are the time duration to transmit a HELLO frame, a coopRTS frame, a coopCTS frame and a coopACK frame. The propagation delay of each frame is defined as δ. The time durations of Short Interframe Space (SIFS), DCF Interframe Space (DIFS) and a slot time are Tsif s , Tdif s and Tslot , respectively. Using the parameters, we compute Tsucc,i the necessary time duration for successful transmission using the conventional IEEE 802.11b RTS/CTS mechanism by a station in group i. Tsucc,i =TRT S + Tsif s + δ + TCT S + Tsif s + δ + TMACheader,i + TMSDU,i (1) + Tsif s + δ + TACK + Tdif s + δ, 1 ≤ i ≤ G. And Tcoll,i is time duration when a collision occurs by a station in group i using the conventional IEEE 802.11b RTS/CTS mechanism. Tcoll,i =TRT S + Tdif s + δ, 1 ≤ i ≤ G.
(2)
The successful transmission time by a station in group i using the proposed cooperative MAC protocol is coop Tsucc,i =THELLO + Tsif s + δ + TACK + Tsif s + δ + TcoopRT S + Tsif s
+ δ + TcoopCT S + Tsif s + δ + TMACheader,coop + TMSDU,coop + Tsif s + δ + TMACheader,i + TMSDU,i + Tsif s + δ + TcoopACK + TMACheader,i + TMSDU,i + Tsif s + δ + TcoopACK + Tdif s + δ,
(3)
i = G − 1. When the proposed cooperative MAC protocol is operating, TMACheader,coop =TP LCP + SMACheader × 8/rcoop
(4)
is the time duration to send a MAC header from the low rate station to the helper station. The time duration TMSDU,coop to forward an MSDU from the low rate station to the helper station is TMSDU,coop =SMSDU × 8/rcoop ,
(5)
where the transmission rate rcoop is used when a low rate station transmits a MAC header and a data frame to the helper station in the proposed cooperative MAC protocol, i.e., rcoop =11Mbps. The time duration when a collision occurs by a station in group i using the proposed cooperative MAC protocol is coop Tcoll,i =THELLO + Tdif s + δ, i = G − 1.
(6)
Performance Enhancement of IEEE 802.11b WLANs
341
The saturation throughput of the stations in group i is defined as Sth,i . And, Ttotal is total time required for the simulation. Nsucc,i is the average number of successful transmissions in group i. In this simulation, Ttotal is 10,000 seconds and Nsucc,i is found by averaging 1,000 times. Sth,i =
Nsucc,i SMSDU , 1 ≤ i ≤ G. Ttotal
(7)
Total saturation throughput Sth is then Sth =
G
Sth,i .
(8)
i=1
We define the fairness index F as F =
G×
G i=1
G
Nsucc,i
2
2 i=1 (Nsucc,i )
.
(9)
The closer to one F is, the fairer the transmission opportunities of each groups. Fig. 5 describes the saturation throughput of the network. The proposed cooperative MAC protocol can dramatically improve the saturation throughput. When the number of stations is 31, the saturation throughput of the proposed cooperative MAC protocol is enhanced by 37.5% compared with the conventional IEEE 802.11b RTS/CTS mechanism. The saturation throughput is the best for the proposed cooperative communications, although the number of low rate stations is increased. It provides an evidence that 2.6 Conventional IEEE 802.11b DCF Conventional IEEE 802.11b RTS/CTS CoopMAC Proposed cooperative MAC protocol
2.4
Throughput(Mbps)
2.2
2
1.8
1.6
1.4
1.2
1 21
22
23
24
25 26 27 Number of Stations
28
29
30
31
Fig. 5. Saturation throughput of the proposed cooperative MAC protocol, the CoopMAC, the conventional IEEE 802.11b DCF and the RTS/CTS mechanism (n1 =5, n2 =5, n3 =10, n4 =0∼10, SM SDU =2304 bytes)
342
J.-S. Kim and T.-J. Lee 1 Conventional IEEE 802.11b DCF Conventional IEEE 802.11b RTS/CTS CoopMAC Proposed cooperative MAC protocol
0.95
Fairness Index
0.9
0.85
0.8
0.75
0.7
0.65 21
22
23
24
25 26 27 Number of Stations
28
29
30
31
Fig. 6. Fairness index of the proposed cooperative MAC protocol, the CoopMAC, the conventional IEEE 802.11b DCF and the RTS/CTS mechanism (n1 =5, n2 =5, n3 =10, n4 =0∼10, SM SDU =2304 bytes)
the proposed cooperative MAC protocol makes the network robust against performance anomaly. Our proposed cooperative MAC protocol limits transmission opportunities of low rate stations. On the other hand, CoopMAC provides the direct transmission from a low rate station to an AP. Although a low rate station does not receive an HTS (Helper ready To Send) frame from a helper station, the low rate station can transmit the data frame after receiving a CTS frame from the AP [4]. In the CoopMAC, since the channel occupation time of low rate stations is larger than our proposed cooperative MAC protocol, the saturation throughput is decreased about 13%. The fairness index of CoopMAC, however, approaches the conventional IEEE 802.11b RTS/CTS mechanism (see Fig. 6). In Fig. 6, we compare the proposed cooperative MAC protocol, CoopMAC and the conventional IEEE 802.11b RTS/CTS mechanism with respect to the fairness index. Both IEEE 802.11b RTS/CTS and the basic DCF mechanisms operate using the BEB algorithm. Thus, IEEE 802.11b RTS/CTS and DCF mechanisms provide equal transmission opportunities to the stations in a network. In the proposed cooperative MAC protocol, transmission opportunities of low rate stations are half of the transmission opportunities of helper stations. Although the fairness index of the proposed cooperative MAC is less than that of the conventional IEEE 802.11b RTS/CTS mechanism, the proposed cooperative MAC protocol guarantees 90% transmission chances compared with the conventional IEEE 802.11b RTS/CTS mechanism. And, the fairness index of the proposed cooperative MAC protocol is close to the fairness index of the conventional IEEE 802.11b RTS/CTS mechanism when the number of low rate stations is 10. We show that fairness index is enhanced as the amount of cooperative communications increases.
Performance Enhancement of IEEE 802.11b WLANs
343
0.5
0.45
Conventional IEEE 802.11b DCF Conventional IEEE 802.11b RTS/CTS CoopMAC Proposed cooperative MAC protocol
MAC delay (second)
0.4
0.35
0.3
0.25
0.2
0.15
0.1 21
22
23
24
25 26 27 Number of Stations
28
29
30
31
Fig. 7. MAC delay of the proposed cooperative MAC protocol, the CoopMAC, the conventional IEEE 802.11b DCF and the RTS/CTS mechanism (n1 =5, n2 =5, n3 =10, n4 =0∼10, SM SDU =2304 bytes)
MAC delay is the average time for the transmission is successful by a station in a network. As the number of stations increases, more collisions occur by contentions for channel occupation in the BEB algorithm. Since low rate stations do not operate the BEB procedure in the proposed cooperative MAC protocol, MAC delay of the proposed cooperative MAC protocol is the smallest compared with that of the conventional IEEE 802.11b RTS/CTS mechanism and CoopMAC as shown in Fig. 7. The increased MAC delay of CoopMAC is caused by the direct transmission of low rate stations. The proposed cooperative MAC protocol reduces the MAC delay up to 37.1% compared with the conventional IEEE 802.11b RTS/CTS mechanism.
4 Conclusion In this paper, we have proposed a cooperative MAC protocol triggered by helper stations. The proposed cooperative MAC protocol not only mitigates the performance anomaly, but also guarantees transmission opportunities of low rate stations in IEEE 802.11b WLAN. The proposed cooperative MAC protocol improves the saturation throughput up to 37.5% and reduces MAC delay up to 37.1%. Although the fairness index is decreased when using the proposed cooperative MAC protocol, it can be overcome by controlling the degree of cooperative communications of helper stations. And, low rate stations can get the fair transmission opportunities as the conventional IEEE 802.11b RTS/CTS mechanism when the number of helper stations is sufficient for cooperative communications. Future work would be to generalize the proposed cooperative MAC protocol for IEEE 802.11 standards and consider a realistic network topology with the channel model in WLAN.
344
J.-S. Kim and T.-J. Lee
Acknowledgement This research was supported by the MKE(Ministry of Knowledge Economy), Korea, under the ITRC(Information Technology Research Center) support program supervised by the IITA(Institute for Information Technology Advancement)(IITA-2009-C10900902-0005).
References 1. Gast, M.S.: 802.11 Wireless Networks. O’Reilly, Sebastopol (2002) 2. IEEE 802.11, Wireless LAN Medium Access Contol (MAC) and Physical Layer (PHY) Specification (1999) 3. Heusse, M., Rousseau, F., Sabbatel, G.B.-., Duda, A.: Performance Anomaly of 802.11b. In: Proc. INFOCOM 2003, San Francisco, USA, pp. 836–843 (2003) 4. Liu, P., Tao, Z., Narayanan, S., Korakis, T., Panwar, S.S.: CoopMAC: A Cooperative MAC for Wireless LANs. IEEE Journal on Selected Areas in Communications 25(2), 340–354 (2007) 5. Moh, S., Yu, C., Park, S.-M., Kim, H.-N., Park, J.: CD-MAC: Cooperative Diversity MAC for Robust Communication in Wireless Ad Hoc Networks. In: Proc. ICC 2007, Glasgow, Scotland, pp. 3636–3641 (2007) 6. Shankar, S., Chou, C.-T., Ghosh, M.: Cooperative Communication MAC (CMAC) - A New MAC protocol for Next Generation Wireless LANs. In: Proc. IEEE International Conference on Wireless Networks, Communications, and Mobile Computing, Maui, Hawaii, pp. 1–6 (2005)
Authentication Scheme Based on Trust and Clustering Using Fuzzy Control in Wireless Ad-Hoc Networks Seong-Soo Park1, Jong-Hyouk Lee1, and Tai-Myoung Chung2 1
Department of Computer Engineering, Sungkyunkwan university 440-746, Suwon, Korea +82-31-290-7222 {sspark,jhlee}@imtl.skku.ac.kr 2 School of Information & Communication Engineering, Sungkyunkwan university 440-746, Suwon, Korea +82-31-290-7222
[email protected]
Abstract. In wireless ad-hoc networks, all nodes need to have functions able to authenticate each other and so it is critical to know the trustworthiness of each node in such environments. This paper proposes an authentication system for ad-hoc networks, based on a clustering and trust model. Firstly, methods of solving authentication problems, which occur in ad-hoc networks, are discussed. Next, fuzzy logic inference is introduced, which is used as a trust evaluation method for each node and provides both local and global trust values. The reputation is used in the calculation of the trust value. Using the trust value of a trustor, a more sophisticated trust value can be computed. If a node moves from one cluster to another, the trust level of the node is determined by the certificate issued by the previous introducers. In addition, it is demonstrated that the proposed model can monitor and isolate malicious nodes in order to provide enhancements to the network’s overall authentication rate. Keywords: authentication, trust, clustering, fuzzy control, ad-hoc networks.
1 Introduction As the use of personal computers and the need for mobility increase, ad-hoc networks have become an important subject in the future of computers and networking architecture. Ad-hoc networks allow communication among a mobile host, a mobile terminal, and a sensor node, which cannot be used in fixed network architecture [1]. Trust plays an important role when any decision is made based on unstable, uncertain information. This functionality has recently been introduced in e-commerce and virtual communities, where trust represents the subjective degree of belief [2]. In a conventional network environment, trust is set and recognized through the access control procedure. The nodes, to which fixed roles are allocated, can be replaced by various decisions in an open network environment. In this setting, reputation is an important element affecting decisions. Reputation represents a positive prediction level for an object's future behaviors, and greatly affects the decision to trust. There are two ways to obtain information regarding reputation: the subject's own experience with an agent and a recommendation from objects around a subject [3]. O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 345–360, 2009. © Springer-Verlag Berlin Heidelberg 2009
346
S.-S. Park, J.-H. Lee, and T.-M. Chung
Trust affects certain decisions, i.e., access control and the selection of a public key in the Public Key Infrastructure (PKI). Trust is available to supplement the PKI when an object decides whether to receive a public key, based on the other object's trust level [4]. This paper does not assume the existence of any globally trusted entity: on the contrary, everything depends on the individual nodes of the network. The clusterhead, which has the highest trust value, and which is elected as a head in the group, should not be a trusted third party (TTP), but should operate the same as the other nodes in the same group. Basically, the cluster-head has a role to keep the cluster’s information and to broadcast information to members. Also, all nodes may issue a certificate binding the agent’s key and identity. It is impossible to obtain an appropriate trust value if there is a lack of interaction among the nodes, as in the existing trust models. However, the proposed cluster-based trust model can quickly determine the trust values, even in situations where no experience data is available with the help of neighboring nodes or “introducers”. In a real network environment, a node may turn from being trustworthy to becoming malicious under a sudden attack. The ability to detect such misbehavior, and the isolation of malicious nodes, are important in public key authentication situations [5]. This study proposes a cluster-based network model, and trust model, using fuzzy control, that can evaluate the trust value in order to provide an authentication service in ad-hoc environments where there is no centrally controlled server. This model is then used to discuss methods of effectively and correctly calculating the trust value of a node newly entering the cluster, methods to identify the selfish nodes in the cluster, and methods to protect against other possible attacks. The main contributions of this research include the following: y y y
Propose the network model based on clustering, and a trust model based on fuzzy control. Application of the network model and the trust model to the authentication service. Propose a trust model support for both authentication and trust evaluation.
Section 2 introduces previous relevant work, including an existing concept of trust, the role of trust, the clustering mechanism, the behavior of malicious nodes, the considerations in a trust system, and fuzzy logic inference and trust management. Section 3 proposes an authentication architecture and models. A clustering-based trust evaluation model is described in section 4, fuzzy logic in section 5, and security operations in section 6. In Section 7, a performance analysis is proposed and conclusions and suggested tasks for future study are described in Section 8.
2 Related Works 2.1 Trust Concept As the concept of trust was derived from the social sciences, there are no corresponding definitions in the field of distributed computer networking. In paper [6], Josang described two common definitions of trust: “reliability trust” and “decision trust”. Reliability trust is defined by Gambetta in the paper [7] as “a particular level of the
Authentication Scheme Based on Trust and Clustering Using Fuzzy Control
347
subjective probability with which an agent assesses that another agent or group of agents will perform a particular action, both before he can monitor such action (or independently of his capacity ever to be able to monitor it) and in a context in which it affects his own action.” However, trust can be more complex than Gambetta’s definition indicates. So, McKinght et al. proposed that decision trust is “the extent to which one party is willing to depend on something or somebody in a given situation with a feeling of relative security, even though negative consequences are possible.” [8]. In this study, trust is defined as the relationship created by a specific behavior between two objects, and this may be monitored by all neighboring nodes in the same cluster (intra-cluster), and given by the introducers in the different cluster (intercluster). One object expects that the other object will take the correct action, although he or she may misbehave. This trust relation is indicated as {subject: agent, action}, and trust is defined as the certainty that the agent will take action in the subject's view. The value is not absolute, but is determined by the subject’s subjectivity. The trust value may vary, even when the same agent and the same action are involved in the calculation [9]. 2.2 The Role of Trust Many researchers recognize trust as an essential element in security solutions for a distributed system [6]. Here, the roles trust can play in ad-hoc networks are synthesized as follows: 1) Assistance in decision making to improve security and robustness: with a prediction of the behaviors of other entities, a network entity can avoid collaborating with untrustworthy entities, which can greatly reduce the chance of being attacked. 2) Adaptation to risk, leading to flexible security solutions: the prediction of a nodes’ future behavior directly determines the risk faced by the network. Given the network, it can adapt its operation accordingly. 3) Misbehavior detecting: trust evaluation leads to a natural security policy that network participants with low trust values should be investigated or eliminated. 4) Quantitative assessment of system-level security properties: with the assessment of trustworthiness of individual network entities, it is possible to evaluate the trustworthiness of the entire network [10]. 2.3 Mechanism of Clustering The proposed scheme adopts a clustering-based network model. Aims et al. previously proposed a cluster formation approach such that a node is either a cluster-head or is, at most, d hops away from a cluster head, namely the Max-Min d-hop formation algorithm in [11]. Jin et al. in [12] proposes the following construction of a cluster in an ad-hoc environment: Nodes use their experience to evaluate the trust value of neighboring nodes. Each node elects the node with the highest value as a guarantor. Then, the selected node becomes a cluster-head, while other nodes become members of the cluster. If the selected node is a member of another cluster, the node with the second highest value will then be elected as cluster-head. This method is adopted here for the selection of the cluster-head, and in this manner the 1-hop cluster is made. All nodes in the 1-hop cluster evaluate the trust value of neighboring nodes by monitoring the behavior and
348
S.-S. Park, J.-H. Lee, and T.-M. Chung
recommending it to other nodes. A cluster-head evaluates the trust value of all nodes in the cluster. If any node in the cluster requests the trust value, the head nodes issues a certificate containing the subject node’s trust value. A member node uses the certificate to show its reliability when initiating communication with other nodes. All nodes in a cluster periodically monitor the neighboring nodes to update trust values. Experience data is retained as long as the node remains available in the cluster. However, this mechanism imposed too heavy a duty on the cluster head, therefore a novel method to solve this problem is proposed in this paper. 2.4 Malicious Node Behavior Authentication in the proposed network relies on the public key certificates signed by some nodes. All users may issue a certificate in the network. However, dishonest users may issue a false certificate as they are under control of the attacker. The behaviors of the malicious nodes have been identified as two different types: signing a false certificate to the other nodes [5] and recommending a false reputation to attack networks. 2.5 Considerations of a Trust System A node depends on its experience in the intra-cluster, and the information received from the introducers, which it may have a trust value in the different cluster to evaluate the trust of a specific object. The side that will evaluate the trust, and provide the resources, cannot fully trust external information, because such information is created by the information provider with its evaluation algorithm based on criteria and materials unknown to the subject. Thus, a subject faces the problem of having no reliable information for evaluation of trust. This is known as a “Black Box Problem” [13]. When a new user accesses the cluster to request a service (the use of information and resources), the “Cold Start Problem” is encountered, that is, how to recognize and evaluate the user in the absence of available user information. An environment with conventional access control architecture will not provide service to an unknown user. However, in Internet-based virtual communities and e-commerce, this kind of interaction is often necessary as unknown users are inevitable. Hence, methods for recognizing and evaluating are essential [13][14]. 2.6 Fuzzy Control and Reputation System Fuzzy control for trust management, as proposed by Song et al. in [15], is important for applications in network security. This is a new approach to supporting approximated reasoning, and is used here as a trust management application. The fuzzy inference method is useful in manipulating imprecise or uncertain information. Indeed, there are five common features between a reputation system and a fuzzy control. y y
Imprecise input: A node’s records in the ad-hoc network can contain untruthful information. Managing this is similar to the handling of noisy signals in a fuzzy control. Linguistics expertise knowledge: In evaluating reputation, human knowledge must be the fundamental inputs. This is similar to the use of linguistics information in a fuzzy control.
Authentication Scheme Based on Trust and Clustering Using Fuzzy Control
y y y
349
Prior information: Imprecise records, accumulated in the past, need to be used. This resembles the learning process in evolving a fuzzy controller rule set. Dynamic environment: A node’s behavior changes dynamically, and such changes must be tracked efficiently. This parallels the handling of dynamically changing environmental signals in a control application. Capture of feedback: Feedback from previous reputation evaluations must be used, which is similar to calibrating a fuzzy controller using system feedback.
3 Architecture and Models This section describes the authentication system architecture, and the models include a network model based on clustering, a trust model based on fuzzy control, and a security operation such as malicious node detection in wireless ad-hoc networks. 3.1 Architecture of the Proposed Authentication System The authentication service aims to provide secure public key certification. The proposed authentication system is composed of 4 layers of architecture: mobile hosts, network model, trust model, and security operations. Wireless ad-hoc networks contain a large number of mobile hosts, each with a transmission range that is relative to the network size. The network is divided into different regions; and nodes in the same region form a cluster [5]. The Max-Min d-cluster formation algorithm [11] is adopted as a clustering method, although with some modification. Each node evaluates a trust value to enable each other to form a cluster, and security operations in the intra-cluster. Between different clusters, a node requests a recommendation for introducers, which he or she may have a trust value to evaluate a trust. Two kinds of trust relationship are defined in the clustered network, namely the trust relationship of two nodes in the same cluster, and the relationship of two nodes in different clusters. The security operations, such as identification and isolation of malicious nodes and trust update, will be presented in Section 6. 3.2 The Network Model The max-min cluster formation algorithm has been adopted with some modification. In the original formation algorithm, all nodes in ad-hoc networks nominate as a cluster-head the highest ID node it can communicate with (including itself). Nominated nodes then form clusters with their nominators. At the end of the algorithm, a node is selected as a cluster-head, and others are members, at most d hops away from its cluster-head. In the original algorithm, node ID is used as the main factor in the formation of a cluster. However, node ID does not have any special meaning in the protection of the network’s security, so the trust value is used as a cluster formation factor [5]. The max-min cluster algorithm is applied to divide the network into small groups, which then become a cluster by means proposed in the previous section. However, the trust value is used in the process of head selection, the best trustworthy node is selected, which then has a duty to maintain a cluster.
350
S.-S. Park, J.-H. Lee, and T.-M. Chung
A cluster head includes the cluster information, such as the cluster size and cluster ID, and then it sends control messages to its cluster members when requested. Cluster members send and receive a signal periodically, which includes the cluster ID and cluster size, in order to determine which cluster it is. The cluster is determined by the receiving signal with majority voting. 3.3 The Trust Model Wireless ad-hoc networks have no centralized server for trust and key management. In this environment, Certificate Authority (CA) can move to another cluster, so some mechanisms are proposed to distribute the certificate authority function [5]. In this research, any node can act as a certifying authority as well as a cluster-head. All nodes that are CA in the network must be verified legitimately. This is performed not by the network model but by the trust model. The node with the highest trust value can be verified more accurately.
4 Clustering-Based Trust Evaluation Model Using the methods described in Section 2.3, the proposed cluster-based trust model is created in ad-hoc environments. The model detects malicious behavior that may occur when a new node enters the cluster to locate a target node for communication, and to determine the trust value in case the target node has no experience information of the new node. A solution is also proposed to counteract nodes that threaten the entire network.
Fig. 1. Cluster-based Trust Model
4.1 Trust Creation Scenario in Intra-cluster A trust creation method is proposed in the same cluster when a new node enters. In Fig. 1, a new node N enters the cluster in the following two cases; 1) if a node is new to the network altogether, 2) if a node arrives from another cluster. In the former case, as the new node has no certificate of trust value or a recommendation, all nodes in the
Authentication Scheme Based on Trust and Clustering Using Fuzzy Control
351
cluster monitor the new node for a certain time, and evaluate the trust value of the node using some algorithm. In the latter case, the new node provides the new clusterhead or members with a certificate of trust value or a recommendation, which has been obtained from the previous cluster-head or members. A new cluster’s nodes will then use the certificate to authenticate the node, and evaluate the default trust value of a new member, even though it does not have any experience [12]. By this method, all nodes in the previous cluster can issue a certificate to verify a trust value, but a new node N uses the certificate issued by the previous cluster’s introducers since they have a relatively high trust value. Next, a scenario is discussed in which a node enters into the cluster and finds the valuable target to communicate with. 1) A new node N determines the trust value of a target node T. Upon entering the cluster, a new node broadcasts a requesting signal to obtain a valuable target, and members notify which nodes are available for communication. Neighboring nodes will notify the node that (after head) has the highest trust value as a target node. The new node will select a trust value by majority voting, and then recognize the value 0.7 as the final value used to select a target node in both cases above. 2) A target node T determines the trust value of a new node N. A solution to evaluate the trust of a newly entering node is provided in the clusterbased trust model. If a new node requests a target node with which to initiate communication, the target node needs to confirm the trust value of the new node. In the latter case, node T receives the certificate of node N, issued by previous nodes, and accepts the given trust value as the final trust value by majority voting. In the former case, the node T monitors the node N's behavior for a certain time to calculate the trust value based on its experience, and receives the monitoring results from neighboring nodes to determine the final trust. We next describe how all members collect recommendations from the neighboring nodes to provide reputation information in the intra-cluster. Members store the recommendation results of a newly entering node N in its buffer to calculate the reputation value, and transfer the reputation information of node N when requested. Node T stores the reputation value received from members in the reputation buffer and calculates the final trust value based on its monitoring records and the stored reputation value [16]. 4.2 Trust Creation Scenario in the Inter-cluster This section discusses how to evaluate a trust value in other clusters. A requester selects some nodes in another cluster as introducers, which has a trust value result from monitoring experience, and these have a relatively high trust. The trust value is evaluated based on the reputation certificate received from lots of introducers. The reason for selecting numerous introducers in another cluster is that the trust value may be inaccurately calculated due to a certificate from the cluster head or introducer being fragile. The requester receives a certificate with a majority voting. Nodes which send inaccurate certificates are regarded as malicious nodes by this method.
352
S.-S. Park, J.-H. Lee, and T.-M. Chung
5 Fuzzy Logic 5.1 Fuzzy Logic for Trust Management The trust relationships among nodes in ad-hoc networks are difficult to assess. The fuzzy logic is used in trust management, and the architecture of the fuzzy system is proposed in [17]. Two advantages of using fuzzy logic to quantify trust in ad-hoc networks are listed as follows. 1) Fuzzy inference is capable of quantifying imprecise data or uncertainties in measuring the trust value of nodes. 2) Different membership functions and inference rules could be developed for different ad-hoc environments, without changing the fuzzy inference engine. In the proposed scheme, the local trust value Γ is determined by the public key certification rate Δ, and the recommended reputation rate Φ of each resource pair, and the global trust value is the evaluator’s trust value as a weight.
Fig. 2. Variation of the trust value Γ
Fig. 2 plots the variation of the trust value of an ad-hoc network, as the certification rate and reputation rate are varied from low to high values. The trust value increases if there is an increase in both contributing factors. The trust value could decrease after each node falsely gives a certificate or inaccurate reputation. These security attributes are treated as fuzzy variables, characterized by the membership functions in Fig. 3. In fuzzy logic, the membership function μ (x) for a fuzzy element x specifies its degree of membership in a fuzzy set. It maps an element x into the interval (0,1), where 1 represents full membership and 0 represents no membership. Fuzzy logic can handle imprecise data or uncertainty in the trust value of a node. 5.2 Membership Function for Different Levels of Trust Value Figure 3(a) shows a “high” membership function for modeling the local score Γ. A node with a trust value of at least 0.75 is considered to be high-trust. Figure 3(b) shows five levels of membership functions of trustworthiness. Figure 3(c) shows five levels of honest public key certification rates. Figure 3(d) depicts the cases of three
Authentication Scheme Based on Trust and Clustering Using Fuzzy Control
(a) High Trust value, Γ
353
(b) 5 levels of trust value, Γ
(c) Public Key Certification rate, Δ
(d) Reputation rate, Φ
Fig. 3. Membership function for different levels of Γ with Public key Certification rate Δ and accurate reputation rate Φ
ascending degrees of the correctly recommended reputation rate. The inference rules are subject to the designer’s choices. Fuzzy inference is a process for assessing the trust value in five steps: 1) Register the initial values of the honesty rate Δ and Φ. 2) Use the membership functions to generate membership degrees for Δ and Φ. 3) Apply the fuzzy rule set to map the input space (Δ - Φ space) onto the output space (Γ space) through fuzzy ‘AND’ and ‘IMPLY’ operations. 4) Aggregate the outputs from each rule, and 5) Derive the trust index through a defuzzification process [17]. The details of these five steps and fuzzy inference rules can be found in the Fuzzy Logic Toolbox User’s Guide by The MathWorks, Inc [18]. 5.3 Trust Value Aggregation The fuzzy inference system aggregates local trust scores collected from all nodes, to produce a global reputation for each node. The system uses fuzzy inference to obtain the global reputation aggregation weights. Here, the global reputation is calculated using the following formula:
⎛ ⎞ ⎜ wj ⎟ Tvi = ⎜ t ji ⎟ = wj ⎟ j∈S ⎜ j S ∈ ⎝ ⎠
∑ ∑
∑w t ∑w
j ji
j∈S
,
(1)
j
j∈S
Where Tvi is the trust value (global reputation) of node i, S is the set of nodes with whom node i has conducted communications, tji is the local trust value of node i rated
354
S.-S. Park, J.-H. Lee, and T.-M. Chung
by node j, and wj is the aggregation weight of tji. The global aggregation process iterates until each Tvi converges [15].
6 Security Operations This section covers the detailed operations of the trust update and malicious node detection scheme. 6.1 Trust Update All nodes in the cluster modify the trust value by monitoring to calculate the final trust value in an ad-hoc environment. Equation (2) calculates the updated trust value from old values and present information.
Tvnew = w × Tvold + (1 − w)Tij
(2)
Tij is the new security stimulus between node i and j at a time instant. The weighting factor w is a random variable in the range (0,1) which is a computed trial count divided by the trial count, set by a node. Algorithm 1: Trust_Update 1) Ri calculates the honest signing rate of Rj: Φ = the number of honest sign the certificates/all trials; 2) Ri calculates the success packet delivery rate of Rj: Δ = the number of successful packet deliveries/all trials; 3) Calculate the stimulus value: Tij = Fuzzy_inference(Φ, Δ); 4) Calculate the new trust value:
Tvnew = w × Tvold + (1 − w)Tij ;
When node i communicates, and issues a public key certificate, or recommends a reputation, to node j, it updates the trust value Tv using Equation (2). With a fuzzytrust quantification, the stimulus value Tij is determined first. Then, the new trust index Tv is updated accordingly. Each node in the cluster retains the modified trust value record and removes the record after receiving a revocation signal. Based on the trust records periodically received from all members in the cluster, all nodes determine and store the trust values of all nodes in the memory, and then issue a certificate of the trust value when any member in the cluster requests the trust value of a specific node [17]. 6.2 Malicious Node Detection and Isolation Scheme The trust evaluation of a node is based on the values of the certificates of trust and recommendations. Those certificates are issued by all members and must be verified. A selfish node has a lower trust value in the certificate issued by the members or introducers. With such easy identification, other nodes can restrict access of a selfish node and hence prevent wasting resources, or loss of information, which might otherwise occur due to a wrong authenticating certificate.
Authentication Scheme Based on Trust and Clustering Using Fuzzy Control
355
A series of processes to select a malicious node is as follows: by referring to the certificate (if any), and when no certificate is available, by calculating the trust value when a new node enters the created cluster. If the calculated trust value is less than the threshold value, the node is considered to be malicious: its access will be controlled, and its requests for authentication will be rejected.
7 Performance Analysis This section evaluates the proposed service in comparison with the eigen-trust system, in ad-hoc networks. Kamvar et al. proposed the eigen-trust algorithm in [19], which captures the reputation in the number of honestly certification rates, and then normalizes it over all participating nodes. The algorithm aggregates the scores by a weighted sum of all raw reputation scores. The eigen-trust is fully distributed using a DHToverlay network. The system uses majority voting to check any faulty reputation scores reported [13]. 7.1 Performance Analysis Parameters The proposed system is compared with the eigen-trust system by the following two performance metrics. y y
Total overhead: the sum of the cost occurred in authentication failure, and the cost incurred in managing authentication information in the nodes. Detection rate of malicious nodes: the detection rate for which nodes behave maliciously in a given environment.
If the global trust values accurately reflect each node’s actual behavior, the number of inauthentic nodes should be minimized. The network settings applied to the proposed performance analysis are summarized in Table 1. Where, malicious nodes are act for their own profit, neighboring nodes are in the same cluster, and introducers are in the different cluster which have some experiences to calculate a trust value. Through performance analysis, it is shown that malicious nodes are efficiently detected and isolated by the proposed system. In addition, the proposed system provides better performance compared to the eigen-trust system. Table 1. Simulation settings Attributes Network size No. of nodes % of malicious nodes in intra-cluster % of malicious nodes in inter-cluster % of neighboring nodes in intra-cluster % of introducers in inter-cluster Mobility pattern Average pause time Max. speed Min. speed Clustering algorithm
Evaluation Criteria 1,000 m × 1,000 m 100 ~ 5,000 0 ~ 30 % 0 ~ 30 % 10 ~ 30 % 10 ~ 30 % Random waypoint protocol 50 sec 10 m/s 0.5 m/s Max-Min d-cluster formation with trust
356
S.-S. Park, J.-H. Lee, and T.-M. Chung
7.2 Total Overhead Calculation In the experiments, the total overhead required for the proposed authentication system is calculated. Figure 4(a) plots the total overhead with an increasing number of introducers, and the total network size is 1,000. It is assumed that malicious behavior does not exist in the network. As seen in Figure 4(a), the total overhead is decreased as the number of introducers is increased. By the increasing number of introducers, receiving reputations are considered to trust calculation as a weight, and the trust value is more accurate leading to a rise in the authentication success rate. However, if the number of introducers is increased too much, each node increases the update count due to maintaining the certifications so that the total overhead is increased again. As the cluster size reduces, the total overhead is also reduced. Also, regardless of the number of introducers, the minimum value is obtained when the cluster size is 20 ~ 25. In Figure 4(b), the total overhead is estimated with different residence time of mobile nodes in the cluster. Note that the network size is assumed to be 1,000. Again it is assumed that malicious nodes do not exist in the network. As shown in Figure 4(b), if a node moves rapidly to another cluster, the certificate revocation overhead is increased, so that the total overhead is also increased. Also, regardless of the mobile node speed, it is recognized that the proposed system shows the best performance with a cluster size of 20 ~ 25. This means the mobile node speed does not affect the selection of cluster size. Total 1000 nodes,inter-cluster
Total 1000 nodes,intra-cluster
500
1000
450
900 cluster size = 10 cluster size = 30 cluster size = 50
400
700 T o tal ov erhead
T o tal ov erhead
350 300 250 200
600 500 400
150
300
100
200
50
100
0 10
15
20
25
30 35 40 45 The number of introducers
50
55
residence time = 50 residence time = 100 residence time = 200 residence time = 400
800
60
(a) inter-cluster overhead
0 10
15
20
25 30 35 40 45 The number of nodes in the cluster
50
55
60
(b) intra-cluster overhead
Fig. 4. Total overhead calculation
7.3 Malicious Node’s Detection Rate The malicious node’s detection rate is represented using the proposed trust evaluation method. When a node aggregates the global trust of the neighboring nodes in intracluster and inter-cluster, three methods are applied: 1) using only its own experience
Authentication Scheme Based on Trust and Clustering Using Fuzzy Control
Fuzzy-trust based, m=30%
100
100
90
90
80
80
Detec tion rate of m alic ious nodes (% )
Detec tion rate of m alic ious nodes (% )
Total 1000 nodes, m=30%
70 60 own experience based on eigen-trust based on fuzzy-trust
50 40 30 20 10 0
357
70 60 N=100 nodes N=500 nodes N=1000 nodes N=5000 nodes
50 40 30 20 10
1
2
3 The number of stages
4
5
(a) Types of global trust aggregation
0
1
2
3 The number of stages
4
5
(b) Type of network size
Fig. 5. Malicious node detection rate
to evaluate the trust value, 2) neighboring nodes’ or introducers’ reputation evaluated by the eigen-trust algorithm case, 3) using reputation as a local trust value by applying weights to determine a final trust value such as Fuzzy-trust. In the figure 5, stage means the iteration number of evaluation. Figure 5(a) plots the measured malicious node’s detection rate as a function of time represented by the number of stages in the evaluation process. In this experiment, malicious nodes make up 30% of the network. The experiment shows that the neighborhood’s reputation is considered superior to using only a node’s own experience data. The case where using own experience detects more than 90 percent of malicious nodes after the fifth iteration. On the other hand, eigen- and fuzzy- trust system detect more than 90 percent of malicious nodes after repeating 3 times. Therefore the global trust value, which is the aggregated local trust value, may be used in the authentication system for better efficiency. Though a neighbor’s reputation is considered, the proposed authentication system is higher than the Eigen-trust for expert’s experience. Figure 5(b) plots the experimental results, in which the proposed fuzzy-trust authentication is repeatedly applied. The rate of malicious nodes in the network is set to 30%. As the number of stages increased, the detection rate of malicious nodes also increased. In this experiment, the average time per evaluation cycle is 100 ~ 120 (sec). Moreover, as the number of nodes in the network is small, the detection rate is increased. Accordingly, it is recognized that the proposed authentication system can detect all malicious nodes as it runs over few operations.
8 Discussion and Conclusions The use of the cluster-based trust model has been proposed, using fuzzy control in adhoc environments. The next section discusses advantages and applications of using clustering and fuzzy control as in the proposed model.
358
S.-S. Park, J.-H. Lee, and T.-M. Chung
8.1 Advantages of Fuzzy Control for Trust Management The trust value of a trustee becomes more accurate if the trust value of a trustor is applied to the evaluation value as a weight, rather than accepting an exact evaluation mean value of the neighboring node or ignoring a neighboring node’s reputation. With the weighted trust value, a node can precisely identify the nodes that exhibit malicious behavior in the network, and can flexibly deal with authentication and security in ad-hoc environments. To apply an expert’s experience to a membership function and fuzzy rule set, it is possible to model an intelligence system. 8.2 Clustering and Authentication Based on the Certificate As all nodes in the cluster participate in the calculation of the trust value for the creation of a cluster, this method can be used as a solution to the “Black Box Problem”, which can occur when a series of processes for calculating the trust value is unreliable for the side that receives the information. When a new node enters the cluster, the node cannot interact with any node if no experience or trust information about the node is available. On the other hand, a new node can be authenticated based on a certificate if the node has a certificate of the trust value issued by the previous members. This is a solution to the “Cold Start Problem” related to the authentication of a newly appearing node with no available information. 8.3 Flexible Model against the Attacks, and Future Works A malicious node may issue a false public key certificate, and recommend an inaccurate reputation due to it’ control by attack, or for other reasons. With a bad reputation and a low trust value of malicious nodes, the authentication architecture among the nodes will become loose. The effectiveness of the entire network can be enhanced if all nodes in the cluster avoid authentication of such malicious nodes. The next section discusses the types of attack. A newcomer attack occurs when a malicious node has been re-registered as a new node. Being re-registered, the malicious node can easily remove its bad records. To defend against this attack, there must be a reliable authentication and access control mechanism, in which the input of a new or fake ID is difficult, rather than relying on the trust evaluation system. In the black hole attack, all packets are discarded when they have reached a specific position. Instead of all packets, only a specific packet including the data packet, or the control-related packet, is selectively discarded in the gray hole attack, which is a special case of the black hole attack [7]. The proposed model can also identify such attackers to prevent these attacks. Future work will investigate methods of studying security properties in many distributed systems. Belief logic can be represented as a set of rules and many authentication protocols can be formalized as a kind of trust theory, which can be used to reason about the trust of agents. The methods of solving the security problems using trust theory and belief logic will be investigated. Furthermore, it is necessary to analyze intensively the elements affecting the trust value.
Authentication Scheme Based on Trust and Clustering Using Fuzzy Control
359
Acknowledgements This study was supported by a grant from the project related to Open Source Software Intellectual Property Rights, Ministry of Culture, Sports and Tourism, Republic of Korea.
References 1. Shieh, S.W., Wallach, D.S.: Guest Editors’ Introduction: Ad Hoc and P2P Security. IEEE Internet Computing 9(6), 14–15 (2005) 2. Zhong, Y., Bhargava, B.: Authorization based on evidence and trust. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds.) DaWaK 2002. LNCS, vol. 2454, pp. 94–103. Springer, Heidelberg (2002) 3. Chadwick, D.W.: Operational Models for Reputation Servers. In: Herrmann, P., Issarny, V., Shiu, S.C.K. (eds.) iTrust 2005. LNCS, vol. 3477, pp. 108–115. Springer, Heidelberg (2005) 4. Theodorakopoulos, G., Baras, J.S.: On trust models and trust evaluation metrics for ad hoc networks. IEEE Journal on Selected Areas in Communications 24(2), 318–328 (2006) 5. Ngai, E.C.H., Lyu, M.R.: An authentication service based on trust and clustering in wireless ad hoc networks: description and security evaluation. In: IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing, 1st edn. (2006) 6. Josang, A., Ismail, R., Boyd, C.: A survey of trust and reputation systems for online service provision. Decision Support Systems 43(2), 618–644 (2007) 7. Gambetta, D.G.: Can we trust trust? In: Gambetta, D.G. (ed.), ch. 13, pp. 213–237. Basil Blackwell, New York (1988) 8. McKnight, D.H., Chervany, N.L.: The Meanings of Trust. Technical Report MISRC Working Paper Series 96-04. University of Minnesota, Management Information Systems Reseach Center (1996) 9. Sun, Y.L., Yu, W., Han, Z., Liu, K.J.R.: Information theoretic framework of trust modeling and evaluation for ad hoc networks. IEEE Journal on Selected Areas in Communications 24(2), 305–317 (2006) 10. Sun, Y.L., Han, Z., Liu, K.J.R.: Defense of trust management vulnerabilities in distributed networks. IEEE Communications Magazine 46(2), 112–119 (2008) 11. Amis, A.D., Prakash, R., Vuong, T.H.P., Huynh, D.T.: Max-min D-cluster Formation in Wireless Ad Hoc Network. In: Proc. of the 19th Annual Joint Conference of the IEEE Computer and Communications Societies (Infocom 2000), pp. 32–41 (March 2000) 12. Jin, S.-H., Park, C.-I., Choi, D.-S., Chung, K.-I., Yoon, H.-S.: Cluster-Based Trust Evaluation Scheme in an Ad Hoc Network. ETRI Journal 27(4) (2005) 13. Massa, P.: Trust-aware Decentralized Recommender Systems: PhD research proposal. Technical Report No.T04-06-07, Istituto Trentino di Cultura (2004) 14. Schein, A.I., Popescul, A., Ungar, L.H., Pennock, D.M.: Methods and Metrics for ColdStart Recommendations. In: ACM SIGIR 2002, Finland (August 2002) 15. Song, S., Hwang, K., Zhou, R., Kwok, Y.-K.: Trusted P2P transactions with fuzzy reputation aggregation. IEEE Internet Computing 9(6), 24–34 (2005) 16. Park, S.-S., Lee, J.-H., Chung, T.-M.: Cluster-Based Trust Model against Attacks in AdHoc Networks. In: 2008 Third International Conference on Convergence and Hybrid Information Technology, iccit, vol. 1, pp. 526–532 (2008)
360
S.-S. Park, J.-H. Lee, and T.-M. Chung
17. Song, S., Hwang, K., Macwan, M.: Fuzzy Trust Integration for Security Enforcement in Grid Compuing. In: Jin, H., Gao, G.R., Xu, Z., Chen, H. (eds.) NPC 2004. LNCS, vol. 3222, pp. 9–21. Springer, Heidelberg (2004) 18. Fuzzy Logic Toolbox User’s guide, The MathWorks, Inc. (2001), http://vyuka.fel.zcu.cz/kae/anf/fuzzy_tb.pdf 19. Kamvar, S., Schlosser, M., Garcia-Molina, H.: The EigenTrust Algorithm for Reputation Management in P2P Networks. In: Proc. ACM Conference World Wide Web (WWW 2003), pp. 640–651. ACM Press, New York (2003)
On Relocation of Hopping Sensors for Balanced Migration Distribution of Sensors Moonseong Kim and Matt W. Mutka Department of Computer Science and Engineering Michigan State University East Lansing, MI 48824, USA
[email protected],
[email protected]
Abstract. When some sensors in Wireless Sensor Networks (WSNs) fail or become energy-exhausted, redundant mobile sensors might be moved to cover the sensing holes created by the failed sensors. Within rugged terrains where wheeled sensors are unsuitable, other types of mobile sensors, such as hopping sensors, are needed. In this paper, we address the problem of relocating hopping sensors to the detected sensing holes. The recently studied tendency for this problem considered the shortest path to relocate the sensors; however, even when multiple suppliers are considered, the previous works merely use the shortest path algorithm repeatedly. As a result, the migration distribution is imbalanced, since specific clusters on the obtained paths could be used repeatedly. In this paper, we first analyze the impact of using multiple suppliers to relocate the hopping sensors, and propose a Relocation Algorithm using the Most Disjointed Paths (RAMDiP). Simulation results show that the proposed RAMDiP guarantees a balanced movement of hopping sensors and higher movement success ratio of requested sensors than those of the representative relocation scheme, MinHopExt. Keywords: Mobile Sensors, Hopping Sensors, Relocation, Wireless Sensor Networks (WSNs).
1 Introduction To accurately observe the phenomena of the requested tasks in Wireless Sensor Networks (WSNs), sensor nodes must be initially deployed suitably [1]. When some sensor nodes become energy-exhausted, mobile redundant sensor nodes might be moved to the specific urgent place created by the failed sensors [2] [3]. Early work on mobile sensors merely focuses on designing algorithms to deploy mobile sensors [4] [5] [6]. As the network condition changes, sensor nodes may need to be redeployed in order to recover failures. In [7], authors implement wheeled mobile sensors with Mica2/TinyOS. In practice, however, it is not suitable for wheeled mobility to migrate in many environments because the areas, such as remote harsh terrains, hostile territory, toxic regions, or disaster areas, are rugged. Moreover, sensor mobility is limited within the physical environment. That is, if a sensor determines to move to a desired location, it cannot migrate without any limitation in the movement distance [8]. In O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 361–371, 2009. © Springer-Verlag Berlin Heidelberg 2009
362
M. Kim and M.W. Mutka
order to overcome these limitations, a class of Intelligent Mobile Land Mine Units (IMLM [9]) to be deployed across battlefields have been developed by DARPA. The IMLM is based on hopping mechanism. A hopping sensor is a class of mobile sensors whose mobility design is more adaptable for the previously mentioned areas where wheeled mobility is not possible. A hopping sensor with a bionic mobility design such as a grasshopper throws itself high and toward the target direction. In [10] [11], a prototype minefield hopping robot is mentioned. The prototype is 12cm in diameter, 11cm tall, and weighs 1.8kg. It can make 100 hops without refueling and can hop as high as 3m. Additionally, some literature has discussed hopping mobility; for instance, the area of planetary exploration is described in [12] [13] and powering small jumping sensor using rubber band is proposed in [14]. In the lifetime of a WSN, if the energy within some sensors in a certain area are depleted faster than those of other areas, the areas are called sensing holes. Redundant sensors are allocated initially in the sensor field through a well planned deployment; thus, if a sensing hole is detected later, some sensors could be moved to the sensing hole. In this paper, we study the problem of relocating hopping sensors to the detected sensing holes. In [15], when a static sensor node may fail, the wheeled sensor node can move to the place of the failed node to replace it temporarily. In [16], the authors propose a shortest path based relocation scheme based on hopping movement and first analyze the impact of wind under aerodynamic conditions. They also mention multiple suppliers; however, they never analyze the impact of using the multiple suppliers. Hence, in this paper, we consider the multiple suppliers and analyze the impact of using them. In addition, we use the shortest path based scheme like [16]; however, we take account of the most disjointed paths to provide redundant sensors. The remainder of this paper is organized as follows. Section 2 explains previous work. Section 3 presents details of the proposed protocol. Section 4 evaluates our proposal and finally, Section 5 concludes this paper.
2 Preliminaries 2.1 System and Sensors Models A hierarchical model is widely used in WSN design. In [17], authors assume that a cluster head is available to coordinate the sensor deployment based on virtual forces. In [2], authors adopt a Grid-Quorum solution to locate the closest redundant sensors in a prompt manner. Quorum or broadcast based approaches can be used to match the cluster containing redundant sensor and the sensing hole cluster, which are called the supplier and consumer, respectively. In our hopping model, we assume that a set of clusters is included in the WSN field, and the sensors are attached to each cluster. A cluster head is capable of the responsibility of properly distributing the sensors, detecting sensor deficiency, and selecting redundant sensors among the clusters. The problem of detecting sensing holes is studied in [6] [18] [19]. We assume that hopping sensors are capable of adjusting their hopping direction. The sensors are also assumed to have a fixed propelling force for hopping and a localization capability. Compared with wheeled mobile sensors, hopping sensors are not as accurate. Thus the movements of the sensors should be modeled probabilistically. In the next subsection, we explain the hopping inaccuracy model.
On Relocation of Hopping Sensors for Balanced Migration Distribution of Sensors
363
2.2 Hopping Inaccuracy Model Compared with wheeled mobility, hopping sensors lack accuracy of movement. In [16], the authors first analyze the impact of wind under aerodynamic condition and prove that wind factors cannot heavily affect the performance; however, it is trivial for the hopping movement to be more susceptible to air disturbance than the wheeled mobility. In addition, hopping sensors could be more adaptable than wheeled sensors in such as harsh terrains. Here, probabilistic methods are used to express the movement inaccuracies along the hopping course. In order to determine the model of landing accuracy between hops, we use a multivariate normal distribution. Let T and L be the targeted location and the actual landing location vectors, respectively. The displacement vector D can be expressed as D = T − L . Here, D is modeled by the two-dimensional normal distribution with mean (0,0) , standard deviation (σ x ,σ y ) , and correlation ρ . The probability density function (PDF) for D is Equation (1) and Fig. 1 shows an example of hopping movement with Equation (1). f XY ( x, y ) =
1 2πσ xσ y
⎛ ⎛ x2 1 y2 2 ρxy ⎞⎟ ⎞⎟ ⎜ + 2 − exp⎜ − 2 2 ⎜ ⎜ 2(1 − ρ ) ⎝ σ x σ xσ y ⎟⎠ ⎟ σy 1− ρ ⎝ ⎠ 2
(1)
Fig. 1. An example of movement with the displace vector D
We define an acceptable landing area as a disk S around the targeted location. As shown in Fig. 2, the radius of S is given as nσ where, n is a multiplying factor. Hence, the probability that the hopping sensor lands in the acceptable landing area S can be represented as follows.
P( S ) =
∫∫ f S
XY
( x, y ) dx dy
(2)
364
M. Kim and M.W. Mutka
Fig. 2. Modeling the hopping accuracy using two-dimensional normal distribution
Let l be the distance between clusters. The upper bound of the number of hops is as follows.
⎡ l ⎤ Nu = ⎢ ⎥ ⎢ r − nσ ⎥
(3)
where, r is the hopping range. Therefore, a consumer cluster needs R sensors and can request E sensors from its previous cluster. E is calculated as follows. E = ⎡R ⋅ P( S ) − Nu ⎤
(4)
The path from a supplier cluster ( Co ) to a consumer cluster ( C k ) is denoted by C 0 → C1 → K → C k . Here, the distance between Ci −1 and Ci ( 1 ≤ i ≤ k ) is also
defined by l (i − 1, i ) . If the consumer cluster needs R sensors, the number of requested sensors ( E ) from the supplier cluster is calculated as follows. k 1 ⎡ − ∑ l ( i −1,i ) ⎤ E = ⎢ R ⋅ P( S ) r − nσ i =1 ⎥ ⎢ ⎥
(5)
We also suppose that there exist m suppliers to provide R sensors requested by the consumer. Then, R is
∑
m
i =1
Ri and Ei sensors are requested to each supplier cluster
by the consumer cluster, simultaneously. 2.3 Minimum Hop-Based Relocation Schemes
If some sensing holes occur, the redundant sensors can be moved. At this time, a system usually considers the geographical optimal path in order to cover the sensing holes. For wheeled mobility, authors of [15] implement a mobile sensor to recover the failed static sensor node. For hopping mobility, the authors of [16] first propose two relocation schemes, called the MinHopsExt, based on the shortest path. In order to transport the requested sensors, one scheme ( γ = 0 in MinHopsExt) uses Dijkstra’s shortest path algorithm according to the number of hops between clusters. Another proposed scheme modifies the first one by adding an additional adjusting process using the parameter γ ( 0 ≤ γ ≤ 1 ). In order to adjust, the scheme tries to minimize a fraction of the sum of the weights along the path and a fraction of the difference of the maximum and minimum weights of the edges, the number of hops between clusters, along the path.
On Relocation of Hopping Sensors for Balanced Migration Distribution of Sensors
365
3 Route Planning of Hopping Sensors for Multiple Suppliers 3.1 Hopping Strategies
As shown in Fig. 3, there are two possible migration strategies. The first strategy is to move the sensors directly from the supplier cluster ( C0 ) to the consumer cluster ( C3 ) as in Fig. 3(a). However, each sensor’s hopping capability may deteriorate due to the long distance movement. In order to overcome this, the second strategy uses intermediate clusters as relay clusters. As described in Fig. 3(b), the sensor in C0 moves to C1 , the sensor in C1 moves to C 2 , and the sensor in C 2 moves to C3 , in regular sequence. The number of hops executed among the sensors could be balanced.
(a) Direct hopping movement
(b) Relayed hopping movement Fig. 3. Two types of hopping movements
3.2 A Relocation Scheme for Multiple Suppliers
In order to relocate the mobile sensors requested, the use of multiple suppliers instead of one supplier could contribute to a balanced movement of hopping sensors. In the next section, we intend to simulate and analyze the impact of the use of multiple suppliers. Early research on relocation scheme tends to merely use the shortest path algorithm repeatedly for multiple suppliers without some considerations. For instance, the previously mentioned MinHopsExt [16] uses Dijkstra’s algorithm simply as the number of suppliers; however, it does not consider the relationship between the obtained shortest paths. When intermediate clusters on the obtained path relocate some requested sensors from each supplier, specific clusters of them could probably be used repeatedly. As a result, the mobility distribution is imbalanced; thus, some sensors’ hopping capability may deteriorate quickly and another sensing hole may occur easily.
366
M. Kim and M.W. Mutka
(a) A typical shortest path-based relocation
(b) A relocation using the most disjointed paths Fig. 4. Two relocations example
In Fig. 4, the number of hops needed to move between clusters is shown on each edge. As shown in Fig. 4(a), cluster B relocates the requested sensors three times, and cluster C provides the sensors two times; however, cluster A and D are not used. It means that clusters A and D clearly have redundant sensors, and there is a strong possibility that the other clusters B and C could be sensing holes later. On the contrary, if we take into account the number of relocations of each cluster for a consumer, then the entire mobility distribution is balanced. In fact, every intermediate cluster relocates the requested sensors once, as depicted in Fig. 4(b). Hence, we should strive to reduce the repeated use of each cluster as much as possible. Before describing the proposed algorithm for the relocation of hopping sensors, the Relocation Algorithm using the Most Disjointed Paths (hereafter, RAMDiP), we first define the network model, and the detailed pseudo code is as follows. A WSN can be represented by a weighted graph G (V , W ) , where V is a set of clusters and W is a set of edges. Each edge is associated with the estimated number of hops needed, as indicated in Equation (3). k hard represents the number of total capable hops without
On Relocation of Hopping Sensors for Balanced Migration Distribution of Sensors
367
refueling. In the RAMDiP, two clusters are connected if the number of hops needed to migrate between them is less than k hard . Finally, S = {s0 , s1 ,K, s|S | } is a set of supplier clusters, and t is a consumer cluster. RAMDiP ( G (V ,W ), k hard , S , t , E ) 01. p ; // a relocation path for each supplier 02. T ; // a relocation tree for multiple suppliers 03. ( I , J ) ; the edge when p first joins to T ; 04. W ′ ← Delete edges whose weights are larger than k hard in W ; 05. T ← Dijkstra( G (V ,W ′), s0 , t ); 06. For ∀ s ∈ S \ {s } i
07. 08. 09. 10.
0
G (V ,W ′′) ← G (V ,W ′) ; d old ← ∞ ; p ← φ ; While(1) p′ ← Dijkstra( G (V ,W ′′), si , t ); If( p′ ≠ φ )
11.
d new ← Check the number of the duplicated clusters for the obtained tree, T ∪ p′ ;
12.
If( d new < d old )
13. 14. 15. 16. 17. 18. 19. 20.
Check ( I , J ) for p′ and T ; W ′′ ← Delete the edge ( I , J ) from W ′′ ;
d old ← d new ; p ← p′ ; Else break; Else break; T =T ∪ p; Return T ;
4 Performance Evaluation We analyze some numerical results that can be used for comparing the performance of the proposed RAMDiP and the MinHopsExt [16]. In order to compare, these algorithms are implemented in C, and the main parameters are described in Table I. Here, γ is assumed to be 0 in the MinHopsExt; that is, the MinHopsExt merely considers the minimum hops as mentioned above. We generate ten different random sensor networks. For simplicity, the probability that the hopping sensor lands in the acceptable landing area is assumed to be 1. Events are generated continuously. For each event, a consumer and some suppliers clusters are randomly chosen. Finally, the assumed hopping strategy is the relayed hopping.
368
M. Kim and M.W. Mutka Table 1. Simulation variables Network size Network density unit Number of total hopping capability without refueling
300 m × 300 m
clusters / m 2 15
Hopping capability per sensor initially
30
Hopping range
3m
Sensors per cluster initially deployed
45
Number of sensors requested by the consumer for each event ( R )
24
Number of sensors requested from each supplier ( Ei )
12, 6, 3, 2
Number of suppliers ( | S | )
2, 4, 8, 12
Fig. 5. Normalized sensors still alive when | S |= 8 and Ei = 3
Fig. 5 shows the graph of each scheme’s normalized sensors that are still alive according to the generated events, in terms of the network densities. Here, | S | and Ei are supposed to be 8 and 3, respectively. As the density increases, the RAMDiP outperforms the MinHopExt slightly. The reason is that the probability that all obtained paths are jointed to each other is low when the network density is high; thus, there are relatively fewer repeatedly used clusters than in the case of low density. Under the relayed hopping environment, it is necessary to reduce the number of repeatedly used clusters; otherwise, the movement success ratio of the requested sensors might decrease. When the network density is especially low, since the RAMDiP tries
On Relocation of Hopping Sensors for Balanced Migration Distribution of Sensors
369
Fig. 6. Movement success ratio of the requested 24 sensors when | S |= 8 and Ei = 3
to avoid the path collision as much as possible, the RAMDiP cannot help but outperform the MinHopsExt, as depicted in Fig. 6. After this, the results are obtained in the same way as above, in terms of the number of suppliers, | S | . Fig. 7 and 8 show the impact of the use of multiple suppliers. As might be expected, we can note that the advantages of using the multiple suppliers are a balanced migration distribution and a higher movement success ratio of requested sensors.
Fig. 7. Normalized sensors still alive when the network density is 0.005 clsusters/m2 for RAMDiP
370
M. Kim and M.W. Mutka
Fig. 8. Movement success ratio of the requested sensors when the network density is 0.005 clsusters/m2 for RAMDiP
5 Conclusion Hopping sensors are more adaptable to many potential working environments, such as remote harsh terrains, toxic regions, disaster areas, and hostile territory than wheeled mobile sensors. In the lifetime of a WSN, sensing holes may often occur. In order to provide the required sensors to the sensing hole, this paper proves that the use of multiple suppliers could guarantee a balanced movement of hopping sensors and a higher movement success ratio of requested sensors. Furthermore, since early research related to relocation scheme tends to merely use the shortest path algorithm repeatedly for multiple suppliers (e.g., MinHopExt), specific clusters on the obtained paths could be used repeatedly. As a result, the mobility distribution is imbalanced; thus, some sensors’ hopping capability may deteriorate quickly and another sensing hole may occur easily. The proposed RAMDiP takes into account the number of relocations of each cluster in order to avoid the path collision as much as possible; thus, the entire mobility distribution is balanced and the success ratio of the requested sensors also increases. Acknowledgments. This research was supported in part by NSF (USA) Grants No. OCI-0753362, CNS-0721441, and CNS-0551464.
References 1. Choi, W., Das, S.K.: A Novel Framework for Energy - Conserving Data Gathering in Wireless Sensor Networks. In: Proceeding of INFOCOM, vol. 3, pp. 1985–1996. IEEE Computer Society Press, Los Alamitos (2005) 2. Wang, G., Cao, G., Porta, T.L., Zhang, W.: Sensor Relocation in Mobile Sensor Networks. In: Proceeding of INFOCOM, vol. 4, pp. 2302–2312. IEEE, Los Alamitos (2005)
On Relocation of Hopping Sensors for Balanced Migration Distribution of Sensors
371
3. Ma, K., Zhang, Y., Trappe, W.: Managing the Mobility of a Mobile Sensor Network Using Network Dynamics. Transactions on Parallel and Distributed Systems 19(1), 106–120 (2008) 4. Wang, G., Cao, G., Porta, T.L.: A Bidding Protocol for Sensor Deployment. In: Proceeding of ICNP, pp. 315–324. IEEE Computer Society Press, Los Alamitos (2003) 5. Goldenberg, D.K., Lin, J., Morse, A.S., Rosen, B.E., Yang, Y.R.: Towards mobility as a network control primitive. In: Proceeding of MobiHoc, pp. 163–174. ACM Press, New York (2004) 6. Wang, G., Cao, G., Porta, T.: Movement-Assisted Sensor Deployment. Transaction on Mobile Computing 5(6), 640–652 (2006) 7. Teng, J., Bolbrock, T., Cao, G., Porta, T.L.: Sensor Relocation with Mobile Sensors: Design, Implementation, and Evaluation. In: Proceeding of MASS, pp. 1–9. IEEE Computer Society Press, Los Alamitos (2007) 8. Chellappan, S., Bai, X., Ma, B., Xuan, D., Xu, C.: Mobility Limited Flip-Based Sensor Networks Deployment. Transactions on Parallel and Distributed Systems 18(2), 199–211 (2007) 9. http://www.darpa.mil/STO/smallunitops/SHM/sandia.html 10. http://www.darpa.mil/STO/smallunitops/SHM/briefings/ Hopcomb.WMV 11. http://www.darpa.mil/STO/smallunitops/SHM/briefings/ IMLBreach.WMV 12. Hale, E., Schara, N., Burdick, J., Fiorini, P.: A minimally actuated hopping rover for exploration of celestial bodies. In: Proceeding of ICRA, vol. 1, pp. 420–427. IEEE Computer Society Press, Los Alamitos (2000) 13. Confente, M., Cosma, C., Fiorini, P., Burdick, J.: Planetary Exploration Using Hopping Robots. In: Proceeding of ESA Workshop on Advanced Space Technologies for Robotics and Automation 2002 (2002) 14. Bergbreiter, S., Pister, K.: Design of an Autonomous Jumping Microrobot. In: Proceeding of ICRA, pp. 447–453. IEEE Computer Society Press, Los Alamitos (2007) 15. Luo, R.C., Huang, J.-T., Chen, O.: A Triangular Selection Path Planning Method with Dead Reckoning System for Wireless Mobile Sensor Mote. In: Proceeding of ICSMC, vol. 1, pp. 162–168. IEEE Computer Society Press, Los Alamitos (2006) 16. Cen, Z., Mutka, M.W.: Relocation of Hopping Sensors. In: Proceeding of ICRA, pp. 569– 574. IEEE Computer Society Press, Los Alamitos (2008) 17. Zou, Y., Chakrabarty, K.: Sensor Deployment and Target Localization in Distributed Sensor Networks. Transaction on Embedded Computing Systems 3(1), 61–91 (2004) 18. Ghrist, R., Muhammad, A.: Coverage and hole-detection in sensor networks via homology. In: Proceeding of IPSN, pp. 254–260. IEEE, Los Alamitos (2005) 19. Fang, Q., Gao, J., Guibas, L.J.: Locating and Bypassing Routing Holes in Sensor Networks. In: Proceeding of INFOCOM, vol. 4, pp. 2458–2468. IEEE, Los Alamitos (2004)
Hybrid Hard/Soft Decode-and-Forward Relaying Protocol with Distributed Turbo Code Taekhoon Kim and Dong In Kim School of Information and Communication Engineering, Sungkyunkwan University Suwon 440-746, Korea {thkulguy,dikim}@ece.skku.ac.kr
Abstract. This paper proposes a hybrid hard/soft decode-and-forward (DF) relaying protocol with distributed turbo code (DTC), based on error detection in cooperative communications. In order to improve the performance in outage case of the channel between source and relay in uplink transmission, the relay decides whether to hard or soft decode the received signal based on the error detection before forwarding to the destination. Our simulation results show that the proposed hybrid hard/soft DF relaying outperforms conventional relaying protocols, in terms of the bit-error rate (BER) performance, in uplink cooperative communications with DTC. This is because the proposed scheme does not amplify the noise and propagate error to the destination by decoding adaptively with the help of error detection code or the use of known threshold in signal-to-noise ratio (SNR). Keywords: Cooperative diversity, decode-and-forward (DF), relay network, distributed turbo code (DTC), hybrid relaying protocol.
1
Introduction
In wireless communications, severe distortions due to multipath propagation and Doppler shift cause a high degradation in received signal quality. It can be mitigated by the use of diversity. Specifically, spatial diversity is obtained when antennas are separated far enough to experience independent fading. In a cellular network, with cost and hardware limitations due to the limited size, it is not always possible to equip multiple antennas at a mobile terminal. To overcome such limitation, a new diversity technique, namely cooperative diversity has been proposed recently. For this the terminals share their antennas to generate a virtual array through distributed transmission [1], [2]. The signal transmission, assisted by a number of relays, is a simple example of the cooperative communications. The relay listens to the source’s transmission and forwards the received signal to the destination.
This research was supposed by the MKE (Ministry of Knowledge Economy), Korea, under the ITRC (Information Technology Research Center) support program supervised by the IITA (Institute for Information Technology Advancement) (IITA2009-C1090-0902-0005).
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 372–382, 2009. c Springer-Verlag Berlin Heidelberg 2009
Hybrid Hard/Soft Decode-and-Forward Relaying Protocol
373
Several relaying protocols have been proposed in the literature. The amplifyand-forward (AF) and the decode-and-forward (DF) were proposed in [3]. AF can achieve the available diversity with maximal-ratio combining (MRC). However, it requires storage of analog waveforms at the relay which may lead to impractical hardware implementation [4]. DF is simple but loses the diversity gain and error performance due to error propagation, unless the reliable channel between source and relay is assured [5]. Moreover, to overcome the error propagation, a hybrid relaying protocol of the AF and DF (i.e., hybrid AF/DF) was proposed which based on the error detection at the relay [6], [7]. The relay detects whether it successfully receives the source’s transmission (e.g., using the CRC bits). If successful, it re-encodes and transmits the message to the destination as in the DF relaying protocol. Otherwise, the relay amplifies and forwards the message as in the AF relaying protocol. However, in outage case such that the majority of the received signals are impaired, the amplified signals transmitted from the relay can also lead to the performance degradation. To increase the channel capacity and improve the error performance in a relay network, some distributed coding schemes have been developed, such as distributed space time block codes [8] and distributed turbo codes (DTCs) [9]. The DTC scheme, using a recursive systematic convolutional (RSC) code [10], has been shown that it performs close to the theoretic outage probability bound. However, it may not be effective under the outage channel between source and relay since most of the DTC scheme, called perfect DTC, assumed that the relay correctly decodes the received signals from the source. To remedy this, a distributed turbo coding with soft information relaying (DTC-SIR), forwarding calculated soft estimate symbols using a posteriori probabilities (APPs) rather than deciding the received signals at the relay, has been proposed in [11]. DTCSIR offers a dramatic performance gain especially when the channel between source and relay is under low SNR, comparing to the DTC without assuming the perfect decoding at the relay [12]. In this paper, we propose a hybrid hard/soft DF relaying protocol with DTC, for which the relay adaptively selects the relaying protocols, either hard or soft decoding, based on error detection at the relay node. To determine whether the received signals are decoded correctly or not, we can simply check the CRC bits appended to a frame in real systems. However, for simplicity and possible analysis, we consider the SNR-threshold based model on the link between source and relay. If the received SNR is above the threshold SNR, it is assumed that the relay can decode correctly. Compared with conventional AF/DF relaying protocols, our proposed scheme does not need to store the analog waveforms and to amplify the damaged signal that already detected error at the relay. Moreover, the diversity gain can be preserved by the soft decision at the relay, unlike the DF, even when the link between source and relay is weak. Our simulation results show that the proposed scheme has much better BER performance than the conventional schemes as stated above. The rest of the paper is organized as follow. Section 2 describes the cooperative system model considered herein. The conventional and proposed relaying
374
T. Kim and D.I. Kim
protocols are analyzed in Section 3 and 4, respectively. In Section 5, simulation results are presented to compare the bit-error rate (BER) performance among them. Conclusions are stated in Section 6.
2
System Model
We consider a single-cell cellular network with fixed relay station (RS) to assist mobile stations (MSs) to access the base station (BTS) via a relaying strategy for uplink transmission. The RSs are positioned around the BTS, in such a way that wireless channels on the relay link (from RSs to the BTS) are line-of-sight (LOS). For simplicity, a general two-hop relay system, consisting of MSs (sources), RSs (relays) and one BTS (destination), is assumed as depicted in Fig. 1. A set of binary information transmitted from the source is represented by B = (b1 , ..., bk , ..., bl )
(1)
where bk is the kth information bit and l is the frame size. First, the information sequence B is encoded by a channel encoder. In this paper we consider a RSC code with code rate of 1/2. Let C represent the corresponding codeword as C = (C1 , ..., Ck , ..., Cl )
(2)
where Ck = (ik , pk ), ik , pk ∈ {0, 1} is the codeword of bk , ik is the information symbol and pk is the corresponding parity symbol. The binary codeword stream C is then mapped into a modulated signal stream S. For simplicity, we consider a BPSK modulation. The modulated signal S is given by S = (S1 , ..., Sk , ..., Sl )
(3)
where Sk = (si,k , sp,k ), si,k , sp,k ∈ {−1, 1} are the modulated information and parity symbols transmitted by the source. For simplicity, we assume that the source and the relay transmit data in time-division multiplexing (TDM), for which the source and the relay transmit in separate time slots.
hRD
Destination
Relay
hSD
hSR
Source
Fig. 1. An example topology of source, relay and destination in cooperative uplink transmission
Hybrid Hard/Soft Decode-and-Forward Relaying Protocol
375
In first time slot, the received signals at the relay and the destination from the source can be written, respectively, as ySR = PS hSR s + nSR (4) ySD = PS hSD s + nSD (5) where PS is the transmit signal power at the source, hSR and hSD are the fading coefficients from source to relay and destination, respectively. They are modeled as zero mean circularly symmetric complex Gaussian noise and assumed to be quasi-static so that the fading coefficients remains constant within a frame. nSR and PSD are independent complex Gaussian noises with two-sided power spectral density of N0 /2 per dimension. At the destination, the corresponding received signal from the relay can be represented by yRD = PR hRD x + nRD (6) where x is the transmit signal at the relay, PR is the transmit signal power, hRD and nRD are the fading coefficient and noise from relay to destination, respectively. They are assumed same as above.
3
Conventional Relaying Protocol
In this section, we review conventional relaying protocols, such as AF, DF and hybrid AF/DF, as mentioned earlier. All conventional relaying protocols are assumed to use the RSC code. Except for the AF, signals for all other protocols are decoded, interleaved and re-encoded at the relay but multiplied by an amplification factor. 3.1
Amplify-and-Forward Relaying
In the AF relaying protocol, the relay simply amplifies the received signals from the source and forwards to the destination. At the destination, the received signals from each node are multiplied by the corresponding weight factors for MRC combining. First, the signal transmitted from the relay is given by xAF = βySR where β is the amplification factor at the relay, defined by PR β≤ . |hSR |2 PS + N0 Combining (6) with (4) and (7) yields yRD,AF = PR hRD β ( PS hSR s + nSR ) + nRD .
(7)
(8)
(9)
376
T. Kim and D.I. Kim
The destination combines the two signals, one from the relay (9) and the other from the source (4), can be expressed as yMRC,AF = ωSD ySD + ωRD yRD,AF
(10)
where ωSD and ωRD are the weight factors for MRC combining for the received signal from source and relay, respectively
ωRD
ωSD = h∗SD h∗SR h∗RD β = . |hRD |2 β 2 + 1
(11) (12)
As shown in (9), AF amplifies the received signal by the amplification factor, thereby amplifying the S-R link noise. 3.2
Decode-and-Forward Relaying
For conventional DF relaying protocol with DTC, the received signals from the source are first binary decoded by the relay, then interleaved and re-encoded, and forwarded to the destination. A set of the decoded bits at the relay is given by ˆ = (ˆb1 , ..., ˆbk , ..., ˆbl ) B
(13)
where ˆbk is the kth information bit decoded by the relay. The decoded binary information stream is then interleaved, re-encoded and ˆ denoted by modulated into S, ˆ = (S ˆ 1 , ..., S ˆ k , ..., S ˆl) S
(14)
ˆ k = (ˆ where S si,k , sˆp,k ), sˆi,k , sˆp,k ∈ {−1, 1} is the modulated signal transmitted by the relay. Since x in (6) can be substituted by sˆ, the received signal at the destination from the relay becomes yRD,DF = PR hRD sˆ + nRD . (15) At the destination, each signal from relay and destination enters the maximum a posteriori probability (MAP) decoder for iterative decoding. 3.3
Hybrid AF/DF Relaying
For the hybrid AF/DF relaying protocol, the relay chooses between the two relaying protocols, AF and DF, depending on whether the decoding at the relay is made correct or not. The received signal at the destination can be expressed as yRD,Hybrid AF/DF = cyRD,AF + (1 − c)yRD,DF
(16)
Hybrid Hard/Soft Decode-and-Forward Relaying Protocol
377
where c ∈ {0, 1} is a constant determined by the conditional probability as follows: 0, P(γSR > γt |ySR ) > P(γSR < γt |ySR ) c= (17) 1, otherwise where γt is the threshold SNR, P(γSR > γt |ySR ) is the conditional probability of the received SNR at the relay exceeding the threshold SNR and otherwise, P(γSR < γt |ySR ).
4
Hybrid Hard/Soft Decode-and-Forward with Distributed Turbo Code
In this section, we propose a hybrid hard/soft DF with DTC scheme, where the relay decides on either hard or soft decoding the received signals from the source by comparing the received SNR at the relay with a specific threshold SNR. If the received SNR is above the threshold SNR, the relay decodes the received signals with hard decision ˆbk , ˆbk ∈ {0, 1}. Otherwise, parity symbol soft estimates (calculated by the APPs of the received signals) via soft decision, are forwarded to the destination. The block diagram of the proposed hybrid relaying protocol is depicted in Fig 2. 4.1
Hard Decode-and-Forward Relaying
Let ˜ = (˜b1 , ..., ˜bk , ..., ˜bl ) B
(18)
˜ is then represent the interleaved version of the binary information stream B. B re-encoded and modulated, respectively, as follows ˜ = (C ˜ 1 , ..., C ˜ k , ..., C ˜ l) C ˜ = (S ˜ 1 , ..., S ˜ k , ..., S ˜ l) S
(19) (20)
˜ k = (˜ik , p˜k ), ˜ik , p˜k ∈ {0, 1} is the codeword of ˜bk , ˜ik is the information where C ˜ k = (˜ symbol and p˜k is the corresponding parity symbol and S si,k , s˜p,k ), s˜i,k , s˜p,k ˜ k. ∈ {−1, 1} is the modulated symbol stream of C The relay then forwards the modulated signals to the destination, namely xhard DF = s˜, and the received signal from the relay can be written as yRD,hard DF = PR hRD s˜ + nRD .
(21)
(22)
378
T. Kim and D.I. Kim
Fig. 2. A block diagram of hybrid hard/soft DF relaying protocol with DTC
4.2
Soft Decode-and-Forward Relaying
If the received SNR at the relay is lower than the threshold SNR, the relay transmits soft decoded signals to the destination. The soft decoded signal is the parity symbol soft estimate of the interleaved information calculated as follows [11]. First of all, the relay calculates the APPs of the received signal denoted by P(bk = v|ySR ), v ∈ {0, 1} [13] m,m =Ms −1
P(bk = v|ySR ) = h
αk−1 (m )βk (m)γk (m, m )
(23)
m,m =0,b(k)=v
where h is a constant that satisfies the sum of APPs equal to 1, m and m are the pair of states connected with input information bit bk , Ms is the number of states in the trellis, γk (m, m ), αk (m ) and βk (m) are the branch metric, feed-forward and feedback metrics, which can be expressed as √ ySR − PS hSR sk 2 γk (m, m ) = exp − (24) N0 αk (m ) = αk−1 (m)γk (m, m ) (25) m
βk−1 (m) =
βk (m )γk (m, m ).
(26)
m
The boundary values of αk (m ) and βk (m) can be α0 (0) = 1 and α0 (m) = 0 for m = 0, βl (0) = 1 and βl (m) = 0 for m = 0 when zero termination is assumed for each codeword.
Hybrid Hard/Soft Decode-and-Forward Relaying Protocol
379
Let PB = {P(bk = v|ySR ), v = 0, 1, k = 1, ..., l}
(27)
represent the set of the APPs of the information symbols and the interleaved version of PB is denoted by PB ˜ . The probability of the state m at time k is then obtained as P[g(k) = m|ySR , PB ˜] = P[b(m, m )|ySR ] × P[g(k − 1) = m |ySR , Pb˜ ]
(28)
m
where b(m, m ) represents the input information symbol resulting in the transition from state m to m , P[b(m, m )|ySR ] is the APP of the information symbol b(m, m ) and P[g(k − 1) = m |ySR , PB ˜ ] denotes the probability of the state m at time k − 1. The APP of c˜k , the parity symbol of ˜bk , given PB ˜ can be calculated from (28) as P(˜ ck = v˜c |ySR , PB ˜c ∈ {0, 1} ˜ ), v ˜ = P(bk = c˜b |ySR ) × P[g(k − 1) = m|ySR , P˜ ] b
(29)
m∈U(˜ ck =˜ vc )
where U (˜ ck = v˜c ) is the set of branches for which the output parity symbol is equal to v˜c , v˜b is the output information symbol corresponding to the parity symbol v˜c . P(˜bk = c˜b |ySR ) represents the APP of the information symbol v˜b at time k. With BPSK modulation, the binary symbols 0 and 1 are mapped to 1 and -1, respectively. Thus, the parity symbol soft estimates of c˜k , denoted by sˆpk , can be calculated as sˆpk = P(˜ ck = 0|ySR , PB ck = 1|ySR , PB ˜ ) · 1 + P(˜ ˜ ) · (−1)
(30)
Here, we can write sˆpk as sˆpk = s˜pk (1 − n ¯ k ), where s˜pk is the exact parity symbol of ˜bk and n ¯ k is an equivalent noise with mean and variance, respectively 1 1 p μn¯ = n ¯k = |ˆ sk − s˜pk |¯ nk l l
(31)
1 (1 − sˆpk s˜pk − μn¯ ) l
(32)
l
l
k=1
k=1
l
σn2¯ =
k=1
The signals transmitted from the relay is then given by xsof t DF = αˆ spk = α˜ spk (1 − n ¯k)
(33)
380
T. Kim and D.I. Kim
where α is a normalization factor calculated from the transmit power constraint at the relay as PR α≤ . (34) (1 − μn¯ )2 + σn2¯ By substituting (33) into (6), the received signal from the relay can be represented as yRD,sof t DF = PR hRD α˜ spk (1 − n ¯k) + n ¯ RD (35) where n ¯ RD is the equivalent noise at the destination for soft DF with zero mean 2 and variance of σE = N0 + |hRD α|2 σn2¯ .
5
Simulation Results
In uplink transmission the signal power from relay to destination is fixed on the LOS, while the powers from source to relay and destination are varied. A frame of 130 symbols is sent over quasi-static fading channels. We use a four-sate RSC code with code rate of 1/2, generator matrix of (1, 5/7) and the CRC-16-CCITT is adopted for all simulations. The branch metric, associated with yRD , is somewhat different from (24), depending on each relaying scheme: √ √ yRD − PS PR hSR hRD βsk 2 γk,AF (m, m ) = exp − PR |hRD |2 β 2 + N0
0
10
AF (15dB) DF (15dB) Hybrid AF/DF (15dB) Hybrid hard/soft DF (15dB)
−1
10
−2
BER
10
−3
10
−4
10
−5
10
−6
10
0
2
4
6
8
10 12 SNR (S−R link, dB)
14
16
18
Fig. 3. BER performance comparison at 15dB in γRD
20
Hybrid Hard/Soft Decode-and-Forward Relaying Protocol
381
0
10
AF (20dB) DF (20dB) Hybrid AF/DF (20dB) Hybrid hard/soft DF (20dB)
−1
10
−2
BER
10
−3
10
−4
10
−5
10
−6
10
0
2
4
6
8
10 12 SNR (S−R link, dB)
14
16
18
20
Fig. 4. BER performance comparison at 20dB in γRD
√ PR hRD sk 2 γk,DF (m, m ) = exp − N0 √ |y − P spk (1 − n ¯ k )|2 RD R hRD α˜ γk,sof t−DF (m, m ) = exp − . 2 σE
yRD −
Figs. 3 and 4 compare the BER performance of three conventional relaying protocols and proposed hybrid hard/soft DF relaying protocol, as a function of the S-R link SNR when the assumed LOS R-D link SNR is fixed to 15dB and 20dB. First of all, we see that the proposed hybrid hard/soft DF relaying benefits from a significant coding gain offered by the DTC, compared to the distributed coding with AF relaying, which also contained in hybrid AF/DF. Furthermore, it overcomes the detrimental effects of error propagation due to the imperfect decoding at relay in pure DF relaying. Finally, unlike the other schemes, AF and DF, the BER of the proposed scheme is much improved with the incresed R-D link SNR. This is because the relay operates adaptively conditioning on the S-R link quality, which helps to avoid outage events.
6
Conclusion
In this paper, we proposed a hybrid hard/soft relaying protocol with DTC for a general two-hop relay network consisting of one relay. Depending on whether the relay can decode correctly or not, the relaying protocol adaptively chooses between hard and soft decoding for DF. In uplink transmission with R-D link SNR fixed and assumed LOS, while the S-R link SNR is varying, the proposed scheme is far better than conventional relaying schemes in the BER performance as observed in our simulation results. Since it has better diversity gain than AF
382
T. Kim and D.I. Kim
with less delay or complexity than DF via soft decoding at the relay, the proposed scheme can be an attractive relaying protocol in cooperative communications.
References 1. Sendonaris, A., Erkip, E., Aazhang, B.: User cooperation diversity - part 1: System description. IEEE Trans. on Communications 51, 1927–1938 (2003) 2. Nosratinia, A., Hunter, T., Hedayat, A.: Cooperative communication in wireless networks. IEEE Communications Magazine 42, 74–80 (2004) 3. Laneman, J.N., Tse, D.N.C.: Cooperative diversity in wireless networks: efficient protocols and outage behavior. IEEE Trans. on Inform. Theory 50, 3062–3080 (2004) 4. Zimmermann, E., Herhold, P., Fettweis, G.: On the performance of cooperative diversity protocols in practical wireless systems. In: IEEE VTC-Fall, Orlando, vol. 4, pp. 2212–2216 (2003) 5. Wang, T., Cano, A., Giannakis, G., Laneman, J.: High-performance cooperative demodulation with decode-and-forward relays. IEEE Trans. on Communications 55, 1427–1438 (2007) 6. Yu, M., Li, J.: Is amplify-and-forward practically better than decode-and-forward or vice versa? In: IEEE ICASSP, Philadelphia, vol. 3, pp. 365–368 (2005) 7. Li, Y., Vucetic, B., Yuan, J.: Distributed turbo coding with hybrid relaying protocols. In: IEEE PIMRC, Cannes, pp. 1–6 (2008) 8. Yiu, S., Schober, R., Lampe, L.: Distributed space-time block coding. IEEE Trans. on Communications 54, 1195–1206 (2006) 9. Zhao, B., Valenti, M.C.: Distributed turbo codes: towards the capacity of the relay channel. In: IEEE VTC-Fall, Florida, vol. 1, pp. 322–326 (2003) 10. Berrou, C., Glavieux, A.: Near optimum error correcting coding and decoding: Turbo-codes. IEEE Trans. on Communications 44, 1261–1271 (1996) 11. Li, Y., Vucetic, B., Wong, T., Dobler, M.: Distributed turbo coding with soft information relaying on multi-hop relaying networks. IEEE J. Select Area Communications 24, 2040–2050 (2006) 12. Sneessens, H.H., Vandendorpe, L.: Soft decode and forward improves cooperative communications. In: IEE Int. Conf. 3G and Beyond, London, pp. 1–4 (2005) 13. Bahl, L.R., Cocke, J., Jelinek, F., Raviv, J.: Optimal decoding of linear codes for minimizing symbol error rate. IEEE Trans. on Inf. Theory 20, 284–287 (1974)
Design of an Efficient Multicast Scheme for Vehicular Telematics Networks Junghoon Lee1 , In-Hye Shin1, , Hye-Jin Kim1 , Min-Jae Kang2 , and Sang Joon Kim3 1
Dept. of Computer Science and Statistics , 2 Dept. of Electronic Engineering, 3 Dept. of Computer Engineering, Cheju National University, 690-756, Jeju Do, Republic of Korea {jhlee, ihshin76, hjkim82, minjk, sjlee}@jejunu.ac.kr
Abstract. This paper proposes and measures the performance of an error control scheme for reliable message multicast on vehicular telematics networks, aiming at improving successful delivery ratio for the safety applications. Periodically triggered by an access point on the mobile gateway, the error recovery procedure collects the error reports from the reachable vehicles and decides the packet to retransmit considering available network bandwidth. The control scheme basically selects the message vehicles have missed most, giving additional precedence to messages belonging to a vehicle that is about to leave the gateway. The performance measurement result obtained via simulation using a discrete event scheduler shows that the proposed scheme can enhance the number of recovered messages by up to 12 % compared with the maximum selection scheme, showing better recovery ratio almost all ranges of given parameters. Keywords: vehicular telematics network, mobile gateway, multicast, retransmission, vehicle mobility.
1
Introduction
The vehicular telematics network consists of a large number of vehicles moving fast along the road segment, making users still connected even in a car[1]. The deployment of a vehicular network is accelerated by ever-growing wireless communication technologies, having a variety for the selection of an appropriate network. Using this network, diverse location-based services can be provided including real-time traffic information, route recommendation, and local map update[2]. Moreover, users can generate a query on location information on the area they are passing by, for example, the local transportation schedule, famous
This research was supported by the MKE (The Ministry of Knowledge Economy), Korea, under the ITRC (Information Technology Research Center) support program supervised by the IITA (Institute for Information Technology Advancement). (IITA2009-C1090-0902-0040). Corresponding author.
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 383–392, 2009. c Springer-Verlag Berlin Heidelberg 2009
384
J. Lee et al.
restaurants, and also their seat availability. Besides, the vehicular network draws attention in relation to safety issues such as accident information propagation, a road condition alert, and so on[1]. For this service, reliable multicast or broadcast is of greatest importance. While multicast is one of the essential communication primitives, the wireless channels are subject to unpredictable location-dependent and time-varying errors[3]. In case of multiple nodes, each one will have different channel conditions, processing powers, and only limited feedback channel capabilities. The design of an efficient protocol indispensably should take into account the underlying network architecture and application requirement. However, the mobility of vehicles makes it quite difficult to apply existing error control mechanisms for the multicast in the vehicular network. Namely, the vehicles move fast and they can be frequently connected to and disconnected from the network. There are many kinds of telematics networks ranging from infrastructure-based networks to fully ad hoc networks. First, the infrastructure-based network makes every vehicle stuffed with a cellular network interface such as GSM (Global System for Mobile), allowing each vehicle to ubiquitously access the global network. Second, in the vehicular ad-hoc network, or VANET, every vehicle exchanges its message in a fully ad-hoc manner according to the DSRC (Dedicated Short Range Communication) protocol[4]. While this network works without a centralized coordination function and eliminates communication cost, it suffers from connection instability and throughput fluctuation. To overcome the high cost of the infrastructure network and the possibly low connection coverage of the VANET, in a hybrid architecture which compromises the above two, some components play a role of the gateway node having two interfaces and relays between the two network types[5]. While the mobile gateway is always maintaining a connection to the infrastructure network, other nodes can access the global network only through this mobile gateway. The connection between the mobile gateway and other vehicles mainly depends on the transmission range of wireless communication interfaces. This network generally does not support the multi-hop connection due to the unpredictable spatial distribution of vehicles. As the protocol within this network follows the IEEE 802.11 series MAC[6], the error control within a mobile gateway boundary must be built on top of this protocol. The error control essentially accompanies additional messages including the report of lost frames and corresponding retransmissions. Therefore, error control should be tightly controlled by a coordinator such as an AP (Access Point). In addition, not all of damaged packets can be recovered due to lack of available bandwidth. Accordingly, it is important to decide which packet the control procedure is to retransmit at a given time instant. One of the most basic options is to pick the packet vehicles have missed most. This scheme is simple, but work very well in most cases. Even though this scheme is reasonable in the vehicular network, it doesn’t consider the mobility of vehicles. If a vehicle gets out of the gateway range, there is no chance to retransmit a damaged packet. Consequently, the mobility must be considered in selecting the packet to retransmit. In this
Design of an Efficient Multicast Scheme for Vehicular Telematics Networks
385
regard, this paper is to propose an error control scheme for multicast messages on the vehicular network and analyze its performance. The rest of this paper is organized as follows: After Section 2 introduces some related work and background, Section 3 explains basic assumptions on the system and mobility models. Section 4 describes the proposed error recovery scheme in detail, focusing on how to select the message to retransmit and Section 5 shows the result of performance measurement. Finally, Section 6 concludes this paper with a brief introduction of future work.
2
Related Work
As an example of MAC layer error control for multicast, SPEED system proposed a real-time area multicast protocol that directly considers geographic information in designing a multicast protocol in ad-hoc networks[7]. End-to-end real-time communication guarantees are achieved by a novel combination of feedback control and non-deterministic QoS-aware geographic forwarding with a bounded hop count. In addition, Lu et al. proposed a timestamp-based content-aware adaptive retry mechanism[8]. MAC dynamically determines whether to send or discard a packet by its retransmission deadline, which is assigned to each packet according to its temporal relationship and error propagation characteristics with respect to other packets within the same multicast group. However, their scheme is too complex to be exploited in the WLAN standard as it crosses the protocol layer boundaries. An efficient retransmission scheme (ER) was developed for wireless LANs[9]. Instead of retransmitting the lost packets in their original forms, ER codes packets lost at different destinations and uses a single retransmission to potentially recover multiple packet losses. This scheme can be best explained by this example. Consider two clients C1 and C2 associated with an AP. The AP has two packets to send p1 and p2. Here, C1 may lose p1 but receive p2; similarly, C2 may lose p2 but receive p1. ER reduces the number of transmissions by letting AP retransmit p1+p2, which is p1 xor-ed with p2, instead of sending p1 and p2 separately. Then C1 can extract p1 by xoring p2 with p1 + p2, and similarly C2 can extract p2 by xoring p1 with p1 + p2. In this way, the AP reduces the number of transmissions and the saved bandwidth can be used for more error recovery. It’s a very efficient scheme, but just lacks the consideration on the mobility factor. However, its coding concept can be integrated to other error control schemes in reducing the number of retransmission messages. Kang et al. proposed an error control scheme for the information framework where the main control policy of a garbage collecting task follows the work of an Ant-style routing scheme[10, 11]. This work concentrated on a route multicast scheme for the networked robot system, in which each robot multicasts its path history whenever it returns to the base to share the experience with other robots. With the mappings of route multicast to CFP (Contention Free Period), error report to CP (Contention Period), and retransmission to overallocated bandwidth over the IEEE 802.11 WLAN, this scheme can eliminate the
386
J. Lee et al.
interference to the guaranteed stream transmission under the complete control of AP. In addition, the message size field contained in each frame enables the timely report of an error list as long as at least a frame arrives at the receiver for each message. Finally, according to its current position along with the start point of route multicast stream, each robot decides whether to participate in the error control of a specific route multicast stream.
3
System Model
First of all, Figure 1 shows the architecture of a mobile gateway cell. The emerging telematics technology makes each vehicle equipped with a telematics device which is basically made up of a positioning component, computing power, and a wireless communication interface. As a vehicle moves around over the vast area, its locations keep changing and they can be represented generally by coordinate. GPS (Global Positioning System) is the most commonly used positioning method these days[2]. This in-vehicle telematics device also has computing capability enough to perform such operations as path finding and elementary map functions, providing the result to the driver though the display unit[12]. Finally, the wireless interface enables a vehicle to communicate with each other as well as a central information service, forming a vehicular telematics network. Mobile gateways are equipped with a cellular network interface for the global network side and a DSRC interface for the cell side. First, CDMA (Code Division Multiple Access) carrier, in Republic of Korea, provides ubiquitous connection to a global network and global information servers[13]. This interface provides a limited network bandwidth, while the communication cost is burdensome, reaching tens of US dollars monthly for each vehicle. Mobile gateways can’t interact directly with each other but only via the cellular telephony network. On the other hand, within the AP transmission range, multiple vehicles are connected via DSRC interface. AP periodically broadcasts a message to notify its existence to nodes in its vicinity. The bandwidth of this network amounts to as large as tens of Mbps, offering a chance of vigorous interactions between the vehicles, even though a prospective application is not yet developed. Cellular network
AP
DSRC
Fig. 1. Mobile gateway network
Design of an Efficient Multicast Scheme for Vehicular Telematics Networks
0.7
c
0.5 0.3
S1
0.4 S2
0.1
S3
S1 S2 S3 S1 0.7 0.3 0.0 S2 0.1 0.5 0.4 S3 0 1−c c
387
S1 S2 S3 0.1 0.2 0.95
1−c
(a) State transition
(b) Matrix setting
Fig. 2. Process control architecture
Within a cell, AP can operate CP or CFP. For CFP, the nodes must perform explicit join and leave operations[14], every time a node gets in and out of the transmission range of a gateway. Otherwise, CSMA/CA-based CP operation is enough for the transmission of broadcast from the gateway. Basically, a vehicle communicates with each other via DSRC interface within its transmission range. The cause of frame loss is collisions or overflows on the interface queue. To deal with this situation, for each data frame, the MAC layer transmits several control frames for the purpose of collision avoidance. Furthermore, if these control frames are lost due to a collision or bad propagation, the MAC layer will try to retransmit them again. However, according to the standard, the automatic MAC layer ACK from the receiver is not mandatory in multicast. Figure 2 describes the connection model of a node to the gateway. We have obtained the probability values empirically from the location history data of Jeju telematics system[15]. This state diagram shows the state transition a node experiences since it becomes connected to a mobile gateway until it leaves the cell. A vehicle having a messages not yet recovered may leave a cell and join again in a very short time. However, that vehicle may have joined another mobile gateway and received messages with difference sequence. In addition, a vehicle can hand off from one gateway to another. How to handle this situation belongs to other research areas. Hence, we assume that the message queue of a vehicle is reset when it joins a new mobile gateway cell. In addition, each transition occurs at each unit time, which indicates the relation between the mobility of a vehicle and the number of broadcast messages. It is decided by the speed and moving pattern of vehicles and message frequency. The state diagram consists of three states and the operation of state transition is straightforward. Above the arrow, the transition probability is shown. For example, a node changes its state from S1 to S2 with a probability of 0.7 at the next time unit. Particularly, in S3, a node keeps staying in this state with c, the tunable parameter. As shown in Figure 2(b), a node in S1 leaves the mobile gateway with a probability of 0.1, S2 with 0.2, and S3 with 0.95. The higher c, the more likely a node leaves a cell. When c is 0.5, the average duration a node stays in a gateway range is 4.95, and we consider this value as the reference stay time. In addition, the inter-arrival time of each vehicle is aligned with this value. A node is associated with a channel which has either of two states, namely, error state and error-free state at any time instant. A channel is defined between each mobile and AP, and can be modeled as a Gilbert channel[3]. We can denote
388
J. Lee et al.
the transition probability from state good to state bad by p while the one from bad to good by q, as shown in Fig. 2. The pair of p and q representing a range of channel conditions is generally obtained by using the trace-based channel estimation. The average error probability and the average length of a burst of p errors are derived to p+q and 1q , respectively. A packet is received correctly if the corresponding channel remains in state good for the whole duration of packet transmission. Otherwise, it is received in error.
4 4.1
Error Control Scheme Control Message Exchange
The first design issue for the error control scheme is how to send feedback to the sender, indicating the packet a client has received and missed. There are some existing methods available for the error report. COPE and ER make a node send reception reports to inform which set of packets it has recently received[9, 16]. Those schemes use selective/cumulative ACKs to minimize the impact of ACK losses. Bitmap-based ACKs can significantly reduce the size and number of messages compared with per-packet ACKs. For the reliability of feedback transmission, they send feedback using MAC-layer unicast, which will automatically retransmit lost feedback. In addition, Kang’s scheme also eliminated the per-frame response from the receiver by constructing the error report, as it increases the total number of control frames, leading to a substantial increase in the network load[10]. The receiver initializes an error frame list when it receives a frame of a message for the first time. As the receiver hears Beacon and a stream sends a frame per a polling round, the receiver can recognize the frame loss. This scheme makes all frames include a field specifying the number of frames, say u, in their belonging message so that the receiver can decide when to report an error frame list as long as it receives at least one frame. When the counter reaches u, the receiver sends an error report back to the sender via a contention period, which can prevent the transmission of error reports from interfereing with other messages. Next, the error control scheme should decide when to retransmit the requested message. According to Rozner’s work, if the sender retransmits a packet whenever the retransmission queue is non-empty, it achieves lowest retransmission delay[9]. On the other hand, such aggressive retransmission would reduce or even eliminate coding opportunities. As an example, there is only one packet in the retransmission queue, and the packet has to be sent by itself and results in 0 coding gain. For a good balance between low delay and high coding gain, it is desirable to retransmit the packet when the retransmission queue reaches a certain threshold or the packets in the retransmission queue timeout. In addition, the retransmission bandwidth can be reserved in prior by the AP according to the reliability level required by an application. After all, error report and retransmission have good previous work, so we don’t have to redesign on this part of error control scheme for vehicular networks.
Design of an Efficient Multicast Scheme for Vehicular Telematics Networks
4.2
389
Selection Criteria
Based on the message exchange procedure, the mobile gateway triggers the error control procedure periodically. We assume that the number of messages in each period is also an application-specific parameter. After collecting the error report, the error control scheme decides the packet to retransmit. Due to the inherent nature of broadcast, a single message can be received by multiple vehicles. Intuitively, it looks desirable to retransmit the message nodes have missed most. Accordingly, it is the first step to order n messages (M1 , M2 , ... Mn ) broadcasted during the last time window by F (Mi ), the frequency Mi appears in the error reports from the nodes in a cell. The number of retransmissions depends on the bandwidth available for the error recovery in the period. Here, in a single period, a message can be retransmitted just once even if there remains sufficient bandwidth. That is, there is no retransmission for the retransmitted message in a single period, as it needs another feedback from the receivers. We call this scheme maximum selection. However, the maximum selection scheme doesn’t take into account the mobility of vehicles. If a vehicle having many messages it wants to be retransmitted gets disconnected from the gateway, there is no chance to recover them. On the contrary, even though a message is requested by many vehicles, if they are likely to stay until the next retransmission period, it can be safe to defer their retransmissions. As a result, the mobility factor of a requester should be considered in deciding the message to retransmit by the gateway. So, this section proposes that the messages should be ordered by the following criteria. Namely, F (Mi ) + αN (Vj |Mi ),
(1)
where Vj |Mi denotes the vehicle Vj that requested Mi , while N (Vj |Mi ) is the total number of Vj ’s. Here, we found that when α is 0.5, it works quite well for all range of system configuration.
5
Performance Measurement
This section measures the performance of the proposed error control scheme via simulation using SMPL which basically provides a discrete event scheduler[17]. For the simplicity of simulation, we assume that every vehicle has common error characteristics to concentrate on the performance of the error retransmission scheme. The first experiment compares the recovery ratios of the maximum selection scheme and the proposed scheme. In this experiment, we set the average credit to 1, 2, and 3, each of which cases is plotted respectively in Figure 3. If a period has n credits, the gateway can retransmit up to n messages in that period. The average inter-arrival time, λ, is also parameterized in terms of the average connection duration time, μ. λ is set to 0.5 μ. The transition probability is set to 0.7. We change the loss ratio from 0 % to 5 %, and measured the number of recovered messages for the specified cases. The number of recovered messages for
390
J. Lee et al.
1.2 "Max" "cr=1" "cr=2" "cr=3"
1.15
Recovery
1.1 1.05 1 0.95 0.9 0
1
2 3 Error rate(%)
4
5
Fig. 3. Process control architecture
1.1 "Max" "err=1%" "err=3%" "err=5%"
Recovery
1.05
1
0.95
0.9 0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Transition probability
1
Fig. 4. Process control architecture
the maximum selection scheme is set to 1.0 and is shown as a horizontal line in the figure. Those for the proposed scheme are normalized with this value. The result reveals that the proposed scheme does not always show better performance, compared with the maximum selection scheme. However, more points are marked above 1.0, regardless of the credit value. The improvement reaches up to 12 % when the credit value is 2 and loss rate is 6 × 10−3 . The difference gap becomes insignificant according to the increase of error rate. Figure 4 plots the number of recovered packets according to the probability of state transition from S2 to S3. In this experiment, λ is also set to 0.5 μ, and the number of recoveries for each case is also normalized to the maximum selection case. The AP broadcasts 10 messages and has 2 credits on average per unit time.
Design of an Efficient Multicast Scheme for Vehicular Telematics Networks
391
As in Figure 3, more points are marked above 1.0 in Figure 4. The higher the transition probability, the more nodes stay in S3 state. That is, the more nodes are about to exit the transmission range. When the probability is around 0.5, S3 nodes and others are appropriately mixed, so the proposed scheme can enhance the recovery ratio maximally. In case the transition probability approaches 1.0, most vehicles are near the boundary of a cell. In this case, all curves are above 1.0, even though the performance gap is not remarkable.
6
Conclusion
This paper has proposed and measured the performance of an error control scheme for multicast messages on vehicular telematics networks, especially considering the mobile gateway network. The proposed procedure consists of collecting error reports from the member vehicles, deciding the packet to retransmit according to the available bandwidth, and finally retransmitting those messages. Periodically triggered by an access point on the mobile gateway, the control scheme basically selects the message vehicles have missed most, giving additional precedence to messages belonging to a vehicle that is about to leave the gateway. It compensates for the probability that the message loses chance to be recovered after disconnected from the gateway. The performance measurement result obtained via simulation using a discrete event scheduler shows that the proposed scheme can enhance the number of recovered messages by up to 12 % compared with the maximum selection scheme, showing better recovery ratio almost all ranges of given parameters. As future work, we are planning to begin another research on delegating an error control agent to the error-prone part of vehicular ad-hoc networks to reduce the size of the control loop, as VANET is considered to be a good candidate for the vehicular safety applications in diverse areas and it still has many ongoing research issues for reliable multicast.
References 1. US Depart of Transportation. Vehicle safety communication project-final report. Technical Report HS 810 591 (2006), http://www-nrd.nhtsa.dot.gov/departments/nrd-12/pubs\_rev.html 2. Hazas, M., Scott, J., Krumm, J.: Location-Aware Computing Comes of Age. IEEE Computer Magazine 37(2), 95–97 (2004) 3. Shah, S., Chen, K., Nahrstedt, K.: Dynamic bandwidth management for singlehop ad hoc wireless networks. ACM/Kluwer Mobile Networks and Applications (MONET) Journal 10, 199–217 (2005) 4. Society of Automotive Engineers: Dedicated short range communication message set dictionary. Tech. Rep. Standard J2735, SAE (2006) 5. Namboodiri, V., Agrawal, M., Gao, L.: A study on the feasibility of mobile gateways for vehicular ad-hoc networks. ACM VANET, 66–75 (2004) 6. IEEE 802.11-1999: Part 11 - Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications (1999), http://standards.ieee.org/ getieee802
392
J. Lee et al.
7. He, T., Stankovic, J., Lu, C., Abdelzaher, T.: SPEED: A real-time routing protocol for sensor networks. University of Virginia Tech. Report CS-2002-09 (2002) 8. Lu, A., Chen, T., Steenkiste, P.: Video streaming over 802.11 WLAN with contextaware adaptive retry. In: IEEE International Conference on Multimedia and Expo. (2005) 9. Rozner, E., Padmanabha, A., Mehta, Y., Qiu, L., Jafry, M.: ER: Efficient Retransmission Scheme for Wireless LANs. In: Proc. of CoNext (2007) 10. Kang, M., Kim, S., Park, G., Kang, M., Kwak, H., Kwon, H., Lee, J.: A locationaware error control scheme of route multicast for moving objects. In: Nguyen, N.T., Grzech, A., Howlett, R.J., Jain, L.C. (eds.) KES-AMSTA 2007. LNCS, vol. 4496, pp. 982–989. Springer, Heidelberg (2007) 11. Vaughan, R.T., Stoy, K., Sukhatme, G.S., Mataric, M.J.: Whistling in the dark: cooperative trail following in uncertain localization space. In: Proc. Int. Conf. Autonomous Agents, pp. 187–194 (2000) 12. Green, P.: Driver Distraction, Telematics Design, and Workload Managers-Safety Issues and Solutions. In: Proceedings of the 2004 International Congress on Transportation Electronics, pp. 165–180 (2004) 13. Lee, J., Park, G., Kang, M.: A message scheduling scheme in hybrid telematics networks. In: Gervasi, O., Murgante, B., Lagan` a, A., Taniar, D., Mun, Y., Gavrilova, M.L. (eds.) ICCSA 2008, Part I. LNCS, vol. 5072, pp. 769–779. Springer, Heidelberg (2008) 14. Song, S., Han, S., Mok, A., Chen, D., Nixon, M., Lucas, M., Pratt, W.: WirelessHART: Applying Wireless Technology in Real-Time Industrial Process Control. In: The 14th IEEE Real-Time and Embedded Technology and Applications Symposium, pp. 377–386 (2008) 15. Lee, J., Park, G., Kim, H., Yang, Y., Kim, P., Kim, S.: A telematics service system based on the Linux cluster. In: Shi, Y., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2007. LNCS, vol. 4490, pp. 660–667. Springer, Heidelberg (2007) 16. Katti, S., Rahul, H., Hu, W., Katabi, D., Medard, M., Crowcroft, J.: XORs in the air: Practical wireless network coding. In: Proc. of ACM SIGCOMM (2006) 17. MacDougall, M.: Simulating Computer Systems: Techniques and Tools. MIT Press, Cambridge (1987)
ODDUGI: Ubiquitous Mobile Agent System SungJin Choi1 , Hyunseung Choo1, , MaengSoon Baik2 , HongSoo Kim3 , and EunJoung Byun4 1
School of Information and Communication Engineering, Sungkyunkwan University
[email protected],
[email protected] 2 SAMSUNG SDS
[email protected] 3 SAMSUNG ELECTRONICS CO.
[email protected] 4 Kibo Technology Fund
[email protected]
Abstract. A mobile agent is regarded as an attractive technology when developing distributed applications in mobile and ubiquitous computing environments. In this paper, we present ODDUGI, a java-based ubiquitous mobile agent system. The ODDUGI mobile agent system provides fault tolerance, security, location management and message delivery mechanisms in a multi-region mobile agent computing environment. We describe the architecture, design concepts and main features of the ODDUGI. In addition, we present the One-Touch Campus Service application developed on the basis of ODDUGI in mobile and ubiquitous computing environments.
1
Introduction
In distributed systems, nodes connected with networks used various communication paradigms such as message passing, RPC(Remote Procedure Call) and remote execution in order to exchange information and share their resources one another [1, 3, 52]. Recently, the extension of computer network to portable devices such as cellular phone, PDA and laptop leads to integration between wired network and wireless network. In other words, the development of wireless technology makes nomadic users to access computing services anywhere, so it enables the advent of mobile and ubiquitous computing [5]. However, mobile and ubiquitous computing encounters some problems such as relatively poor resources and low reliable communication channel (i.e., low bandwidth, high latency or intermittent disconnection). Thus, when developing distributed applications in such an environment, we need to adopt a new communication paradigm (i.e., mobile agent technology) in order to adapt to such an integrated and heterogeneous computer network. A mobile agent is a software program that migrates from one node to another while performing given tasks on behalf of a user [1–4, 7]. A mobile agent has
Corresponding author.
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 393–407, 2009. c Springer-Verlag Berlin Heidelberg 2009
394
S. Choi et al.
some benefits as follows [1–3, 5–8]. 1) A mobile agent can reduce network load and latency. 2) It can solve frequent and intermittent disconnection. 3) It enables dynamic service customization and software deployment. 4) It can adapt to heterogeneous environment as well as dynamic environmental changes. In all respects, mobile agent technology has been used widely in distributed computing, mobile computing, grid computing and ubiquitous computing. It has been used for some applications such as distributed information retrieval, electronic commerce, distributed network management and parallel computing [1–3, 8, 17]. Recently, mobile agents are largely used for resource monitoring and management as well as scheduling mechanism in Grid computing environments [20], resource discovery in peer to peer computing environments [21], and service discovery and composition and context awareness in ubiquitous computing environments [22, 23]. A mobile agent system is a platform that can create, interpret, execute, transfer, and terminate mobile agents [4, 7, 8]. Some distinguishing mobile agent systems have been developed: Telescript [7], Aglets [8], Voyager [9], Concordia [10], Mole [11], JAMES [12], Ajanta [13], Ara [14], D’Agent [15], MobileSpaces [16], MAP [17], MESSENGERS [18], TACOMA [19], and so on. Most mobile agent systems are implemented in Java language except Telescript (telescript), Ara (C, C++, Tcl), MESSENGERS (M0), D’Agent (Java, Tcl, Scheme), and TACOMA (C). Java is mainly selected as language of mobile agent systems [8–13, 15, 16]. In this paper, we present a ubiquitous mobile agent system which is called “ODDUGI”1 . The ODDUGI mobile agent system [53] was developed by using Java. ODDUGI aims to support high reliability, security and efficient location management and message delivery mechanisms between mobile agents. ODDUGI was deployed in a multi-region mobile agent computing environment organized into many regions. We have also developed One-Touch Campus Service applications on the basis of ODDUGI to show that how the ODDUGI can be used in mobile and ubiquitous computing environments. The rest of the paper is structured as follows. Section 2 describes the architecture and key concepts of the ODDUGI mobile agent system. Section 3 presents some notes about the implementation of the ODDUGI system. Section 4 illustrates the One-Touch Campus Service applications. Section 5 concludes the paper.
2
ODDUGI Mobile Agent System
In this section, we first present the architecture of the ODDUGI mobile agent system. Then, we describe a multi-region mobile agent computing environment in which ODDUGI is deployed. Finally, we illustrate the main design concepts: fault tolerance, security, location management and message delivery. 1
ODDUGI is a Korean word which means a self-righting toy. A self-righting toy doll can rock over and then right itself. Our project team seek to bring the sprit of selfrighting toy doll into the realm of mobile agent computing in order to convey that our mobile agent system guarantees the reliable, secure and fault tolerant execution of mobile agents.
ODDUGI: Ubiquitous Mobile Agent System
2.1
395
Architecture
The ODDUGI mobile agent system is a platform that can create, execute, transfer, and terminate mobile agents. ODDUGI provides all primitives needed for the development and management of mobile agents. It provides the basic primitives such as creation, execution, clone, migration, activation, deactivation, and termination. It also gives extended primitives such as communication, fault tolerance, security, location management and message delivery, and monitoring tools. ODDUGI consists of agent, place and runtime layers as shown in Figure 1. The runtime layer initiates the place and various managers. It initiates the resource manager, communication manager and security manager. The communication manager is the daemon entity that listens to a certain port, waiting for mobile agents or messages from other nodes. It cooperates with location management & message delivery and messaging parts in the place layer. The resource manager controls the system properties and resources used in the ODDUGI mobile agent system. The security manager is responsible for security of the ODDUGI mobile agent system. It cooperates with the security part in the place layer. The place layer is an execution environment for mobile agents. It provides core functionalities such as creation, execution, clone, migration, retraction, activation, deactivation, termination and messaging as well as enhanced mechanisms such as fault tolerance, security and location management and message delivery of mobile agents. It also provides the GUI-based system manager that is easily able to manage mobile agents systems. The agent layer provides application developers with APIs such as migration, clone, activation and communication
Application API
AGENT Code
Data
Information
Mobility metadata
GUIGUI-based System Manager Fault Tolerance
PLACE
RUNTIME
Location management & Message delivery
Security
Create
Migrate
Activate
Terminate
Clone
Retract
Deactivate
Messaging
Resource Manager
Security Manager
Communication Manager
Fig. 1. Architecture of the ODDUGI mobile agent system
396
S. Choi et al.
when implementing mobile agents. It allows application developers to create, execute, clone, migrate, retract, activate, deactivate and terminate mobile agents as well as to interact with other agents. A mobile agent object that consists of code, data, various information (i.e., identifier, creator, time, codebase, etc.), and mobility metadata (i.e., itinerary, state, results etc.) can be manipulated in this layer. 2.2
Multi-region Mobile Agent Computing Environment
In this paper, we assume a multi-region environment organized into many regions. The environment is composed of five components: mobile agent, node (i.e., place), region, region server (RS), and lookup service server (LSS). Figure 2 shows a multiregion mobile agent computing environment. − Mobile Agent : A mobile agent is a mobile object that consists of code, data, state, and mobility metadata. The mobile agent migrates from one node to another while performing a task autonomously on behalf of a user. When a mobile agent migrates, it chooses the next destination according to an itinerary that is a predefined travel plan or dynamically according to execution results. A path from a source to the last destination is called a migration path. Region1
Region2
Region3
Regionk-1
Regionk
••• Lookup Service Server (LSS)
Internet
l
••• RegionN
RegionN-1
Node0
Regionk+2
RegionN-2
Node1
Regionk+1
Nodej
••• NodeM
Region Server (RSk+2)
NodeM-1
Nodej+1 Mobile Agent
•••
Fig. 2. Multi-region mobile agent computing environment
Migration Path
ODDUGI: Ubiquitous Mobile Agent System
397
− Node : A node (i.e., place) is the execution environment for mobile agents. That is, a mobile agent system is installed at the node. The node offers a specific service. A mobile agent executes tasks on a sequence of nodes. The node which firstly creates a mobile agent is called a home node. − Region : A region is a set of nodes that have the same authority. − Region Server : A region server (RS) is responsible for the authority of its region [4]. RS provides a naming service for mobile agent created within its region, cooperating with its LSS. It performs location management for the mobile agents located within its region by maintaining location information. − Lookup Service Server : A lookup service server (LSS) provides the lookup service for mobile agent and node. LSS maintains the location information for the mobile agents created in all regions. In addition, it maintains the services which are delivered in each node. In such an environment, a mobile agent executes tasks on a sequence of nodes while migrating to nodes in some regions. Each action that a mobile agent performs on a node is called a stage. The execution of mobile agent on a node results in a new internal state of mobile agent as well as potentially a new state of the place. Therefore, the mobile agent in a previous stage is different from one in the current stage. 2.3
Fault Tolerance
Fault tolerance is essential to the development of reliable mobile agent system in order to guarantee continuous execution of mobile agents [24–29]. During the execution of mobile agents, link and node failures lead to the partial or complete loss of mobile agents or the blocking of executions. Therefore, fault tolerant protocols guarantee that the executions of mobile agents proceed in spite of failures. They also ensure that mobile agents eventually arrive at destination, keeping their final results of executions. It is difficult to build a fault tolerant mobile agent system since not only the failures on nodes to which mobile agents migrate, but also the failures of communication between any two nodes should be dealt with [30, 31, 51]. Moreover, asynchronous distributed systems such as Internet increases the difficulties. Since there is no boundary on communication delays in an asynchronous distributed system, it is impossible to distinguish between a slow processor and a failed processor [51, 52]. In other words, it is impossible to distinguish between the failed mobile agent due to node or link failures and slow mobile agent due to slow communication link or processor speed. ODDUGI mobile agent system provides some fault tolerant protocols, optimistic temporal replication based approach (OTRBA) [32, 33] and region-based stage construction (RBSC) [34] protocols, in a multi-region mobile agent computing environment. In OTRBA, we classify the exactly-once property for fault tolerant mobile agent execution into three categories: strictly exactly-once, one-free exactly-once, and k-free exactly-once property. ODDUGI provides various protocols which can accommodate requirements of various applications, so it increases the flexibility
398
S. Choi et al.
for the fault tolerant property. Especially, OTRBA guarantees the exactly-once property by observing the one-free exactly-once property in each stage. In addition, to solve the scalability problem incurred when existing fault tolerant protocols are applied to a multi-region mobile agent computing environment, OTRBA uses a region server to audit the exactly-once property when a mobile agent migrates to a node in a different region. When existing spatial replication based approaches are applied to a multiregion mobile agent computing environment, they have high overhead of stage works because they construct stages regardless of regions. RBSC solves this problem by using quasi-participant and substage to put together some places located in different regions within a stage into the same region. RBSC decreases the overhead of stage works, so it reduces the total execution time of mobile agents. 2.4
Security
Security must be taken into account in developing a mobile agent system in order to provide reliable and secure execution of mobile agent [35–38]. In a mobile agent computing environment, it is more difficult to implement security mechanisms than in traditional distributed systems because a mobile agent is a mobility entity [35, 38]. Moreover, it takes actions on behalf of a user. Furthermore, there are many identities such as agent developer, agent owner and agent dispatcher (i.e., a user or the host the mobile agent visited last). The ODDUGI mobile agent system also provides various security mechanisms. Like most mobile agent systems, ODDUGI provides challenge-response based authentication, authorization based on ACL, encryption, and one-way hashing approaches in order not only to protect mobile agents from insecure mobile agents, but also to protect a node from malicious mobile agents. Especially, ODDUGI also provides a protection protocol for sensitive data of mobile agent from malicious nodes in a multi-region mobile agent computing environment [39]. In this protocol, the part of encrypted sensitive data of a mobile agent is separated from its components required to execute at a node. In addition, the encrypted sensitive data consists of pre-image and residual-image. A mobile agent can proceed without fully using sensitive data. That is, mobile agent with pre-image can proceed without using residual-image. This pattern of agent execution is called post-confirmation. The protection protocol makes use of the concept of post-confirmation. In other words, when the host needs more sensitive data (i.e., residual-image) during the execution of mobile agent, it reserves the sensitive data. After that, the host can execute mobile agent again when getting the sensitive data. As a result, the protection protocol prevents the leakage and improper usage of sensitive data of mobile agents. In addition, in order to apply to a multi-region mobile agent computing environment, the protection protocol also use a region server when a mobile agent migrates to a node in different region. Consequently, it solves a scalability problem which occurs in a multi-region mobile agent computing environment.
ODDUGI: Ubiquitous Mobile Agent System
2.5
399
Location Management and Message Delivery
Location management and message delivery protocols are fundamental to the further development of mobile agent systems in a multi-region mobile agent computing environment in order to not only control mobile agents, but also guarantee message transfer between mobile agents [40–48]. It is more difficult to implement location management and message delivery protocol in a multiregion mobile agent computing environment due to the frequent mobility of mobile agents as well as the limited network bandwidth in a widely distributed computing environment such as Internet. We propose new location management and message delivery protocols in a multi-region mobile agent computing environment: Broadcast-based Message Transferring protocol [49] and a new Reliable Asynchronous Message Delivery (RAMD) protocol [50]. The Broadcast-based Message Transferring protocol [49] is based on previous Broadcast model. However, the protocol broadcasts the receiving notification of message instead of message itself, so it reduces the communication cost. In this protocol, the HN maintains the information of the current RS in which a mobile agent is located. The RS maintains the names of mobile agents which is located in it. The location registration phases are also performed differently according to the type of migration. In the case of intra-region migration, there exist no location update messages. On the other hand, in the case of inter-region migration, a mobile agent updates its location information to the previous RS, current RS and HN. In message delivery phase, the sending mobile agent (i.e., sender) firstly sends the request message to LSS in order to obtaining the location information of the corresponding mobile agent. After that, the LSS sends the address of the HN of the mobile agent to the sender. The sender sends a message to the HN. If the mobile agent is located in the HN, the HN broadcasts the receiving notification of this message to all nodes in the home region. The nodes store the notification for a fixed period time in order to deal with the mobile agent in transit between two nodes. If the HN receives the reply message from the node in which the mobile agent is located, it forwards the message to the node (i.e., the mobile agent). On the other hand, If the mobile agent is located in the different region, not home region, the HN sends the message to the current RS. When the RS receives the message, it stores the message and broadcasts the receiving notification of the message to all nodes in its region. If the RS receives the reply message from the node in which the mobile agent is located, it forwards the message to the node (i.e., the mobile agent). Like this, the proposed protocol makes use of LSS, RS, and HN in a multi-region mobile agent computing environment, so it solves the scalability problem. The RAMD protocol [50] consists of a location management procedure and a blackboard-based asynchronous message delivery procedure. In order to reliably and asynchronously deliver messages, the RAMD protocol exploits a blackboard (i.e., a shared information space for message exchange). In addition, message delivery is tightly related with agent migration. It guarantees message delivery by placing a blackboard in each region. The RAMD protocol consists of a creation &
400
S. Choi et al.
migration phase, and message delivery phase. The creation & migration phase is a procedure that registers the location of mobile agent to its LSS, HN, or RS when a mobile agent is created or when it migrates to another node. A migration phase is organized into intra-region migration (i.e. when a mobile agent migrates within the same region) and inter-region migration (i.e. when a mobile agent migrates to another region). According to type of migration, the location registration procedures are performed differently. In the case of the intra-region migration, a mobile agent sends a location update message only to the RS with which the mobile agent is associated. In the case of the inter-region migration, a mobile agent sends location update messages to the HN, previous RS and current RS. The message delivery phase is a procedure that delivers a message to a mobile agent after finding it. First, a sender finds the address of the HN by contacting the LS and sends a message to the HN. Then, the HN finds the address of the current RS and sends the message to the RS. After that, the RS puts the message on its blackboard. Finally, when the RS receives a location update message from a mobile agent, the RS checks its blackboard. If there is a message, the RS retrieve it from the blackboard and delivers it to the mobile agent. Likewise, the RAMD protocol decreases the cost of location management and message delivery and solves the following problem with low communication cost. Furthermore, the RAMD protocol deals with the location management and message delivery of cloned and parent and child mobile agents, so that it guarantees message delivery of these mobile agents.
3
Implementation
The ODDUGI mobile agent system has been implemented in Java language (JDK 1.3.1 or later version). The ODDUGI mobile agent system provides fault tolerance, security, and location management & message delivery. Fault tolerance. The ODDUGI mobile agent system provides fault tolerant execution of mobile agent in a multi-region mobile agent computing environment. The fault tolerant protocols are implemented in the LSS, RS, and HN. In addition, the ODDUGI mobile agent system provides the GUI-based fault tolerance manager which facilitates activation, deactivation and snapshot functionalities, as shown in Figure 3. The deactivation enables a user to suspend the execution of a mobile agent, and to store the mobile agent as the stream of bytes at the local disk, as shown in Figure 3 (a). The deactivation of a mobile agent is carried out with methods deactivateMobileAgent in the OddugiPlace class and deactivate in the MobileAgent class in the application level. In addition, ODDUGI provides the snapshot mechanism which similar to deactivation except that the mobile agent continuously runs after deactivation, as shown in Figure 3 (a). The snapshot of a mobile agent is carried out with methods snapshotMobileAgent in the OddugiPlace class and snapshot in the MobileAgent class in the application level.
ODDUGI: Ubiquitous Mobile Agent System
(a)
401
(b)
Fig. 3. GUI based fault tolerance manager
The activation enables a user to activate the mobile agent which had been deactivated before, as shown in Figure 3(b). The serialized mobile agent is deserialized and then run again. The activation of a mobile agent is carried out with methods activateMobileAgent in the OddugiPlace class and activate in the MobileAgent class in the application level. Security. The ODDUGI mobile agent system provides the GUI-based security manager which facilitates the configuration of security policy, as shown in Figure 4. The Agent Permission is a method to protect the mobile agent from malicious nodes. The agent security is carried out with methods checkPermission, checkAgentPermission, applyAgentPermission and removeAgentPermission in the OddugiPlace class and the AgentPermission and OddugiSecurityManager classes in the runtime layer. The Place Permission is a method to protect the node (i.e., place) from malicious mobile agents. The place security is carried out with methods applyPlacePermission and
(a)
(b)
Fig. 4. GUI based security manager
402
S. Choi et al.
removePlacePermission in the OddugiPlace class and the PlacePermission and OddugiSecurityManager classes in the runtime layer. Figure 4 (a) shows the method to configure the security policy of mobile agent. Figure 4 (b) shows the method to configure the security policy of the place. Location management & message delivery. The ODDUGI mobile agent system implements the RAMD protocol and Broadcast-based message transferring protocol in a multi-region mobile agent computing environment. The protocols are implemented in the LSS, RS and HN. The ODDUGI provides GUI-based location manager. The LSS maintains the list of mobile agents which contains the entries such as Home Place, AgentID, AgentName, and Home RS, as shown in Figure 5 (a). The RS maintains the list of mobile agents which contains the entries such as AgentID, AgentName, Home Place, Current Place and Home RS, as shown in Figure 5 (b). In addition, it maintains the list of places. The HN maintains the list of mobile agents which contains the entries such as AgentID, AgentName, Current Place, Home RS, and Current RS, as shown in Figure 5 (c). Location management is implemented in the OddugiFinderRMIImpl and OddugiPlace classes. OddugiFinderRMIImpl can be used as an LS, an RS or an HN according to the parameter setting. Methods such as registerAgent(), registerPlace(), registerRegion() and registerService() in the OddugiFinderRMIImpl class are used during the creation procedure. Methods such as updateAgent-
(a)
(b)
(c)
Fig. 5. GUI based location manager
ODDUGI: Ubiquitous Mobile Agent System
403
Location() and updateAgentLocationToPreviousRS() in the OddugiFinderRMIImpl class are used during the migration procedure. The OddugiFinderRMIImpl class provides methods such as unRegisterAgent(), unRegisterPlace(), unRegisterRegion(), unRegisterService() to remove location information. The OddugiPlace class provides functions that can create, interpret, execute, clone, activate, deactivate, transfer and terminate mobile agents. When an agent is created and migrates, it registers, updates and deletes its information to the LS, RS and HN using methods such as registerAgentToHN(), registerAgentToRS(), registerAgentToLS(), registerPlace(), updateAgentLocationToRS(), updateAgentLocationToPreviousRS(), updateAgentLocationToHN(), removeAgentRegisterFromLS(), removeAgentRegisterFromRS() and removeAgentRegisterFromHN(). Methods such as registerServiceToLS(), registerRegionToLS(), removeServiceRegisterFromLS() and removeRegionRegisterFromLS() are used to register and remove services and regions. Message delivery is also implemented in the OddugiFinderRMIImpl, OddugiPlace and MobileAgent classes. To find the current location of an agent, methods such as lookupAgent(), lookupPlace(), lookupPlace() in OddugiFinderRMIImpl class are used. Methods in the OddugiPlace class such as lookupCurrentRSFromHN(), lookupCurrentPlaceFromRS() and lookupCurrentRSFromHomeRS() are also used. ODDUGI provides synchronous and asynchronous communication between mobile agents. Synchronous communication between mobile agents is achieved through methods such as sendMessage, sendMulticast() and receiveMessage() in OddugiPlace class as well as methods sendMessage(), multicast(), or handleMessage() in the MobileAgent class that provides an API for users to implement messaging in an application level. Asynchronous communication is performed through methods such as sendAsyncMessage(), and sendAsyncMulticast() in the OddugiPlace class as well as methods sendAsyncMessage() and asyncMulticast() in the MobileAgent class. To maintain messages on the blackboard in an RS, methods such as putMessage(), getMessages() and withdrawMessages() in the OddugiFinderRMIImpl class are used.
4
One-Touch Campus Services Applications in Ubiquitous Computing
In this section, we present One-Touch Campus Service applications to show that how ODDUGI can be applied to in mobile and ubiquitous computing environments. One-Touch Campus Service applications automatically provide the various services related to students such as certificate application, class information, book reservation, email checking, and so on. 4.1
Scenario
CheolSoo is a student in Korea University. He is now at home, preparing for going to school. Before going to school, he wants to check not only the schedules or notices of the classes, but also email. In addition, he has a lot of things at
404
S. Choi et al.
school today. He has to apply to the certificate of school record and the certificate of graduation at a one-stop service center. Second, he should lend books in a library. CheolSoo requests those things to a campus mobile agent by using a cellular phone or PDA. On his way to school, his handheld shows results about not only information the classes and email, e.g., no class or the cancellation of an appointment, but also certificates application and book reservation, so he can arrange his schedule without any problems. As soon as he enters the one-stop service center, he can immediately get the certificate of school record and the certificate of graduation which the campus mobile agent has already applied to. He can immediately lend the books which the campus mobile agent has already reserved. 4.2
Implementation
We have implemented the One-Touch Campus Services Applications on the basis of the ODDUGI mobile agent system. We constructed five services to support the One-Touch Campus Services applications: service center, certificate service, library service, class service, and email service, as presented in Figure 6. The service center receives the requests from a user and creates a campus mobile agent that is able to carry out the requested tasks. When the campus mobile agent comes back after performing the tasks, the service center informs the results to the user. The certificate service issues various certificates such as school record, graduation, studentship and so on. The library service has a responsibility for finding and reserving the requested book, otherwise, subscribing it. The class service maintains the information of lectures such as homework, changed classroom, changed class hours, no class and so on. Finally, email service provides the method to check the lists of emails. These services were implemented at nodes which are running ODDUGI mobile agent systems.
Certificate Service
User
Class Service
Service Center
Email Service
Fig. 6. One-Touch Campus Service Application
Library Service
ODDUGI: Ubiquitous Mobile Agent System
5
405
Conclusions
In this paper, we illustrated the ODDUGI mobile agent system which is a javabased platform for constructing mobile agents in a multi-region mobile agent computing environment. Especially, we described the architecture, design concepts and main features of ODDUGI. The ODDUGI mobile agent system focuses on fault tolerance, security, location management and message delivery mechanisms, so it guarantees not only fault tolerant and secure execution of mobile agents. Moreover, it can locate mobile agents efficiently and transfer messages reliably between mobile agents. In addition, we developed the One-Touch Campus Services applications to show that how ODDUGI can be applied to in mobile and ubiquitous computing environments.
Acknowledgment This research was supported by MKE, Korea under ITRC IITA-2009-(C10900902-0046) and by MEST(Korea), under WCU Program supervised by the KOSEF (No. R31-2008-000-10062-0).
References [1] Fuggetta, A., Picco, G.P., Vigna, G.: Understanding Code Mobility. IEEE Transactions on Software Engeering 24(5), 342–361 (1998) [2] Maes, P., Guttman, R.H., Moukas, A.G.: Agents That Buy and Sell. Communication of the ACM 42(3), 81–91 (1999) [3] Wong, D., Paciorek, N., Moore, D.: Java-based Mobile Agents. Communication of the ACM 42(3), 92–102 (1999) [4] Object Management Group, Mobile agent system interoperability facilities specification, OMG TC Document orbos/97-10-05 (1997) [5] Spyrou, C., Samaras, G., Pitoura, E., Evripidou, P.: Mobile Agents for Wireless Computing: The Convergence of Wireless Computational Models with MobileAgent Technologies. The Journal of Mobile Neworks and Applications 9(5), 517– 528 (2004) [6] Cardoso, R.S., Kon, F.: Mobile Agents: A Key for Effective Pervasive Computing. In: OOPSLA 2002 (2002) [7] White, J.: Mobile Agents White Paper, General Magic (1996) [8] Lange, D., Oshima, M.: Programming and Deploying Java Mobile Agents with Aglets. Addison Wesley, Reading (1998) [9] Object Space Inc. Voyager core package technical overview. Technical Report (1997) [10] Wong, D., Paciorek, N., Walsh, T., DiCelie, J., Young, M., Peet, B.: Concordia: An Infrastructure for Collaborating Mobile Agents. In: Rothermel, K., PopescuZeletin, R. (eds.) MA 1997. LNCS, vol. 1219. Springer, Heidelberg (1997) [11] Baumann, J., Hohl, F., Rothermel, K., Straer, M.: Mole - Concepts of a Mobile Agent System. World Wide Web 1(3), 123–137 (1998) [12] Silva, L.M., Simoes, P., Soares, G., Martins, P., Batista, V., Renato, C., Almeida, L., Stohr, N.: JAMES: A Platform of Mobile agents for the Management of Telecommunication Networks. In: Albayrak, S ¸ . (ed.) IATA 1999. LNCS, vol. 1699, pp. 77–95. Springer, Heidelberg (1999)
406
S. Choi et al.
[13] Karnik, N.M., Tripathi, A.R.: Design Issues in Mobile-Agent Programming Systems. IEEE Concurrency, 52–61 (1998) [14] Peine, H.: Application and programming experience with the Ara mobile agent system. Software-Practice and Experience 32, 515–541 (2002) [15] Gray, R.S., Cybenko, G., Kotz, D., Perterson, R.A., Rus, D.: D’Agents: Applications and performance of a mobile-agent system. Software-Practice and Experience 32, 543–573 (2002) [16] Satoh, I.: MobileSpaces: A Framework for Building Adaptive Distributed Applications using a Hierarchical Mobile Agent System. In: ICDCS 2000, pp. 161–168 (April 2000) [17] Puliafito, A., Tomarchio, O., Vita, L.: MAP: Design and implementation of a mobile agents’ platform. Journal of Systems Architecture 46, 145–162 (2000) [18] Fukuda, M., Bic, L.F., Dillencourt, M.B., Merchant, F.: MESSENGERS: Distributed Programming Using Mobile Agents. Transaction of the Society for Design and Process Science (SDPS) 5(4) (December 2001) [19] Johansen, D., Lauvset, K.J., Renesse, R.v., Schneider, F.B., Sudmann, N.P., Jacobsen, K.: A TACOMA retrospective. Software-Practice and Experience 32, 605– 619 (2002) [20] Fukuda, M., Tanaka, Y., Suzuki, N., Bic, L.F.: A mobile-Agent-Based PC Grid. In: Autonomic Computing Workshop AMS 2003, pp. 142–150 (June 2003) [21] Dunne, C.R.: Using mobile agents for network resource discovery in peer-to-peer networks. ACM SIGecom Exchanges 2(3), 1–9 (2001) [22] Bagci, F., Petzold, J., Trumler, W., Ungerer, T.: Ubiquitous Mobile Agent System in a P2P Network. In: Ubisys 2003 (October 2003) [23] Stevenson, G., Nixon, P., Ferguson, R.I.: A General Purpose Programming Framework for Ubiquitous Computing Environments. In: Ubisys 2003 (October 2003) [24] Mohindra, A., Purakayastha, A., Thati, P.: Expoliting non-determinism for reliability of mobile agent systems. In: DSN 2000, pp. 144–153 (June 2000) [25] Silva, L.M., Batista, V., Silva, J.: Fault-tolerant execution of mobile agents. In: DSN 2000, pp. 135–143 (June 2000) [26] Schneider, F.: Towards fault-tolerant and security agentry. In: The 11th International Workshop on Distributed Algorithms (September 1997) [27] de Assis Silva, F.M., Popescu-Zeletin, R.: An approach for providing mobile agent fault tolerance. In: Rothermel, K., Hohl, F. (eds.) MA 1998. LNCS, vol. 1477, pp. 14–25. Springer, Heidelberg (1998) [28] Rothermel, K., Strasser, M.: A fault-tolerant protocol for providing the exactlyonce property of mobile agents. In: The 17th SRDS, pp. 100–108 (October 1998) [29] Strasser, M., Rothermel, K.: Reliability concepts for Mobile agents. Int’l Journal of Cooperative Inforamtion Systems 7(4), 355–382 (1998) [30] Pleisch, S., Schiper, A.: Fault-Tolerant Mobile Agent Execution. IEEE Transactions on Computers 52(2), 209–222 (2003) [31] Pleisch, S., Schiper, A.: Approaches to Fault-Tolerant Mobile Agent Execution. IBM Research Report RZ 333 (2001) [32] Baik, M., Choi, S., Kim, C., Hwang, C.: Optimistic Temporal Replication Based Approach to Fault-Tolerant Mobile Agent Execution. In: The 5th International Conference on Advanced Communication Technology (2003) [33] Baik, M., Kang, I., Kang, Y., Hwang, C.: Optimistic Fault-tolerant Approach for Mobile Agent in Multi-region Mobile Agent Computing Environment. In: The Proceeding of International Conference on Parallel and Distributed Processing Techniques and Applications (2003)
ODDUGI: Ubiquitous Mobile Agent System
407
[34] Choi, S., Baik, M., Kim, H., Yoon, J., Shon, J., Hwang, C.: Region-based Stage Construction Protocol for Fault tolerant Execution of Mobile Agent. In: AINA 2004, vol. 2, pp. 499–502 (March 2004) [35] Greengerg, M.S., Byington, J.C.: Mobile Agents and Security. IEEE Communications Magazine, 76–85 (July 1998) [36] Karnik, N.M., Tripathi, A.R.: Security in the Ajanta mobile agent system. Software-Practice and Experience, 301–329 (2002) [37] Ono, K., Tai, H.: A security scheme for Aglets. Software-Practice and Experience, 497–514 (2002) [38] Claessens, J., Preneel, B., Vandewalle, J.: (How) Can Mobile Agents Do Secure Electronic Transactions on Untrusted Hosts? ACM Transactions on Internet Technology 3(1), 28–48 (2003) [39] Baik, M., Lee, M., Kim, H., Hwang, C.: Protection Protocol for Sensitive Data of Mobile Agent in Multi-region Mobile Agent Computing Environment. In: The Proceeding of International Conference on Communications Systems and Applications (2003) [40] Baumann, J.: A comparison of mechanisms for locating mobile agents. IBM Research Report 3333 (August 1999) [41] Wojciechowski, P.T.: Algorithms for location-idenpendent communication between mobile agents. Technical Report 2001/13, Communication Systems Department, EPFL (March 2001) [42] Deugol, D.: Mobile agent messing models. In: The 5th International Symposium on Autonomous Decentralized Systems, pp. 278–286 (March 2001) [43] Lingnau, A., Drobnik, O.: Agent-User Communications: Requests, Results, Interaction. In: Rothermel, K., Hohl, F. (eds.) MA 1998. LNCS, vol. 1477, pp. 209–221. Springer, Heidelberg (1998) [44] Domel, P., Lingnau, A., Drobnik, O.: Mobile agent interaction in heterogeneous environments. In: Rothermel, K., Popescu-Zeletin, R. (eds.) MA 1997. LNCS, vol. 1219, pp. 136–148. Springer, Heidelberg (1997) [45] Cabri, G., Leonardi, L., Zambonelli, F.: Mobile-agent coordination models for internet applications. IEEE Computer 33(2), 82–89 (2000) [46] Baumann, J., Rothermel, K.: Shadow approach: An orphan detection protocol for mobile agents. In: Rothermel, K., Hohl, F. (eds.) MA 1998. LNCS, vol. 1477, pp. 2–13. Springer, Heidelberg (1998) [47] Stefano, A.D., Santoro, C.: Locating mobile agents in a wide distributed environment. IEEE Transactions on Parallel and Distributed Systems 13(8), 153–161 (2002) [48] Murphy, A.L., Picco, G.P.: Reliable communication for highly mobile agents. Autonomous Agents and Multi-Agent Systems 5, 81–100 (2002) [49] Baik, M., Yang, K., Shon, J., Hwang, C.: Message Transferring Model between Mobile Agents in Multi-Region Mobile Agent Computing Environment. In: The 2nd Int’l Human.society@Internet Conference (2003) [50] Choi, S., Baik, M., Kim, H., Byun, E., Hwang, C.: Reliable Asynchronous Message Delivery for Mobile Agents. IEEE Internet Computing 10(6), 16–25 (2006) [51] Jalote, P.: Fault Tolerance in Distributed Systems. Prentice-Hall, Englewood Cliffs (1994) [52] Tanenbaum, A.S., Steen, M.V.: Distributed Systems: Principles and Paradigms. Prentice-Hall, Englewood Cliffs (2002) [53] ODDUGI Mobile Agent System, http://oddugi.korea.ac.kr/
Determination of the Optimal Hop Number for Wireless Sensor Networks Jin Wang and Young-Koo Lee* Department of Computer Engineering, Kyung Hee University, Korea
[email protected],
[email protected]
Abstract. Energy efficiency is one of the primary challenges to the successful application of wireless sensor networks (WSNs) since sensors can not be recharged easily once they are deployed. By carefully select the multi-hop number, energy consumption during routing process can be largely reduced and network lifetime can get prolonged. Although it is commonly agreed that multihop transmission is more energy efficient than direct transmission, especially when the source node is far away from sink node, how to determine the optimal hop number in both theoretical and practical network environment with constraint conditions is still a nontrivial problem. In this paper, we focus on theoretical deduction of the optimal hop number under one dimensional linear network environment. Then, we extend the deduced results to the practical sensor network. We also provide the selection of sub-optimal hop number under practical sensor network when the sensors are randomly deployed. The preliminary simulation results show that our optimal hop number based routing algorithm can save much more energy than many popular routing algorithms for WSNs.
1 Introduction Wireless sensor networks (WSNs) are composed of hundreds or thousands of tiny and inexpensive sensor nodes which can effectively monitor their surrounding environment and then send their observed data to the remote sink node or base station (BS) through direct or multi-hop transmission. Due to the wide potential applications in military surveillance, industrial and agricultural monitoring, environmental protection, healthcare etc [1], WSNs have attracted quite attention in recent years. One of the primary challenges to the successful application of WSNs is the energy consumption problem since it is not practical to re-charge the limited battery once they deployed, such as dropped from an airplane. The energy consumption usually consists of three parts, which are the energy consumed during sensing, processing and communication process. Here, we only focus on the energy consumption during communication process since it prevails over the other two processes. It is well known that communicating one bit message over wireless medium consumes around 1000 times more energy than processing it. *
Corresponding author.
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 408–418, 2009. © Springer-Verlag Berlin Heidelberg 2009
Determination of the Optimal Hop Number for Wireless Sensor Networks
409
Although it is commonly agreed that multi-hop transmission manner is usually more energy efficient than direct transmission (a.k.a. one hop transmission) manner, especially when the source node is far away from the sink node. However, it is not clear that how many hops are needed and how to determine the corresponding intermediate nodes. Up to now, just a little work has been done purely from optimal hop number point of view so as to improve energy efficiency and extend network lifetime. It is the motivation of this paper. The rest of the paper is organized as follows. Some related work is presented in Section 2. In section 3, the theoretical optimal hop number is deduced under one dimensional sensor network with the constraint condition of hardware parameters. Then, a selection criterion of sub-optimal hop number is provided with explanation under two dimensional practical sensor network. The preliminary simulation results in Section 4 validate the performance of our optimal hop number based routing algorithm in comparison with other popular routing algorithms for WSNs. Section 5 concludes the paper.
2 Related Work The study of energy efficient routing protocols or algorithms for WSNs has lasted for many years and tons of research papers have been published. However, just a few of these papers study from optimal hop number point of view during routing process. The authors in [2] present a taxonomy about most of the famous routing protocols for WSNs and categorize them into three classes which are data-centric [5-7], hierarchical [8-10] and location-based [11, 12] protocols. Data aggregation (a.k.a. data fusion) is an important technique adopted by the datacentric routing protocols [5-7]. Due to the fact that many nearby sensor nodes might sense and collect similar information, there is more or less similarity among collected sensor data. Through this method, both the data size and number of packet can be largely reduced. SPIN (Sensor Protocols for Information via Negotiation [5]) can be viewed as the first data-centric routing protocol which uses data negotiation method among sensor nodes to reduce data redundancy and save energy. Direct Diffusion [6] is a famous and representative data-centric routing protocol for WSNs. The data generated by sensor nodes is named by attribute-value pairs. Once the BS inquires a certain type of information (like four-let animal) in a sub-area during certain time interval, the observed data with same type and same sub-area can get aggregated and then be transmitted to BS through multi-hop transmission. In addition, load balancing is achieved by forward the data on different paths based on probability. Rather than always using lowest energy paths, the authors in [7] use sub-optimal paths occasionally so that network lifetime is increased by 40% compared to [6]. Hierarchical routing protocols [8-10] are very suitable for WSNs since they not only provide scalability for hundreds or thousands of sensors, but that cluster head can perform data aggregation and coordination within each cluster. [8] is one of the most famous and representative hierarchical routing protocols for WSNs. It can prolong network lifetime 8 times longer than other ordinary routing protocols like direction transmission and minimum transmission energy routing protocols. However, 5% of the cluster head nodes are randomly chosen and they only use direct transmission
410
J. Wang and Y.-K. Lee
to the remote sink node under their small scale sensor network. Power Efficient Gathering in Sensor Information Systems (PEGASIS) [10] is a chain based routing protocol which can save more energy than LEACH. The message can get aggregated along the chain and finally be sent to sink node via direct transmission by one of the node on the chain. The main shortcoming is that PEGASIS requires a global knowledge of whole network, which is practically hard to achieve. Location-based routing protocols [11, 12] usually require sensor location information which can be gained either through global positioning system (GPS) devices or through certain estimation algorithms based on received signal strength. Minimum Energy Communication Network (MECN) [11] provides a minimum energy network for WSNs with the help of low power GPS. [12] is an extension of [11] which considers possible obstacles between any pair of nodes. Recently, the authors in [13] study the selection of transmission manner from a probability viewpoint. They give a closed form of the probability ( 1 − Pi ) to directly transmit data to BS and a probability of Pi to transmit data through multi-hop to BS. The authors in [14] also study the energy consumption under both direct and multihop transmission. They claim that the superiority of the multi-hop scheme depends on the source to sink distance and reception cost, which is consistent with our algorithm. It can be seen that the hop number plays a secondly role in most of the abovementioned protocols and its influence to energy consumption is not carefully studied. Up to now, little work has been done to study the relationship between hop number and energy consumption except [3, 4, 13, 14]. In this paper, we will thoroughly study the relationship between hop number and energy consumption from both theoretical and experimental aspects in this paper. We will deduce the optimal hop number for linear network environment under hardware parameter constraints. Also, we will present a sub-optimal hop number selection criterion for practical sensor network when the theoretical optimal hop number can not be obtained.
3 Determination of Optimal Hop Number 3.1 Energy Consumption Model Figure 1 show the one dimensional sensor network with n number of sensor nodes placed along a line to the sink node. Usually, the one dimensional linear sensor network can be used in highway applications, such as the traffic monitoring and congestion control. The distance between each sensor node is ri . Once there is some data to
r Sink Node
d Fig. 1. One dimensional linear network
Determination of the Optimal Hop Number for Wireless Sensor Networks
411
send from a source node to the remote sink node, a direct transmission or multi-hop transmission manner can be chosen based on the distance from source to sink node d as well as the hardware parameters of energy model. We will begin from studying the energy consumption model below. A commonly used energy consumption model is called first order radio model [810]. According to this model, radio will consume the following ETx amount of energy to transmit a l -bits message over distance d :
⎧⎪l ⋅ Eelec+l ⋅ε fs ⋅ d 2 , if d < d0 ETx(l, d) = ⎨ 4 ⎪⎩l ⋅ Eelec+l ⋅εmp ⋅ d , if d ≥ d0
(1)
E Rx amount of energy to receive this message: E Rx (l ) = l ⋅ Eelec , and
(2)
E Fx amount of energy to forward this message: ⎧⎪2l ⋅ Eelec + l ⋅ ε fs ⋅ d 2 , E Fx (l , d ) = ETx (l , d ) + E Rx (l ) = ⎨ 4 ⎪⎩2l ⋅ Eelec + l ⋅ ε mp ⋅ d ,
if d < d 0 if d ≥ d 0
(3)
Definition of these radio parameters is listed in Table 1. Table 1. Radio parameters
Parameter
Definition
Unit
Eelec
Energy dissipation rate to run the radio
50 nJ/bit
ε fs
Free space model of transmitter amplifier
10 pJ/bit/m2
ε mp
Multi-path model of transmitter amplifier
0.0013 pJ/bit/m4
l d0
Data length
2000 bits
ε fs /ε mp
Distance threshold
So, to transmit 1 bit information over
m
n -hop route will consume a total Emp
amount of energy along the multi-hop route: n ⎧ 2 2 ⎪ ( E elec + ε fs ⋅ r1 ) + ∑ ε fs ⋅ ri + 2 ⋅ ( n − 1) ⋅ E elec , ⎪ i= 2 E (n) = ⎨ n 4 4 ⎪( E elec + ε mp ⋅ r1 ) + ∑ ε mp ⋅ ri + 2 ⋅ ( n − 1) ⋅ E elec , ⎪⎩ i=2
ri < d 0
(4)
ri ≥ d 0
Eq. (4) is the objective functions to be optimized with the variable of hop number and intermediate distances. It is worth mentioning that the optimal hop number might not
412
J. Wang and Y.-K. Lee
be obtained with the constraint ri < d 0 or ri ≥ d 0 in (4). Sometimes, we have to find sub-optimal hop number and corresponding intermediate distance under real sensor network environment. Finally, we can generalize Eq. (4) as following objective function to be optimized:
E ( n ) = ( 2 n − 1) E elec +
n −1
∑ε
amp
⋅ riα
(5)
i =1
Here, ε amp = ε fs when α = 2 and ε amp = ε mp when α = 4 . 3.2 Optimal Hop Number for Linear Network n
For fixed
∑ ri = d , the expression of i =1
n
∑ r α will have a minimal value i
when
i =1
r1 = r2 = K = rn = d / n . Finally, E ( n ) is equal to: E (n) = (2n − 1) ⋅ Eelec + ε amp ⋅ n1−α ⋅ d α
(6)
Eq. (6) has the minimum when E ' ( n) = 0 or
2 Eelec + ε amp ⋅ (1 − α ) ⋅ (d / n)α = 0 , Finally, we can get the optimal theoretical hop number as: * nopt = d ⋅ (ε amp ⋅ (α − 1) / 2 Eelec )1 / α .
(7)
* From Eq. (7), it is easy to get the optimal hop number nopt = d ⋅ ε fs / 2 ⋅ E elec for free
space model when α = 2 . Also, we can get the corresponding intermediate distance * ri = d / nopt = 2 ⋅ Eelec / ε fs = 100 based on the hardware parameters in Table 1. Similarly, we can get the optimal hop number
* n = nopt = d ⋅ (3 ⋅ ε mp / 2 ⋅ Eelec )1/ 4 for
multi-path model when α = 4 . Also, we can get the distance
* ri = d / nopt ≈ 71 .
Figure 2(a) shows the energy consumption under free space model. Given the distance from source to sink node d and the hardware parameters listed in Table 1, we can divide d into n hops with n sensor nodes equally placed along d . We can see that there exits an optimal hop number with minimal energy consumption for both Figure 2(a) and 2(b). The larger d is, the larger optimal hop number will be since the corresponding intermediate distance is kept as constant. Also, the larger d is, more energy will be consumed with same hop number. But, usually more energy can also be saved via multi-hop transmission than direct transmission. Figure 2(b) shows a similar case under multi-path model. The larger d is, the corresponding optimal hop number will also be larger and the intermediate distance ( ri ) is the same.
Determination of the Optimal Hop Number for Wireless Sensor Networks
4
d=200 d=300 d=400 d=500
2
Energy consumption
Energy consumption
x 10
1
0
0
2
4 6 Hop number
8
d4 Energy consumption
-6
d2 Energy consumption
-6
3
x 10
d=200 d=250 d=280 d=300
3 2 1
0
10
413
0
2
(a) free space model
4 6 Hop number
8
10
(b) multi-path model
Fig. 2. Energy consumption under two models
However,
* nopt can not be obtained if we further consider the constraint condition
ri < d 0 when α = 2 , since ri = 100 > d 0 ≈ 88 . It is the same when α = 4 under constraint condition ri ≥ d 0 . This is because the hardware parameters are determined by factors like electronic circuit, antenna height, receiver sensitivity etc [9]. Thus, we will choose the nearest integer of
* nopt in Eq. (7), which satisfies ri < d 0 or
ri ≥ d 0 simultaneously. We call it sub-optimal hop number ( nopt ) in this paper. -6
x 10
Nearest decimal value
Minimal energy consumption
Minimal energy consumption
-7
7
d4 d2
6 5 4 3 2 1 100
150
200 250 Distance
300
(a) nearest decimal value
350
1.5
Nearest integer value
x 10
d4 d2
1
0.5
0 100
150
200 250 Distance
300
350
(b) nearest integer value
Fig. 3. Energy consumption under constraint condition
Figure 3(a) shows the minimal energy consumption with different source to sink node distance d by considering constraint condition ri < d 0 or ri ≥ d 0 . In Figure 3(a), the optimal hop number is not an integer but the nearest decimal value. Figure 3(b) is the case when hop number is chosen as the nearest integer ( nopt ). From Figure 3, we find that in most cases, free space model consumes less energy than multi-path model. Especially when hop number is an integer (Figure 3(b)), free space model is usually much more energy efficient than multi-path model. For example, when d = 240 , 3hop free space transmission is more energy efficient than 2-hop multi-path transmission.
414
J. Wang and Y.-K. Lee
3.3 Sub-optimal Hop Number for Practical Network
Given the hardware parameters and source to sink node distance, we studied how to determine the optimal hop number under constrain condition from both theoretical and experimental point of view previously. Finally, we will study how to determine the sub-optimal hop number as well as the intermediate nodes under real sensor network environment. It is worth mentioning that we can not directly utilize the optimal theoretical hop number in Eq. (7) for two reasons. First, the hop number should be an integer rather than a decimal value. Second, it is very hard to find the corresponding intermediate nodes with ri which is equally placed along the multi-hop route under practical sensor network.
d < d 0 , it is natural to use direct transmission with distance r = d . When d ∈ (d 0 ,2d 0 ) , we can either use direct multi-path transmission or 2-hop free space When
transmission manner. Let:
f (d ) = EDirect − E (2) ≥ 0 , so:
f (d ) = (Eelec + ε mp ⋅ d 4 ) − (3Eelec + ε fs ⋅ d 2 / 2) = ε mp ⋅ d 4 − ε fs ⋅ d 2 / 2 − 2Eelec ≥ 0.
(8a)
Eq. (8a) will always hold true when:
d≥
ε fs / 2+ ε 2fs / 4+(8εmp ⋅ Eelec) 4εmp
,
(8b)
d c ≈ 104 when n = 2 . So, if the source to sink node distance d < d c , we will still choose direct transmission with r = d . If d ≥ d c , we will choose multi-hop transmission with an integer value nopt ∈ ( d / d 0 , ( d / d 0 ) + 1] as sub-optimal hop number, which can be
and the critical distance
derived from the last row in Table 2. And the intermediate distance(s) Table 2. Selection criterion of sub-optimal hop number
Hop Number
d
ri
(0, d c )
r1 < d c
1
[ d c ,2 d 0 )
r1 , r2 < d 0
2
M [(n−1)d0 , nd0 )
M r1,L, rn < d0
M n
ri ≈ d / nopt .
Determination of the Optimal Hop Number for Wireless Sensor Networks
415
d = N ⋅ d 0 (here N is an integer), it is not possible to find such N intermediate nodes which divide d equally in real sensor network, so it is easier to find N + 1 intermediate nodes with similar distance of ri ≈ d /( N + 1) . We will try to choose the intermediate node with distance close to ri ≈ d / nopt and The reason is that when
as close to the direct line from source to sink node as possible. It is worth emphasizing that when d > d c , we can either use
n1 -hop multi-path
r1 or n2 -hop free space transmission with individual distance r2 , here n1 ⋅ r1 = n2 ⋅ r2 = d and r1 > d 0 > r2 . According to the analysis in Figure 3, we will always choose n2 -hop ( n2 > n1 ) free space transmission model 2 ( d ) because free space transmission will always be more energy efficient than transmission with distance
multi-path transmission under real sensor network with sensors randomly placed.
4 Performance Evaluation For the performance evaluation of our optimal hop number based algorithm, we use MATLAB simulator. There are 300 sensor nodes randomly deployed in an area of 2
500×500 m . The sink node is placed either inside or outside the WSN area. The transmission radius can be adjusted from 150 to 250 meters to evaluate the performance of different routing algorithms. It is worth emphasizing that for different parameter values in Table 3, the optimal or sub-optimal hop number is different. But our methodology is suitable for all sets of parameters. Table 3. Simulation Environment
Parameter
Value
Network size
500×500 m
Node number
300
Radius
[150, 250] m
Sink node location
Inside or outside
2
Data size
2000 bits
Initial energy
2J
Eelec
50 nJ/bit
ε fs
10 pJ/bit/m2
ε mp
0.0013 pJ/bit/m4
d0
ε fs /ε mp ≈87.7m
416
J. Wang and Y.-K. Lee
The traffic pattern is as follows. In each round, every node will transmit a 2000 bits message to the sink node using either direct transmission or multi-hop transmission based on different routing algorithms. In case of multi-hop transmission, the intermediate nodes will consume additional energy to forward message. Table 3 shows the related simulation parameters used in the simulation environment. We compare our algorithm with the following four routing algorithms: z
z
z
z
Direct transmission: Each node will transmit its data directly to the sink node supposing its transmission radius is large enough. Greedy algorithm: Each node will choose one of its neighbors as the next hop which is nearest to sink node. Usually, the average hop number is small and energy consumption is proportional to transmission radius. Maximal remaining energy (MRE) algorithm: Each node will choose the next hop which is closer to sink node with maximal remaining energy. Our algorithm: Our optimal hop number based algorithm. 2
Figure 4 and 5 show average energy consumption for a 500×500 m network with 300 sensor nodes, where the sink node (also called Base Station) is placed inside at (250, 250) in Figure 4 and outside at (250, 550) in Figure 5. The column group name 1, 2 and 3 is corresponding to transmission radius R = 150,200,250 respectively.
0.007 n o i t0.006 p m u0.005 s n o C0.004 y g r0.003 e n E e0.002 g a r0.001 e v A 0
Direct Trans. Greedy Algo. MRE Algo. Our Algo.
1
2
3
Transmission Radius
Fig. 4. Average energy consumption with sink node at (250, 250)
From the two figures, we can see direct transmission always consumes the largest amount of energy. It does not change with transmission radius ( R ) increases since the distance from each source node to the sink node is fixed. The energy consumption of our optimal hop number based routing algorithm is the smallest. It also does not change because the transmission radius ensures that the intermediate distance corresponding to the sub-optimal hop number can be found under practical sensor network. The average energy consumption of greedy and MRE algorithm is in the middle. The performance of these two algorithms increases with R because they tend to choose next hop with larger distance, which causes more average energy consumption.
Determination of the Optimal Hop Number for Wireless Sensor Networks
417
Also we can see that direct transmission almost consumes 10 times more energy than our algorithm in Figure 4. In Figure 5, the ratio is about 51 with sink node placed outside at (250,550). This is because the average source to sink distance is much larger in Figure 4 than in Figure 5, so more energy will be consumed on average. Taking R = 150 as an example, the energy consumption of greedy algorithm in Figure 4 is about 2.5 times more than our algorithm and 2.2 times in Figure 5. MRE algorithm is 1.9 times than our algorithm in Figure 4 and 1.8 times in Figure 5. The ratio will be larger as R increases. It is worth mentioning that the energy consumption of our algorithm does not change with R because the sub-optimal hop number and the corresponding intermediate distances are fixed according to our deduction of optimal hop number in both theoretical and practical network environment.
$YHUDJH(QHUJ\&RQVXPSWLRQ
'LUHFW7UDQV *UHHG\$OJR 05($OJR 2XU$OJR
7UDQVPLVVLRQ5DGLXV
Fig. 5. Average energy consumption with sink node at (250, 550)
5 Conclusion Optimal hop number can not only improve energy efficiency for WSNs, but that it plays an important role on many other network metrics like latency, interference, routing overhead etc. In this paper, we primarily focus on theoretical study of the relationship between optimal hop number and energy consumption. We derive the optimal theoretical hop number under linear network environment. Also, we provide the selection criterion of sub-optimal hop number as well as the corresponding intermediate nodes under practical sensor network environment. The preliminary simulation results verify that our optimal hop number based algorithm is more energy efficient than other popular routing algorithms for WSNs. In the near future, we will extend our work by studying the influence of optimal hop number on other network performance such as latency, packet delivery ratio etc. Also, we plan to integrate our optimal hop number based algorithm with other mechanisms like data aggregation and clustering etc.
418
J. Wang and Y.-K. Lee
Acknowledgement This research was supported by the MKE (Ministry of Knowledge Economy), Korea, under the ITRC (Information Technology Research Center) support program supervised by the IITA (Institute of Information Technology Advancement)" (IITA-2009(C1090-0902-0002)).
References 1. Akyildiz, I.F., Su, W., Sankarasubramaniam, Y., Cayirci, E.: Wireless Sensor Networks: A Survey. Computer Networks 38(4), 393–422 (2002) 2. Akkaya, K., Younis, M.: A Survey of Routing Protocols in Wireless Sensor Networks. The Elsevier Ad Hoc Network 3(3), 325–349 (2005) 3. Stojmenovic, I., Lin, X.: Power-aware Localized Routing in Wireless Networks. IEEE Transactions on Parallel and Distributed Systems 12(11), 1121–1133 (2001) 4. Haenggi, M.: Twelve reasons not to route over many short hops. In: The proceedings of the 60th IEEE Vehicular TEchnology Conference (VTC), pp. 3130–3134 (September 2004) 5. Kulik, J., Heinzelman, W.R., Balakrishnan, H.: Negotiation-based protocols for disseminating information in wireless sensor networks. In: ACM/IEEE Int. Conf. on Mobile Computing and Networking (August 1999) 6. Intanagonwiwat, C., Govindan, R., Estrin, D.: Directed diffusion: A scalable and robust communication paradigm for sensor networks. In: The Proceedings of the 6th Annual ACM/IEEE International Conference on Mobile Computing and Networking (MobiCom 2000), Boston, MA (August 2000) 7. Shah, R.C., Rabaey, J.M.: Energy Aware Routing for Low Energy Ad Hoc Sensor Networks. In: Proc. IEEE Wireless Comm. and Networking Conf., pp. 350–355 (March 2002) 8. Heinzelman, W., Chandrakasan, A., Balakrishnan, H.: Energy-efficient communication protocol for wireless sensor networks. In: The Proceeding of the Hawaii International Conference System Sciences, Hawaii (January 2000) 9. Heinzelman, W.: Application-Specific Protocol Architectures for Wireless Networks. Ph.D. thesis, Massachusetts Institute of Technology, pp. 84–86 (2000) 10. Lindsey, S., Raghavendra, C.S.: PEGASIS: Power Efficient GAthering in Sensor Information Systems. In: The Proceedings of the IEEE Aerospace Conference, Big Sky, Montana (March 2002) 11. Rodoplu, V., Ming, T.H.: Minimum energy mobile wireless networks. IEEE Journal of Selected Areas in Communications 17(8), 1333–1344 (1999) 12. Li, L., Halpern, J.Y.: Minimum energy mobile wireless networks revisited. In: The Proceedings of IEEE International Conference on Communications (ICC 2001), Helsinki, Finland (June 2001) 13. Efthymiou, C., Nikoletseas, S., Rolim, J.: Energy Balanced Data Propagation in Wireless Sensor Networks. In: Wireless Networks, Special Issue: Selected Papers from WMAN 2004, vol. 12(6), pp. 691–707 (2006) 14. Fedor, S., Collier, M.: On the Problem of Energy Efficiency of Multi-Hop vs One-Hop Routing in Wireless Sensor Networks. In: The 21st International Conference on Advanced Information Networking and Applications Workshops (AINAW), pp. 380–385 (May 2007)
Localization in Sensor Networks with Fading Channels Based on Nonmetric Distance Models Viet-Duc Le, Young-Koo Lee, and Sungyoung Lee Department of Computer Engineering, KyungHee University Seocheon-dong, Giheung-gu, Youngin-si, Gyeonggi-do 446-701, Korea {levietduc,sylee}@oslab.khu.ac.kr,
[email protected]
Abstract. Wireless Sensor Network (WSN) applications nowadays are an emerging avenue in which sensor localization is an essential and crucial issue. Many algorithms have been proposed to estimate the coordinate of sensors in WSNs, however, the attained accuracy in real-world applications is still far from the theoretical lower bound, Crame-Rao Lower Bound (CRLB), due to the effects of fading channels. In this paper, we propose a very simple and light weight statistical model for rangbased localization schemes, especially for the most typical localization algorithms based on received signal strength (RSS) and time-of-arrival (TOA). Our proposed method infers only the order or the nomination of given distances from measurement data to avoid significant bias caused by fading channels or shadowing. In such way, it radically reduces the effects of the degradation and performs better than existing algorithms do. With simulation of fading channels and irregular noises for both the RSS-based measurement and the TOA-based measurement, we analyze and testify both the benefits and the drawbacks of the proposed models and the localization scheme. Keywords: Localization, Nonmetric, Fading channel, Shadowing.
1
Introduction
Node Localization plays a fundamental role in Wireless Sensor Network applications which are rapidly growing. Especially, in applications for dynamic environment such as manufacturing logistics, asset tracking, context aware computing, where sensors may change their locations time by time without notice. In addition, in large scale WSN applications where sensors are deployed randomly in vast areas, for instant, environment monitoring, conservation biology and precision agriculture, the measured information must stick with location information so that the data are meaningful. Hence, it is critical that self-localization or node localization must be implemented in those applications. Many localization algorithms have been proposed to compromise the tradeoffs between low cost, low energy consumption, and high accuracy and robustness. These algorithms are categorized into range-based schemes and range-free schemes. The range-based schemes estimate the location of sensors based on O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 419–431, 2009. c Springer-Verlag Berlin Heidelberg 2009
420
V.-D. Le, Y.-K. Lee, and S. Lee
given pairwise distances transformed from measurements such as received signal strength (RRS), time-of-arrival (TOA), time of different of arrival (TDOA), angle of arrival (AOA). Usually, TOA, TDOA are suitable for applications requiring high accuracy (in order of centimeters) but it will be costly to equip those measurements. Meanwhile, if the first priority is low cost, RSS is the best candidate. However, the drawback of using RSS is that it is difficult to overcome the bias due to the irregular distribution of radio rank. To improve the accuracy, in case of using RSS, Patwari et al. proposed a novel localization algorithm based on maximum likelihood relative estimation (MLE) [1]. Recently Costa et al. [2] introduced a scalable, distributed weighted-multidimensional scaling (dwMDS) algorithm. The main issue of range-based schemes is the adding cost of hardware to measure the distance. Therefore, in recent years, other scholars proposed range-free schemes. The locations of sensor, in range-free schemes, are estimated from the connectivity or the number of hops between each pair of sensors. Thus, rangefree schemes do not require any hardware to determine the pairwise distances. Obviously, it significantly reduces cost and power consumption as well. Consequently, range-free schemes are suitable for the resource-limited WSNs. Some of the best papers working on the range-free scheme should be mentioned here including MDS-MAP [3], Isomap [4], area-based approach [5], DV based positioning [6], mobile and static sensor network localization [7]. Recently, some novel approaches in range-free schemes were proposed such as the distributed localization algorithm with improved grid-scan and vector-based refinement [8]. Naturally, range-free schemes take precedence over range-base schemes when cost and energy are the main concerns. However, the range-free scheme has its own drawback, that is, it is very hard to obtain high accuracy, particularly, in real-world applications with fading channels and unpredictable noises. To deal with the effects of fading channels and irregular noises in measuring distances, some novel approaches have been proposed recently. V. Vivekanandan and W.S Wong [9] improved MDS-MAP [3] by using ordinal Multidimensional Scaling (MDS) instead of classical MDS. In Ordinal MDS/MDS-MAP(O), which will be described more details in the section XX, it only requires a monotonicity constraint between the shortest path distance and the Euclidean distance for each pair of nodes. The results show that ordinal MDS gives higher accuracy than classical MDS. However, MDS-MAP(O) [9] still uses the metric model as the input of algorithm. N. Patwari and P. Agrawal [10] build up an algorithm which infers localization information from link correlations in order to avoid significant effects from correlated shadowing on links, in connectivity, localization, and in radio tomographic imaging. Even so, current literatures estimate the coordinates of sensor mostly from a given matrix of pairwise distances. This kind of approach inherits the bias of measurements as well as the converting measurements into Euclidean distances. Unlike previous work, in this paper, we proposed a very simple model based on nonmetric model MDS [11] to reduce the effects of fading channels and irregular noises. This method is technically similar to averaging (AR) in signal processing.
Localization in Sensor Networks with Fading Channels
421
In such way, the major errors of measured distances will be somewhat canceled. For that reason, when integrated with range-based algorithms as the input, the model will perform significantly better than the stand-alone algorithms do. We also remark that only Gaussian noise in the localization problem is concerned in most existing work. We however focus on how to cope with non Gaussian noise, fading channels, and artifacts. In this paper, first we define the problem formulation, then we propose the nonmetric distance model, so-called NoDis Model, for the most common measurements in nowadays, the received signal strength (RSS) and the time-of-arrived (TOA). Next section, we will show and analyze the performance of our proposed scheme via simulation the NoDis model with MDS-MAP [3] and MDS-MAP(O) [9]. Finally, we end the paper with our conclusions and future work.
2
Problem Formulation
In this section, we first introduce the mathematical localization problem. Then we talk over models of RSS and TOA in realistic and simulate WSN networks. Finally, we explain the rough challenges of fading channels in real applications and how to model such phenomena. 2.1
Localization Problem
In this paper, we consider a network which includes n sensors, normal nodes, randomly deployed in d-dimensional space (d=2 or 3) without location information, and very few m beacons (m n) with location information. Let N=n+m denote the total number of sensors in the considering WSN, X = {xi : i = 1..N }, xi ∈ = { d , be the actual vector coordinates of sensors and X xi : i = 1..N }, x i ∈ d , be the estimate vector coordinates of sensors. The problem of localization in Wireless Sensor Networks is formalized as follows: Given n normal nodes, m beacons, and a set of pairwise vector distances {δij : i, j = 1..N }, the locations of normal nodes must be estimated. We assume that all measured pairwise vector distances, {δij : i = j and i, j = 1..N }, are available and δij = δji , (. is the 2-norm). This assumption doesn’t restrict the application of the proposed algorithms. The more pairwise vector distances are given, the higher accuracy is achieved. Note that our method is developed to adapt any type of range measurements, for instance, RSS, TOA, or AOA. However, in this paper, we mainly discus on RRS and TOA model because of their low cost and most typically used in WSN applications. 2.2
No Fading Channels
For most existing work, they only solve the localization problem with Gaussian noise on the distance only. In other words, there is no noise that is much greater than the data containing location information. In this paper, we concern only two typical measurement models, received signal strength (RSS) and time of arrival (TOA) because of their popularity.
422
V.-D. Le, Y.-K. Lee, and S. Lee
Fig. 1. Time of Arrival illustrated
In RSS-based, distances are measured by converting power of radio signal with the following formula: dij = d0 10(P0 −Pij )/(10np ) ,
(1)
where, dij is the converted distance between sensor i and j. P0 and Pij are the power in decibel milliwatts at distance d0 and dij respectively. np is the path loss and depends on the environment and can be known from calibration and measurement. Naturally the equation 1 reflects the degradation of received signal strength with corresponding distances. Furthermore, RF channel measurements of Pij is mainly constant over path length [Rappaport 1996] and [1]. Thus, it is possible to model Pij as a Gaussian model in form 2 Pij ∼ N (P¯ij , σdB ),
(2)
2 where P¯ij is the mean value of signal power received at distance dij and σdB is the variance of the irregular distribution of radio range. In TOA-based, sensors are commonly equipped hardware ranging mechanism such as a speaker and a microphone or ultrasound. The mathematical transformation physical measurements into Euclidean distance is independent of particular hardware. To measure the distance between sensor i and j, sensor i first sends a radio message and waits some interval of time, tdelay , to ensure that node j receives the message. Then node i emits a sound. Node j, based on the time of receiving the radio signal, it notes the current time tradio . When node j hears the sound, it again note the current time tsound . Using the fact that radio signal travels very much faster than sound in the air, the distance between node i and j is simply computed as
dij = (vradio − vsound ) ∗ (tsound − tradio − tdelay ) = v ∗ t,
(3)
where vradio and vsound are the speed of radio and sound traveling in the air, respectively. Assume that the air is unique and there is no obstacle on the traveling path of sound, the bias in TOA case is mostly caused by the error in time measurement. Therefore, the t can be modeled as
Localization in Sensor Networks with Fading Channels
423
Fig. 2. Effects on measured data in real application: (a)from fading channel; (b)from irregular radio range
¯ σ 2 ), t ∼ N ( t, s
(4)
¯ is the mean value of t and σ 2 is the variance of the traveling time where t s of sound. 2.3
Fading Channel Models
The physical location of sensors is critical for both network operation and data gathering. Nowadays, network communication in WSNs mostly use radio signal to transmit or receive data and many localization algorithms execute relied on the connectivity or the propagation of the transmission. However, the connectivity or hop-counting must face with fading channels due to unexpected obstacles in real-world environment. In Fig. 2a, sensor A and sensor B can not directly communicate with each other because of the obstacle between them. The number of hops or the distance estimated between sensor A and B, through the path A → C → D → E → B, consequently is much greater than the actual distance. Fig. 2b illustrates the effect of irregular radio range in WSNs. In practical, the radio range is not symmetric. Therefore, based on measured data, one will think that node B is closer to node A than node C is, even that is not true. To simulate these phenomena, non Gaussian noise or artifacts, we proposed a method as hereafter. For the RSS model, the radio signal is significantly degraded when there are obstructions on the link. And the ratio of sigma over path loss σdB /np represents this phenomenon, fading channels or artifacts. As a consequence, if we increase the σdB /np of some random links in a wireless sensor network by adding a constant C, their corresponding measured distances will be significantly larger than true values. To simulate the fading channel in TOA case, analogously, we randomly multiply a number of measured time values, t, by a fixed coefficient α. d∗ij = α ∗ v ∗ t.
(5)
In summary, by creating fading channels on some random links like our method, the corresponding measured distances are radically different from the actual
424
V.-D. Le, Y.-K. Lee, and S. Lee
distances. In order to cope with both non fading channels (Gaussian noises) and fading channels (non Gaussian noise), we propose a new model which based on nonmetric models and will be described in next section.
3 3.1
Our Proposed Model Nonmetric Distance Modeling
The input {δij : i = j and i, j = 1..N } of a localization scheme is a set of pairwise distances among sensors in a network, which is a metric space or socalled metric distance model. That is, it represents various properties of the data related to algebraic operations (addition, subtraction, multiplication, division). The pairwise distances are converted from other measurements such as RF signal (RSS), Time of Arrival (TOA) or hop-distances (connectivity). Because of the present of fading channels and unpredictable obstacles in real-world applications, there are some measured distances which are significantly different from their actual values. This phenomenon consequently biases the output of a localization scheme, coordinates of normal nodes. To lessen the effect of noise, we convert metric distance models into nonmetric distance models [11]. Metric distance models only preserve the rank or the order of the metric data. For example, if δ12 = 12 and δ34 = 9, a nonmetric distance model maintains only the property as δ12 > δ34 . Therefore, we constructs a new model of given distance δij −→ δij = f (δij ) so that if δij > δkl then δij > δkl .
(6)
Note that we also can use the requiring monotonicity δij ≥ δkl , however, neither δij > δkl nor δij ≥ δkl strengthens models in practice, because one can always add a very small number to one side of the equality. There are many ways to obtain a nonmetric distance model δij satisfying (6). In this paper, we propose a simple and lightweight method as follows and name it the NoDis model. First we create a vector v = {vi : i = 1..n(n − 1)/2} from elements below the diagonal of the given pairwise distances δij . Then we derive an index vector u = {ui : i = 1..n(n − 1)/2} from sorting vector V in ascending order. Obviously, each element ui contains the index of corresponding vi , and ui ≤ n(n − 1)/2, i = 1..n(n − 1)/2. We also stress that the index of u is the order of elements of v. Next step is that we construct a nonmetric vector v = { vi : i = 1..n(n − 1)/2} where, vui = i, i = 1..n(n − 1)/2.
(7)
by converting v into Finally, one easily builds the nonmetric distance model D a square, symmetric format matrix, in which dij denotes the nonmetric distance between the ith and jth sensors in the given metric distances, and satisfies (6). a square symmetric matrix with all elements on We use this nonmetric model D, the diagonal are zero, as the input of algorithms based on the MDS technique. A
Localization in Sensor Networks with Fading Channels
425
Table 1. Symbolic metric distances δij , order (index of v), nomination vector v, index vector u, nonmetric nomination vector v and symbolic nonmetric distances δij dij Order v u v dij d12 1 9 3 2 d12 d13 2 3 5 3 d13 d14 3 5 7 4 d14 d23 4 7 9 1 d23 d24 5 15 12 6 d24 d34 6 12 15 5 d34
very simple example with only 4 nodes (n = 4) given in Tab. 1 would be useful to understand converting nonmetric distances from given metric distances. Let d24 and d34 be kind of fading channels, apparently the discrepancies between them and other distances are reduced in vector v. This sort of transformation weakens the effects of large bias on input data. It consequently somewhat improve the accuracy and convergence speed as well. 3.2
Model Limitations
We note that our proposed NoDis model has hereafter limitations. Firstly, NoDis is mainly developed for range-based schemes. We only test the NoDis model with rang-based algorithms in our simulation. However, it is possible to apply our proposed with range-free schemes by calculating pairwise distances analogously to DV-HOP [6]. Secondly, the NoDis model is most appropriate to localization algorithms which perform well on networks in which sensors are randomly deployed. In this paper, we analyze and investigate our proposed model 5 and 6 with classical (MDS) and ordinal MDS. With some other localization algorithms, our proposed model may not suffice to improve the performance. Finally, the NoDis model has not integrated the effects of multi-paths which impairs given distances. The power received at a receiving sensor may be the multi-path components transmitted by many sensors, not only the considering sensor in pair with receiving sensor. The NoDis assume that the receiver only receives power from only one of its neighbors at a time. An update model, which considers the correlation between multi-paths, should be developed to archive better solution. The proposed nonmetric distance model does not perfectly represent for all cases of noises in wireless sensor network localization, however, it does simplify the existing algorithms, makes them applicable for real world applications.
4
Experimental Results
In our experiments, we access the performance of our proposed NoDis modle when using the model as the input for MDS-MAP [3] and ordinal MDS [9]. To
426
V.-D. Le, Y.-K. Lee, and S. Lee
20
25 15
13
34 23 22
11
32 21 31
10 20
42
30 29
19
39 49
17
27 37
1 16 6 bc1
92
72 71
81
69 59 58
80
89 88
68
97 4 bc4
56 66
10 m
98
78 77 87
67
5
100 99
79
57
46
102 91 101
70
36
26
0
38 48 47
28
0
103 82
52
70
8 7
95 5 bc5 104 94 84 83 93
74
61 51 41 3 60 bc3 50 40
18
5
105 85
75
636273
12
9
65
54 64 44 53
3343
15
m
55 45
24
14
10
35
2 bc2
7686
96
15
20
(a) 100 sensors, marked as ”◦”, are randomly deployed in an area 20m x 20m. 5 beacons are in ”red”.
20
25 15 14
m
15
55
3343 23
11
32 21 31
20
42
30 29
19
39 49
28
17
0 0
103 82
52
27 37
1 16 6 bc1
38 48 47
71
81
70
69 59 58
46
5
10 m
102 91 101
80 100 99
79 89 88
68
98
78 77 87
57 67
36 26
92
72
70
8 7
74
61 51 41 3 60 bc3 50 40
18
5
95 5 bc5 104 94 84 93 83
636273
22
10
105 85
75
54 64 44 53
34
12
9
65
45
24 13
10
35
2 bc2
56 66
15
97 4 bc4 7686
96
20
(b) The result of localization with TOA measurement, ratio of noise is 25%. The estimate locations are marked as ””. RMSE = 0.0592m. Fig. 3. Illustration of 100 randomly deployed sensors and their estimated locations marked as ””
compare the performance of our proposed localization scheme, we also implement MDS-MAP and ordinal MDS (MDS-MAP(O)). For comparable convenience, all experiments run on a same network topology. That is, 100 nodes are deployed randomly in a 20m x 20m square. 4 reference nodes or beacons are placed at 4 corners and one at the center of the area (see Fig. 3.a ). For instant, Fig. 3.b shows the result of the localization problem in TOA case with the ratio of noise is
Localization in Sensor Networks with Fading Channels
427
25% and the obtained RMSE is 0.0592m. Noises and artifacts are generated for (non) fading channel models for both RSS and TOA measurements. The initial coordinate of sensors is randomly assigned for each trial of our simulations. We conduct the simulations on two phrases, the former is without fading channels and the latter includes fading channels. 4.1
Simulations without Fading Channels
To model the errors in the simulation network, we add Gaussian noise to the received signal strength for the RSS model and to the distance for TOA model. For RSS case, the ratio of sigma over path loss σdB /np varies from 1 to 2. For TOA case, the ratio of noise σs varies from 5%(0.05) to 25%(0.25). We run 10 trials for each type of considering algorithms, the Root Mean Square Error of each algorithms is plotted in Fig. 4.
5
RMSE (m)
4
MDS−MAP MDS−MAP−NoDis MDS−MAP(O) MDS−MAP(O)−NoDis
3 2 1 0 1
1.2 1.4 1.6 1.8 Ratio of sigma over path loss
2
(a) RSS measurement
RMSE (m)
1 0.8 MDS−MAP MDS−MAP−NoDis MDS−MAP(O) MDS−MAP(O)−NoDis
0.6 0.4 0.2 0 0.05
0.1
0.15 Ratio of noise
0.2
0.25
(b) TOA measurement Fig. 4. Root Mean Square Error for the simulations without fading channels
428
V.-D. Le, Y.-K. Lee, and S. Lee
From the graph, it is easy to realize that the performances of our proposed, RMSE of MDS-MAP-NoDis and MDS-MAP(O)-NoDis are similar to those of MDS-MAP and MDS-MAP(O) when the ratio of noise is low. However, NoDis seems to give better accuracy when the noise increases, particularly with the MDS-MAP algorithm in RSS model. The reason is that MDS-MAP suffers from large bias when noise is increasing but NoDis. However, the NoDis model also reduces the resolution of TOA measurement, MDS-MAP-NoDis in the Fig 4.b therefore can not get high accuracy like the MDS-MAP can. 4.2
Simulations with Fading Channels
To add fading channels into the above network topology to analyze the performance of algorithms, we vary the percentage of channels in the network from 5%(0.05) to 25%(0.25). We note that the channels or links selected to transform
25
RMSE (m)
20
MDS−MAP MDS−MAP−NoDis MDS−MAP(O) MDS−MAP(O)−NoDis
15 10 5 0 0.05
0.1 0.15 0.2 0.25 Ratio of fading channels to total channels
(a) RSS measurement 35 30
RMSE (m)
25
MDS−MAP MDS−MAP−NoDis MDS−MAP(O) MDS−MAP(O)−NoDis
20 15 10 5 0 0.05
0.1 0.15 0.2 0.25 Ratio of fading channels to total channels
(b) TOA measurement Fig. 5. Root Mean Square Error for the simulations with fading channels
Localization in Sensor Networks with Fading Channels
429
into fading channels are random. To transform a normal channel to a fading channel, we fix parameters σdB /np and σs at 1.6 and 20% respectively and double distances calculated by (1) and (3). The fading channels only vary in quantity, indeed. Again, we run 10 trials for each type of considering algorithms. Fig. 5 shows their results on the fading channel cases. This time, MDS-MAP-NoDis is much better than MDS-MAP, and achieving as good results as MDS-MAP(O) or MDSMAP(O)-NoDis. Obviously, the NoDis model works very well in the case of having a lot of fading channels and it is suitable for the real -world application where there are many unpredictable obstacles. We also remark that MDS-MAP, the basic MDS-MAP [3], can not work appropriately when the number of fading channels excesses 10%(0.1) of total channels in the network. This limitation can be explained by that optimizing with Mean Square Error (MSE) technique in a metric space can not perfectly remove artifacts or fading channels. That is why its graphs are irregular in the Fig. 5. 4.3
Convergence
As we have studied, the accuracy attained by MDS-MAP(O) [9] and our proposed MDS-MAP(O)-NoDis are almost similar in both cases of with and without fading channels. However, when the irregular noises are high, especially when occurring many fading channels, MDS-MAP(O)-NoDis converges much faster than MDS-MAP(O) does. The reason is the NoDis model discards the significantly different distances in the given pairwise distances δij . In addition, MDS-MAP algorithms are technically based on minimizing Mean Square Error which largely depends on the value of bias and MDS-MAP(O) is not an exception. Thus, MDS-MAP(O) require many more iterations to get convergence than 60
MDS−MAP(O) MDS−MAP(O)−NoDis
MDS−MAP(O) MDS−MAP(O)−NoDis
100
iterations
iterations
150
50
0 1
1.2
1.4
1.6
1.8
50
40
30 0.05
2
Ratio of sigma over path loss
0.2
0.25
(b) TOA non fading 120
100
100
iterations
iterations
(a) RSS, non fading
MDS−MAP(O) MDS−MAP(O)−NoDis
60
0.15
Ratio of noise
120
80
0.1
40
MDS−MAP(O) MDS−MAP(O)−NoDis
80 60 40
20 0.05
0.1
0.15
0.2
0.25
Ratio of fading channels to total channels
(c) RSS, fading channels
20 0.05
0.1
0.15
0.2
0.25
Ratio of fading channels to total channels
(d) TOA, fading channels
Fig. 6. Convergence vs. level of noise and fading channels
430
V.-D. Le, Y.-K. Lee, and S. Lee
MDS-MAP(O)-NoDis. Their results are plotted in Fig. 6. The iterations are the averaging of 10 trials.
5
Conclusions
We propose a new approach for localization that works well with networks containing fading channels. The proposed model, when used as the input for rangbased schemes or even rang-free schemes, can radically eliminate or reduce the effects of fading channels and artifacts. Previous methods often use the metric pairwise distances to estimate the coordinate of sensors. In such way, it is hard to overcome the problem caused by fading channels. Our approach does not have this limitation. It estimates the location of sensors from the nomination or the order of pairwise distances in nonmetric space. Simulations using various network measurements and different levels of noise illustrate that our proposal gives higher accuracy and convergence speed than the previous work, especially when there are many fading channels.
Acknowledgment This research was supported by the MKE (Ministry of Knowledge Economy), Korea, under the ITRC(Information Technology Research Center) support program supervised by the IITA (Institute of Information Technology Advancement)” (IITA-2009-(C1090-0902-0002)).
References 1. Patwari, N., Hero III, A., Perkins, M., Correal, N., O’Dea, R.: Relative localization estimation in wireless sensor networks. IEEE Trans. Signal Processing 51(8), 2137– 2148 (2003) 2. Costa, J., Patwari, N., Hero III, A.: Distributed weighted-multidimensional scaling for node localization in sensor networks. ACM Transactions on Sensor Networks 2(1), 39–46 (2006) 3. Niculescu, Y.S., Ruml, W., Zhang, Y., Fromherz, M.: Localization from connectivity in sensor networks. IEEE Transactions on Parallel and Distributed Systems 15(11), 961–974 (2004) 4. Shang, Y., Ruml, W., Zhang, Y., Fromherz, M.: Localization from mere connectivity. In: Proc. Mobihoc., pp. 201–212 (2003) 5. He, T., Huang, C., Blum, B., Stankovic, J., Abdelzher, T.: Range-free localization schemes for large scale sensor networks. In: Proc. ACM MobiCom, pp. 81–95 (2003) 6. Niculescu, D., Nath, B.: Dv based positining in ad hoc networks. Telecommunication Systems 22, 267–280 (2003) 7. Rudafshani, M., Datta, S.: Localization in wireless sensor networks. In: Proc. Information Processing in Sensor Networks (IPSN), pp. 51–60 (2007) 8. Sheu, J.P., Chen, P.C., Hsu, C.S.: A distributed localization scheme for wireless sensor networks with improved grid-scan and vector-based refinement. IEEE Transactions on Mobile Computing 7(9), 1110–1123 (2008)
Localization in Sensor Networks with Fading Channels
431
9. Vivekanandan, V., Wong, V.: Ordinal mds-based localization for wireless sensor networks. In: Proc. IEEE Vehicular Technology Conference, pp. 1–5 (2006) 10. Patwari, N., Agrawal, P.: Effects of correlated shadowing: Connectivity, localization, and rf tomography. In: Proc. Information Processing in Sensor Networks, pp. 82–93 (2008) 11. Borg, I., Groenen, P.J.: Mordern Multidimenstional Scaling Theory and Applications. Springer, Heidelberg (2005)
A Performance Comparison of Swarm Intelligence Inspired Routing Algorithms for MANETs Jin Wang and Sungyoung Lee∗ Department of Computer Engineering, Kyung Hee University, Korea {wangjin, sylee}@oslab.khu.ac.kr
Abstract. Swarm Intelligence (SI) inspired routing algorithms have become a research focus in recent years due to their self-organizing nature, which is very suitable to the routing problems in Mobile Ad hoc Networks (MANETs). Network topology information can be collected and updated in a distributed and autonomous way via the local interaction among ant-like agents inspired by SI. In this paper we make a comprehensive survey about various SI inspired routing algorithms for MANETs. These algorithms are explained and compared in detail based on a number of network metrics including packet delivery ratio, delay, routing overhead, delay jitter, goodput and throughput etc. It is our hope that the readers can get some hints for their future research work in the realm of SI inspired routing problems from the discussion and simulation results we provide in this paper.
1 Introduction Mobile Ad-hoc Networks (MANETs) have attracted much attention in recent years due to the rapid advances in Micro-Electro-Mechanical Systems (MEMS). MANETs consist of many mobile nodes (e.g. PDA, notebook) or sensors which can autonomously form a network without relaying on any existing infrastructure. So, they have wide potential applications like battlefield surveillance, disaster rescue missions etc. In recent years, a new type of Swarm Intelligence (SI) inspired routing paradigm has been becoming a research focus of the routing algorithms for MANETs. Different from the traditional routing protocols [1-3], the SI inspired algorithms [4-12] are selforganized in nature. By adopting the concept of stimergy [13] which means an indirect communication index among different ant agents (software packet), the network information can be collected and updated in a decentralized and dynamic way. Through the localized interaction within various ant agents, global network performance can get optimized, such as energy consumption and load balancing, routing overhead etc. Our contribution in this paper lies in the following two aspects. First, we present some state-of-the-art SI inspired routing algorithms for MANETs. Second, we give a comprehensive comparison of these algorithms from various network metrics, such as ∗
Corresponding author.
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 432–442, 2009. © Springer-Verlag Berlin Heidelberg 2009
A Performance Comparison of Swarm Intelligence Inspired Routing Algorithms
433
packet delivery ratio, delay, routing overhead, delay jitter, goodput and throughput etc. Extensive simulation results are provided with detailed analysis and explanation.
2 Related Work In MANETs, the routing protocols can be categorized into proactive routing protocols (e.g. DSDV [1]), reactive routing protocols (e.g. DSR [2], AODV [3]) and hybrid routing protocols, which combine both of them. Nevertheless, both proactive and reactive protocols have their intrinsic disadvantages. For example, proactive routing protocols may suffer from heavy communication overhead, especially when the network scale is large or the nodes move very fast. On the other hand, the reactive routing protocols also suffer from longer latency even though they are scalable and effective in reducing the routing overhead. To tackle the disadvantages above, a variety of Swarm Intelligence (SI) inspired routing algorithms have been proposed with different performance metrics in recent years, as is shown in Table. 1. ABC [4] and AntNet [5] are two of the earliest work about SI based routing for wired networks. The main purpose of ABC is to avoid the traffic congestion and make load balancing in circuit-switched networks by introducing dynamic pheromone updating and aging mechanism. AntNet [5] is a mobile agent based Monte Carlo system with application target of packet-switched networks. [6] provides a survey and some new directions about Ant Colony Optimization (ACO) based routing for wired network. ARA [7] is the first routing algorithm for MANETs which is based on the concept of SI and especially on ant colony based meta heuristic. In ARA, the routing table is maintained through data packets so as to reduce routing overhead. The uniqueness of PERA [8] is that the probability to select next hop is uniformly distributed during certain percent of time rather than pheromone based. Same as the authors in[5], the authors of AntHocNet [9] extended their rich experience of wired network routing to the MANETs routing problems. It is a hybrid routing algorithms which combines the reactive route setup phase and proactive route maintenance phase together. Similar to [9], the authors in [10] are the same as [11] whose application target is the fixed network, and their latest work can be found in [14]. Finally, in [12], the SI inspired routing algorithm is for the first time applied to hybrid ad hoc networks which include Table 1. Various SI inspired routing algorithms and main metrics Algorithm Name ABC(96) AntNet(98) ARA(02) PERA(03) AntHocNet(04) BeeAdHoc(05) ANSI (06)
Packet delivery ratio YES YES YES YES YES YES
Delay
Routing Overhead
YES
YES YES
YES YES YES YES
YES YES YES
Delay Jitter
Good put
Throughput
YES YES
YES
YES
YES
YES YES
434
J. Wang and S. Lee
both pure MANETs and other highly capable networks, such as mesh networks or cellular networks. So, the routing strategy can either be proactive or reactive which depends on whether it is connected to a highly capable network or not.
3 Swarm Intelligence Inspired Routing Algorithms in MANETs In order to understand the characteristics of Swarm Intelligence, we will first look at the SI inspired routing problem in MANETs with an example. Then, we will explain the self-organizing nature of SI in the context of routing for MANETs. 3.1 Swarm Intelligence Inspired Routing Procedure 3.1.1 Route Setup Phase As is shown in Fig. 1, each node has a routing table as well as a pheromone table. Once node 1 has packets to send, it will first check its routing table in Table 2. The row in routing table represents its neighbors and column represents the destination node. If there is no information about the destination node 5, it will then initiate the route setup phase. Then, it will broadcast a route request packet (ant agent) to its neighbors to find information of node 5. If the neighbor node does not have the information about destination, it will once again broadcast until it finally reaches the destination. During this process, the intermediate nodes are saved in the request packets in a sequence order like P= {1, 2, 5}, and a sequence number is adopted so as to avoid the loop. Once the request packet reaches node 5, a route reply packet will be sent on the reverse route of P. It is worth noting that the route information and pheromone information is not updated until the backward process so as to reflect the latest network situation. Then, the hop number and time stamp from destination node 5 to each of the intermediate node are recorded in their routing table and pheromone table based on heuristic function. The entry in the routing table can be a combination of
3
2
1
4 Fig. 1. Illustration of SI based routing problem
5
A Performance Comparison of Swarm Intelligence Inspired Routing Algorithms
435
Table 2. Routing table of node 1
2 p2,2 p3,2 p4,2 p5,2
2 3 4 5
3 p2,3 p3,3 p4,3 p5,3
4 p2,4 p3,4 p4,4 p5,4
5 p2,5 p3,5 p4,5 p5,5
both delay and hop number, so that those with a shorter delay and hop number can be selected as next hop to the destination later on. Finally, the route will be established and the data packets can be sent from node 1 to node 5. 3.1.2 Route Maintenance Phase It is worth mentioning that during the route setup phase, multiple path could be built based on the broadcasting mechanism, like another path P’ = {1, 3, 5} etc. So, the selection of next hop can be based on the pheromone table, whose entry pi , j is a probability which is calculated as follows [13]:
p i, j = Here,
τ i, j
∑
(τ i , j ) α ⋅ (η i , j ) β k∈ N i
(τ
k,j
) α ⋅ (η k , j ) β
(1)
is pheromone value and η i, j is an index of link quality which can be sim-
ply represented as
1 / d i , j . N i is the neighbor number of node i (here is the source
node 1). α and β are two tunable parameters which control the convergence of algorithm. The neighbor with a higher probability will have more chance of being selected as next hop. From Eq. (1), we can see that nodes with a shorter distance to their previous node or with a higher pheromone value are more likely to be chosen as next hop. Suppose the pheromone value in node 2 and 3 are the same, then node 2 will have a higher probability to be chosen as the next hop since its distance to node 1 is shorter than node 3. The route information can be maintained through periodical “hello” packets or even the data packets [7]. Since the selection of next hop is probability based, the work load on one route can be shared by the other route to make load-balancing. 3.1.3 Route Failure Handling Phase Due to the dynamic nature of MANETs, once there is a link failure because of node movement or out of energy, the following measures can be taken. Supposing node 2 has moved out of the range of node 1 and link fails between node 1 and node 2. First, node 1 will set related routing table and pheromone table entries of node 2 as empty. Then, it will check its routing table again to find an alternative route. Here, there are two other candidates, which are {1, 3, 5} and {1, 4, 5}. So, node 1 will choose the one with a larger pi , j . If there is no alternative route, the route setup phase will be initiated again.
436
J. Wang and S. Lee
3.2 Self-organizing Nature of Swarm Intelligence Here, we will introduce the self-organizing nature of Swarm Intelligence in the context of routing for MANETs. The main ingredients of SI lie in the following four aspects, which are positive feedback, negative feedback, amplification of fluctuations and multiple interactions among multi-agents [13]. 3.2.1 Positive Feedback The notion of positive feedback is also tightly related with reinforcement learning, which is a branch research of Machine Learning in the field Artificial Intelligence. Once certain link is visited again, the pheromone value along that link is incremented by a small constant amount. Later on, that link will have a higher probability of being revisited based on formula (1). This is because it demonstrates a better link performance, such as shorter distance, lowers latency or higher remaining energy etc. However, traffic congestion and link failure are likely to be caused if the pheromone value is monotonously increased. So, the negative feedback and certain thresholding mechanism need to be adopted simultaneously. 3.2.2 Negative Feedback Similar to the evaporation of pheromone, once certain links are not visited for certain time, it shows that those links might have a lower priority in the aspects of energyefficiency or end-to-end delay etc. So, the pheromone along those links needs to be decreased by a certain amount. Both linear and non-linear decreasing functions can be adopted here based on the application scenarios. The decreasing function needs to be carefully selected because a faster decreasing mechanism will deteriorate the goodness of certain node with higher priority while a slower decreasing mechanism will hinder the convergence of network performance. 3.2.3 Amplification of Fluctuations This is a critical factor in many self-organizing systems. Without it, most of the selforganized systems will turn back to static and deterministic systems rather than dynamic and stochastic systems. It provides more candidate solutions with lower priority, which might seem to be inferior at first. For example, node 4 might seem to be inferior at first. But it still has a probability of being chosen as the next hop of node 1 and the pheromone value will be added. Later on, node 2 might become more suitable to share the work load with node 2. In that case, a load-balancing can be made so as to avoid node 2 from ding out of energy quickly. 3.2.4 Multiple Interactions This is also one of the critical factors to ensure the system robustness. As we mentioned before, during the route failure phase, other alternative routes can be chosen based on the multiple interaction among ant-like agent packets. Besides, during the process of route setup phase, even though one of the ant agents fails, a route can still be found by other ant agents. In the mean time, the convergence rate can also be accelerated by this parallel route searching mechanism.
A Performance Comparison of Swarm Intelligence Inspired Routing Algorithms
437
4 Performance Evaluation From Table 1 and Table 3 we found that two major problems exist in the simulation of various SI inspired routing algorithms. First, different network metrics are used for performance evaluation. They are defined and compared based on different criteria. Second, the network environment which includes network size, node number, mobility model and traffic model etc. is different between them. So, we try to compare all of there performances and draw some common conclusion in this survey paper. Later on, we can deepen our future study based on the analysis and comparison here. 4.1 Simulation Environment In the simulation environment, N number of nodes are randomly deployed in a [X, Y] m2 area with a maximum transmission radius of R meters. The most commonly used mobility model is called “random waypoint (RWP)” model, among which a node will move with a certain velocity uniformly distributed in the range of [ Vmin , Vmax ]. Here
Vmin is usually set as 0 and Vmax is the maximum velocity. After reaching one place, the node will stay there for a certain pause time and then randomly move to the next place with a new velocity. A total simulation time is set so that it will finally converge no matter the traffic session is finished or not. A constant bit rate (CBR) traffic model is adopted, and K number of connections can be selected randomly from N nodes as source and destination pairs. The packet rate can be set as 1, 4, 8 or 16 packet(s)/s and the packet size is usually defined from 64 bytes to 1024 bytes. The traffic can be initiated and terminated at any time within the simulation time. Table 3. Simulation environment comparison Algorithm
[X,Y] (m2)
N
R Vmax (m) (m/s)
ARA
1500 *300 500 *500 3000 *1000 1000 *1000 [750^2] [2250^2] 2400 *800 [1100^2] [2460^2]
50
250
10
20
250
20
100
300
20
0, 30… 120,300 50, 100 0..480
100
110
20
50.. 500 50
110
20
PERA Ant HocNet1 Ant HocNet2 Ant HocNet3 BeeAdHoc ANSI
50.. 250
Simulation Time (s) 900
Conn. Num.
Packet Size (Byte)
Packet Rate (p/s)
10
4
900
4
64.. 1024 *
900
20
64
1
0..480
900
20
64
8
30
900
20
64
8
250 1..20
60
1000
1
64
10
250
10
300
N/2
64
1
20
Pause Time (s)
1
J. Wang and S. Lee
85 80 75
0
500 1000 Pause Time (s) BeeAdHoc1(1 CBR connection)
92 0.2
0.4 0.6 Pheromone value BeeAdHoc2(1 CBR connection)
99 D e liv ery R a tio (% )
D e liv ery R a tio (% )
94
90
100 99 98
DSDV AODV BeeAdHoc DSR
97 96
96
5
10 15 Node Velocity (m/s)
20
98 97 DSDV AODV BeeAdHoc DSR
96 95 94
0
2000 4000 Packet Size (bits)
6000
AntHocNet AODV
80
60 D e liv e ry R a tio (% )
DSDV AODV ARA DSR
PERA(5m/s) PERA(10m/s)
75 70 65 60 55
AntHocNet AODV
50 40 30
20 200 300 400 500 0 100 200 300 400 500 Pause Time (s) Pause Time (s) AntHocNet3 (20 CBR connections,8 packets/s) ANSI(N/2 CBR connections) 60 100 AntHocNet ANSI 80 AODV AODV 50 60 0
100
D e liv e ry R a tio (% )
98
AntHocNet2 (20 CBR connections, 8 packets/s)
85 D e liv e ry R a tio (% )
95 90
AntHocNet1 (20 CBR connections, 1 packet/s)
PERA(4 CBR connections) 100 D eliv e ry R at io (% )
D eliv e ry R at io (% )
ARA (10 CBR connections) 100
D e liv e ry R a tio (% )
438
40
30
100
200 300 400 Node Number
(a)
500
40 20 0 50
100 150 200 Node Number
250
(b) Fig. 2. Packet delivery ratio comparison
4.2 Simulation Results and Comparison Here, we do not compare ABC and AntNet with other routing algorithms since they are wired network oriented. It should be noted that in Table 3, the AntHocNet algorithm can be further classified into three types based on its application scenario. AntHocNet1 is dealing with light traffic since 20 CBR traffic pairs send out 1 packet per second. AntNetHoc2 is dealing with heavy traffic with packet rate of 8 packets per second. AntHocNet3 is similar to AntHocNet2 except that the node density is kept as a constant (1 node per 100 × 100 m2). Besides, there are five experiments in ANSI, ranging from hybrid network with UDP, hybrid network with TCP, large hybrid network with UDP to pure MANET with TCP and pure MANET with UDP. Here, we choose the last experiment, namely pure MANET with UDP, since our network environment is pure MANET with UDP BCR traffic. 4.2.1 Packet Delivery Ratio Packet delivery ratio means the ratio of correctly delivered packets versus the total packets sent. From Table 1 we can see that it is one of the most commonly compared network metrics by most SI inspired routing algorithms. From Fig. 2 (a) we can see that packet delivery ratio increases with pause time in ARA. Here, the performance of DSR is the best and ARA is a little inferior to DSR. But their performance is both above 95%. However, in Fig. 2 (b), this performance of AntHocNet decreases with pause time under both light and heavy traffic. The reason is that the network topology is sparse and there are some isolated nodes with no neighboring nodes. So the nodes may not successfully forward the packets to other nodes. The PERA algorithm shows that the packet delivery ratio is higher with lower node velocity, and the difference is not so much. BeeAdHoc algorithm once again verifies the conclusion of PERA, which is that the packet delivery ratio decreases with high node velocity. In BeeAdHoc2, we can see that larger packets can cause
A Performance Comparison of Swarm Intelligence Inspired Routing Algorithms
0.4
0
0.2 0.1
2000 4000 6000 Packet Size (bits) PERA2(4 CBR connections, 10m/s) PERA AODV
0.6 0.4 0.2
0
500 Time
0
1000
500 Time
1000
D e la y (s )
0
0
0
2
0.3
1.5
0.2
0
0.2 0.1
AntHocNet AODV
0.1
0
AntHocNet2 (20 CBR connections) 0.4 AntHocNet AODV 0.3
200 400 600 Pause Time (s) AntHocNet3 (20 CBR connections) 0.4
0
0.8
PERA AODV D ela y (s )
D ela y (s )
0.05
0
10 15 20 Node Velocity (m/s) PERA1(4 CBR connections, 1m/s)
0.3
0
0.5
5
0.4
0.1
D e la y (s )
0.6
DSDV AODV BeeAdHoc DSR
1
D e la y (s )
1.5 DSDV AODV BeeAdHoc DSR
Dela y (s )
Dela y (s )
1 0.8
0.2
AntHocNet1 (20 CBR connections) 0.2 AntHocNet AODV 0.15
BeeAdHoc2(1 CBR connection) 2
0
200 400 Node Number
(a)
600
D e la y (s )
BeeAdHoc1(1 CBR connection) 1.2
439
200 400 600 Pause Time (s) ANSI(N/2 CBR connections) ANSI AODV
1 0.5 0 50
100 150 200 Node Number
250
(b) Fig. 3. Delay comparison
traffic congestion and thus decrease the delivery ratio. Once again, DSR has the best performance as is shown in ARA algorithm. So, we can still draw some common conclusions through the comparison here. Finally, for the relationship between packet delivery ratio and node number, we see from both AntHocNet3 and ANSI that it decreases with node number. It shows that both of these two SI inspired algorithms scale better than AODV, especially when the node number is above 150. 4.2.2 Delay From Fig. 3 (a) we can see that node velocity has no influence on end-to-end delay in BeeAdHoc algorithm. Even though DSR has the best performance of packet delivery ratio, here it has the worst performance of delay. Due to the simple data structure and less control packets used in BeeAdHoc, it has a good performance of delay. The influence of packet size lies in that it can cause traffic congestion and delay the packet transmission. For PERA algorithm, we can once again draw a common conclusion that velocity has little impact on delay. However, there is a sharp increase in AODV during certain time on both low and high velocity cases. The reason is that AODV needs more time to deal with traffic congestion or link failure, while in PERA, the data packet can be transmitted through an alternative path as we mentioned before. In Fig. 3 (b), the delay of AntHocNet is about one third or half of AODV at light traffic. For heavy traffic situation, the trend is still the same. The reason is that AntHocNet is a hybrid routing algorithm with proactive route maintenance function. Same conclusion can be drawn from the simulation results of AntHocNet and ANSI that delay increases with node number. Since the node density is kept constant in both algorithms, the network size also increases with node number. For a larger number of nodes, more nodes are involved into the traffic session, which will usually cause more hop number and delay. 4.2.3 Routing Overhead Routing overhead is defined as the average number of control packet transmissions per data packet delivered in AntHocNet algorithm, while it is usually defined as the
J. Wang and S. Lee
10
1.5 DSDV AODV ARA DSR
1 0.5 0
0
200
400 600 Pause Time
800
4
20
2 5
R o u t in g O v e rh e a d
R o u t in g O v e rh e a d
10 15 Node Velocity (m/s)
30 25 20 0
100
200 300 400 Pause Time (s)
0 50
20
500
10
AntHocNet AODV
10 8 6 4
0
100
200 300 400 Pause Time (s)
100
150
200
250 300 Node Number
4
12 AntHocNet AODV
AODV
40
AntHocNet2 (20 CBR connections)
40 35
AntHocNet
60
DSDV AODV BeeAdHoc DSR
6
AntHocNet1 (20 CBR connections)
15
AntHocNet3 (20 CBR connections) 80
8
0
1000
5 x 10BeeAdHoc(1 CBR connection)
R o u t in g O v e r h e a d
C o n t ro l P a c k e t s S e n t
R o u t in g O v e rh e a d
x 10 ARA (10 CBR connections)
500
x 10 ANSI1(1 CBR connections)
5
8
350
400
450
500
6ANSI2(N/2 CBR connections)
x 10
ANSI
4
6
AODV
3
4
2
ANSI AODV
2 0 50
8 0 2 . 1 1 D C F B ro a d c a s t N u m b e r
4
2
8 0 2 . 1 1 D C F U n ic a s t N u m b e r
440
100
150 200 Node Number
(a)
1
250
0 50
100
150 200 Node Number
250
(b) Fig. 4. Routing overhead comparison
specific number of routing packets sent in other algorithms. From Fig. 4 (a), we can see a general trend which is that more routing overhead is needed with higher mobility. Specifically speaking, when a node stays in certain position for a longer time, the less control packets are needed in ARA. We may recall that in BeeAdHoc, foragers are only sent back when the destination has packets to send to the source node and it is put in the header of the beeswarms. In that case, less control packets are needed, causing a large decrease of routing overhead. In AntHocNet however, the routing overhead is a disadvantageous factor. Actually, it pays for all the other advantageous factors in AntHocNet, such as delivery ratio, delay and delay jitter. In contrast with ARA, it uses more control packets to discover and maintain the route. The longer one node stays, more packets will be sent there. It is also worth noting from Fig. 4 (b) that AntHocNet algorithm can also gain advantage over AODV when the network scale is large. It seems route maintenance and route failure handling mechanism play a trade-off. Again, half number of nodes’ involvement as data sources may cause traffic congestion, thus increase the controlling unicast or broadcast packets in ANSI. 4.2.4 Delay Jitter As an important metric of QoS, delay jitter means the average difference in interarrival time between packets. The performance of AntHocNet is always better than AODV from the observation of Fig. 5. For AntHocNet, the average delay jitter is smaller on heavy traffic than on light traffic based on its definition. If nodes move more frequently, delay jitter will also be small as we can imagine. Even if the node number increases to 300, this performance of AntHocNet is still very good in comparison with AODV. On the other hand, traffic congestion might be easily caused in ANSI since half of the nodes are serving as data sources. Due to the slow packet rate and large network size, which is corresponding to the node number, the delay jitter in ANSI is relatively larger than that in AntHocNet.
AntHocNet1 (20 CBR connections) 2 AntHocNet AODV 1.5 1 0.5
Delay Jitter(s)
Delay Jitter(s)
A Performance Comparison of Swarm Intelligence Inspired Routing Algorithms
0.4
200 400 Node Number
0.4
0
200 400 600 Pause Time (s) ANSI(N/2 CBR connections)
8
0.6
0
0.6
0.2
0 200 400 600 Pause Time (s) AntHocNet3 (20 CBR connections) 1 AntHocNet AODV 0.8
0.2
AntHocNet2 (20 CBR connections) 1 AntHocNet AODV 0.8
Delay Jitter(s)
Delay Jitter(s)
0
441
ANSI AODV
6 4 2 0 50
600
100 150 200 Node Number
250
Fig. 5. Delay jitter comparison
Goodput(%)
Goodput(%)
PERA1(4 CBR connections, 1m/s) 100
50 PERA AODV 0
0
0
500 Time(s)
1000
0 500 1000 Time(s) PERA2(4 CBR connections, 10m/s) 100 Throughput(%)
Throughput(%)
500 1000 Time(s) PERA1(4 CBR connections, 1m/s) 100
50
50
0
0
PERA AODV
PERA2(4 CBR connections, 10m/s) 100 PERA AODV
PERA AODV 50
0
0
500 Time(s)
1000
Fig. 6. Goodput and throughput comparison
4.2.5 Goodput and Throughput From Fig. 6 we can see that the goodput of PERA is inferior to AODV on both low and high velocity occasions, and it decreases with high mobility. This is because more control packets are sent out so as to find new paths when certain link is broken due to the fast movement of nodes. The performance of throughput is nearly the same for AODV and PERA under different mobility.
5 Conclusion and Future Work In this paper, we make a comprehensive comparison and analysis of SI inspired routing algorithms for MANETs. The self-organizing nature of SI and the integration of SI principle with routing mechanism are illustrated and explained. It is our hope that the readers can get some hints for their future research work in the realm of SI inspired routing problems from the common conclusions we draw as well as our comparative figures and tables. In the near future, we will study the energy consumption and load-balancing performance of the SI inspired routing algorithms. Besides, the heuristic functions to
442
J. Wang and S. Lee
associate pheromone with probability and other network metrics, such as delay, hop number and remaining energy will also be studied.
Acknowledgement This research was supported by the MKE (Ministry of Knowledge Economy), Korea, under the ITRC (Information Technology Research Center) support program supervised by the IITA( Institute of Information Technology Advancement)" (IITA-2009-(C10900902-0002)).
References 1. Perkins, C.E., Bhagvat, P.: Highly Dynamic Destination Sequenced Distance Vector Routing (DSDV) for Mobile Computers. Computer Communications Rev., pp. 234–244 (October 1994) 2. Johnson, D.B., Maltz, D.A., Hu, Y.C., Jetcheva, J.G.: The Dynamic Source Routing Protocol for Mobile Ad hoc Networks. IETF Internet draft, draft-ietf-manet-dsr-04.txt (November 2000) 3. Perkins, C.E., Royer, E.M., Das, S.R.: Ad hoc On-demand Distance Vector (AODV) routing. IETF Internet draft, draft-ietf-manet-aodv-07.txt (November 2000) 4. Schoonderwoerd, R., Holland, O., Bruten, J., Rothkrantz, L.: Ant-Based Load Balancing In Telecommunications Networks. Adaptive Behavior 5(2), 169–207 (1996) 5. Di Caro, G., Dorigo, M.: AntNet: Distributed Stigmergetic Control for Communications Networks. Journal of Artificial Intelligence Research (JAIR) 9, 317–365 (1998) 6. Sim, K., Sun, W.: Ant Colony Optimization for Routing and Load-balancing: Survey and New Directions. EEE Transactions on Systems, Man and Cybernetics–Part A: Systems and Humans 33(5), 560–572 (2003) 7. Gunes, M., Sorges, U., Bouazizi, I.: ARA - The Ant-Colony Based Routing Algorithm for MANETs. In: Proceedings of the ICPP Workshop on Ad Hoc Networks (IWAHN 2002), pp. 79–85. IEEE Computer Society Press, Los Alamitos (2002) 8. Baras, J.S., Mehta, H.: A Probabilistic Emergent Routing Algorithm for Mobile Ad hoc Networks. In: WiOpt 2003: Modeling and Optimization in Mobile, Ad Hoc and Wireless Networks (March 2003) 9. Di Caro, G., Ducatelle, F., Gambardella, L.M.: AntHocNet: An Ant-based Hybrid Routing Algorithm for Mobile Ad hoc Networks. In: Yao, X., Burke, E.K., Lozano, J.A., Smith, J., Merelo-Guervós, J.J., Bullinaria, J.A., Rowe, J.E., Tiňo, P., Kabán, A., Schwefel, H.-P. (eds.) PPSN 2004. LNCS, vol. 3242, pp. 461–470. Springer, Heidelberg (2004) 10. Wedde, H., Farooq, M.: The Wisdom of the Hive Applied to Mobile Ad-Hoc Networks. In: Proceedings of the IEEE Swarm Intelligence Symposium (SIS), pp. 341–348 (June 2005) 11. Wedde, H.F., Farooq, M., Zhang, Y.: BeeHive: an Efficient Fault-tolerant Routing Algorithm inspired by Honey Bee Behavior. In: Dorigo, M., Birattari, M., Blum, C., Gambardella, L.M., Mondada, F., Stützle, T. (eds.) ANTS 2004. LNCS, vol. 3172, pp. 83–94. Springer, Heidelberg (2004) 12. Rajagopalan, S., Shen, C.-C.: ANSI: A Swarm Intelligence-based Unicast Routing Protocol for Hybrid Ad hoc Networks. Journal of Systems Architecture, Special Issue on Nature Inspired Applied Systems, 485–504 (2006) 13. Bonabeau, E., Dorigo, M., Theraulaz, G.: Swarm Intelligence: From Natural to Artificial Systems. Oxford University Press, New York (1999) 14. Wedde, H.F., Farooq, M.: A Comprehensive Review of Nature Inspired Routing Algorithms for Fixed Telecommunication Networks. Journal of Systems Architecture, Special Issue on Nature Inspired Applied Systems, 461–484 (2006)
Analysis of Moving Patterns of Moving Objects with the Proposed Framework* In-Hye Shin1, Gyung-Leen Park1,**, Abhijit Saha1, Ho-young Kwak2, and Hanil Kim3 1
Dept. of Computer Science and Statistics, Cheju National University, 2 Dept. of Computer Engineering, Cheju National University, 3 Dept. of Computer Education, Cheju National University, 690-756, Ara 1 Dong Jeju-si, Jeju-do, Republic of Korea {ihshin76, glpark, abhijit298, kwak, hikim}@jejunu.ac.kr
Abstract. This paper proposes an analysis framework which enables us to analyze the moving patterns of moving objects. To show the effectiveness of the framework, we applied the framework to analyze moving patterns of taxis based on the real-life location history data accumulated from the Taxi Telematics system developed in Jeju, Korea. The analysis aims at obtaining value-added information necessary to provide empty taxis with location recommendation services for the efficient operations of taxis. The proposed framework provides the flow chart which would have a quick look around the overall analysis process and help quickly deal with the same or similar analysis, while saving the temporal and economic costs. Data mining tool used in the framework is Enterprise Miner (E-Miner) in SAS which is one of the most widely used statistics packages and can effectively address huge amounts of log data. Especially, we perform the refined analysis by means of doing repeatedly the well-known kmeans clustering method under various spatial or temporal conditions. The paper proposes the refined data mining process 1) extracting the interested dataset about meaningful information driven from the previous cluster results, 2) performing again the detailed clustering with the extracted dataset, and 3) finally extracting the value-added information such as the good pick-up spots or 4) returning the feedback. As a result, the spatiotemporal pattern analysis within the each refined clustering method makes it possible to recommend that the empty taxis go to the nearby cluster location with a high pick-up frequency statistically, resulting in the reduction of empty taxi ratio.
1 Introduction The Taxi Telematics system has operated an efficient taxi dispatch service using the real-time location tracking function in Jeju island, Korea[1]. Each taxi equipped with a GPS receiver reports its location to the central call server periodically. The call *
This research was supported by the MKE (The Ministry of Knowledge Economy), Korea, under the ITRC (Information Technology Research Center) support program supervised by the IITA (Institute for Information Technology Advancement). (IITA-2009-C1090-0902-0040). ** Corresponding author.
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 443–452, 2009. © Springer-Verlag Berlin Heidelberg 2009
444
I.-H. Shin et al.
control server is responsible for keeping such reports from each taxi and handles the call request from a customer by finding and dispatching a taxi closest to the call point. The accumulated data in the Taxi Telematics system can be utilized for various analysis on both research and business areas [2]. In Jeju, for an efficient analysis, the existing research has developed a data processing framework [3,4], while some researchers have already dealt with the most basic and useful analyses after extracting the pick-up records[5,6]. Especially, [6] has analyzed the passenger pick-up pattern using the SAS statistics software package for taxi location recommendation, but has simply provided the spatiotemporal distribution of both pick-ups and drop-offs. It is the exact fact that the empty taxi ratio can be reduced by means of guiding a taxi to the spot where many passengers are waiting for. The empty taxi ratio in Jeju area is about 80 % according to a survey in [5]. Therefore, many taxi businesses are interested in an effective location recommendation service in order to increase the income and decrease the consumption of fuels. The reduction of fuels is more important today to protect the environments. Many researches for the effective pattern analysis based on the real location data are still necessary, as the different area has the different traffic pattern due to the uniqueness of facilities and road segment distributions [7]. Also, the real pattern analysis using data mining is an interested topic due to importance of the value-added information, as shown in [9],[10]. In this regard, this paper is to propose the systematical framework and perform the sophisticated analysis for real applications. The proposed analysis framework consists of both of the data mining-specific data processing framework and a novel analysis processing framework for the refined data mining aiming at abstracting location recommendation information which can give to the empty taxis instant from the place where a passenger gets off the taxi. This framework can help perform even a sophisticated and quick analysis. Compared with the similar given works [5,6], two key points of our analysis employ the SAS data mining model containing powerful analysis tools and carry out the further various spatiotemporal analysis of taxis’ moving patterns as well as pick-up patterns. Particularly, the paper is to develop the spatiotemporal refined clustering analysis by means of taking into account important factors such as the interested area, the interested time, and the interested driving distance (or driving hour), keeping pace with the business requirements. This paper is organized as follows. Section 2 proposes the analysis framework for the location history data collected from the Taxi Telematics system. For location recommendation, Section 3 and Section 4 exhibit the detailed results obtained from the primary analysis and the various refined analysis on the moving patterns of taxis, respectively, especially focusing on the pick-up patterns. Finally, Section 5 summarizes and concludes the paper with a brief introduction of future work.
2 Proposed Analysis Framework This section proposes the data processing framework and shows how to analyze the moving patterns of taxis effectively. The proposed framework provides the flow chart which would have a quick look around both of the data flow and the overall analysis process. It helps quickly perform the same or similar analysis, while sparing the temporal and economic costs.
Analysis of Moving Patterns of Moving Objects with the Proposed Framework
445
Fig. 1. Data processing framework
Figure 1 outlines the overall procedure for location data processing. In Jeju, 200 taxis report their location records every minute, and each record includes taxi ID and status fields in addition to the basic GPS data such as timestamp, latitude, longitude, direction, speed as shown in database table. First, an analyzer transforms the type of such records into data type (such as TXT, DAT, etc) readable in SAS then stores them with SAS dataset types. We analyze the location dataset using SAS analysis engine consisting of program editor, log, output, data mining analysis tool (or Enterprise Miner), and so on. The result obtained from the pattern analysis can be depicted using SAS output interface and also in Google Earth map interface to enhance the trajectory visualization. 2.1 SAS Analysis Process for Data Mining Data mining should be adapted efficiently as aiming at deducing useful information[10]. Fig. 2 shows a serial of data mining analysis process: the data cleansing process collecting, exploring, and modifying the large amount of data; the modeling process analyzing the refined data using a proper method; the assessment process evaluating the applied modeling; the utilization process using the generalized information via the generalization; feedback process providing the factors for another analysis. Such a process helps obtain more useful information. Especially, SAS Enterprise Miner (or E-Miner) is a tool exploring the hidden relation and pattern among a huge amount of data, modeling such a relation or pattern, transforming extracted information adaptable to a specific task, adapting such information to the business decision, and moreover supporting Graphic User Interface (GUI) environment. It supports the overall data mining process addressing and analyzing the data, and extracting the valuable information, by means of the SEMMA methodology [11]. SEMMA consists of five main steps: 1) Sampling identifying input data sets, sampling from a larger data set, and partitioning dataset into training, validation, and test datasets; 2) Exploration exploring dataset statistically and graphically and obtaining descriptive statistics; 3) Modification preparing the data for analysis; creating additional variables, transforming existing variables for analysis, and removing outliers; 4) Modeling fitting a predictive model; modeling a target variable using a cluster
446
I.-H. Shin et al.
Fig. 2. SAS procedure for Data Mining
analysis, an association analysis, a regression model, a decision tree, a neural network, or a user-defined model; 5) Assessment comparing competing predictive models. The paper exploits a clustering method for data modeling. Clustering is an analytic method to divide the data into meaningful clusters by a specific criterion such as spatiotemporal similarity or homogeneity [11]. 2.2 Proposed Analysis Process for the Refined Clustering Method Fig. 3 shows the analysis framework for the refined clustering based on the taxi records to recommend the best location such as a good pick-up spot. This framework shows an efficient overall process of the pattern analysis. That is, if the pattern analysis of the taxi can be done according to the analytic scheme and order in Fig. 3, the analysis cost and consuming time can be reduced. Also, this processing framework could be expansively adapted to another transport industry, similar to a taxi business, such as delivery by changing some variables or adding to new variables, and modifying the flow chart partially. As the procedure in Fig. 3, a data analyzer should load taxi records first, restore them as the SAS data type, modify the SAS dataset through the variable transformation and data filtering to remove the abnormal data, which is generally generated from the abnormal operation of GPS receiver or from the driver’s mistreatment of telematics device, like a Modification step of SEMMA in fig. 2. The total SAS datasets can be divided into two dataset. The first dataset has the data extracting both pick-up and drop-off data record by record. The second dataset has the data extracting the paired data by a pick-up and drop-off record. The second dataset may provide the information where a passenger picking up the taxi at a specific spot gets off it. Then primary clustering analysis is performed to prior look into the whole distribution of the pick-ups and the drop-offs. To examine clustering locations in detail, the refined clustering analysis is necessary. The analyzer may clearly analyze the interested factor through data classification or extraction to extract the interested or the significant factors such as the specific area, the specific pick-up time or the drop-off time, the specific driving hour (or driving distance), the specific weekday, and so on. Each dataset and clustering can be made up of by means of the join condition among the interested factors. There is a recurrence of such a refined analysis
Analysis of Moving Patterns of Moving Objects with the Proposed Framework
447
process, adjusting the feedback factors resulting from either the clustering process or the decision information driven through the result interpretation.
3 Primary Analysis Results This section shows the results obtained from the basic clustering analysis informing pick-up patterns, drop-off patterns, spatial distributions, and temporal distributions. And the paper exhibits the analysis results mainly focusing on the pick-ups rather than the drop-offs due to the current interests shown from the taxi drivers. As shown in Fig. 3, the analysis procedure via data cleaning (data transmission and filtering) creates one dataset with total 163,490 records consisting of 81,745 pick-up records and 81,745 drop-off records. The other dataset consists of the pick-up and drop-off pared records with total 81,656 records excluding too long or short boarding records and abnormal ones. The paper uses a k-means clustering method, which builds clusters based on spatial distance for the given fixed number (k) of clusters, for the clustering analysis of E-Miner tool. Fig. 4 and Fig. 5 shows the clusters of main pick-up hotspots and drop-off hotspots, respectively, on Google Earth map, after performing the k-means clustering using two target location variables (longitude and latitude) with k fixed to 100. Each cluster shows the cluster number, the record frequency included in each cluster, and the cluster range made of maximum and minimum points of clustering result. Most of pick-up spots and the drop-off spots are located in the near of the wide intersections such as 4-way or 5-way crossing, the crowded stations, main buildings, or famous hotels.
Fig. 3. The analysis framework for the refined data mining of taxi location records
448
I.-H. Shin et al.
Fig. 4. Cluster locations of pick-up records
Fig. 5. Cluster locations of drop-off records
Note that most of clusters are distributed over Jeju city in Jeju Island, because most of taxis are being operated in the city, the center of administration having a large portion of the population. For further detailed analysis, we divided the Jeju city area into three subareas, new town area, old town area, ant the airport area, according to the population and the vehicle distribution. The clustering analysis reveals that the moving patterns of taxis are most related to the population density and the location of principal facilities. Hence, the refined analysis would depend on three interested area. Fig. 6 shows the pick-up frequency and the top 16 clusters, according to the daily hour. This figure indicates that the pick-ups are concentrated on the commuting time and around the midnight. Especially, we regard the interested time as 19~20 o’clock (7~8 p.m.). Moreover, from the viewpoint of a taxi business, one important factor is a taxis’ driving distance (or a driving hour) while picking up a passenger. Note that the longer the boarding hour (or the boarding distance), the larger the taxi fare. Thus we would be interested in the long driving hour which could be different from the distance due to the complication of the non-straight driving distance computing.
Analysis of Moving Patterns of Moving Objects with the Proposed Framework
449
Fig. 6. Pick-up clustering distribution according to daily hour
After all, the paper defines three interested factors, the pick-up area, the pick-up time, the driving hour to further refine the clustering analysis and would use them for the concentrated analysis in the next section.
4 Refined Clustering Analysis Results This section adds spatial and temporal attributes (the user-defined interested factors in the previous section) to the clusters, performs the refined clustering analysis under the spatially or temporally interested condition, and shows the result through the Google Earth interface. The two following figures exhibits the clustering distribution of drop-off hotspots from a specific pick-up spot on Google Earth map, after performing the k-means clustering using two drop-off location variables (Elongitude and Elatitude) with k=50. That is, Fig. 7 shows the moving pattern of the maximum pick-up cluster, Cluster ID
Fig. 7. Moving pattern of the maximum pick-up cluster in the old town
450
I.-H. Shin et al.
Fig. 8. Moving pattern of the maximum pick-up cluster in the new town
97 with 2,958 out of total 81,745 in Fig. 4. Fig. 8 shows the moving pattern of the second pick-up cluster (the maximum cluster within new town), Cluster ID 11 with 2,542 in Fig. 4. The percentage of top 13 clusters were 94.1% (2740 out of 2909) in Fig. 7, while the percentage of top 9 clusters were 94.0% (2187 out of 2316) in Fig. 8. Taxis’ movement has the moving pattern with short distance, mostly roaming within either the old town or the new town. Fig. 9 shows the pick-up locations after performing the k-means clustering using two longitude and latitude variables with k=50 during the interested driving hour, 7~8 p.m. After all, the result recommends the good place where taxis chiefly pick up customers during hot driving hour. Top 13 clusters shown in Fig. 9 cover 95.0% (5,247 out of 5,521) of the pick-ups. Figure 10 shows the result of the refined clustering with k=40 for the long driving taxis, recommending the main spots where taxis pick up passengers who need the long drive. 9 hot clusters cover 95.6% (3,836 out of 4,014) of the pick-ups. Note that the smaller the size of clusters, the more accurate the location of the pick-up spots. That is, the more refined clustering
Fig. 9. Pick-up locations during the hot driving hour
Analysis of Moving Patterns of Moving Objects with the Proposed Framework
451
Fig. 10. Pick-up locations of the long driving taxis
analysis, the more refined location recommendation for the empty taxis. And we suggest that the degree of analysis should reflect the business requirements as well as the data miner’s judgment.
5 Conclusions This paper showed the development of an analysis framework for tracing the moving patterns of moving objects. In order to show the real applicability of the framework, the framework has been applied to the real-life location history data collected from the Taxi Telematics system in Jeju Island. The proposed framework provides the flow chart which would have a quick look around the overall analysis process and help quickly deal with the same or similar analysis, while saving the temporal and economic costs. And the paper has analyzed the moving patterns of taxis, focusing pickup data, using a useful data mining tool, SAS E-Miner. It has shown various results on Google Earth map interface. That is, the paper has performed the refined data mining analysis doing repeatedly the k-means clustering method under various spatial or temporal conditions. The refined data mining process consists of the main 4 steps: 1) the data extraction step extracting the interested SAS dataset associated with meaningful information conducted from the previous cluster results; 2) the refined clustering analysis step performing again the sophisticated clustering based on the extracted SAS dataset; 3) finally decision deduction step driving the value-added information such as the good pick-up spots either according to spatiotemporal conditions or business requirements such as a specific area, a specific hour, or a specific driving distance, or the others; or 4) feedback step returning to the previous step (1 or 2 step) with additional modifying information. As a result, the spatiotemporal pattern analysis within the each refined clustering method makes it possible to recommend that the empty taxis go to the nearby cluster location with a high pick-up frequency statistically, resulting in the decrease of empty taxi ratio. In near future, under the spatiotemporal conditions reflecting the business requirements, higher added-value location information deduced from the various results could utilize the useful LBS (locationbased services) application such as a location recommendation service in Jeju or even in other area trying related businesses.
452
I.-H. Shin et al.
References [1] Lee, J., Park, G., Kim, H., Yang, Y., Kim, P., Kim, S.: A telematics service system based on the Linux cluster. In: Shi, Y., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2007. LNCS, vol. 4490, pp. 660–667. Springer, Heidelberg (2007) [2] Hariharan, R., Toyama, K.: Project Lachesis: Parsing and modeling location histories. In: Egenhofer, M.J., Freksa, C., Miller, H.J. (eds.) GIScience 2004. LNCS, vol. 3234, pp. 106–124. Springer, Heidelberg (2004) [3] Lee, J., Park, G.: Design and implementation of a movement history analysis framework for the taxi telematics system. In: Asia-Pacific Conference on Communications, pp. 1–4 (2008) [4] Lee, J., Hong, J.: Design and implementation of a spatial data processing engine for the telematics network. Applied Computing and Computational Science (2008) [5] Lee, J.: Traveling pattern analysis for the design of location-dependent contents based on the Taxi telematics system. In: International Conference on Multimedia, Information Technology and its Applications (2008) [6] Lee, J., Shin, I., Park, G.: Analysis of the passenger pick-up pattern for taxi location recommendation. In: International Conference on Networked Computing and Advanced Information Management, vol. 1, pp. 199–204 (2008) [7] Liao, Z.: Real-time taxi dispatching using global positioning systems. Communication of the ACM, 81–83 (2003) [8] He, H., Jin, H., Chen, J., McAullay, D., Li, J., Fallon, T.: Analysis of Breast Feeding Data Mining Methods. In: Proc. Australasian Data Mining Conference, pp. 47–52 (2006) [9] Madigan, E.A., Curet, O.L., Zrinyi, M.: Workforce analysis using data mining and linear regression to understand HIV/AIDS prevalence patterns. Human Resource for Health 6 (2008) [10] Han, J., Kamber, M.: Data Mining Concepts and Techniques. Morgan-Kaufmann, San Francisco (2006) [11] Matignon, R.: Data Mining Using SAS Enterprise Miner. Wiley, Chichester (2007)
A User-Defined Index for Containment Queries in XML∗ Gap-Joo Na and Sang-Won Lee School of Information and Communication Engineering, Sungkyunkwan University, Chunchun-dong 300, Jangan-gu, Suwon, Kyounggi, 440-746, Korea {factory,swkee}@skku.edu
Abstract. Containment queries for XML documents is one of the most important query types, and thus the efficient support for this type of query is crucial for XML databases. Recently, object-relational database management system (ORDBMS) vendors try to store and retrieve XML data in their products. In this paper, we propose an extensible index to support containment queries over the XML data stored as BLOB type in ORDBMSs. That is, we describe how to implement an index using the extensibility feature of an ORDBMS, and describe its usage. The main advantage of this index is user’s productivity in handling XML data in SQL language. Keywords: XML, user-defined index, containment query.
1 Introduction XML[1] is rapidly growing as the standard for data representation and data exchange over the Web, and is, in practice, adopted as standard data format/exchange by many companies. XML can be used to represent information about cars in a car company, or to represent book catalog information in library, or product catalog in e-commerce web site. As a result, enormous XML data will be generated in the near future, and the research on efficiently searching information in XML document database is urgent. For this end, W3C adopts XQuery [2] as its standard query language for XML documents over the Internet. By the way, one of the core components of XQuery is the containment query. The containment query is a query based on the containment relationships among the elements, attributes and contents within an XML document. Since this containment query is an essential type of XML query, it is very important to support containment query efficiently. Recently, there have been much work on containment query support based on relational databases [4,7]. We also proposed an index scheme for XML containment query using RDBMS [6]. We believe that, for these academic approaches to be viable ∗
This research was supported by the MKE(Ministry of Knowledge Economy), Korea, under the ITRC(Information Technology Research Center) support program supervised by the IITA(Institute of Information Technology Advancement) (IITA-2009-(C1090-0902-0046)).
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 453–464, 2009. © Springer-Verlag Berlin Heidelberg 2009
454
G.-J. Na and S.-W. Lee
options for commercial uses, it must be possible for users to use the schemes easily as if they were built-in index types such as B+ tree and bitmap indexes. In this paper, we propose an extensible index type for efficient processing various containment queries over XML data. Even though brand-new native XML databases are also used for XML storage, we believe that Object-Relational DBMSs (hereafter, ORDBMSs) will be the main-stream database for XML data in the near future. This prediction is mainly based on ORDBMS’s reliability, query optimization, and SQL language, which have been developed over the last 30 years. Nevertheless in order to support XML data, ORDBMS need an efficient XML index structure for containment queries, in addition to its traditional B+ tree and Bitmap index. For this end, in this paper we propose an extensible index for XML containment queries, based on ORDBMS’s extensibility framework. Even though we present our idea based on a specific ORDBMS, our scheme can be applied to other commercial DBMSs such as IBM DB2 and MS SQL Server because they also provide their own extensibility framework for user-defined indexes. The organization of this paper is as follows. Firstly in section 2, we will describe preliminary knowledge for understanding our extensible index concept, including XML containment queries, inverted index and an ORDBMS’s extensibility. In section 3, we will explain the implementation details of our extensible index, based on a commercial ORDBMS’s extensibility. In section 4, we will touch upon how naïve users create an extensible index for XML data, and will explain how a containment query can exploit the extensible index. Finally in section 5, we conclude the paper, and list several future work. In this paper, we do not provide a section for related work, partly because there are, as far as we know, no work on extensible index for XML containment queries, and partly because there is, for commercial ORDBMSs’ support for XML index, no publicly available technical information.
2 Preliminary Knowledge 2.1 XML Containment Queries Containment queries are a class of queries based on containment and proximity relationships among elements, attributes, and their contents [4]. As these containment queries are crucial parts of queries processed by XML IR systems, we should consider finding an effective method for processing containment queries. In this paper, we classify containment queries for XML documents into four types as in [4], and use Figure1 to show examples of each type. 1. Indirect Containment Query: A query consisting of indirect containment relationships (predecessor-descendant relationships) among elements, attributes, and their contents. e.g. /companies//profile//'scanners' When an XML document is represented by means of a tree such as in Figure 1, the leading "/" indicates that "companies" must be a root element and "//" represents a predecessor-descendant relationship. Therefore, this query means retrieving XML
A User-Defined Index for Containment Queries in XML
455
AAPL Apple Computer, Inc. Texas Austin Designs, develops, produces, markets and services, microprocessor based personal computers, related software and peripheral products, including laser printers, scanners, compact disk read-only memory drives and other related products Fig. 1. Example XML Document
documents in which "companies" root elements have "profile" descendant elements and in turn "profile" elements have "scanners" descendant words. 2. Direct Containment Query: A query consisting of direct containment relationships (parent-child relationships) among elements, attributes, and their contents. e.g. : /companies/company/profile/description/'printers' "companies" root elements must have "company" child elements and in turn "company" elements must have "profile" child elements. "profile" elements must have "description" child elements, whose content contains "printers". Therefore, this query involves extracting all XML documents satisfying the above conditions 3. Tight Containment Query: A query consisting of tight containment relationships among elements, attributes, and their contents. e.g. //symbol='AAPL' This query means retrieving XML documents in which "symbol" elements contain only "AAPL" in their contents. 4. k-Proximity Containment Query: A query based on the proximity between two words in contents. e.g. (k=3): Distance("printers","scanners") < 3 This query means retrieving XML documents in which an occurrence of "printers" is within distance k of "scanners". The hybrid forms of four basic containment queries can be regarded as the core parts of queries over XML IR systems. For example, “/companies/company// description = ‘plastic’ ” is a hybrid form of direct and indirect containment queries. “/companies/company//address/city=‘Austin’” is a hybrid form of direct, indirect and tight containment queries. In this paper, we define the path length of a query as the number of containment relationships in a query.
456
G.-J. Na and S.-W. Lee
2.2 RDB-Based Extended Inverted Index Approach to XML Containment Queries The inverted index, which is very popular in traditional IR systems, is a technique based on words to make an index for the text to enhance the speed of search activities [5]. The classic inverted index form consists of a text word and its occurrence which enumerates its positions within each document. To support XML containment queries in relational database, Zhang et al. [4] proposed a modeling technique based on the following two tables by extending the inverted index technique, Elements (term,docno,begin,end,level) Texts (term,docno,wordno,level) By the way, we pointed out that this extended index approach, since it requires a self-join between Elements tables to process containment relationships between two elements, has two serious drawbacks [6]. The first one is that, the number of joins is proportional to the path length. Another serious problem in the approach is the size of relational tables involved in a join operation. If 2-INDEX approach is used for voluminous XML documents, there is always a join operation between two large relations whenever a join operation occurs, since “Elements” and “Texts” relations become huge in general. To overcome the problems, we proposed another technique to extend the inverted index using 4 relational tables. Path (path, pathID) PathIndex (pathID, docID, begin, end) Term (term, termID) TermIndex (termID, docID, pathID, position) We showed in [6] that our technique can improve the performance of XML search drastically over Zhang’s approach. Yoshikawa et al. also proposed a similar approach [7], and they also show the superiority of our approach. According to Joe Celko’s definitions on relational modeling for tree and hierarchical data [19], our approach can be categorized as the hybrid approach of nested set model and path enumeration. 2.3 Extensibility in Object-Relational DBMS In the early 1990s, RDBMS vendors had invested much effort into their products in order to support object concept, and now some RDBMS products such as IBM DB2 and Oracle9i fully support the object concept. We call these DBMSs as ORDBMSs (object-relational DBMSs). One of the main reasons why DBMS engine supports the object concept is to effectively support application areas which need to support new multimedia data types such as spatial data, text, video, and XML. For this, user should be able to register new data types into DBMS engine, and DBMS engine should efficiently support various operations and indexing techniques for them. We call this kind of ORDBMS’s implementation infrastructure including creation of new data types, registration of operators, and indexes for new data types and operators as extensibility [8]. Major ORDBMS vendors are now providing this extensibility features in their products [9, 10, 11].
A User-Defined Index for Containment Queries in XML
457
Fig. 2. Extensibility Concept in ORDBMSs
As in Figure 2, Oracle provides the extensibility in the name of Data Catridge [9], and its goal is to provide, as with the built-in data types such as integer and string , user-defined types with type system, query processing, and indexing technique. IBM DB2 and Informix (merged into IBM in 2002) also provide the extensibility in the name of Data Extender[10] and Data Blade[11], respectively. Currently, ORDBMS vendors, based on this extensibility, provides many functionalities for new database application areas. Oracle, based on its data catridge, provides the functionalities of spatial data [13], images [14], texts [15], e-Commerce data [16], and expressions [17].
CREATE INDEX
USER APPLICATIO N
INSERT SELECT ANALYZE
ODCIIndexCreate
ORACLE SERVER
ODCIIndexInsert ODCIIndexStart/ Fetch/Close ODCIStatsSelectivity/ IndexCost
User-defined index & User-defined stats
ODCIStatsCollect
Create/ Insert/ Select/ Analyze Logical Index
Physical Index Table
Oracle 8i Database
Fig. 3. Extensibility Interface in Oracle DBMS
458
G.-J. Na and S.-W. Lee
For easily understanding the concept of extensible indexes, let’s consider a B+index on integer type. First, users can create a B+-tree index on any field of integer type in a table. Then, when records are inserted/deleted/updated in the table, the B+tree will also be changed accordingly. And, when a query with predicate on the column (for example, >, = ,!=) is imposed, the index may be used in order to accelerate the search in the table. Normally, users are expecting these features from an index. Similarly, with extensible indexes, 1) user can create indexes for user-defined types and operators, 2) the contents of an extensible index are also changed appropriately when the corresponding tables are updated, 3) query optimizer can exploit an extensible index for query processing. Figure 3 shows these concepts of extensible indexes. As shown in the left side of Figure 3, naïve database users just can create an extensible index on user-defined type, can insert/delete/update the base table. Upon these operations, Oracle server internally calls the corresponding functions (denoted as ODCIxxxx in the Figure 3; ODCI is an acronym for Oracle Data Catridge Interface) which has been provided by developers of the extensible index. The physical structure of an extensible index is stored as table(s).
3 Implementation of Extensible Index for XML Containment Queries In this section, we will describe how we implement an extensible index for XML containment queries. The implementation techniques are based on the concepts explained in the previous section. In this paper, we assume that XML documents are stored as BLOB or CLOB type in a table, and the containment queries for them are expressed using “Contains” operators within SQL query statements. For the rest of this paper, we assume the XML documents are stored as document column with BLOB type in Company table, as follows. The doc_id field represents unique document identifier which is assigned when the XML document is inserted into the table. Company (doc_id number, document BLOB) This section is organized as follows. First, we will explain how users express containment queries using the Contain operators. Then, we will describe the implementation details of our extensible indexes. Finally, we will give an example how naïve database users can create an extensible index over a user-defined type. 3.1 XML Containment Queries in SQL The indirect containment query “ /companies//profile//'scanners' ” against the in Figure 1 can be expressed in SQL, as follows: Example Query Q1: SELECT doc_id FROM company WHERE contains(document, ‘ /companies//profile//'scanners' ‘) = true; In the above example, the operator “contains” returns ‘true’ when a XML document satisfy the containment relationship, and, if not, returns ‘false’. In case when an extensible index is not defined on the column ‘doc’, the following function
A User-Defined Index for Containment Queries in XML
459
“contains_func” should be registered into the Oracle server, and the Oracle query processor, using this function, checks whether an XML document satisfy the query condition. If an extensible index is provided, the query processor will consider two options for query processing: 1) using full table scan + “contains_func” function for each tuple, and 2) using an extensible index. The first option is not the scope of this paper, and thus we will not further consider it. /* Creation of Function “Contains_Func” */ create function contains_func( document BLOB, condition VARCHAR2) return BOOLEAN as begin // 1) Parse the given document, // 2) Check whether it satisfy the containment condition end within_distance_func; /* Creation of Operator :“Contains” */ create or replace operator contains binding(BLOB, VARCHAR2) return boolean using contains_func; As shown in the above, only when the operator “contains” is declared, Oracle server will recognize it as an operator, and deal it as other built-in operators, such as >, 0 . A step length λ is considered acceptable by Armijo rule if:
)
• θ (λ ) ≤ θ (λ ) (to assure sufficient decrease of
)
θ ) and
• θ (γλ ) ≥ θ (γλ ) (to prevent the step size from being too small) The above rule yields a range of acceptable step lengths. In practice, to find a step length in this range, Armijo rule is usually implemented in an iterative fashion, using a fixed initial step size λ0 > 0 : Step 0 Set k = 0 . λ0 > 0 . ) Step k If θ (λ k ) ≤ θ (λ k ) , choose λk +1 ← βλk , k ← k + 1 .
λk
) as the step length; stop. If θ (λk ) > θ (λk ) , let
This iterative scheme is often referred to as backtracking. Note that as a result of backtracking, the chosen step length is λk = β k λ0 , where k ≥ 0 is the smallest inte) ger such that θ (β k λ0 ) ≤ θ (β k λ0 ) . The value of β is set as 0.5 by convention. In the following experiments,
λ0 is
variable so that the number of evaluations can be
controlled for different test functions for experiment purpose. 3.2 Description of CLS-PSO Hybrid In the proposed CLS-PSO algorithm, some particles from the current generation are selected to form a sub swarm (subswarm-1) and joined in Armijo line search. These particles may achieve a sufficient increase in their fitness. In that case, we let the swarm parameter p g immediately reflect the improvement of fitness achieved by these particles. The rest of the swarm (subswarm-2) execute the S-PSO algorithm. They are also allowed to update p g . Finally, these two sub swarms are merged into a single swarm and employed for the next iteration. This procedure is repeated until a termination criterion is reached. The following pseudo code shows the operations of CLS-PSO: INITIALIZE { Swarm scale: SIZE;
A PSO − Line Search Hybrid Algorithm
551
Initial states of particles: vi , xi ; Running parameters: c1 , c2 , w ; Number of particles expected for line search: PN ; Constant parameter: G0 ; Armijo parameters:
λ0 , β , α ;
Maximum number of generations:
I Max ;} ENDINITIALIZE
WHILE (number of generations <
I Max ) {
Step1: Select PN particles from swarm according to a certain strategy as subswarm1; (here, we employ random selection) Step2: FOR each particle i in subswarm-1 { IF (|| Gradient ( xi ) || < G0 ) { Execute Armijo line search using xi as an initial point, then gain new xi , and maintain vi unchanged; } ENDIF } ENDFOR Step3: evaluate subswarm-1 and update pi and p g ; Step4: for subswarm-2, execute S-PSO according to equation (1); Step5: evaluate subswarm-2 and update pi and p g ; Step6: merge subswarm-1 and subswarm-2 into a whole swarm; } ENDWHILE
4 Experimental Settings and Results A set of nonlinear functions are used to evaluate algorithm performance. These functions are defined as follows: (1) Sphere function (unimodal, global minimum: f ( x ) = 0 , xi = 0 ) : D
f Sh ( X ) = ∑ xi2 i =1
(5)
552
X. Liang, X. Li, and M.F. Ercan
(2) Rosenbrock function (unimodal, global minimum: f ( x ) = 0 , xi = 0 ) : D −1
(
(
)
f Ro ( X ) = ∑ 100 xi +1 − xi2 + ( xi − 1) i =1
2
2
)
(6)
(3) Rastrigrin function (multimodal, global minimum: f ( x ) = 0 , xi = 0 ) : D
(
f Ra ( X ) = ∑ xi2 − 10 cos(2πxi ) + 10 i =1
)
(7)
(4) Griewank function (multimodal, global minimum: f ( x ) = 0 , xi = 0 ) :
fGr ( X ) =
1 D 2 D ⎛x ⎞ xi − ∏ cos⎜ i ⎟ + 1 ∑ 4000 i =1 ⎝ i⎠ i =1
(8)
(5) Ackley function (multimodal, global minimum: f ( x ) = 0 , xi = 0 ) :
⎛ 1 D 2 ⎞⎟ ⎛1 D ⎞ f Ac ( X ) = −20 exp⎜ − 0.2 xi − exp⎜ ∑ cos(2πxi )⎟ + 20 + e ∑ ⎜ ⎟ D i =1 ⎠ ⎝ D i =1 ⎠ ⎝
(9)
(6) Schaffer F6 function (multimodal, global minimum: f ( x ) = 0 , xi = 0 ):
f Sc ( X ) =
sin 2 x12 + x22 − 0.5
(1 + 0.001(x
2 1
+ x22
))
2
+ 0.5
(10)
Parameter Setting for PSO: The inertial weight w was decreased from wstart = 0.9
I n = 1,...,0.7 × I Max and for the remaining iterations, w remains at 0.4. The acceleration constants are set as c1 = c 2 = 2 . to wend = 0.4 linearly for
Parameter Settings for Armijo Line Search: In our algorithm, if it is anticipated that the fitness of a particle, which is selected to join the line search, is not going to be improved considerably in the future generations, we try to exclude that particle in line search. In other words, if a selected particle locates at a steep slope then it should start the line search. Parameter G0 , shown in the above pseudo code, is used for this pur-
pose. Another parameter
λ0
is the initial step size required for Armijo line search. In
our experiments, both G0 and λ0 are tuned so that approximately the same evaluation time is given to the test functions. Other Armijo parameters are fixed as α = 0.001 , β = 0.1 . For each test function, the initialization intervals and
Vmax values are shown in Table 1.
A PSO − Line Search Hybrid Algorithm
553
Table 1. Initialization intervals and maximum velocity function
Initialization interval
Vmax
fSp fRo fRa fGr fAc fSc
[-100, 100] [-2.048. 2.048] [-5.12, 5.12] [-600, 600] [-30, 30] [-100, 100]
100 2.048 5.12 600 30 100
Tables 2, 3 and 4 summarise the results of CLS-PSO experiments. Results, which are better than that of S-PSO, are marked with italic and the best results among them are highlighted with bold. Notations used in these tables are: DIM (dimensions of function), SIZE (number of individuals in swarm), GenNum (generation number), F/M (function/the optimal result of the function); BEST (the best result), MEDIAN (median of results), MEAN (mean of the results), WORST (the worst result), STD (standard error of the mean), FNum (number of function evaluations), PN (number of the particles joining line search),VER (versions of algorithms (S-PSO or CLSPSO)).For two dimensions case, CLS-PSO delivers the best results for functions fSh, fRo, fRa, fAc and fSc when the value of PN is 2, and for fGr, when PN is 1 (see Table 2). For ten dimensions case, when PN is 4, we obtained the best results for fSh, fRo, fRa, fAc and fSc except for fGr (see Table 3). For thirty dimensions, the best results are obtained when the value of PN is 6, except for fGr (see Table 4). Approximately, equal computation times are allocated for all the functions in the same set of experiments. It is Table 2. Functions with 2 DIM using algorithms with SIZE=5 and GenNum=50 F/M
BEST
MEDIAN
MEAN
WORST
STD
FNum
PN
fSh/0
0.000203
0.009949
0.050915
0.834574
0.125179
255
s-PSO
VER
0.005982 0.008033 0.037134
0.027989 0.02659 0.178293
0.413848 0.440727 4.22069
0.065827 0.066135 0.6163
249 242 255
1 CLSPSO 2 s-PSO
G0=2 Ȝ0=1
fRo/0
0.000024 0.000031 0.000104
0.002306 0.00016 0.998391
0.012047 0.006756 0.880693
0.161764 0.24979 4.97689
0.028923 0.03612 0.870117
254 251 255
1 CLSPSO 2 s-PSO
G0=2 Ȝ0=0.01
fRa/0
0 0.000001 0.000069
0.995223 1.004706 0.133698
0.86459 0.905111 0.149837
2.952167 2.204677 0.365341
0.771634 0.636771 0.089461
255 254 255
1 CLSPSO 2 s-PSO
G0=2 Ȝ0=0.01
fGr/0
0.000135 0.002443 0.020092
0.073973 0.122168 0.010790
0.117988 0.17001 0.024403
0.417868 0.801768 0.079041
0.108264 0.155928 0.021807
255 255 255
1 CLSPSO 2 s-PSO
G0=0 Ȝ0=1
fSc/0
0.000003 0.007687 0.009716
0.009999 0.012475 0.028048
0.025689 0.020901 0.097690
0.429723 0.127968 2.581438
0.059599 0.021322 0.363392
255 255 255
1 CLSPSO 2 s-PSO
G0=0 Ȝ0=1
fAc/0
0.000110 0.000291 0.001524 0.000417 0.000004
0.007589 0.003505
0.139256 0.050880
0.787129 2.765319
0.152626 0.489793
275 328
1 2
G0=2 Ȝ0=1
CLSPSO
554
X. Liang, X. Li, and M.F. Ercan Table 3. Functions with 10 DIM using algorithms with SIZE=10 and GenNum=80
F/M
BEST
MEDIAN
MEAN
WORST
STD
FNum
PN
fSh/ 0
10.99746 2.604126 1.138475 0.151992 6.208669 4.332262 2.289102 0.930739 17.9758 11.05158 9.714402 7.34792 1.042606 1.099857 1.151146 1.374584 1.743253 0.068017 0.045864 0.025243
68.294035 14.448428 7.670247 2.875198 13.465615 8.786142 8.101985 6.309097 43.49918 32.668272 29.986856 29.558088 1.717609 1.678122 2.110503 3.755625 3.499555 2.824217 2.252571 2.398371
79.832484 17.206827 8.691129 3.274787 24.618283 9.057704 7.938878 6.109603 41.686909 33.919513 31.710038 29.921405 1.829192 1.886202 2.440084 3.803974 3.387758 2.707395 2.348255 2.456063
250.21541 83.301085 25.87961 8.8134 81.752671 17.426607 10.37776 9.752152 69.701227 64.147091 61.714373 52.664531 4.409106 3.719182 7.237958 7.918195 5.157087 4.963112 5.127834 7.802689
57.91012 14.61279 5.105647 1.921044 22.37126 2.08105 1.537704 1.87612 11.61018 11.47117 12.6663 10.10675 0.68367 0.674526 1.175986 1.448149 0.910514 1.068068 1.127768 1.904701
810 808 803 753 810 814 817 814 810 810 919 1044 810 810 810 810 810 844 886 994
s-PSO 1 2 4 S-PSO 1 2 4 S-PSO 1 2 4 S-PSO 1 2 4 S-PSO 1 2 4
fRo/ 0
fRa/ 0
fGr/ 0
fAc/ 0
VE R CLS PSO
G0=20 Ȝ0= 0.1
CLS PSO
G0=20 Ȝ0= 0.001
CLS PSO
G0=20 Ȝ0= 0.01
CLS PSO
G0=0 Ȝ0=1
CLS PSO
G0=0 Ȝ0=1
Table 4. Functions with 30 DIM using algorithms with SIZE=15 and GenNum=100 F/M fSh/0
fRo/0
fRa/0
fGr/0
fAc/0
BEST 1408.113 78.72999 58.47356 19.29957 184.1421 31.93366 22.39567 24.23159 157.2403 101.9565 77.81324 77.1869 17.93947 16.26624 21.59679 27.5395 7.803321 8.531718 7.985930 7.186708
MEDIAN 3773.24436 334.841966 128.987634 49.449917 349.18866 62.464877 29.739173 28.57671 228.94188 196.91634 180.136093 163.662048 36.763406 41.678887 46.588436 66.335886 10.036393 9.894961 10.228644 9.016653
MEAN 3851.19971 431.591709 163.417036 51.159679 361.497842 66.933892 30.35641 28.580692 230.901811 199.15285 175.316802 163.890833 40.103411 41.74746 44.20316 68.060567 10.004640 11.972540 13.692525 9.106547
WORST 6538.82719 1471.71128 793.621874 86.857909 736.604201 208.118929 78.446942 30.604132 301.384898 290.761195 242.476217 233.429565 85.620382 63.674032 70.033391 113.452199 12.907506 12.634181 11.706500 11.439774
STD 1180.174 314.3907 120.1027 16.53508 101.1722 31.20958 7.140461 1.425963 34.33205 41.21422 37.03607 33.77958 14.89935 10.96972 13.23021 18.01507 1.207770 2.464983 2.000572 1.350484
FNum 1515 1514 1510 1389 1515 1531 1539 1521 1515 1515 1514 1514 1515 1515 1515 1515 1515 1516 1520 1601
PN VER S-PSO CLS1 PSO 3 6 S-PSO 1 CLSPSO 3 6 S-PSO 1 CLSPSO 3 6 S-PSO 1 CLSPSO 3 6 S-PSO 1 CLSPSO 3 6
G0=30 Ȝ0= 0.1 G0=30 Ȝ0= 0.001 G0=30 Ȝ0= 0.001 G0=0 Ȝ0=1 G0=0 Ȝ0=1
clearly observed that the performance of CLS-PSO outperforms that of S-PSO except for fGr. From the above analysis, it appears that for higher PN, CLS-PSO performs better, except for fGr case. It is expected that when PN is increased, the number of function evaluations will also increase significantly. However, at present, we have not concluded the best PN value for different complexity of problems. The reason why CLS-PSO is unable to demonstrate a notably improved performance for the function
A PSO − Line Search Hybrid Algorithm
555
fGr is not clear. Even for the increased PN value, we do not observe a major improvement. Experimental settings could be one possible reason which will be studied further. However, from the results presented, it is obvious that over all CLS-PSO outperforms the S-PSO algorithm.
5 Conclusion In this paper, we proposed a cooperative line search-PSO hybrid algorithm (CLSPSO), and compare its performance with standard PSO (S-PSO) by numerical experiments. From the experimental results, we conclude that proposed algorithm outperforms S-PSO. In our future study, we intend to conduct more experiments in order to understand the behaviour of CLS-PSO. In addition, the collaboration methods between line search and PSO, effective selection strategies for the particles joining the line search and its implications on the result will also be studied further.
Acknowledgments This work was supported by National Science Foundation of China (grant No. 60874070), by National Research Foundation for the Doctoral Program of Higher Education of China (grant No. 20070533131), and by Scientific Research Foundation of Hunan Province Education Committee (grant No. 07C319).
References 1. Banks, A., Vincent, J., Anyakoha, C.: A Review of Particle Swarm Optimization. Part I: Background and Development. Natural Computing 6(4), 46–484 (2007) 2. Banks, A., Vincent, J., Anyakoha, C.: Review of Particle Swarm Optimization. Part II: Hybridization, Combinatorial, Multicriteria and Constrained Optimization, and Indicative Applications. Natural Computing 7(1), 109–124 (2008) 3. Li, Y., Chen, X.: Mobile Robot Navigation Using Particle Swarm Optimization and Adaptive NN. In: Wang, L., Chen, K., S. Ong, Y. (eds.) ICNC 2005. LNCS, vol. 3612, pp. 628– 631. Springer, Heidelberg (2005) 4. Omran, M., Salman, A., Engelbrecht, A.P.: Image Classification Using Particle Swarm Optimization. In: 4th Asia-Pacific Conference on Simulated Evolution and Learning, pp. 370–374 (2002) 5. Xia, W.J., Wu, Z.M.: A Hybrid Particle Swarm Optimization Approach for the Job-Shop Scheduling Problem. International Journal of Advanced Manufacturing Technology 29, 360–366 (2006) 6. Asselmayer, T., Ebeling, W., Rose, H.: Evolutionary Strategies of Optimization. Phys. Rev. E 56(1), 1171–1180 (1997) 7. Whittey, D.: A Genetic Algorithm Tutorial. Statistical Computation 4(2), 65–85 (1994) 8. Dorigo, M., Maniezzo, V., Colorni, A.: Ant System: Optimization by A Colony of Cooperating Agents. IEEE transactions on Systems, Man and Cybernetics-part B 26, 29–41 (1996) 9. Wang, L.: Intelligent Optimization Algorithms with Application. Tsinghua University and Springer Press, Beijing (2001)
556
X. Liang, X. Li, and M.F. Ercan
10. Kennedy, J., Eberhart, R.C.: Particle Swarm Optimization. In: IEEE Int. Conf. on Neural Networks, pp. 1942–1948. IEEE Press, Piscataway (1995) 11. Yang, G., Chen, D., Zhou, G.: A New Hybrid Algorithm of Particle Swarm Optimization. In: Huang, D.S., Li, K., Irwin, G.W. (eds.) ICIC 2006. LNCS (LNBI), vol. 4115, pp. 50– 60. Springer, Heidelberg (2006) 12. Zhang, Q., Li, C., Liu, Y., Kang, L.: Fast Multi-swarm Optimization with Cauchy Mutation and Crossover Operation. In: Yang, L., Liu, Y., Zeng, S. (eds.) ISICA 2007. LNCS, vol. 4683, pp. 344–352. Springer, Heidelberg (2007) 13. Chen, J., Zheng, Q., Yu, L., Jiang, L.: Particle Swarm Optimization with Local Search. In: IEEE Int. Conf. on Neural Networks and Brain, China, Beijing, pp. 481–484 (2005) 14. Das, S., Koduru, P., Min, G., Cochran, M., Wareing, A., Welch, S.M., Babin, B.R.: Adding Local Search to Particle Swarm Optimization. In: IEEE Congress on EC, pp. 428–433 (2005) 15. Wang, J., Zhou, Y.: Quantum-Behaved Particle Swarm Optimization with Generalized Local Search Operator for Global Optimization. In: Huang, D.-S., Heutte, L., Loog, M. (eds.) ICIC 2007. LNCS, vol. 4682, pp. 851–860. Springer, Heidelberg (2007) 16. Liang, X., Xu, C., Qian, J.: A Trust Region-type Method for Solving Monotone Variational Inequality. Journal of Computational Mathematics 18(1), 13–14 (2000) 17. Liang, X., Xu, C., Hu, J.: A Potential Reduction Algorithm for Monotone Variational Inequality Problems. Systems Science and Mathematical Sciences 13(1), 59–66 (2000)
Using Meaning of Coefficients of the Reliability Polynomial for Their Faster Calculation Alexey Rodionov1 , Olga Rodionova2 , and Hyunseung Choo3 1
3
Institute of Computational Mathematics and Mathematical Geophysics SB RAS, Pr. Lavrentieva, 6, Novosibirsk, 630090, Russia
[email protected] 2 Siberian University of Consumers’ Co-operative System Pr. Marksa, 26, Novosibirsk, 630087, Russia
[email protected] School of Information and Communication Engineering, Sungkyunkwan Univ. Chunchun-dong 300, Jangan-gu, Suwon 440-746, South Korea
[email protected]
Abstract. We propose some new approaches to the problem of obtaining the reliability polynomial of a random graph. The meaning of coefficients of the reliability polynomial in one of its presentation is used for the significant reducing of calculations while the factoring method underlies. Experiments shows significant speeding up in compare with well known package Maple 11 (up to 2000 times on the standard lattice example). Keywords: reliability polynomial, factoring method.
1
Introduction
Random graphs are widely used in modeling and simulation of different kinds of networks, for their reliability analysis in particular. One of most popular reliability indices is the probability of graph’s connectivity (graph’s probabilistic connectivity). For short hereafter we refer to it as simply graph’s reliability. We consider the case when all nodes are reliable while edges can fail independently with equal probability q (p = 1−q – edge’s reliability). We are interested in the analytical expression describing the dependence of the graph’s reliability on p. This function is usually referred to as the reliability function or Reliability Polynomial (RP) [1, 2, 3]. RP is the avowed tool for comparison of different structures of random graphs using the reliability criteria. It serves also as an examination of a link’s importance in networks [4] and as proof of a uniform or non-uniform optimality for some structures [5, 6]. Uniform optimality, for given number of nodes and edges, represents the existence of a graph possessing maximal reliability for all edge reliability p ∈ (0, 1). Obtaining RP in general case is the NP-hard problem, so any results that allows speed up calculation of its coefficients are of high importance. O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 557–571, 2009. c Springer-Verlag Berlin Heidelberg 2009
558
A. Rodionov, O. Rodionova, and H. Choo
We propose some new approaches that, by taking meaning of RP coefficients in one of its representations into account, and by using some vector expressions, allows their obtaining much faster then by traditional algorithms. The rest of the paper is organized as follows: in section 2 we introduce main definitions, notations and preliminary results needed for further considerations. In section 3 we discuss the factoring method and its modification. Section 4 is devoted to the discussion of graphs that allow direct obtaining of their reliability polynomials. Section 5 contains the case study and section 6 is the brief conclusion.
2 2.1
Notation and Preliminaries Main Notations
We consider graphs having a connected structure (if all edges are reliable, then the graph is connected). Several edges between pair of nodes are possible, thus graphs under consideration are multi-graphs. Let us first make the following denotations: G(n, ν) = (V, U, p) – non-oriented multi-graph with set of nodes V (|V | = n) and set of multi-edges U (|U | = ν); n eij ∈ U – multi-edge with multiplicity sij that connects nodes vi and vj , n j=1
i=1
sij = m. Sometimes, when considering special set of edges, we use ek
(k-th multi-edge) instead of eij . p – edge’s reliability; C(n, S) and T(n, S) – a cycle or a tree of multi-edges with n nodes and vector of multi-edges’s multiplicities S, respectively; R(G, p) – reliability polynomial of a graph G. Second parameter can be omitted if no variant reading. Vectors are denoted by bold capitals. Subvector of first k elements of vector A is denoted as A(k) while correspondent subvector of its last elements is denoted as A[k] (if numeration of elements starts from zero, then vector A(k) has k + 1 elements). Subvector (slice) of elements of vector A from i-th to j-th, i ≤ j we denote as A[i,j] . Vector A shifted by k positions to the right (left) is denoted as Ak+ (Ak− ). L(A) is the number of last element in A, thus the number of elements in A is L(A) + 1. As P asc(n) we denote vector of n+1 binomial coefficients Cni , i = 0, . . . , n; S n (m) = P asc(m)(n−m)+ (thus for n ≥ m we have a vector of n + 1 elements in which first n − m elements are zeros and last m + 1 – binomial coefficients. In the case of n < m we have a slice of first n + 1 elements of P asc(m)). Xn (i) is a vector in which all elements but i-th (0 ≤ i ≤ n) are zeros and i-th element is 1. If we sum up vectors with different number of elements, then we fill shorter vectors by zeros from the right.
Using Meaning of Coefficients of the RP for Their Faster Calculation
559
We define the following useful function that is equivalent to 1: I(n) ≡ 1n ≡ [p + (1 − p)]n =
n
Cni pi (1 − p)n−i =
i=0
n
Cni pn−i (1 − p)i .
(1)
i=0
Some other polynomials are frequently used in our derivations so we give them special denotations: N (s, p) = (1 − p)s ;
(2)
M (s, p) = 1 − N (s) = I(s) − (1 − p)s =
s−1
Csi ps−i (1 − p)i ;
(3)
i=0 L(S)
D(S, p) =
M (si , p);
(4)
i=1 L(S)
Z(S, p) =
i=1
L(S)
N (si , p)
M (si ).
(5)
j=1, j =i
N (s, p) corresponds to the probability that multi-edge with the multiplicity s fails completely, while M (s, p) shows the probability that al least one of its edges is in working state. Note that N (k + l) = N (k)N (l). Note also that D(S, p) is the RP of a tree while D(S, p) + Z(S, p) is the RP of a cycle with vector of edges’ multiplicities S (these cases are discussed in Section 4 in details). For short, if it does not lead to variant reading, then we use edge for multi-edge in the paper. If it does matter, we use multi-edge or single edge. 2.2
Representation of RP and Why It Is Important
Representation of the Reliability Polynomial. RP can be presented in different forms (see [2], for example). We choose the following representation: R(G, p) =
m
ai (1 − p)i pm−i ,
(6)
i=0
where ai represents the number of connected sugraphs (subgraphs with the complete set of nodes) of G with a given number of edges (i edges fail, m−i exist). This meaning is highly useful for our reasoning. Coefficients of RP in its classic representation R(G, p) =
m
b i pi
(7)
i=0
are connected with those in (6) by the following easily obtained expressions: b0 = am ;
bm−i =
m j=i
(−1)i+j Cji aj ,
i = 1, . . . , m.
(8)
560
A. Rodionov, O. Rodionova, and H. Choo
am = b 0 ;
m ai = b i + (−1)i+j−1 Cji aj ,
i = m−1, . . . , 0.
(9)
j=i+1
Coefficients bi cannot be interpreted substantially, but representation (7) can be useful for some derivations and algebraic transformations. Value of Some Coefficients of RP. One of the important consequencies of the meaning of coefficients in (6) is that all ai for i > m−n+1 are zeros. Really, it is not possible connect all n nodes by less than n − 1 edges. If the graph under consideration has a connected structure, then a0 is always 1. Coefficient am−n+1 is equal to the number of covering trees, thus its obtaining is possible by using the Kirchhoff theorem [7]. If there are at least k edge-independent paths between any pair of edges, then i ∀i∈[0, . . . , k−1] ai = Cm . In particular, if there are no bridges in the graph, then a1 = m. Note that it is possible find maximal k for initial graph, but it is inexpediently obtain it for intermediate graphs during calculation by the factoring method. Note also that there is no need for multiple recalculations of the binomial coefficients. Matrix of Cij (Pascal’s triangle) of a needed dimension can be obtained before main calculations and considered as known.
3
Factoring Method and Its Usage for Obtaining Coefficients of RP
For obtaining RP of the kind (6) the factoring (branching) method [8, 9] suits excellently: R(G, p) = M (sij , p)R(G∗ (eij )) + N (sij , p)R(G\{eij }),
(10)
where G∗ (eij ) is a graph G contracted by an edge eij , G\{eij } – a graph obtained from G by removal of this edge. Note, that M (sij , p) and N (sij , p) are polynomials of the kind (6). Thus our task is reduced to obtaining right polynomials for the terminal graphs. It is obvious that the factoring method is more effective if we can find terminal graphs on highest possible level and if we can reduce the graph dimension. Terminal graphs are discussed in the next section. Some methods of the reduction of graph’s dimension for the task under consideration are discussed in [2, 3]. Here we present one useful extension of the factoring method that we call branching by chain. We have presented this extension for the case of exact calculation of the probabilistic connectivity in [10]. Let us adapt it to the calculation of RP. Theorem 1. Let a graph G have a simple chain Ch = e1 , e2 , . . . , ek of edges with vector of multiplicities S = (s1 , s2 , . . . , sk ), connecting nodes vu and vt . Then the RP of G is equal to R(G, p) = D(S, p)R(G∗ ) + N (sut , p)Z(S, p)R(G\{e1 . . . , ek })
(11)
Using Meaning of Coefficients of the RP for Their Faster Calculation
561
where graph G∗ is derived from G by contracting nodes vu and vt by a chain, and G\{e1 . . . , ek } is a graph obtained from G by deletion of this chain with nodes (except for terminal ones), and sut is the multiplicity of an edge eut . Proof. Let eut be absent. By applying (10) to ek we obtain by contracting a graph G∗k−1 , in which the chain is shorter by one edge, and by removal – a graph (G\{e1 , . . . , ek }) with the chain of edges e1 , . . . , ek−1 that is attached to it in the node vu (Gk−1 ). RP for this last graph is R(Gk−1 ) = R(G\{e1 . . . , ek })
k−1
P (si , p).
(12)
i=1
To the graph G∗k−1 we apply (10) again, obtaining graph with shortened chain (G∗k−2 ) and graph Gk−2 with attached chain of k−2 edges and so on (see Fig 1). When contracting by the last edge e1 we obtain G∗ . Backtracking process k gives its probability M (si , p). RP for any of Gj is i=1
R(Gj ) =
j
M (si , p)R(G\{e1 . . . , ek }),
(13)
i=1
while its probability is
k i=j+2
M (si , p)N (sj+1 , p).
Thus whole backtracking process gives R(G) =
k i=1
M (si , p)R(G∗ ) +
k
N (si , p)
i=1
P (si , p)R(G\{e1 . . . , ek }).
(14)
j =i
Now let eut have the multiplicity sut ≥ 1. First we make factoring by this edge. By contracting nodes vu and vt we obtain with probability M (sut ) the graph G1 that is G∗ connected with the cycle C of edges e1 , e2 , . . . , ek . The joint node is the articulation point. RP of this graph is ⎡ ⎤ k k k R(G1 ) = R(G∗ )R(C) = R(G∗ ) ⎣ M (si , p) + N (si , p) M (si )⎦ . i=1
i=1
j=1, j =i
(15) By deleting eut we obtain with probability N (sut ) the graph G2 that is the case discussed above. Thus, using (14) we have R(G) = M (sut )R(G1 ) + N (sut )R(G2 ) ⎡ k k = M (sut )R(G∗ ) ⎣ M (si , p) + N (si , p) i=1
i=1
k j=1, j =i
⎤ M (si )⎦ +
562
A. Rodionov, O. Rodionova, and H. Choo
Fig. 1. Illustration to the proof of the theorem 1
⎡ N (sut ) ⎣
k i=1
M (si , p)R(G∗ ) +
k i=1
N (si , p)
j =i
⎤ P (si , p)R(G\{e1 . . . , ek })⎦
Using Meaning of Coefficients of the RP for Their Faster Calculation
=
m
M (si , p)R(G∗ ) + N (sut , p)
i=1
k
N (si , p)
i=1
563
M (si , p)R(G\{e1 . . . , ek }),
j =i
Using (4) and (5) we obtain what was to be proved. In the case when all si are equal to 1 we have
R(G) = pk R(G∗ ) + kR(G\{e1 . . . , ek })
s ut −1
Csiut psut −i+k−1 (1 − p)i+1 .
2
(16)
i=0
4
Graphs That Allow Direct Obtaining of RP
It is known that efficiency of the factoring method highly depends on the level on which the process can be terminated. Standard procedure suggested factoring till obtaining 2-node graph (polynomial for a edge) or disconnected graph (zero). In this section we present RPs for some other graphs whose recognition is not time-consuming. 4.1
3-Node Graph
G(3, m) is connected if not more than 1 edge are absent. Let multiplicities of its edges be s1 , s2 and s3 . We have R(p) = D(S, p) + Z(S, p) s1 m i i m−i = Cm p (1 − p) − Csi1 pi (1 − p)m−i − i=0 s2 i=0
Csi2 pi (1 − p)m−i −
i=0 s3 i=0
(17)
Csi3 pi (1 − p)m−i + 2(1 − p)m .
Thus, in vector form we have A = S m (m) −
3
S m (si ) + 2X m (m).
(18)
i=1
The number of covering trees (case of two single edges that do not belong to one multi-edge) and, correspondingly, am−2 is equal to s1 s2 + s1 s3 + s2 s3 , but in this case it is easier obtain it among other coefficients using (18). Coefficients am−1 (case of single edge) and am (case of no edges) are zeros. Thus, it is enough obtain the following subvector of non-zero coefficients: A(m−2) = (S m (m))(m−2) − (S m (si ))(m−2) . (19) si >1
It is clear that only multi-edges do matter.
564
A. Rodionov, O. Rodionova, and H. Choo
Fig. 2. Disconnected 4-node graphs
4.2
4-Node Graph
Let si , i = 1, . . . , 6 be the multiplicities of edges e12 ,e23 ,e34 ,e14 ,e13 , and e24 , correspondingly. Disconnected 4-node sugraphs are presented in Fig. 2, edges are represented as single ones. After subtraction of RPs of these sugraphs from I(m) and cancellation we obtain the RP for 4-node graph. Omitting the polynomial expression we present the resulting vector of coefficients of RP: A = S m (m) −
7
S m (ui ) + 2
i=1
6
S m (si ),
(20)
i=1
where u 1 = s1 + s3 ;
u 2 = s2 + s 4 ;
u 5 = s1 + s2 + s 6 ;
u 3 = s5 + s6 ;
u 6 = s2 + s3 + s5 ;
u 4 = s 1 + s 4 + s5 ;
(21)
u 7 = s 3 + s 4 + s6 .
It is obvious that am−2 , am−1 and am are zeros. By the Kirchhoff theorem we have that am−3 = (s1 +s4 +s5 )(s1 +s2 +s6 )(s3 +s4 +s6 )−2s1 s4 s6 −(s1 +s2 +s6 )s24 − (22) (s1 +s4 +s5 )s26 −(s3 +s4 +s6 )s21 Similar to the previous case obtaining the am−3 among other coefficients by (20) is simpler. Thus the subvector of non-zero coefficients of RP of the 4-node graph is A(m−3) = S m (m)(m−3) − (S m (ui ))(m−3) + 2 (S m (si ))(m−3) . (23) ui >2
4.3
si >2
Tree of Multi-edges
If we have the n-node tree with single edges, then its RP is simply pn−1 (all edges ought be in the working state). Let the multiplicities of its n−1 edges be
Using Meaning of Coefficients of the RP for Their Faster Calculation
565
si , i = 1, . . . , n−1. A tree is connected if and only if at least one edge in any multi-edge is in the working state. Thus R(T(n, S), p) = D(S, p) =
n−1
M (si , p).
(24)
i=1
For presenting this polynomial in the form (6) we need new denotations: M(m) – set {1, . . . , m}; Ui – subset of i elements from M(m); W (Ui ) = sj . j∈Ui
D(S, p) =
n−1
1 − (1 − p)si
(25)
i=1
= 1 − (1 − p)a − (1 − p)s2 − . . . − (1 − p)sn−1 + (1 − p)a+s2 + . . . + (1 − p)sn−2 +sn−1 − . . . + (−1)n−1 (1 − p)m n−1 =1+ (−1)i (1 − p)W (Ui ) i=1
= I(m) +
=
m
Ui ⊂2M(n−1) n−1
i=1
Ui ⊂2M(n−1)
(−1)i
i i Cm p (1−p)m−i
i=0
+
I(m − W (Ui ))(1 − p)W (Ui )
n−1
i
(−1)
i=1
W (Ui )
Ui ⊂2M(n−1) j=0
j j m−j Cm−W . (Ui ) p (1 − p)
Thus, for obtaining RP of a tree of multi-edges we need search all possible sums of multiplicities of its edges and calculate a number of variants for each sum (more precisely, we need obtain a number of variants with odd and a number of variants even number of items). So we have the following vector of coefficients: A = S m (m) +
n−1
i=1
Ui ⊂2M
(−1)i
S m (m − W (Ui )).
(26)
It is clear that minimal cut corresponds to the edge with minimal multiplicity i (s1 ), so for all i < s1 ai = Cm . The number of spanning trees is obviously am−n+1 =
n−1
si .
(27)
i=1
The subvector of coefficients from as1 to am−n is ⎛ ⎜ A[s1 ,m−n] = ⎜ ⎝S m (m) +
n−1
i=1
Ui ⊂2M W (Ui )≥m−n
(−1)i
⎞
⎟ S m (m − W (Ui ))⎟ ⎠
. (28) [s1 ,m−n]
566
A. Rodionov, O. Rodionova, and H. Choo
Example 1. Let us have the tree with 6 nodes and edges’ multiplicities 1,2,2,3 and 3. Using (24) we obtain (for short we use q for 1 − p)
2
2 R(T, p) = p · 1 − q 2 · 1 − q 3 (29)
2 4 3 5 7 6 8 10 = p · 1 − 2q + q − 2q + 4q − 2q + q − 2q + q = p · I(10) − 2q 2 I(8) + q 4 I(6) − 2q 3 I(7) + 4q 5 I(5) −
2q 7 I(3) + q 6 I(4) − 2q 8 I(2) + q 10 = p · p10 + 10p9 q + 45p8 q 2 + 120p7 q 3 + 210p6 q 4 + 252p5q 5 + 210p4q 6 + 120p3 q 7 + 45p2 q 8 + 10pq 9 + q 10 − 2p8 q 2 − 16p7 q 3 − 56p6 q 4 − 112p5q 5 − 140p4q 6 − 112p3 q 7 − 56p2 q 8 − 16pq 9 − 2q 10 + p6 q 4 + 6p5 q 5 + 15p4 q 6 + 20p3 q 7 + 15p2 q 8 + 6pq 9 + q 10 − 2p7 q 3 − 14p6 q 4 − 42p5 q 5 − 70p4 q 6 − 70p3 q 7 − 42p2 q 8 − 14pq 9 − 2q 10 + 4p5 q 5 + 20p4 q 6 + 40p3 q 7 + 40p2 q 8 + 20pq 9 + 4q 10 − 2p3 q 7 − 6p2 q 8 − 6pq 9 − 2q 10 + p4 q 6 + 4p3 q 7 + 6p2 q 8 + 4pq 9 + q 10 2p2 q 8 −
4pq 9 − 2q 10 + q 10 = p11 +10p10 (1−p)+43p9 (1−p)2 +102p8(1−p)3 +141p7(1−p)4 + 108p6(1−p)5 +36p5 (1−p)6 . Now transfer to the vector form. The following sums of edges’ multiplicities are possible: 1,2,3,4,5,6,7,8,9,10 and 11. 1 is obtained by single way (one item), 2 – by two (one item), 3 is obtained by tow ways as one item and by two ways as sum of two items (hereafter we show number of items in brackets and separate variants by ’+’), 4 – 3(2), 5 – 4(2)+1(3), 6 – 1(2)+4(3), 7 – 3(3), 8 – 2(3)+2(4), 9 – 2(4), 10 – 1(4), 11 – 1(5). According to (26) we have: A=
(1 11 55 165 330 462 462 330 165 55 11 (0 1 10 45 120 210 252 210 120 45 10 2 (0 0 1 9 36 84 126 126 84 36 9 2 (0 0 0 1 8 28 56 70 56 28 8 2 (0 0 0 1 8 28 56 70 56 28 8 3 (0 0 0 0 1 7 21 35 35 21 7 4 (0 0 0 0 0 1 6 15 20 15 6 (0 0 0 0 0 1 6 15 20 15 6 (0 0 0 0 0 0 1 5 10 10 5 4 (0 0 0 0 0 0 1 5 10 10 5 3 (0 0 0 0 0 0 0 1 4 6 4 2 (0 0 0 0 0 0 0 0 1 3 3 2 (0 0 0 0 0 0 0 0 1 3 3 2 (0 0 0 0 0 0 0 0 0 1 2 (0 0 0 0 0 0 0 0 0 0 1 (0 0 0 0 0 0 0 0 0 0 0 = (1 10 43 102 141 108 36 0 0 0 0
1) − 1) − 1) − 1) + 1) + 1) + 1) − 1) + 1) − 1) − 1) − 1) + 1) + 1) + 1) − 1) 0) .
Using Meaning of Coefficients of the RP for Their Faster Calculation
567
If the factors at vectors that corresponds to different sums are calculated before summarizing, then we can reduce the number of operations significantly. These factors are: 1 – 1, 2 – 2, 3 – 0, 4 – 3, 5 – 3, 6 – -3, 7 – -3, 8 – 0, 9 – 2, 10 – 1, 11 – -1). Now refer to the subsection 2.2. First, we know a0 = 1 and a6 = 1 · 2 · 2 · 3 · 3 = 36. Second, as for all i > 6 ai = 0, we need not calculate them, we need coefficients from a1 to a5 only. Note also that all vectors that correspond to sums that exceed 5 have no effect on the result. So we obtain needed coefficients as A[1,5] =
(11 55 165 330 462) − ( 1 10 45 120 210) − 2 ( 0 1 9 36 84) + 3 ( 0 0 0 1 7) + 3 ( 0 0 0 0 1) = (10 43 102 141 108) .
Next, from (24) we have that single edges do not affect on the coefficients of RP, affecting on the RP’s power only (M (1, p) = p). Thus, in the example 1 only the edges with multiplicities 2,2,3 and 3 are of importance. Possible sums are: 2,3,4,5,6,7,8 and 10, but only sums up to 5 are of interest. Factors at corresponding vectors in (26) are -2,-2,1, and 4. The subvector of unknown coefficients is obtained now as: A[1,5] =
(10 45 120 210 252) − 2 ( 0 1 8 28 56) − 2 ( 0 0 1 7 21) + (0 0 0 1 6) + 4 (0 0 0 0 1) = (10 43 102 141 108),
we obtain the same result. On the first sight the number of operations does not change, but note that for obtaining all possible sums in the first case we have examined 25 − 1 = 31 variants (sums of 1,2,. . . ,5 items), while only 24 − 1 = 15 (sums of 1,2,. . . ,4 items) variants are considered now. It is obvious that the effect increases with the number of single edges. 4.4
Cycle of Multi-edges
Cycle is connected if and only if not more than one edge is completely destroyed. Let cycle C have n edges ei with multiplicities si , i = 1, . . . , n. In general case we have that R(C, p) = D(S, p) + Z(S, p). (30) Using derivations similar to the case of the tree, we obtain the following vector equation: n−1 A= (−1)i+1 (i − 1) S m (m − W (Ui )), (31) i=0
Ui ⊂2M(n−1) )
where A has m+1 elements (0 ≤ i ≤ m).
568
A. Rodionov, O. Rodionova, and H. Choo
From (30) we have that RP for a cycle does not depends on placement of edges, so without loss of generality we assume that they are ordered by multiplicities in non-decreasing mode (si ≤ si+1 , 1 ≤ i ≤ n − 1). As minimal cut in a cycle includes pair of edges with minimal multiplicities, its capacity is s1 + s + 2, so i all coefficients up to as1 +s2 −1 are equal to binomial coefficients Cm . As in the previous case, we can obtain the number of spanning trees (am−n+1 ) without using the Kirchhoff theorem. Really, any spanning tree of a cycle is a chain. The whole number of different n-node chains in a cycle is equal to the sum of productions of edges’ multiplicities made by all variants of deletion of a edge: am−n+1 =
n
sj =
i=1 j =i
n
si ·
i=1
n i=1
s−1 i .
Thus we need obtain the following subvector only: ⎛ ⎜ A[s1 +s2 ,m−n] = ⎜ ⎝S m (m)+
n−1
(−1)i+1 (i−1)
i=0
M
(32) ⎞
⎟ S m (m−W (Ui ))⎟ ⎠
Ui ⊂2 W (Ui )≥m−n
.
[s1 +s2 ,m−n]
(33) Example 2. Let us have the cycle with 5 nodes and edges’ multiplicities 1,2,2,3 and 3. Number of spanning trees is 1 · 2 · 2 · 3 · 3 · (1 + 1/2 + 1/2 + 1/3 + 1/3) = 36 · (8/3) = 96. As in the previous case a0 = 1. Minimal cut includes pair of 1 2 edges with minimal multiplicities (1+2=3), so a1 = C11 = 11, a2 = C11 = 55. Before obtaining other coefficients we calculate factors for vectors corresponding to different sums of edges’ multiplicities: 1 and 2 – are not taken into account (one item), 3 – 0 · 2 + 1 · 2 = 2, 4 – 1 · 3 = 3, 5 – 1 · 4 + 2 · 1 = 6, 6 – 1 · 1 + 2 · 4 = 9. Other sums are too large to affect on the result. Thus the subvector of coefficients from a3 to 6 is obtained as: A[3,6] =
(165 330 462 462) − 2 ( 1 8 28 56) + 3 ( 0 1 7 21) + 6 ( 0 0 1 6) − 9 ( 0 0 0 1) = (163 317 433 440) .
and so A = (1 11 55 163 317 433 440 96 0 0 0 0). Let us examine the case when k edges, 1 < k ≤ n, are single. If k = n, then, obviously, R(C, p) = pn + npn−1 (1 − p). Transfer to the case of k < n. As edges are ordered by their multiplicities, these single edges form the chain. Now we use (16) taking this chain as pivot. By removing this chain we obtain the chain of last n − k edges (Ch), and by contracting pair of nodes by it – the cycle of the same edges (C∗ ). Note that there is no closing edge and sut = 0. Thus R(C) = pk R(C∗ ) + kpk−1 (1 − p)R(Ch). For some special cases we can derive finite expressions.
(34)
Using Meaning of Coefficients of the RP for Their Faster Calculation
569
Case of k = n−1. Let us make factoring by the multi-edge. We obtain m−n+1
i
Cm−n+1 pm−i (1−p)i +(n−1)pm−i−1 (1−p)i+1 −
R(C, p) =
(35)
i=0
(n−1)pn−2 (1 − p)m−n+2 , or (for non-zero coefficients) (m−n+1) A(m−n+1) = P asc(m−n+1)+(n−1) P asc(m−n+1)1+ .
(36)
Case of k = n−2. We use (34). When contracting, we obtain edge with multiplicity sn−1 + sn = m − n + 2. Thus we have:
R(C, p) = pn−2 1−N(m−n+2) +(n−2)pn−3 (1−p) 1−N(sn−1 ) 1−N(sn ) (37)
= pn−2 I(m−n+2)−(1−p)m−n+2 +(n−2)pn−3 (1−p) I(m−n+2) −
I(m−n+2−sn−1 )(1−p)sn−1 −I(m−n+2−sn )(1−p)sn +pm−n+2 =
m−n+2
i Cm−n+2 pm−i (1 − p)i +
i=0
(n − 2)
m−n+2
i Cm−n+2 pm−1−i (1 − p)i+1 −
i=0 m−n+2−sn−1
(n − 2) (n − 2)
i=0 m−n+2−s n
(n − 2)p
i Cm−n+2−s pm−i (1 − p)i − n−1
i Cm−n+2−s pm−i (1 − p)i + n
i=0 n−3
(1 − p)m−n+3 .
or (for non-zero coefficients) A(m−n+1) = P asc(m−n+2)(m−n+1) +(n−2) P asc(m−n+2)1+ − (38) (m−n+1) S m−n+2 (m−n+2 − sn−1 ) − S m−n+2 (m−n+2 − sn ) .
5
Case Studies
Here we present some comparative results concerning the use of the proposed techniques for CRP calculation. For case studies we choose cycles of edges (cycles C1 , C2 and C3 consist of 20, 40 and 50 edges with multiplicity 20, correspondingly) and lattices G1 (4x4), G2 (5x5) and G3 (6x6). We compared performance of our program with that
570
A. Rodionov, O. Rodionova, and H. Choo Table 1. Computational time (in seconds) Graph Maple 11 Our algorithm C1 2.8 < 0.01 C2 14.3 < 0.01 C3 32.4 0.01 G1 3.0 < 0.01 G2 1123.0 0.50 G3 > 24 hours 507.2
of Maple 11. Maple 11 we choose for comparison because the task of obtaining CRP for the reliability polynomial presented in the form (7) is included in this package as standard one. We have used the PC with Intel (R) Core(TM) 2 CPU
[email protected]. Results are presented in the table. Note that accuracy of the time measurement for Maple is 0.1 sec while that for our program is 0.01 sec. In Fig.3 the examples of our program output are presented. One can see how often the special equations for RP of terminal graphs are used. Note that 3- and 4-node cycles or trees are considered as just 3- or 4-node graphs.
Fig. 3. Program outputs for the graphs G1 and G2
Using Meaning of Coefficients of the RP for Their Faster Calculation
6
571
Conclusion and Future Work
In this paper we show how usage of meaning of coefficients of the reliability polynomial along with usage of vectors of binomial coefficients allow significantly simplify the process of obtaining reliability polynomial of a random multi-graph. In future we propose develop parallel realization of this approach. It is clear that factoring method itself allows natural parallelization. We expect that this along with usage of vector operations will allow exact obtaining of RP for graphs large enough for solving practical tasks.
References 1. Ayanoglu, E., Chih-Lin, I.: A Method of Computing the Coefficients of the Network Reliability Polynomial. In: GLOBECOM 1989, vol. 1, pp. 331–337. IEEE Press, New York (1989) 2. Colbourn, C.J.: Some open problems on reliability polynomials. In: Proc. Twentyfourth Southeastern Conf. Combin., Graph Theory Computing, Congr. Numer. 93, pp. 187–202 (1993) 3. Chari, M., Colbourn, C.J.: Reliability polynomials: a survey. J. Combin. Inform. System Sci. 22, 177–193 (1997) 4. Page, L.B., Perry, J.E.: Reliability Polynomials and Link Importance in Networks. IEEE Transactions on Reliability 43(1), 51–58 (1994) 5. Kelmans, A.K.: Crossing Properties of Graph Reliability Functions. Journal of Graph Theory 35(9), 206–221 (2000) 6. Colbourn, C.J., Harms, D.D., Myrvold, W.J.: Reliability Polynomials can Cross Twice. Journal of the Franklin Institute 300(3), 627–633 (1993) 7. Cvetkovic, D.M., Doob, M., Sachs, H.: Spectra of Graphs: Theory and Applications, 3rd rev. edn., p. 38. Wiley, New York (1998) 8. Moore, E.F., Shannon, C.E.: Reliable Circuits Using Less Reliable Relays. J. Franclin Inst. 262(4b), 191–208 (1956) 9. Shooman, A.M.: Algorithms for Network Reliability and Connection Availability Analysis. In: Electro/1995 Int. Professional Program Proc., pp. 309–336 (1995) 10. Rodionova, O.K., Rodionov, A.S., Choo, H.: Network Probabilistic Connectivity: Exact Calculation with Use of Chains. In: Lagan´ a, A., Gavrilova, M.L., Kumar, V., Mun, Y., Tan, C.J.K., Gervasi, O. (eds.) ICCSA 2004. LNCS, vol. 3046, pp. 315–324. Springer, Heidelberg (2004)
A Novel Tree Graph Data Structure for Point Datasets Saeed Behzadi, Ali A. Alesheikh, and Mohammad R. Malek Department of GIS, Faculty of Geodesy and Geomatics Engineering K.N. Toosi University of Technology Valiasr Street, Mirdamad Cross, Tehran, Iran
[email protected], {alesheikh, mrmalek}@kntu.ac.ir
Abstract. Numerous data structures are developed to organize data and their relations. Point set data in GIS are managed mostly through TIN (Triangulated Irregular Network) or grid structure. Both methods have some disadvantages which will be discussed in this paper. In order to remove these weaknesses, a novel method will be introduced which is based on tree graph data structure. Tree graph data structure is a kind of data structure which shows the relationship between points by using some tree graphs. This paper assesses the commonly used point structures. It then introduces a new algorithm to address the issues of previous structures. The new data structure is inspired by snow falling process in natural environment. In order to evaluate the proposed data structure, a Digital Train Model (DTM) of sample points is constructed and compared with the generated DTM of TIN model. The RMSE of proposed method is 0.585933 while the one which is obtained by TIN method is 0.748113. The details of which are presented in the paper. Keywords: Data Structure, TIN, Grid, DTM, GIS.
1 Introduction There exists lots of data structure to explain the relationship between points, all of which have some advantages and drawbacks. The drawbacks are mostly attributed to the unreal assumption of component behaviors [3]. In this paper, grid and TIN data structure are scientifically evaluated. Then, the new data structure called snowing layer data structure is introduced. Finally, the advantages of using this new data structure are highlighted.
2 Grid Data Structure In grid data structure, the point is collected in a regular way, which means that the position of points and the distance between points are known [14]. This method has some advantages. For example, the spatially systematic positioning of the points ensures evenness of coverage across the study area. Moreover, the configuration resembles to a matrix structure of any high level programming languages. This characteristic facilitates computer manipulation. There are some weaknesses in this method O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 572–579, 2009. © Springer-Verlag Berlin Heidelberg 2009
A Novel Tree Graph Data Structure for Point Datasets
573
Fig. 1. Large grid simplifies the surface and removes important information
as well. For example, the collected points do not follow the shape of the surface [11]. This adds to the uncertainty of any surface representation. Specifying sampling distance is another issue of grid structure [12, 13]. If the distance between points increases, less points is gathered, and the surface will not represented well. Figure (1) shows the results of an abstraction occurs when grid size is large.
3 TIN Data Structure A TIN (Triangulated Irregular Network) is a vector data structure which represents the surface as a collection of non-overlapping triangles [1, 2]. These triangulations are created by joining the neighboring points to each other. Each point has a value beside its coordinate component called Z-component. Once a TIN is created, the third component (Z) of any point on the triangle's continuous surface can be estimated through some spatial interpolation method. Figure (2) shows an example of a Triangulated Irregular Network which is made by x, y, z, survey data [4, 5]. In TIN data structure, the points are collected irregularly from the surface. In this method the distance between points is not important, because the points located in significant position is gathered. Therefore, the method can solve one of the problems occurred in grid method. There are many methods in order to join the neighbor points. One of these methods is through Delaunay triangulation. The Delaunay triangulation of a point set is a collection of edges satisfying an "empty circle" property: for each edge it must find a circle containing the edge's endpoints but not containing any other points. One of the disadvantages of this method is that the points are connected to each other only based on two spatial components (x, y components) [6, 7, 8]. Logically, to estimate the height, the nearest point must be weighted more, therefore Delaunay triangulation try to connect nearest points. It is shown that height information (z component) can be used and enhance the accuracy of surface representation [15]. Properties of Delaunay triangulation are: • Local empty-circle property: The circum-circle of any triangle in Delaunay triangulation does not contain the vertex of the other triangle in its interior. • max-min angle property: In Figure 3, PlPk satisfies the max-min angle property if PlPk is the diagonal of tetrahedral which maximizes the minimum of the six internal angles associated with each of the two possible triangulations of tetrahedral [16].
574
S. Behzadi, A.A. Alesheikh, and Md.R. Malek
Fig. 2. Triangulated Irregular Network
Fig. 3. Angle criterion in Delaunay Triangulation
• Uniqueness: There is a unique Delaunay triangulation from a set of points. • Boundary property: External edges of Delaunay triangulation make the convex hull of the point set. • TIN is computationally complex.
4 Snowing Layer Data Structure In this part, a new method is introduced to describe the relationship between points better. The new method is inspired by the procedure of snow falling, so it is named “Snow Falling Data Structure”. Here, the process of snow falling is explained based on X, Y, Z survey points, then, the relationships between points are specified by the snow layer. In snow falling, snowflakes fall on the earth in a period of time, but in this case, it is assumed that all snowflakes fall instantaneously on the surface. The first point of the surface which meets the snow layer is the highest point as show in figure (4). The process of landing snow layer continues, and then the others points located around the highest point are covered by the layer. By decreasing the height of the layer more points are covered by the layer, in addition, the horizontal distance between peak point and the new covered point increases. This process will continue until the snow layer meets another summit point in the area. This new top points is as high as the others points which are covered by the snow layer, but the horizontal distance between this new peak point and other points with the same height is longer than the distance between other points. This new peak point is shown in figure (5). This procedure will continue until the whole area is covered by the snowing layer, then the procedure will stop.
A Novel Tree Graph Data Structure for Point Datasets
575
(a) Peak
(b) Fig. 4. (a) The surface of the earth (b) the peak point of the area meets the snow layer
The area was covered with snowing layer
The second peak point
Fig. 5. The covered area with the snowing layer and the second peak point
Snowing layer can be explained in a mathematical way. Figure 6 presents the proposed algorithm.
Insert {point data set} Insert {threshold distance (')} N ĸ number of points S ĸ {} S’ ĸ Sort point data set based on third (Z) component For I=1 to N P ĸ S’ (I) If P is near to Points in S Connect P to nearest point in S set End If S = S {P} S’ ĸ S’ – S End Fig. 6. The Algorithm of proposed method
576
S. Behzadi, A.A. Alesheikh, and Md.R. Malek
Peak
S set points
Fig. 7. S set is the covered area with snowing layer
A set of points are surveyed from the area. In order to describe the landing of snow layer, at first all points must be sorted based on z component of the points and in this step the x, y components of points are unimportant. The point with the highest z component is selected as the first peak point of the area. This selected point is put in the S set. (S set is a set in which the covered point in the area are put and the S’ set is a set of uncovered points in the area by the snowing layer). At first step, S set has only one member which is the highest point in the area. Then the second ordered point is selected, and connected to the neatest point in S set. Therefore, at first step only one point is stored in S set then the second ordered point is connected to the highest point. This process continues and the other points in S’ set is selected and connected to the nearest point in the S set. These selected points in S set are shown in figure (7). This process will continue until all points in S’ set are transferred to S set. At each step the distance between the selected points and the nearest point in S set is approximately considered as d±δ(d is the mean distance between selected points and the nearest point in S set, δ is the amount of tolerance). Afterwards, some points from S’ set is selected, if the distance between this new point and the nearest point in S set is near d±δ, then this new point is connected to the nearest point, if not, this new point is another peak point in the area, so this point is added to S set without any connection to the others points in S set. By using these rules other summit points in the surface are discovered. Figure (8) illustrates the process of finding the second peak point in the area.
The second peak of the area
Fig. 8. Snow Falling Data Structure which shows two peak points
A Novel Tree Graph Data Structure for Point Datasets
577
Unlike other methods, in our proposed algorithm all three coordinates of points are considered.
5 DTM by Snowing Layer At the previous step, the relationships between points are defined by snowing Layer. In this part, the third component of unknowns point is computed through Snowing Layer method. The unknown point is located between lots of vector made by Snowing Layer. The two nearest vectors to the unknown point is selected; these two vectors may have three cases with each other in 3D space: intersect, parallel, or avert. By using the following equitation the third component (Z) of unknown point is calculated:
Zx =
d2 d1 ( H1 ) + (H 2 ) d1 + d 2 d1 + d 2
(1)
Where: d1 is the distance between unknown point and vector (V1). d2 is the distance between unknown point and vector (V2). H1 is the height of the nearest point in vector (V1) H2 is the height of the nearest point in vector (V2) H1 and H2 are calculated through any spatial interpolation. Each vector (V1 or V2) has known end points coordinate.
6 Experiments In this part, the proposed data structure and TIN data structure are compared on a set of points which are surveyed from the surface of the earth. Both data structures are executed on the points, then, the relationship between points is described. Figure (9) shows the resulting TIN and snowing data structures. In order to compare these two methods, some unknown points are selected in the area and the third component of these points is calculated by both TIN and Snowing Layer approaches. Then, the results are compared with the true values of third
(a)
(b)
(c)
Fig. 9. Comparing Snowing Data Structure and TIN data structure, (1) Surface, (2) TIN data structure, (3) Snowing data structure
578
S. Behzadi, A.A. Alesheikh, and Md.R. Malek Table 1. The comparison between TIN and Snowing Layer Data Structure
Elevation 1875.18958 1863.27759 1859.69104 1861.91272 1868.84766
TIN method 1876.21191 1862.46582 1860.21912 1862.59863 1869.43494
Snowing layer Method 1876.18545771057 1862.88041499319 1860.1781568575 1862.164729861 1869.36367617288
components which are obtained by direct observation. The results are presented in table (1). The RMSE of calculated value for both methods is specified as:
TIN → RMSE = 0.748113 m Snowing Layer → RMSE = 0.585933 m
7 Conclusions and Recommendations In this paper, TIN and grid data structures were assessed. The structures have several advantages and disadvantages. In order to resolve the problems of these methods, a new data structure called “Snowing Data Structure” was introduced. This method is established based on the characteristic of snow falling in nature. Due to the nature of its creation, the connected lines in the Snowing Data Structure are the most stable ones in the area. The lines can be considered as controls to check the accuracy of TIN/Grid data structures. Snowing Data Structure is a set of tree graphs which describes the relationship between points. This method can be extended to create triangles in TIN. This case will be occurred by adding lines to the tree graph and changing the tree graphs to a connected graph like what is seen in TIN triangles. The results of the test demonstrated that the accuracy of height interpolation in Snowing Data Structure (RMSE=0.59m) is much better than that of TIN (RMSE= 0.75m). More case studies that bear more control points are needed to ascertain the results of this study.
References 1. Burrough, P.A.: Principles of Geographic Information Systems for Land Resource Assessment. Monographs on Soil and Resources Survey, No. 12. Oxford Science Publications, New York (1986) 2. El Sheimy, N., Valeo, C., Habib, A.: Digital Terrain Modeling: Acquisition, Manipulation and Applications. Artech House, Boston (2005) 3. Rolf, A., de Knippers, B.R.A., Sun, Y., Ellis, M.C., Kraak, M.-J., Weir, M.J.C., Georgiadou, Y., Radwan, M.M., van Westen, C.J., Kainz, W., Sides, E.J.: Principles of Geographic Information System, Version of 25th January (2001) 4. Shojaee, D., Helali, H., Alesheikh, A.A.: Triangulation For Surface Modeling. In: 9th Symposium on the 3-D Analysis of Human Movement, Valenciennes, France (2006) 5. Abdul-Rahman, A., Pilouk, M.: Spatial Data Modelling for 3D GIS. Springer, USA (2008)
A Novel Tree Graph Data Structure for Point Datasets
579
6. Li, Z., Zhu, Q., Gold, C.: Digital Terrain Modeling Principles and Methodology. CRC Press, Boca Raton (2004) 7. Alesheikh, A.A., Soltani, M.J., Nouri, N., Khalilzadeh, M.: Land assessment for flood spreading site selection using Geospatial Information System. International Journal of Environmental Science and Technology 5, 455–462 (2008) 8. Chaplot, V., Darboux, F., Bourennane, H., Leguédois, S., Silvera, N., Phachomphon, K.: Accuracy of interpolation techniques for the derivation of digital elevation models in relation to landform types and data density. Geomorphology 91, 161–172 (2007) 9. Carter, J.R.: Digital representations of topographic surfaces. Photogrammetric Engineering and Remote Sensing 54, 1577–1580 (1988) 10. Lu, G.Y., Wong, D.W.: An adaptive inverse-distance weighting spatial interpolation technique. Computers & Geosciences 34, 1044–1055 (2008) 11. Li, W.-j., Li, Y.-j., Liang, Z.-w., Huang, C.-w., Wen, Y.-w.: The Design and Implementation of GIS Grid Services. In: Zhuge, H., Fox, G.C. (eds.) GCC 2005. LNCS, vol. 3795, pp. 220–225. Springer, Heidelberg (2005) 12. Wang, J., Xue, Y., Guo, J., Hu, Y., Wu, C., Zheng, L., Luo, Y., Xie, Y., Liu, Y.: Study on Grid-Based Special Remotely Sensed Data Processing Node in Grid GIS. In: Min, G., Di Martino, B., Yang, L.T., Guo, M., Rünger, G. (eds.) ISPA Workshops 2006. LNCS, vol. 4331, pp. 610–617. Springer, Heidelberg (2006) 13. Jing, Z., Hai, H.: Research on Mass Terrain Data Storage and Scheduling Based on Grid GIS. In: Pan, Z., Cheok, D.A.D., Haller, M., Lau, R., Saito, H., Liang, R. (eds.) ICAT 2006. LNCS, vol. 4282, pp. 1263–1272. Springer, Heidelberg (2006) 14. Longley, P.A., Godchild, M.F., Maguire, D.J., Rind, D.W.: Geographic Information Systems and Sciences, 517 p. Wiley Publications, Chichester (2005) 15. Bowyer, A.: Computing Dirichlet Tesselations. Comp. J. 24, 162 (1981)
Software Dependability Analysis Methodology Beoungil Cho, Hyunsang Youn, and Eunseok Lee 300 Cheoncheon-dong, Jangan-gu, Suwon, Gyeonggi-do 440-746, Korea {mes1an, wizehack, eslee}@ece.skku.ac.kr
Abstract. Dependability can be verified at the integration phase of the software development life cycle. However the dependability verification processes that inspect software dependability in the late period of development have an effect in the development cost (e.g. time, human resources). Therefore, it is a very important issue to verify any nonfunctional requirements in the early stages of the development process. In this paper, we propose a software dependability analysis methodology of distributed component based software by using HQPNs (Hierarchically Combined Queuing Petri Nets) modeling. We prove the validity of the proposed methodology by applying it to a video conference system development. Keywords: Dependability, Reliability, Availability, Hierarchically Combined Queuing Petri Nets.
1 Introduction Successful software development is the satisfaction of a customer’s requirements within an appropriate period. A user requirement is classified as either a functional requirement or a non-functional requirement. Non functional requirement, in particular, are concerned with software quality which consists of performance, security, dependability, usability and maintainability and so on. From the view point of traditional software engineering, the nonfunctional requirement can be verified only at the integration phase of the software development life cycle. Unfortunately, the integration phase is in the late part of the overall development. If the integrated software can not satisfy the non-functional requirement, it could cause serious damage relating to development costs. Furthermore the costs associated with any modification increase exponentially as the development phase continues. Assuming that modification cost at design step is within reason, the modification cost at implementation step will be three times of the design cost, and modification cost at the integration step will be about seventy times that at the initial design phase. So if we can verify the nonfunctional requirement at an earlier stage of the software life cycle, we can reduce the development cost to one-seventieth. Consequently research into the methodology of non functional requirement verification at the early phases of development is a recently highlighted division of the software engineering field. O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 580–593, 2009. © Springer-Verlag Berlin Heidelberg 2009
Software Dependability Analysis Methodology
581
Software dependability, especially, remains one of the non functional requirements, while verification contains an availability and reliability analysis. Availability is the probability that a system responds to user requests in reasonable amount of time. Reliability is the ability that software delivers an appropriate service to user.[1] To analyze software dependability at the early stages of the development process, it must be possible to mathematically analyze the software model. The Markov Reward Model (MRM) is a representative mathematical model for analyzing software reliability. This model provides specific analysis means which is easier to analyze than a scenario or event based model. But the modeler must to know about every state of the system. Furthermore numerous system state models are necessary to be designed by modeler him- or herself. Actually the cost of these state modeling methods may be unacceptable. Therefore MRM auto generation methods are often required. Generalized Stochastic Petri Nets (GSPNs) are the modeling methods that are convenient for the system behavior representation. This model is able to represent various system behaviors. And the model can be transformed to the Markov Chain. Thus it can be a mediator between the software system and MRM mentioned above. However the transition from GSPN to the Markov Chain might cause a state explosion problem. In this paper, we propose the methodology for verifying software dependability of non functional requirements at the design phase of software life cycle. We use modeling methods based on Hierarchically Combined Queuing Petri Nets (HQPNs) to solve the GSPNs’ state explosion problem. We propose an MCM induction methodology from the HQPNs for dependability verification. After the methodology introduction, we also show the actual dependability analysis process by using proposed methodology. Through our methodology, developers can predict and analyze software dependability in a quantitative way before the development process reaches the implementation phase. This paper is organized as follows. First we review relevant previous research in Chapter 2. Chapter 2 contains HQPNs, AOPN and MRM. Then Chapter 3 introduces the transformation process from the HQPNs to MCM that can measure dependability. In Chapter 4, the analysis method for verifying availability and reliability by using the model derived from Chapter 3 is introduced. Chapter 5 represents a case study of a video conference system. At the last chapter, we draw conclusions.
2 Related Work In this chapter, we provide a brief presentation about HQPN [3,4], AOPN [5] and MRM [3] that is the fundamental basis of our work. We also discuss necessities of each model. 2.1 HQPN (Hierarchically Combined Queuing Petri Nets) Bause[3] suggests a hierarchical model that is a combination of queuing networks and Petri nets. This hierarchically combined Petri net model has both characteristics of a queuing network and a Petri net. A queuing network can design a model from a
582
B. Cho, H. Youn, and E. Lee
hardware view point and the Petri net can be modeled from the view point of the software. Additionally, the state explosion problem is addressed, because the model is composed hierarchically. For those reasons HQPN is a good model for modeling complex system states. Kounev [4] suggests a methodology for HQPN modeling. The HQPNs for system analysis can be designed through the following steps. Step1. System component and resource modeling: Classify system resources according to token and place. For this classification, hardware and software resources are classified as active resources and passive resources. Hardware resources like the Server, CPU and data base disk are classified as active resources. These resources are represented by a queuing place. Threads, processes, date base connections are passive resources, and they are represented by tokens that exist in ordinary places. Step2. Base component modeling of the work load: Work load models base components of tokens, and grant different colors to token classify each of the work load classes. Step3. Inter component interaction and process modeling: Interactions between components are modeled by transitions. To represent the sub-transaction of each processing step of the system transaction, a token is used. The transaction token is represented by a capital letter and the subtransaction is represented by a small letter. Step4. Model parameter setting: Determine the parameters used in the model. (e.g. the number of tokens exist in the place, service time of a token in the queue of the queuing place, transition firing weight, and firing delay.) By using the queuing network which was obtained by the above 4 steps, a hierarchical state space can be identified and represented abstractly by using subnet place. In this paper, the HQPN model is used for analyzing software dependability. 2.2 AOPNs (Aspect Oriented Petri Nets) There is an inadequate part in HQPN model to verify dependability. Here is an example. A system cannot deliver a service in the right amount of time when the system is not available. This means that HQPN loses its tokens. But HQPN cannot represent a loss of tokens, so the model contains a limitation that is unable to express the unavailable system states. Xu [5] proposes AOPN that contains cross-cutting concerns of aspect oriented programming. This Petri net transfer tokens to a designated state by defining point cuts in an established Petri net transition. The AOPN is able to insert additional Petri nets at a specific aspect, so the AOPN can be used to analyze availability. 2.3 MRM (Markov Reward Model) Goseva-Popstojanova [2] proposes a system reliability analysis methodology by using a Markov Chain that analyzes the various interactions between components. In this methodology, the Markov Chain can be converted to MRM. MRM is able to measure and verify system reliability mathematically. In [2], the CTMC (Continuous Time Markov Chain) is defined as follows. A state space is defined as S when {X(t), 0 ≤ t } and an infinitesimal generator matrix Q=[qij]. The MRM is defined by granting rewards to each state of the CTMC. The reward rate of state i is defined as ri. Then a random variable Z(t)=rx(t) becomes the instantaneous reward rate of the MRM on time t. The instantaneous reward rate is formula (1). If the steady state probability vector is defined as π = lim P(t ) , then the t →∞
expected reward rate in a steady state is same as formula (2).
Software Dependability Analysis Methodology
583
E[ Z (t )] = ∑ ri Pi (t )
(1)
E[ Z (t )] = ∑ riπ i
(2)
i∈S
i∈S
3 Quantified Dependability Analysis Model Framing Method In this chapter, we propose the software modeling and quantified analysis method for software reliability analysis. Fig. 1 shows the process that analyzes reliability and availability in quantified way. To analyze the reliability and availability, first, one must identify system elements and model those elements to HQPN. We follow the HQPN modeling method proposed by Kounev [4]. Then, as a function of the dependability analysis, define a reliability analysis aspect or availability analysis aspect, and insert the token losing place to the existing HQPN. The marking of aspect defined HQPN is equivalent to a state space of the Markov Chain. So, we can obtain a Markov Chain from the HQPN. Upon the conclusion, the Markov Chain is transformed into MRM by adding rewards to the Markov model that created the last step. 3.1 Transformation from HQPN to MC (Markov Chain) The definition of an infinitesimal generator matrix Q0 is necessary to transform HQPN into MC. Each state space of the Petri net is separated to two patterns, one of the two is transformed from the other state space and another of the two is transformed to the other state space. Assuming that the number of state spaces is n, then an n by n infinitesimal generator matrix, Q0, is created by using the two patterns of the state space transition.
Fig. 1. Dependability Analysis Process
Fig. 2. Markov Chain
584
B. Cho, H. Youn, and E. Lee Table 1. Transition rate for MC state 0 1 2
0
1
2
-Ɓ
Ɓ
0
Ɛ 0
ƐƁ
Ɓ
Ɛ
Ɛ
Fig. 3. Queuing place conversion
Fig.2 is the MC of Q0 from Table 1. In Table 1, each of the rows and columns indicate each state space. Respectively, a row means the former state and the column represents the state space that transitioned from the row. Fig.2 shows a simple Markov Chain. There are three states, so the infinitesimal generator matrix is 3 by 3 matrix. Table 1 is the infinitesimal generator matrix of Fig.2. This value is used to transform the Markov Chain. Fig.3 shows queuing place conversion in HQPN. The queuing place represents the hardware in the HQPN. Conversion from the queuing place into a normal place is necessary to transform the HQPN into an MRM. (a) In Fig.3 has queuing place Q1. If we represent the Q1 with timed transition, Q1 can be transformed to (b), shown in Fig.3. The queuing place Q1 can be transformed into two normal places and a one timed transition. Fig.4 shows the process that transforms the HQPN into the Markov Chain. The place P1 in Fig.4(a) has two tokens, A and B. In Fig.4(b) the process that the token A and B are fired to S1 are represented. M0 is the state that has token A and B. In this state, the state is changed to M1 when the token A is fired by transition T1. After that, state M1 is changed to M3 when the token B is fired by T1. The dotted line notes inform which tokens exist in a certain place. Through the above process, HQPN is transformed to the Markov Chain. 3.2 Transformation from MC to MRM In this chapter, we introduce the transformation process from Markov Chain to MRM. The Markov Chain transformed in chapter 3.1, transforms into MRM because we analyzed the dependability from MRM. MRM needs the fault definition of each state and the reward rate of each state.
Software Dependability Analysis Methodology
585
Fig. 4. HQPN transform into the Markov Chain
Table 2 shows the grouping of system fault states. In this table, a predictable system fault means that the fault can be restored in short time by changing system setting. The other side, an unpredictable fault can be restored by rebooting that uses a relatively long time than a predictable fault recovery. Occurrence probability of each fault can be graded differently by through the classification of faults. The reward rate which is used to analyze the reliability, classifies states into up-state and down-state. In this paper, the up-state reward rate is 1 and down state reward rate is 0, as shown in Table 3. Each state’s reliability or availability is separated to ‘yes’ or ‘no’, so the reward rate is a binary value. In this paper, we cannot grade the system reliability or availability partially like 50% satisfied. 3.3 Reward Rate Assessment for Availability Analysis Fig.5 is predictable fault occurrence model of a single processor availability model. Table 4 shows the notation of this model. State 0 and 1 in Fig.5 indicates the number of processors. State y represents an unpredictable fault occurrence in the system. Therefore, if a predictable fault has occurred, then the number of processors is reduced from 1 to 0. And if system is recovered by the probability τ, then the number of processors is recovered to 1 again. If a fault has occurred and this fault is unpredictable, then the system state enters into y. The state y cannot be restored, so there is no restore probability, and the number of processors is never restored to the normal state 1. Table 2. System fault classification System Fault Transient Permanent Recoverable Non Corrupting Unrecoverable Corrupting
Description Occurs only in specific input Occurs in every input Recover without operator Fault not effects on system state or data Recover with operator Fault effect on system sate or date
Predictability o o o o x x
586
B. Cho, H. Youn, and E. Lee
Fig. 5. Markov Chain of Availability Model Table 3. Reward rate State Up-state Down-state
Reward Rate 1 0
Description Reliable, Available Unreliable, Unavailable
Available state is only state 1 that has operating processor in Fig.5. The state 0 and y is an unavailable state because of the definition used in this paper. The available state is the up-state. Table 3 defines the reward rate to up or down state. So we can assess reward 1 to an available state. The unavailable state is the down state, so the reward rate of an unavailable state is 0. Table 5 shows the arrangement of reward rate assessment by Fig.5. 3.4 Reward Rate Assessment for Reliability Analysis Requirements do not affect the availability model, but the reliability analysis model is changed by requirements. An unreliable state in the Markov Chain reliability model means the state cannot transform into another state. Fig.6 shows the Markov Chain reliability model. If the requirement defines the unreliable state as unpredictable fault occurred state, then (a) shown in Fig.6 represents the reliability model. (b) in Fig.6 can change state from every state to every state, in that case the reliability requirement is more easily satisfied than (a). Fig.6 (c) shows a very strict reliability requirement. Fig.6 (c) defines all system faults as an unreliable state. So, an appropriate reward assessment is necessary to analyze reliability. By the reward rate definition in Table 3, state y at (a) of Fig.6 is the down-state, since state y cannot change its state. Naturally state 0 and state 1 are up-state. Consequently (a) of Fig.6’s reward rate assessment is as follows. State y’s reward rate is 0 and state 0 and state 1’s reward rate is 1. For Fig.6 (b), every state can change their state, so all state are up-state and there is no down-state. But For Fig.6 (c), state 0 and state y cannot change their state, so they are down-state and their reward rate is 0. State 1 is up-state and its reward rate is 1. Table 6 shows above states collection. Table 4. MRM symbols Symbol с 1-с γ π
Description Predictable fault occurrence rate Unpredictable fault occurrence rate Fault occurrence rate System recovery rate
Software Dependability Analysis Methodology
587
Table 5. Up and down state of availability Markov Chain model Up-state U={1}
Down-state D={0,y}
Table 6. Reward rate of reliability Markov Chain model Model (a) (b) (c)
Up-state U={0, 1} U={0, 1, y} U={1}
Down-state D={y} D={ } D={0, y}
As mentioned above, the reliability model should care for various circumstances as non-functional requirements such as reliability and abailability. To take care of those requirements for reflected state transitions, we used AOPN and inserted additional states to HQPN. A detailed example will be shown Quantitative Software Dependability Analysis In Chapter 3, we transform the HQPN to the Markov Chain and identify the up and down-state. Also we assess the reward rate for each state. Those rewards assessed with the Markov Chain are called as MRM (Markov Reward Model). This chapter introduces the reliability and availability analysis method by using MRM. 3.5 Quantitative Availability Analysis Quantitative availability analysis uses a reward rate definition contained in Table 5 and formula (1) and (2). U={1} is the up-state, so the instantaneous availability on time t is same as formula (3). Steady state availability is same as formula (4) by formula (2).
Fig. 6. Markov Chain of Reliability Model
588
B. Cho, H. Youn, and E. Lee
A(t ) = E[ Z (t )] = ∑ ri Pi (t ) = ∑ Pi (t ) = P1 (t )
(3)
A = E[ Z ] = ∑ riπ i =∑ π i = π i
(4)
i∈S
i∈U
i∈S
i∈U
3.6 Quantitative Reliability Analysis Instantaneous reliability on time t is the same as formula (5) by formula (2).
R(t ) = E[ Z (t )] = ∑ ri Pi (t ) = i∈S
∑ P (t ) i
i∈U j
(5)
In formula (5), U means the universal set of the up-state. As mentioned at Chapter 3.3, the reliability model should be changed to possess different reliability requirements. Therefore the up-state set, U, should be changed. The reliability model of Fig.6 (a) is same as formula (6) by using formula (5). The reliability model of (b) is same as formula (7). Reliability model of Fig.6 (c) is same as formula (8). U = {0,1}, Ra (t ) = P0 + P1
(6)
U = {0,1, y}, Rb (t ) = P0 + P1 + Py
(7)
U = {1}, Rc (t ) = P1
(8)
In Markov model, a state that cannot change its state called as the absorbing state. This state is regarded as failure state. In that case, the mean time to absorption is equal to mean time to failure. Mean time to absorption on a reliability based MRM is same shown by formula (9). By using formula (9), we can estimate system reliability on certain time period. MTTF = MTTA = E[Y (∞)] =
∑ L (∞)
i∈U j
i
(9)
4 Case Study In this chapter, we confirm the proposed dependability analysis method, by applying our method to a video conference system. The video conference system that is introduced in this chapter can support video and text message communication and transfer files while a conference is held. The CPU can operate smoothly with about 1000 connections. Additionally the thread pool allows for 10 concurrent connecters. We assumed that system dependability goal is 90% of reliability and availability. 4.1 HQPNs of Video Conference System
The constitution of the video conference system is shown in Fig.7. When clients login through the Internet, a load balancer assigns jobs to each server and sends the results
Software Dependability Analysis Methodology
589
Fig. 7. HQPNs of video conference system
of requested job to the client. This system provides text messaging, video messaging and file transfer functions to the user. To model this system, resources are identified as to places and tokens. Token C means a text message, token V means a video message, and token D means a file transfer. The small letter of each tokens, c, v and d mean a sub-transaction of the main transaction C, V and D. As Fig.7 shows, the video conference system is out of one load balancer and three video conference servers. The load balancer and video conference server are represented as queueing the Petri net as included in subnet places. Each load balancer and video conference servers consist of one regular place and queuing place. Modeled HQPN through above analysis is shown as Fig.7. Table 8 shows a partial firing rate of the HQPN. Those firing rates are obtained by the methods for HQPN modeling [5]. 4.2 Modeling for Availability Analysis
To measure the video conference system availability, developer determines the HQPN states as the system behavior. In a video conference system, the unavailable states are Table 7. System specification Component Server
Load Balancer
LAN
Description WebLogic 8.1 Server AMD Athlon 64 X2 4800, 1GB RAM, SuSE Linux8 Commercial HTTP Load Balancer, AMD Athlon 64 X2 4800, 1GB RAM, SuSE Linux 8 1 Gbit Switched Ethernet
590
B. Cho, H. Youn, and E. Lee Table 8. Firing Rate for HQPN of Video Conference System Transition t1, t3, t5
t2, t4, t6 t7, t8 t9, t10
In
Out
Firing Rate
C V D c v d C D V D
c v d C V D C D C D
0.34 0.51 0.15 0.32 0.55 0.13 0.82 0.18 0.87 0.13
the states that client is to terminate abnormally or the states when the server loses video or text messages. If we wanted to model this in the queuing Petri net, a token leak should have occurred. So we define an AOPN and the model token leak state and add it to the existing queuing Petri net. Fig.8 is the HQPN for availability analysis. Place L1, L2 and L3 means leaked token gathering state. We transform this Petri net to MRM to analyze the availability. Fig.9 is a part of the generated MRM. As the proposed method in Chapter 3, the unavailable state reward rate is assigned as 0. The available state reward rate is assigned as 1. By using those reward rates and formula (1), the system dependability is measured as the requirement shown in Table 9. If the message guarantee rate is lower than probability of Table 9, the system is in the unavailable state. Assigning reward rates according to this procedure, and calculating availability as in formula (10) allows us to judge which states disobey the availability requirement. As shown as Table 10, the number of availability model states is too large to represent the entire MRM.
A = ∑ π i = π 1 + π 2 + π 3 + π 4 + π 5 + ... i∈U
Fig. 8. HQPNs for Availability Analysis
(10)
Software Dependability Analysis Methodology
591
Table 9. Video Conference System Dependability Requirements Fault State Text message losing Video image losing File date losing
Probability 0.05 0.20 0.05
Fig. 9. Part of generated Markov reward model
Table 10 is the results of the availability model analysis. As the measured result, the portion of the unavailable states is 8.1% of the entire state. Therefore the portion of available state is 91.9% of entire states. This result shows us that the video conference system availability is 91.9% so this system satisfies its availability requirements of 90.0%. 4.3 Modeling for Reliability Analysis
HQPN of the video conference system needs additional places to measure reliability. Accordingly it is necessary that the definition of a reliable state be just like the availability analysis. As mentioned in Chapter 3, the reliability model is different to the reliability requirement. In this case, we assumed the reliability requirement as (b) in Chapter 3, since the video conference system failures can be fixed by retransferring or reconfiguring [8]. The aspect for reliability analysis is defined as those states. As the assumption, an unreliable state of video conference system is the state that the system cannot provide the service to the customer because of too many other users. HQPN in Fig.8 is unable to represent the state that the number of users increases over a certain number. To represent those states, a new token should be the inflow to HQPN. Table 10. Availability Model Analysis Item # of states #of unavailable state #of available state
Availability Model 3.138E+10 2.549E+9 (8.1%) 2.884E+10 (91.9%)
592
B. Cho, H. Youn, and E. Lee
Fig. 10. HQPNs for Reliability Analysis
Fig.10 shows the Petri net added model by using AOPN. In Fig.10, place E1 and E2 are added. Those places mean the case in which the service was not well provided or a request did not transmit. To provide the above state, token E is added. This token represents the customer increasing the circumstance, so the number of tokens in this place is proportioned to the number of service tokens C, V and D. As the service requests increase, the tokens inflowing into the additional places increase as well. The above process represents the availability reduction caused by large amounts of requests. Additional places inserted HQPN can transform to MRM for reliability analysis by using the method introduced in Chapter 3. Fig.11 shows a part of the transformed MRM. Services can be recovered by system properties. The reward rate of a recoverable state is assessed as 1, but the absorbing state B in Fig.11 is assessed as 0. By doing this, we can analyze the reliability as formula (11) shows. We omit the entire MRM, since there are too many states shown as in Table 11. Table 11 is the complete results of the MRM analysis. It shows that the unavailable states are 7.8% of the total states. Consequently, the system availability is 92.2% so it satisfies its reliability requirements.
R(t ) = PA + PC + PD + ...
Fig. 11. Generated Markov reward model
(11)
Software Dependability Analysis Methodology
593
Table 11. Reliability Model Analysis Item # of states #of unreliable state #of reliable state
Availability Model 2.824E+11 2.203E+10 (7.8%) 2.604E+11 (92.2%)
5 Conclusion In this paper, we proposed a mathematical analysis method that can measure software system dependability at the architecture design phase. The proposed process is as follows: 1) Identify system elements and model the system to HQPN; 2) Define aspects to unavailable and unreliable part in HQPN, and add the aspects to HQPN. 3) Derive MRM from the aspects added to HQPN. 4) The derived MRM can analyze reliability and availability quantitatively. We applied the method to a video conference system and we could predict the system dependability. Software developers can verify the dependability of a system that is a representative nonfunctional requirement, at the early stages of the software lifecycle. This verification cost is considerably less than the verification cost at the system integration phase. However, it is important to note that the derived MRM has a very large number of states. Our future work will involve making a tool for automatic transformations from AOPN to MRM. Acknowledgments. This paper was supported by Faculty Research Fund, Sungkyunkwan University, 2008.
References 1. Basili, Bohem: Software Defect Reduction Top 10 List. IEEE Computer Society 34(1), 135–137 (2001) 2. Goseva-Popstojanova, K., Trivedi, K.S.: Stochastic modeling formalisms for dependability, performance and performability. In: Reiser, M., Haring, G., Lindemann, C. (eds.) Dagstuhl Seminar 1997. LNCS, vol. 1769, pp. 403–422. Springer, Heidelberg (2000) 3. Bause, F., Buchholz, P., Kemper, P.: Hierarchically Combined Queueing Petri Nets. Lecture Notes in Control Information Sciences, vol. 199, pp. 176–182 (1994) 4. Kounev, S.: Performance Modeling and Evaluation of Distributed Component-Based Systems Using Queueing Petri Nets. IEEE Transactions on Software Engineering 32(7), 486– 502 (2006) 5. Xu, D.: Threat-Driven and Verification of Secure Software Using Aspect-Oriented Petri Nets. IEEE Transactions on Software Engineering 32(4), 265–278 (2006) 6. Park, J., Lee, J., Lee, E.: Goal graph based Performance Improvement for Self-Adaptive Modules. In: ACM SIGKDD International Conference on Ubiquitous Information Management and Communication (ACM SIGKDD ICUIMC 2008), Feburary 2008, pp. 68–72 (2008)
New Approach for the Pricing of Bond Option Using the Relation between the HJM Model and the BGM Model Kisoeb Park1, Seki Kim2, , and William T. Shaw1 1 Department of Mathematics King’s College London, United Kingdom {kisoeb.park, william.shaw}@kcl.ac.uk 2 Department of Mathematics Sungkyunkwan University, Korea
[email protected]
Abstract. In this paper, we propose a new approach for the pricing of bond options using the relation between the Heath-Jarrow-Morton (HJM) model and the Brace-Gatarek-Musiela (BGM) model. To derive a closed-form solution (CFS) of bond options on the HJM model with the BGM model, we first consider about basic concepts of the HJM model in which is hard to achieve the CFS of bond options. The second obtains the bond pricing equation through the fact that the spot rate is equal to the instantaneous forward rate. Furthermore, we derive the formula of the discount bond price using restrictive condition of Ritchken and Sankarasubramanian (RS). Finally, we get a CFS of bond options using the relation between the HJM volatility function σf (t, T ) and the BGM volatility function λ(t, T ) and give the analytic proof of bond pricing. In particular, we can confirm the humps in the pricing of bond call option occur while the graph of bond put option are decreasing functions of the maturity as the value of δ(tenor) and σr (volaitility of interest rate) are increasing with two scenarios. This result means a simple and reasonable estimate for the pricing of bond options under the proposed conditions. Keywords: Heath-Jarrow-Morton (HJM) model, Brace-Gatarek-Musiela (BGM) model, and Bond Option.
1
Introduction
Many researchers continue to make efforts to set up a standard model for stochastic interest rate models. In particular, the pricing of bond option is a very important problem encountered in complex financial markets [1]. The pioneers of term structure model could be regarded as Vasick [2], Brennan and Schwartz [3], and Cox-Ingersoll-Ross [4]. These models could not be calibrated consistently to the initial yield curve, and the relationship between the model parameters and the
This work was partially supported by Korea Sanhak Foundation 2007 Research Fund. Corresponding author.
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 594–604, 2009. c Springer-Verlag Berlin Heidelberg 2009
New Approach for the Pricing of Bond Option
595
market observed market features into these models. By contrast, the arbitragefree approach to modelling the term structure of interest rate has its genesis in Ho and Lee model [5] and Heath, Jarrow, and Morton (HJM) have developed a new class of stochastic interest rate model [6]. The HJM class takes into account times evolution of an entire term structure of interest rates, and no arbitrage opportunity is assured [7]. In the past decade, academics and practitioners have proposed various candidates for HJM volatility functions. However, the main drawback of the HJM model is that it results in models that are nonMarkovian in general, and consequently the techniques from the theory of PDEs no longer apply. To overcome this problem, many researchers have considered ways of transforming properties of the earlier Markovian models and the HJM model coexist, and provide useful settings under which to study interest rate derivatives. Another solution was achieved by Brace-Gatarek-Musiela (BGM) model [8]. The model is based on forward LIBOR rates. Jamshidian developed a counterpart of the BGM model by taking swap rates as state variables [9]. The BGM and Jamshidian’s models are referred as “market models”, because they take trading instruments as state variables. The BGM model can be interpreted as a subset of the HJM model, and the arbitrage-free condition is automatically satisfied. Although the BGM model is attractive to market participants, it also has several drawbacks. We have to pick up unique “forward δ-LIBOR rate”, e.g. 6 month LIBOR rate, and we are obliged to stick to it. We have a complicated drift term in the forward LIBOR rate evolution equation on the spot martingale measure, and consequently implementations of the model might be troublesome. BGM model has formulated the dynamics of forward LIBOR rates on a forward martingale measure rather than on a spot martingale measure [10]. Thus the forward LIBOR rate follows lognormal distribution on the forward martingale measure. In following, we fix δ and simply refer to forward δ-LIBOR rates as forward LIBOR rates. We already know that the HJM and BGM models are very difficult that the closed-form solution (CFS) of bond option. Our final goal is to obtain the CFS for bond option on the HJM model with the BGM model. In this paper, we propose a new approach for the pricing of bond option using the relation between the HJM model and the BGM model. To obtain the bond pricing formula, we first consider the relation between the HJM volatility function σf (t, T ) and the BGM volatility function λ(t, T ) to get rid of the risk term. Second, we derive the formula of the discount bond price using restrictive condition of Ritchken and Sankarasubramanian (RS) [11]. RS have extended Carverhill [12] results showing that if the volatilities of forward rates were differential with respect to maturity date. And then we can determine the CFS of bond option under the proposed conditions. Finally, to find its price we need to consist of two scenarios because CFS of bond options price is obtained through the relation between the HJM model and the BGM model and restrictive condition of RS by the empirical computer. By computation analysis, we confirm that the bond call option price on the proposed model is humped as δ(tenor) and σr (volatility of interest rate) are increasing while the bond put option prices are
596
K. Park, S. Kim, and W.T. Shaw
decreasing functions of the maturity with two scenarios. It is very reasonable on prediction of the bond option price [13]. Moreover, we explain for term structure movements using the latent factor models, it means how macro variables affect bond option prices and the dynamics of the yield curve [14]. This result means an accurate estimate on the proposed model in empirical computer. The structure of the remainder of this paper is as follows. In the section 2, introduced basic concepts of the HJM model. In the section 3, new approach for the pricing of bond options using the relation between the HJM model and the BGM model is presented. In Section 4, performance of the pricing of bond options are given by the proposed model with two scenarios. Conclusions and future research are in Section 5.
2
Basic Concepts of the HJM Model
Our model will be set up in a given complete probability space (Ω, Ft , P )) and an argument filtration (Ft )t≥0 generated by an Winear process {W1 (t), W2 (t), · · ·, Wn (t)|t ≥ 0}. We will ignore taxes, transaction costs, and a saving account Vt , represent the time t value of unit investment made at time 0 is given by rolling spot rate, t Vt = exp r(s)ds . (1) 0
We denote as f (t, T ) the instantaneous forward rate at time t for instantaneous borrowing at time T (≥ t). Then the price at time t of a pure discount zero coupon bond with maturity T , denote by V (t, T ), is defined as T
V (t, T ) = exp −
f (t, s)ds ,
(2)
t
so that V (T, T ) = 1. The spot rate at time t, r(t) is the instantaneous forward rate at time t(≥ 0) is given by r(t) = lim f (t, T ) = f (t, t). T →t
(3)
We consider the multi-factor HJM model for forward rate f is governed by a SDE of the form df (t, T ) = μf (t, T )dt +
n
σfi (t, T )dWi (t),
(4)
i=1
where μf (t, T ) represents drift function; σ 2 fi (t, T ) is volatility coefficients; in this stochastic process n independent the Wiener processes determine the stochastic fluctuation of the entire forward rate curve starting from a fixed initial curve. The main contribution of the HJM model is the parameters μf (t, T ) and σfi (t, T ) cannot be freely specified; drift of forward rates under the risk-neutral probability
New Approach for the Pricing of Bond Option
597
are entirely determined by their volatility and by the market price of risk. We introduce an arbitrage-free condition as follows: T n μf (t, T ) = − σfi (t, T )(λi (t) − σfi (t, s)ds), (5) t
i=1
where λi (t) represents the instantaneous market price of risk and that is independent of the maturity T . Furthermore, by an application of Girsanov’s theorem the dependence on the market price of interest rate risk can be absorbed into an equivalent martingale measure. Thus, Wiener processes is dW Q i (t) = dW i (t) + λi (t)ds Substituting the equation (4) into no-arbitrage condition (5). We now think over the one-factor HJM model with jump (n = 1) and under the corresponding riskneutral measure Q, the explicit dependence on the market price of risk can be suppressed, and we obtain the SDE is given by df (t, T ) = σf ∗ (t, T )dt + σf (t, T )dW Q (t), (6) T where σf ∗ (t, T ) = σf (t, T ) t σf (t, s)ds and dW Q (t) is a standard Wiener process generated by the risk-neutral measure Q. In stochastic integral equation form, equation (6) may be written as t t f (t, T ) = f (0, T ) + σf ∗ (s, T )ds + σf (s, T )dW Q (s). (7) 0
0
It can be shown that the probability density function for f (t, T ). Therefore, the conditional expectation and variance for the forward rate are normally distributed and we calculate that t t ∗ 2 f (t, T ) ∼ N f (0, T ) + σf (s, T )ds, σf (s, T )ds . (8) 0
3
0
Pricing of Bond Options Using the Relation between the HJM Model and the BGM Model
We summarize our notation a brief review of the BGM model before we derive a closed-form solution (CFS) for the bond pricing model. The forward δ− LIBOR rate prevailing at time t over the future time interval [T, T + δ] is denoted by L(t, T ) and follows lognormal distribution on the forward martingale measure. The equation (2) is related to the forward δ− LIBOR rate, 1 + δL(t, T ) =
V (t, T ) , V (t, T + δ)
(9)
where the positive number δ is called the tenor of the LIBOR(δ = 0.25, 0.5, . . .). We investigate one-factor HJM model which we use the relation between the
598
K. Park, S. Kim, and W.T. Shaw
HJM model and the BGM model. From the equation (7), taking integral from t to T of both sides we obtain T T t T f (t, s)ds = f (0, s)ds + σf ∗ (u, s)dsdu t
t
t + 0
0 T
t
σf (u, s)dsdW Q (u)
t
If we substitute the above equation into the equation (2), then we derive the zero coupon bond price as follow: V (t, T ) = e−
= e−( =
T t
T t
f (t,s)ds f (0,s)ds+ 0t tT σf ∗ (u,s)dsdu+ 0t tT σf (u,s)dsdW Q (u))
V (0, T ) −( t T σf ∗ (u,s)dsdu+ t T σf (u,s)dsdW Q (u)) 0 t e 0 t . V (0, t)
Since the spot rate at time t, r(t) is the instantaneous forward rate at time t(≥ 0). When T = t in the equation (7), we can compute the spot rate r(t) is given by t t r(t) = f (0, t) + σf ∗ (s, T )ds + σf (s, T )dW Q (s). (10) 0
0
From the equation (10), we get t t σf (s, T )dW Q (s) = r(t) − f (0, t) − σf ∗ (s, T )ds 0
(11)
0
Substituting the equation (10)into the equation (2), hence we obtain T 2 t σf (t, s)ds V (0, T ) 1 t V (t, T ) = exp{− σf2 (s, t)ds V (0, t) 2 σf (t, T ) 0 T σf (t, s)ds − t [f (0, t) − r(t)]}. σf (t, T )
(12)
To find the bond price formula, we will be use the relation between the HJM volatility function σf (t, T ) and the BGM volatility function γ(t, T ). BGM model formulated the dynamics of the forward δ−LIBOR rates under the spot martingale measure is governed by a SDE,
dL(t, T ) = L(t, T )γ(t, T ) σf (t, T )dt + dW Q (t) , (13) where the forward δ−LIBOR volatility γ(t, T ) is given by 1 + δL(t, T ) T +δ γ(t, T ) = σf (t, u)du. δL(t, T ) T
(14)
New Approach for the Pricing of Bond Option
599
Theorem 1. Let be the process in forward δ-LIBOR rate L(t, T ) is the equation (13) and let be the SDE for forward rate (6). Then we obtain the equivalent model is T f (0, T ) = r(T ) − σf ∗ (s, T )ds 0 T log L(T, T ) − log L(0, T ) − 0 γ(s, T )σf (s, T )ds − (15) T +δ 1+δL(t,u) δL(t,u) du T that is, all forward rates are normally distributed. Proof. Integrating both sides of equations (6) and (13) with respect to 0 from T to, we obtain T T f (T, T ) − f (0, T ) = σf ∗ (s, T )ds + σf (s, T )dW Q (s) (16) 0
and
0
log L(T, T ) − log L(0, T ) =
T
T
γ(s, T )σf (s, T )ds + 0
γ(s, T )dW Q (s). (17)
0
From the equations (16) and (17) we can be rewritten as T T Q σf (s, T )dW (s) = r(T ) − f (0, T ) − σf ∗ (s, T )ds 0
and
T
(18)
0
γ(s, T )dW Q (s) = log L(T, T ) − log L(0, T ) −
0
T
γ(s, T )σf (s, T )ds, (19) 0
since we have f (T, T ) = r(T ). In the equation (19), using the equation (14) we are rearranged by T σf (s, T )dW Q (s) = 0
T log L(T, T ) − log L(0, T ) − 0 γ(s, T )σf (s, T )ds T +δ 1+δL(t,u) δL(t,u) du T By the equations (16) and (17), we obtain this formula as follow; T f (0, T ) = r(T ) − σf ∗ (s, T )ds 0 T log L(T, T ) − log L(0, T ) − 0 γ(s, T )σf (s, T )ds − . T +δ 1+δL(t,u) du δL(t,u) T
(20)
(21)
600
K. Park, S. Kim, and W.T. Shaw
Here the forward rate is normally distributed, which means that the bond prices are log-normally distributed. For the volatility of forward rate is of the form σf (t, T ), the following formula the discount bond price V (t, T ) was obtained in restrictive condition of RS. RS have extended Carverhill [12] results showing that if the volatilities of forward rates were differential with respect to maturity date. In this paper, the volatility of forward rate is given by σf (t, T ) = σr (t)η(t, T ) (22) T with η(t, T ) = exp − t a(s)ds is deterministic functions. Also, we are derived the relation between LIBOR rate and forward rate by the theroem 1. Hence we obtain the pricing of bond formula as follows; Corollary 1. Let be the HJM model based on jump is the SDE for forward rate f (t, T ) is given by T df (t, T ) = σf (t, T ) σf (t, s)dsdt + σf (t, T )dW Q (t), (23) t Q
where dW is a Gaussian process generated by an equivalent martingale measure Q and σf (t, T ) be as given in the equation (22). Then discount bond price V (t, T ) for forward rate is given by the formula T 2 t V (0, T ) 1 t σf (t, s)ds V (t, T ) = exp{− σf2 (s, t)ds V (0, t) 2 σf (t, T ) 0 T σf (t, s)ds − t [f (0, t) − r(t)]} (24) σf (t, T ) with the equation (15). We derive a CFS of bond option using the relation between the HJM model and the BGM model. We now consider the value of European option on bond pricing equation (2) using the proposed model. The price of a call option on the TV maturity discount bond with exercise price K and maturity T < TV is given by C(t, TV ) = V (t, T )E[max(V¯ − K, 0)] = V (t, TV )N (h1 ) − KV (t, T )N (h2 ), where h1 =
log([V (t, TV )/V (t, T )]/K) + σV 2 /2 , σV h2 = h1 − σV ,
and
(25)
New Approach for the Pricing of Bond Option
601
σV 2 = Var[log V (T, TV )], where E denotes expected value in a world that is forward risk neutral with respect to a zero coupon bond maturing at time TV , and N (x) is the normal cumulative density function. The price of a put option on the discount bond is P (t, TV ) = V (t, T )E[max(K − V¯ , 0)] = V (t, T )(KN (−h2 ) − E[V¯ ]N (−h1 )).
(26)
The resemblance between the bond option pricing formula and the Black-Scholes and Morton formula is obvious. In both cases, the random variables is lognormal, resulting in similar formulas.
4
Performance of the Proposed Model
In this section, a special case of the general stochastic volatility framework introduced in the section 2 is considered to investigate the pricing of bond option on the HJM model with the BGM model. We estimate for the CFS of bond options under the proposed conditions using Mathematica[15] as shown in Fig. 1 ∼ 4. To find its prices we need to consist of two scenarios because CFS of bond option price which get through the relation between the HJM model and the BGM model and restrictive condition of RS. In Experiment 1 ∼ 4, to calculate computer, the parameter values are assumed to be r0 = 0.058, rt = 0.0342, a = 0.046, σr = 0.08, t0 = 0, t = 0.05, T = TV − 0.5, and TV = 1, . . . , 10. Note that let K be the strike price of a bond
Fig. 1. [Experiment 1] Bond call option prices on the HJM model with the BGM model (δ = tenor)
602
K. Park, S. Kim, and W.T. Shaw
Fig. 2. [Experiment 2] Bond put option prices on the HJM model with the BGM model (δ = tenor)
option, then the K = 0.8 case is the bond call option, the K = 1.2 case is the bond put option. In Fig. 1 and 2, the graph of the pricing of bond options describe as the value of δ (tenor) is increasing under the values of other parameter fixed as above. As shown in Fig. 3 and 4, the pricing of the bond options depict as the value of σr (volatility of interest rate) is increasing under the values of other parameter fixed as above. In empirical computer calculation Fig. 1 ∼ 4, as can be seen from Fig. 1 and 3, the humps in the pricing of bond call option occur while the bond put
Fig. 3. [Experiment 3] Bond call option prices on the HJM model with the BGM model (σr = volatility of interest rate)
New Approach for the Pricing of Bond Option
603
Fig. 4. [Experiment 4] Bond put option prices on the HJM model with the BGM model (σr = volatility of interest rate)
option prices are decreasing functions of the maturity as the value of δ(tenor) and σr (volatility of interest rate) are increasing as shown in Fig. 2 and 4.
5
Conclusion and Future Research
Stochastic interest rate model has long been of interest to researchers in complex financial markets. In particular, HJM model approach provides a very general interest rate framework, but its main drawback is that it results in models that are non-Markovian ( i.e. path dependent) in general. This model depend on the entire history of forward rates. Therefore, this model from the theory of partial differential equations to obtain the CFS, and numerical solutions in case where this was not possible. In this paper, to conquer main drawback of HJM model, we propose a new approach for the pricing of bond options using the relation between the HJM model and the BGM model. To obtain the CFS of bond options under the proposed conditions, we first use the relation between the HJM volatility function and the BGM volatility function. And then we derive the CFS of bond options on pure discount bond which obtain in restrictive condition of RS. This paper obtains a new approach for the pricing of bond options on the HJM model with the BGM model. In particular, we can confirm the humps in the pricing of bond call option occur while the graph of bond put option are decreasing functions of the maturity as the value of δ (tenor) and σr (volatility of interest rate) are increasing with two scenarios. This result means a simple and reasonable estimate for the pricing of bond option under proposed conditions. There are still many problems remaining for further research. Some of them, for instance, are: (i) comparison the proposed model and the data of actual market; (ii) consider a dynamic prediction algorithm to predict the pricing of bond option using actual data set of bond.
604
K. Park, S. Kim, and W.T. Shaw
References 1. Das, S.R., Foresi, S.: Exact solutions for bond and option prices with systematic jump risk. Review of Derivatives Research 1 (2005) 2. Vasicek, O.A.: An Equilibrium Characterization of the Term Structure. Journal of Financial Economics 5, 177–188 (1977) 3. Brennan, M., Schwartz, E.: A Continuous Time Approach to the Pricing of Interest Rates. Journal of Banking and Fiance 3, 133–155 (1979) 4. Cox, J.C., Ingersoll, J., Ross, S.: A Theory of the Term Structure of Interest Rate. Econometrica 53, 385–407 (1985) 5. Ho, T.S., Lee, S.: Term Structure Movements and Pricing Interest Rate Contingent Claims. Journal of Finance 41, 1011–1028 (1986) 6. Health, D., Jarrow, R., Morton, A.: Bond Pricing and the Term Structure of Interest Rates. Econometrica 60, 77–105 (1992) 7. Chiarella, C., Kwon, O.K.: Classes of Interest Rate Models under the HJM Framework. Asia-Pacific financial markets 8, 1–22 (2001) 8. Brace, A., Gatarek, D., Musiela, M.: The market model of interest rate dynamics. Mathematical Finance 7, 127–155 (1997) 9. Jamshidian, F.: LIBOR and swap market models and measures. Finance and Stochastics 1, 293–330 (1997) 10. Musiela, M., Rutkowski, M.: Martingale Methods in Financial Modelling, 2nd edn. Springer, Heidelberg (2005) 11. Ritchken, P., Sankarasubramanian, L.: Volatility Structures of Forward Rates and the Dynamics of the Term Structure. Mathematical Finance 5, 55–72 (1995) 12. Carverhill, A.: When is Spot Rate Markovian? Mathematical Finance 4, 305–312 (1994) 13. Buraschi, A., Jiltsov, A.: Habit Formation and Macroeconomic Models of the Term Structure of Interest Rates. Journal of Finance 62, 3009–3063 (2007) 14. Ang, A., Piazzesi, M.: A no-arbitrage vector autoregression of term structure dynamics withmacroeconomic and latent variables. Journal of Monetary Economics 50, 745–787 (2003) 15. Wolfram, MathWorld, http://mathworld.wolfram.com
Measuring Anonymous Systems with the Probabilistic Applied Pi Calculus Xiaojuan Cai1 and Yonggen Gu2 1
2
BASICS Lab, Department of Computer Science, Shanghai Jiao Tong University, Shanghai
[email protected] Department of Computer Science and Technology, Huzhou Teachers College, Zhejiang
[email protected]
Abstract. In [1] a formulation of anonymity based on metrics of Probabilistic Applied Pi processes was proposed. As an extension to [1], we consider to neglect all the internal interactions in the definition of metric. In other words, the new metric between two processes turns out to be 0 when these two processes are weakly bisimilar (strongly bisimilar in [1]). Upon metric, the degree of probabilistic anonymity is modelled for general anonymous systems where there are no explicit senders or receivers. In addition, we devise an algorithm to calculate the metric between two finite processes. As a running example, we analyze the classical anonymous protocol — Probabilistic Dining Cryptographer Problem — to illustrate the effectiveness of our approach. Keywords: Anonymity, Probabilistic Applied Pi Calculus, Dinning Cryptographer Problem.
1
Introduction
Anonymity is getting more and more attention in the past few years as Internet becomes part of people’s everyday lives. Some web activities, such as web surfing, file sharing and request sending, require some reasonable degree of anonymity, otherwise computer users may be reluctant to engage in. Recently, many systems have been proposed to implement anonymity for various kinds of network communication, such as FOO electronic voting protocol [2], untraceable electronic cash protocol [3] and so on. These anonymous systems are non-probabilistic. Formal methods are applied to analyze their anonymity. For example, Chothia[4] analyzed the MUTE anonymous file sharing system using the pi calculus and weak bisimulation.
This work is supported by the National 973 Project (2003CB317005), the National Nature Science Foundation of China (60573002, 60703033), the Nature Science Foundation of Huzhou City Zhejiang Province (2007YZ10) and the Sci-Tech Brainstorm Stress Project of Huzhou City Zhejiang Province (2007G22).
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 605–620, 2009. c Springer-Verlag Berlin Heidelberg 2009
606
X. Cai and Y. Gu
Probabilistic anonymous systems have been proposed for those circumstances where absolute anonymity is hard to obtain. Probabilistic protocols include Crowds [5], variants of DCP [6], and others. Instead of absolute anonymity, the degree of anonymity is defined and investigated. In [1] the definition of metric on pairs of processes was given to measure the degree of difference between two Probabilistic Applied Pi processes. It equals to 0 when two processes are strongly bisimilar. In the seminal work of ComonLundh and Cortier, the computational soundness of observational equivalence in Applied Pi Calculus was proved under IND-CPA security and INT-CTXT assumption. Therefore, in this paper we extend the metric in [1] to the setting of weak bisimulation which equals to observational equivalence in Probabilistic Applied Pi Calculus. The main contributions of this paper conclude as follows. 1. We give the definition of metric on pairs of processes to measure the degree of difference between two Probabilistic Applied Pi processes. It equals to 0 when two processes are weakly bisimilar. Our definition of metric extends the one defined in [7] with a more complex label set. Frame equivalence is taken into account too. 2. Upon metric, we define beyond suspicion, probable innocence and possible innocence — three degrees of anonymity from strong to weak. Different from the work in [1], we formulate the anonymity degree to general anonymous systems where there are no explicit senders or receivers, such as FOO voting protocols, electronic cash protocols and so on. 3. We devise an algorithm to calculate the metric between two finite PAPi processes. Symbolic techniques are integrated into algorithm to eliminate infinite branches of the input prefixes. 4. As a running example, we model the Probabilistic Dining Cryptographer Problem (PDCP for short). The coins in this problem are biased. When the number of cryptographers is in {3, 4, 5, 6}, we show that the relation between the degree of anonymity and the distribution of coin-tossing is d = |1 − 2p| where d is the degree while p denotes the probability of “head”. Outline of the Paper. The rest of the paper is organized as follows. Section 2 briefly introduces the probabilistic applied pi calculus. Section 3 defines metric between two processes, and gives the algorithm of metric calculation. Section 4 defines the degree of anonymity upon metric. Section 5 gives the running example of PDCP. Finally, Section 6 concludes.
2
Probabilistic Applied Pi Calculus
In this section, we briefly recall some key concepts of probabilistic Applied Pi Calculus (PAPi for short). More details can be found in [8]. A signature Σ = {(f1 , a1 ), · · · , (fn , an )} is a finite set of function symbols fi each with the arity ai . The syntax of PAPi adds to the applied pi calculus the nondeterministic choice (+) and probabilistic choice (⊕p ) in the definition
Measuring Anonymous Systems
607
of plain processes. Terms, plain processes and extended processes of PAPi are defined by the following grammar: M, N ::= a, b, c, · · ·
(Terms)
| x, y, z, · · · | f(M1 , · · · , Ml ) P, Q ::= 0 | u ¯M .P | u(x).P | P + Q | P ⊕p Q
(Plain Process)
| P | Q | !P | ν n.P | if M = N then P else Q A, B ::= P | {M/x} | A | B | ν n.A | ν x.A
(Extended Process)
In terms, a, b, c are names, x, y, z are variables and f(M1 , · · · , Ml ) denotes function application if (f, l) ∈ Σ. In plain processes, 0 is the null process which does nothing. u ¯M .P outputs the term M on channel u and behaves like P , while u(x).P inputs a term on channel u and behaves like P with the actual received message replacing x in P . The nondeterministic choice P + Q denotes a process which may act like P or Q, when the probabilistic choice P ⊕p Q denotes a process which acts like P with probability p and Q with probability 1 − p. P | Q denotes the parallel composition of P and Q. !P is the infinite copies of P running paralleled. ν n.P bounds the fresh name n in P and the conditional process if M = N then P else Q behaves like P if M = N , like Q otherwise. The extended processes extend plain processes with active substitutions {M/x}. The process P | {M/x} means the variable x in P can be replaced with term M . We write f v(A), bv(A), f n(A), and bn(A) for free and bound variables, free and bound names of A. We use a, b, c, · · · to range over the set of channel names, m, n, · · · for the set of random numbers and k, k1 , · · · for the set of keys. Act is the set of all possible actions by the labeled rules in Appendix A. We denote the equivalence between terms by M = N . For every Σ, we equip it with an equational theory E. We write M =E N to mean M = N is derivable in E. For example, given the following equational theory E: fst(pair(x, y)) = x, snd(pair(x, y)) = y, dec(enc(x, y), y) = x. The meaning of the first two equations is clear, while the third one stands for symmetric encryption and decryption. Within this equational theory, the following equations are easily to get. fst(pair(M1 , N )) = snd(pair(M2 , M1 )),
dec(enc(m, k), k) = m.
An extended process A’s frame φA is obtained by replacing all the plain processes def / ). in A with 0. It can be written in the form of φA = ν n .{M x} where n ⊆ n(M
608
X. Cai and Y. Gu
= ∅, then φA ≡ 0. If M = {M1 , M2 , · · · , Mn } and x If M = {x1 , x2 , · · · , xn }, then φA ≡ ν n .({M1 /x1 } | {M2/x2 } | · · · | {Mn /xn }). Frames can be viewed as the static knowledge exposed by A to its environment. Static equivalence (≈E ) is an equivalence relation built on closed extended processes, which depicts the indistinguishability of their frames. If two processes are statically equivalent, then the knowledge they exposed to the environment are indistinguishable. The formal definition is referred to Appendix B.
3
Metric
Since bisimulation equivalence is not a suitable concept in the presence of probabilities, a metric analogue of weak bisimulation is developed [7]. In this section, we apply this approach to PAPi, taking into account the static equivalence and a more complex set of labels. We give the algorithm for automatically calculating the metric between finite probabilistic applied pi processes. The elimination of finite branching is due to the symbolic approach we adopt. 3.1
The Definition
We consider pseudo-metrics on the set of all closed extended processes. Definition 1 (Pseudo-metrics). M is the class of 1-bounded pseudo-metrics on extended processes with the ordering: m1 m2 if (∀A, B)(m1 (A, B) ≥ m2 (A, B)). We denote the top element of M by , i.e. (∀A, B)( (A, B) = 0); the bottom element of M is denoted by ⊥, i.e. ⊥(A, B) = 1 if A ≡ B otherwise ⊥(A, B) = 0. One can easily get that (M, ) is a complete lattice. Since the labeled transition system of PAPi (see Appendix A) is an labeled concurrent Markov chain, every process becomes a distribution on processes after a probabilistic transition. We lift the metric to the set of distributions. The definition is based on the Hutchinson metric on probability measures – we have merely simplified the definition for our context of discrete finite state distributions. Definition 2. Suppose m ∈ M, and μ, μ be distributions on extended processes, then m(μ, μ ) is given by the solution of the following linear program: Σi (μ(Ai ) − μ (Ai ))(ai − aK ) ∀i.0 ≤ ai , aK ≤ 1,
maximalize: subject to:
∀i, j.ai − aj ≤ m(Ai , Aj ) Definition 3. For closed extended process A, B. The action set of A corresponding to B, denoted L(A, B), is the set: {α | α ∈ Act ∧ f v(α) ⊆ dom(A) ∧ bn(α) ∩ f n(B) = ∅}
Measuring Anonymous Systems
609
L(A, B) is the set of actions A can perform and the bound names in these actions should be renamed in order not to clash with the free names of B. We now define a function F on M that closely resembles the usual functional for weak bisimulation. Definition 4. Given two closed extended processes A, B, a functional F on M is defined as follows: – if A ≈E B, then F (m)(A, B) = 1; – if A ≈E B, F (m)(A, B) = α α max(maxα∈L(A,B) supA=⇒μ inf B =⇒μ m(μ, μ ), α α maxα∈L(B,A) supB =⇒μ m(μ, μ )) inf A=⇒μ
where inf ∅ = 1, sup ∅ = 0. Similar to [7], F is monotonic on M. Therefore F has a fixed point which is given by mf = i mi where m0 = and mi+1 = F (mi ). Definition 4 gives us an intuition of calculate the maximum metric between processes. Let m0 = , iterate F enough times to get the maximum fixed point mf and the distance between two processes is mf (A, B). In the rest of this paper, we mean mf (A, B) if we say the metric between A and B. The following main theorem characterizes metric. Our definition of metric is consistent with weak bisimulation relation. We postpone the proof to Appendix C. Theorem 1. For any two closed extended processes A, B, A and B are weakly bisimilar iff mf (A, B) = 0, where mf is the fixed point of F given m0 = . 3.2
Automatic Calculation
The metric is given for any two processes of PAPi. Since PAPi is Turing complete, there is no algorithm to calculate the metric in general. But if we restrict our consideration to the calculation of metric for finite processes, we devise an algorithm for automatic calculation. Finite processes are those that will terminate in finite steps without infinite branches. In other words, the transition graphs of finite processes are finite trees. When the process is finite, we need not to iterate the calculation of F (m), one recursive function will be adequate. We illuminate this by the following proposition and the proof is direct. Proposition 1. Given two finite closed extended processes A, B, the metric m(A, B) defined as follows equals to mf (A, B). – if A ≈E B, then m(A, B) = 1; – if A ≈E B, then m(A, B) = α α max(maxα∈L(A,B) supA=⇒μ inf B =⇒μ m(μ, μ ), α α maxα∈L(B,A) supB =⇒μ m(μ, μ )), inf A=⇒μ
610
X. Cai and Y. Gu
However the “finite” closed PAPi processes are very limited. For example, process a(b)
a(x).0 is not finite because for any b we have a(x).0 −→ 0. It is infinite branching. In the analysis of security protocols, the protocol instance is modelled by some a(b)
PAPi process, say P . If P −→ P for some a, b and P , it means that the adversary sends a message b on the public channel a. The infinite branching of input actions represents the ability of forging arbitrary messages of the adversary. In order to eliminate this ability of infinite behaviors, we put some restriction on it. – All the messages an adversary can send are got by a finite computation from the messages that he/she has had. In PAPi the frame represents all the messages the adversary has known. For example, the frame νn, k.{m/x, enc(n, k) /y} means the adversary gets two messages which consists of a nonce m and an cipher text enc(n, k). The computation ability of the adversary is modeled by the functions in the signature, such as enc(m, enc(n, k)), pair(m, enc(n, k)) and so on. So the restriction we put on it is that the number of nested functions is bounded by a constant. – The adversary can chose a random number and add it to the frame. This random number is used for the situation when the frame is empty and the adversary is assumed to be able to generate random numbers. We let the random number generation to be deterministic in order to get rid of the infinite branching caused by infinitely many random numbers. This will not influence the result of analysis since different newly generated random numbers are indistinguishable to each other in Dolev-Yao model [9]. Actually, Proposition 1 and the above limitations have given us a recursive algorithm to calculate the metric between any terminate protocol instances. The pseudo-code of main algorithm is referred to Appendix D.
4
Description of Anonymity
For each type of anonymity, Reiter and Rubin [5] defined six degrees of anonymity. Three of them attracted more attention. Take sender anonymity as an example: 1. Beyond suspicion: Though the attacker can see evidence of a sent message, the sender appears no more likely to be the originator of that message than any other potential sender in the system. 2. Probable innocence: From the attacker’s point of view, the sender appears no more likely to be the originator than to not be the originator. 3. Possible innocence: From the attacker’s point of view, there is a nontrivial probability that the real sender is someone else. The degree of sender anonymity has been formulated based on metrics in [1]. It can be easily extended to the setting of weak bisimulation by using metrics
Measuring Anonymous Systems
611
defined in Section 3. In this paper, we consider about the general anonymous systems where there are no explicit senders or receivers. In [10] these three concepts are extended to general anonymous systems: Beyond suspicion means the actual user (i.e. the user that performed the action and wanted to be anonymous) is not more likely to have performed the action than every other user. Probable innocence means the actual user has probability less than 1/2. Possible innocence means there is a nontrivial probability that another user could have performed the action. Suppose A is an probabilistic anonymous system, then A(i) stands for an instance of A in which the actual user’s identity is i. Let ki denotes the private knowledge of some participant i, such as the private keys, newly generated random numbers and etc. We write Chni for the set of all the channels related to i. For a set of corrupted participants T , we use AT (i) to represent the instance in which A(i) is instantiated with all the private knowledge of corrupted parties in T being revealed to the environment and all the private channels under the control of the environment. In other words, these collaborated participants make the environment more powerful. Formally, AT (i) = A (i) | def
{kj /xj }
j∈T
where A (i) is the process A(i) by opening the private channels of participants in T . Now we come to the formal definition of the degree of anonymity. Definition 5. For any anonymous system A, whose possible user set is I, we say A satisfies, – beyond suspicion under corrupted parties T , if dT = 0, – probable innocence under corrupted parties T , if dT < 0.5, – possible innocence under corrupted parties T , if dT < 1, def
where dT = maxi,j∈I\T {mf (AT (i), AT (j))}. As stated above, AT (i) denotes the instance in which the actual user is i, and AT (j) is j. Both of them are running with a group of attackers T and the environment who control not only all the public channels, public keys, but also the private channels and private keys between them. In next section we will see a more illustrative example showing what AT (i) looks like. The degree of anonymity dT equals to the maximal metric between two instances with different actual users who are not corrupted. If dT = 0, then the actual user is not more likely to have performed the action than every other user since each pair of AT (i) and AT (j) are hardly to tell apart. If dT < 0.5, then the adversary can observe some differences between some user k and others but he/she can not say that k is the actual user. Similarly, when dT < 1, the adversary may find the actual user k in this instance with probability 0.999, but he/she still can not say it is k. However, when dT = 1, the system does not satisfy anonymity.
612
5
X. Cai and Y. Gu
Probabilistic Dining Cryptographer Problem
The Dining Cryptographer Problem (DCP) is described as follows. A number of cryptographers are having a dinner on a round table. The representative of their organization (master) may or may not pay the bill of the dinner. If he does not, then he will select exactly one cryptographer and order him to pay the bill secretly. The cryptographers would like to reveal whether the bill is paid by the master or by one of them. If the bill is paid by one of them, the identity of the payer should be kept unknown. Chaum [11] gives a solution to this problem. The solution is to associate a coin-tossing with each pair of adjacent cryptographers. After the coins are tossed, each cryptographer will see the results of two coins which are associated with him/her. Then the cryptographers are asked one question: Are the results of the two coins same? The answer is either “same” or “different”. If he is not the payer, he will answer honestly. But if he is, he will lie. It has been proved in [11] that if the number of answers “different” is odd, the payer is among the cryptographers. The Probabilistic Dining Cryptographer Problem (PDCP for short) considers the coins to be probabilistic, so the methods for non-probabilistic anonymous system no longer work. We need to develop additional techniques to cope with these cases. 5.1
Modeling
We use the set A = {0, 1, · · · , n − 1} to represent the n cryptographers, and Cm = {cm,i | i ∈ A} (resp. Cc = {cc,i | i ∈ A}) for the channels between the master (resp. the coin system) and each cryptographer. Let C = Cm ∪ Cc . The cryptographers output their answers through a set of public channel pub = {pubi | i ∈ A}. We make all the channels private except those in pub. There are three types of participants in PDCP (see Figure 1): 1. Master (M asteri ) chooses some cryptographer i to pay, then sends PAY on the channel cm,i and NOTPAY on channels {cm,j | ∀j ∈ A ∧ j = i}. 2. Coin system (Coin) tosses coins, then sends the results to pairs of adjacent cryptographers i and [i + 1], where [i + 1] means (i + 1) mod n. p is the probability of each coin to turn HEAD up. 3. Cryptographer i (Crypti ) gets the message x from master and x1 , x2 from the coin system. If x = PAY, then cryptographer i compares x1 with x2 and gives his/her lying answer on channel pubi . If x = NOTPAY, he/she will give the honest answer on pubi .
5.2
Analysis
In PDCP, the cryptographers are the actual users who need to keep anonymous. def Instances of it are P DCP (k) = νC.(M asterk | Coin | i∈A Crypti ) for each k.
Measuring Anonymous Systems
def
M asteri = cm,i PAY | ( def
M aster =
613
cm,j NOTPAY)
j∈A\{i}
M asteri
i∈A def
Crypti = cm,i (x).cc,i (x1 ).cc,i (x2 ). if x = PAY then if x1 = x2 then pubi DIFFERENT else pubi SAME else if x1 = x2 then pubi SAME def
Coin =
else pubi DIFFERENT (cc,i HEAD | cc,[i+1] HEAD) ⊕p (cc,i TAIL | cc,[i+1] TAIL)
i∈A def
P DCP = νC.(M aster | Coin |
Crypti )
i∈A
Fig. 1. Modeling PDCP in PAPi
Since there are no encryption, we have ki = ∅. For any set of corrupted parties T A, P DCP T (i) = νC T .(M aster(i) | Coin | def
Crypti )
i∈A
where C T = C \ {cm,j , cc,j | j ∈ T }. We first consider the case in which n = 3. There can be only one corrupted party, since if there are two corrupted parties then the third one cannot keep anonymous. W.l.o.g, suppose cryptographer 0 is the corrupted party. d0 = max {m(P DCP 0 (1), P DCP 0 (2))} i,j∈{1,2}
= m(P DCP 0 (1), P DCP 0 (2)) The function d0 (p) is depicted in Figure 2. It is obvious that when p = 0.5 (the coin-tossing distribution is HEAD : 0.5, TAIL : 0.5), d0 equals to 0. This means PDCP is beyond suspicion when coins are unbiased. However when the coins are biased, 0.25 < p < 0.75, we get d0 < 0.5 and PDCP is probable innocence. In the cases where p = 0 or p = 1, PDCP is not anonymous because d0 = 1. These cases are consistent with our intuition. If the coin-tossing is always HEAD and the payer is cryptographer 1, the corrupted party 0 will find the output on pub0 , pub1 , pub2 are always SAME, DIFFERENT, SAME. If cryptographer 2 is the payer, the output on pub0 , pub1 , pub2 are always SAME, SAME, DIFFERENT.
614
X. Cai and Y. Gu
Fig. 2. The relation of d0 and p (n=3)
So the corrupted party can easily know which one is the payer. When the coins are unbiased, the corrupted party can find nothing more in the case when cryptographer 1 is the payer than the case when 2 is the payer. Similarly, we can analyze the case n = 4. We calculate d0 , d0,1 and find the graphs of d0 of p and of d0,1 of p are completely same as Figure 2 in the case n = 3. We keep on checking n = 5, n = 6 and get the same graph when each good party at least has one good neighbor. This means every good party cannot have two corrupted neighbors, or its anonymity will not be assured. So in the case of n = 3, 4, 5, 6 if each good party has at least a good neighbor, the anonymity degree of PDCP depends on the distribution of the coin-tossing. Formally, d = |1 − 2p| where p is the probability of turning HEAD up. This result is a little different from the one of [6], where they use entropy to measure the anonymity degree and get ⎧ |1 − 2p| ⎪ ⎪ ⎨ |1 − 2p|2 d(p) = |1 − 2p|3 ⎪ ⎪ ⎩ |1 − 2p|4
if if if if
n=3 n=4 n=5 n=6
In [6] the degree of anonymity depends on the information given by the observable actions, and they consider the observable actions in PDCP be the output actions on pub. However, in our model, the coin-tossing results on the private channels of corrupted participants can also be detected by the environment (attacker). We try to use their method to check the case n = 4 and consider the coin-tossing
Measuring Anonymous Systems
615
results of corrupted parties be a part of the observables. We get an exactly same graph as ours here.
6
Conclusion
As an extension to [1], we gave a new metric by neglecting all the internal interactions. This definition turns out to be consistent to the labeled bisimulation relation, because the metric between two processes equals to 0 if and only if they are bisimilar to each other. We then define the degree of anonymity upon this metric and give an algorithm to calculate metric of finite processes, which gives rise to an automatic calculation tool. The probabilistic version of Dining Cryptographer Problem is analyzed as an illustrating example. As for the future work, we would like to extend the algorithm and automatic tool to finite state processes. Finite state processes have all the expressive power to model protocols which may not terminate.
References 1. Cai, X., Gu, Y.: Measuring Anonymity. In: Bao, F., Li, H., Wang, G. (eds.) ISPEC 2009. LNCS, vol. 5451, pp. 183–194. Springer, Heidelberg (2009) 2. Fujioka, A., Okamoto, T., Ohta, K.: A Practical Secret Voting Scheme for Large Scale Elections. In: Zheng, Y., Seberry, J. (eds.) AUSCRYPT 1992. LNCS, vol. 718, pp. 244–251. Springer, Heidelberg (1993) 3. Chaum, D., Fiat, A., Naor, M.: Untraceable Electronic Cash. In: Proceedings on Advances in cryptology table of contents, pp. 319–327. Springer, New York (1990) 4. Chothia, T.: Analysing the Mute Anonymous File-sharing System Using the Pi Calculus. In: Najm, E., Pradat-Peyre, J.-F., Donzeau-Gouge, V.V. (eds.) FORTE 2006. LNCS, vol. 4229, pp. 115–130. Springer, Heidelberg (2006) 5. Reiter, M., Rubin, A.: Crowds: Anonymity for Web Transactions. ACM Transactions on Information and System Security 1(1), 66–92 (1998) 6. Deng, Y., Palamidessi, C., Pang, J.: Weak Probabilistic Anonymity. Electronic Notes in Theoretical Computer Science 180(1), 55–76 (2007) 7. Desharnais, J., Jagadeesan, R., Gupta, V., Panangaden, P.: The Metric Analogue of Weak Bisimulation for Probabilistic Processes. In: Proceedings of the 17th Annual IEEE Symposium on Logic in Computer Science, pp. 413–422 (2002) 8. Goubault-Larrecq, J., Palamidessi, C., Troina, A.: A Probabilistic Applied PiCalculus. In: Shao, Z. (ed.) APLAS 2007. LNCS, vol. 4807, pp. 175–190. Springer, Heidelberg (2007) 9. Dolev, D., Yao, A.: On the Security of Public Key Protocols. IEEE Transactions on Information Theory 29(2), 198–208 (1983) 10. Palamidessi, C.: Probabilistic and Nondeterministic Aspects of Anonymity. Electronic Notes in Theoretical Computer Science 155, 33–42 (2006) 11. Chaum, D.: The Dining Cryptographers Problem: Unconditional Sender and Recipient Untraceability. Journal of Cryptology 1(1), 65–75 (1988)
616
A
X. Cai and Y. Gu
Operational Semantics of PAPi
The operational semantics of PAPi consists of three sets of rules. Structural Rules Par-0
A≡
A|0
Par-A Par-C
A | (B | C) ≡ A|B ≡
Repl
!P ≡
New-0 New-C
ν n.0 ≡ ν u.ν v.A ≡
New-Par Alias
A | ν u.B ≡ ν x.{M/x} ≡
ν u.(A | B) if u ∈ f v(A) ∪ f n(A) 0
Subst Rewrite
{M/x} | A ≡ {M/x} ≡
{M/x} | A {M/x} {N/x} Σ M = N
(A | B) | C B|A P | !P 0 ν v.ν u.A
Internal Reduction Rules Comm Then
a ¯x.P | a(x).Q −→ if M = M then P else Q −→
P |Q P
Else
if M = N then P else Q −→
Q if Σ M = N
Labeled Rules In
a(M)
a(x).P −→ P {M/x} a ¯u
A −→ A
Open-Atom
Out-Atom
a ¯ u
a ¯u.P −→ P
u = a
ν u.¯ au
ν u.A −→ A A −→ A u ∈ n(α) ∪ v(α) α
Scope Par
Struct
B
α
ν u.A −→ ν u.A α A −→ A bv(α) ∩ f v(B) = bn(α) ∩ f n(B) = ∅ α
A | B −→ A | B α
A ≡ B B −→ B B ≡ A α A −→ A
Weak Bisimulation
We assume a discrete probability distribution μ : 2X → [0, 1] over the set X to have following properties:
Measuring Anonymous Systems
617
– μ(X) = 1; – μ(∪i Xi ) = i μ(Xi ). Let δx be the dirac measure on x. We define the probabilistic addition operation on distributions μ = μ1 +p μ2 as μ(Y ) = p · μ1 (Y ) + (1 − p) · μ2 (Y ). Let ExecA be the set of all labeled executions of A, def
α
α
1 k ExecA = {e | e = A −→ μ1 A1 · · · −→μk Ak , k ∈ N }
α
α
1 k For any finite labeled execution e = A −→ μ1 A1 · · · −→μk Ak , we define some unary functions on it.
e↑ = {e | e is a prefix of e } last(e) = Ak |e| = k label(e) = α1 α2 · · · αk Given a scheduler F , we write ExecF A for the set of executions starting from A and driven by the scheduler F . For each e ∈ ExecF A , some additional unary functions are defined on it as follows. PAF (e) = μ1 (A1 ) · . . . · μk (Ak ) F P robF A (e↑) = PA (e)
The set of executions starting from A that cross a process in the set H is F F F denoted as ExecF A (H). The probability P robA (H) = P robA (ExecA (H)). Let F ∗ ∗ ExecA (τ ατ , H) be the set of executions which start from A and lead to a process in H via an execution that consists of an α action preceded and followed ∗ ∗ by an arbitrary number of τ steps. The probability P robF A (τ ατ , H) is defined F F ∗ ∗ as P robA (ExecA (τ ατ , H)). We write dom(φ) for the domain of φ, a set of variables which appear in active substitutions in φ but not under a variable restriction. The domain of an extended process is the domain of its frame. Definition 6. We say that two terms M and N are equal in the frame φ, and write (M = N )φ, if and only if φ ≡ ν n .σ, M σ = N σ, and n ∩(f n(M )∪f n(N )) = ∅ for some names n and substitution σ. Definition 7. We say that two closed frame φ and ψ are statically equivalent, denoted by φ ≈s ψ, if dom(φ) = dom(ψ) and for all terms M and N , we have (M = N )φ if and only if (M = N )ψ. Definition 8 (Static equivalence). We say that two closed extended processes are statically equivalent ≈E , if their frames are statically equivalent.
618
X. Cai and Y. Gu
Weak bisimulation is defined under the labeled rules in Appendix A. That two closed extended processes are weak bisimilar implies that their operations are similar. In [8], weak bisimulation is proved to be same as observational equivalence, an equivalent relation considering the impossibility to differentiate two processes by any observer. Definition 9 (Weak bisimulation). ≈l is the largest symmetric relation R between closed extended processes with the same domain such that whenever ARB implies: – A ≈E B; F – ∀F, ∀C ∈ AC /R, there exists F , s.t. P robF A (C) = P robB (C); F ∗ ∗ – ∀F, ∀α = τ, ∀C ∈ AC /R, there exists F , s.t. P robF A (α, C) = P robB (τ ατ , C) with f v(α) ⊆ dom(A) and bn(α) ∩ f n(B) = ∅.
C
Proof of Theorem 1
Theorem 1. For any two closed extended processes A, B, A and B are weakly bisimilar iff mf (A, B) = 0, where mf is the fixed point of F . Before actual proof, we introduce some auxiliary definitions. Definition 10 (State metric). A metric m on closed extended processes is a state metric if for all processes A, B and c ∈ [0, 1), m(A, B) ≤ c implies, α
α
– if A =⇒ μ, then there exists some μ such that B =⇒ μ and m(μ, μ ) ≤ c. An obvious lemma follows the definition. Lemma 1. mf is a state metric. The relation between weak bisimulation and state metric is stated in the next lemma. Lemma 2. Given a binary relation R on closed extended processes, and a pseudo-metric m such that 0 if ARB, m(A, B) = 1 otherwise, R is a weak bisimulation iff m is a state metric. Proof. Given two distributions μ, μ . First let us consider how to compute m(μ, μ ) if R is an equivalence relation. We may assume that C1 , C2 , · · · be the equivalence classes of closed extended processes under R. Recall the linear program in Section 3: m(μ, μ ) = (μ(Ai ) − μ (Ai ))(ai − aK ) Maximalize: i
∀i.0 ≤ ai , aK ≤ 1, ∀i, j.ai − aj ≤ m(Ai , Aj ).
Subject to:
Measuring Anonymous Systems
619
If sj1 , sj2 are in the same equivalent class, then we have m(sj1 , sj2 ) = 0, and aj1 = aj2 . We rewrite the linear program in the following form: m(μ, μ ) =
(μ(Ci ) − μ (Ci ))(ai − aK )
Maximalize:
i
∀i.0 ≤ ai , aK ≤ 1,
Subject to:
∀i, j.ai − aj ≤ m(ACi , ACj ). Here μ(Ci ) = Aj ∈Ci μ(Aj ) and ACi denotes the representative of the equivalent class Ci . (=⇒). If R is a weak bisimulation, of course it is an equivalence relation, and m(A, B) = 0, then ARB. By the definition of weak bisimulation, we have that α α for any α ∈ L(A, B) and A =⇒ μ, there exists μ and B =⇒ μ and ∀j.μ(Cj ) = μ (Cj ). So m(μ, μ ) = 0 ≤ m(A, B), and m is a state metric. (⇐=). If m is as defined in the hypothesis. It is clear that R is an equivalence relation. We need to show that R is a weak bisimulation. First of all A must be statically equivalent to B. Suppose ARB, so m(A, B) = 0 by the definition. α α If A =⇒ μ, then there exists some μ such that B =⇒ μ and m(μ, μ ) ≤ m(A, B) = 0. Assume there exist l such that μ(Cl ) = μ (Cl ). W.l.o.g. we assume μ(Cl ) > μ (Cl ). Let ∀j = l.aj = 0, then al = m(ACj , ACl ). Because ACj and ACl are in different equivalent classes, so al = m(ACj , ACl ) = 1, and
(μ(Ci ) − μ (Ci ))(ai − aK ) > (μ(Cl ) − μ (Cl )(1 − 0)) > 0,
i
which contradicts to m(μ, μ ) ≤ m(A, B) = 0. One can conclude that R is a weak bisimulation, and we are done. Back to the proof of main theorem. Proof. Two directions. – (=⇒). If A ≈l B, by Lemma 2, there exists a state metric m such that m(A, B) = 0. Then since m mf , we have mf (A, B) ≤ m(A, B) = 0. – (⇐=). We construct a pseudo-metric m, 0 if mf (A, B) = 0, m(A, B) = 1 otherwise. By Lemma 1, mf is a state metric, so is m. Then we construct a binary relation R: for any two closed extend processes, ARB iff m(A, B) = 0. According to Lemma 2, R is a bisimulation. So if mf (A, B) = 0, then m(s, t) = 0 and thus ARB and A ≈l B. We are done.
620
D
X. Cai and Y. Gu
The Main Algorithm
Algorithm 1. Metric (A, B) if A is not statically equivalent to B then return 1 end if max ⇐ 0 kA ⇐ the number of A’s descenders kB ⇐ the number of B’s descenders for i = 0 to kA − 1 do a ⇐ the action of A’s ith descender dA ⇐ the distribution of A’s ith descender min ⇐ 1 for j = 0 to kB − 1 do if the action of B’s j th descender equals to a then dB ⇐ the distribution of B’s j th descender m ⇐ DisM etric(dA , dB ) if min > m then min ⇐ m end if end if end for if max < min then max ⇐ min end if end for return max DisMetric (dA , dB ) nA ⇐ the number of processes s with dA (s) > 0 nB ⇐ the number of processes s with dB (s) > 0 n ⇐ nA + n B A[n],p[n],r[n][n] is initiated k⇐0 for i = 0 to nA − 1 do A[k] ⇐ the k th process in dA p[k] ⇐ the probability of the k th process in dA k ⇐k+1 end for for i = 0 to nB − 1 do A[k] ⇐ the k th process in dB p[k] ⇐ 0 − the probability of the k th process in dB k ⇐k+1 end for for i = 0 to n − 1 do for j = i to n − 1 do r[i][j] ⇐ M etric(A[i], A[j]);r[j][i] ⇐ r[i][j] end for end for return call linearP (p, r) to get the solution of the linear programming maxΣi p[i](ai − aK ) subject to ∀i : 0 ≤ ai , aK ≤ 1 and ∀i, j : ai − aj ≤ r[i][j].
YAO: A Software for Variational Data Assimilation Using Numerical Models Luigi Nardi1,2 , Charles Sorror1, Fouad Badran1,2 , and Sylvie Thiria1 1
2
LOCEAN, Laboratoire d’Oc´eanographie et du Climat: Exp´erimentations et Approches Num´eriques Unit´e Mixte de Recherche 7159 CNRS/IRD/ Universit´e Pierre et Marie Curie/MNHN Institut Pierre Simon Laplace 4, place Jussieu 75252 Paris Cedex 05, France CNAM-CEDRIC, Centre d’Etude et De Recherche en Informatique du CNAM 292, rue St Martin 75141 Paris Cedex 03, France {luigi.nardi, charles.sorror, fouad.badran, sylvie.thiria}@locean-ipsl.upmc.fr http://www.locean-ipsl.upmc.fr/
Abstract. Variational data assimilation consists in estimating control parameters of a numerical model in order to minimize the misfit between the forecast values and some actual observations. The gradient based minimization methods require the multiplication of the transpose jacobian matrix (adjoint model), which is of huge dimension, with the derivative vector of the cost function at the observation points. We present a method based on a modular graph concept and two algorithms to avoid these expensive multiplications. The first step of the method is a propagation algorithm on the graph that allows computing the output of the numerical model and its linear tangent, the second is a backpropagation on the graph that allows the computation of the adjoint model. The YAO software implements these two steps using appropriate algorithms. We present a brief description of YAO functionalities. Keywords: Data assimilation, numerical model, modular graph, adjoint model, automatic differentiation, backpropagation.
1
Introduction
Numerical models are widely used as a tool for studying physical phenomena. A direct numerical model is a discretization of the equations that represent the physical phenomenon under study. The space structure of the phenomenon can be 1D, 2D, 3D; often an additional dimension is added to represent the time evolution. Most of the time the model is used to forecast or analyse the evolution of the phenomenon. Since the model is imperfect, discrepancy between its forecast values and actual observations may be important due to model parametrization, O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 621–636, 2009. c Springer-Verlag Berlin Heidelberg 2009
622
L. Nardi et al.
numerical discretization, uncertainty on the initial conditions and boundary conditions. New methods which use both the direct model of the phenomenon and inverse problem method have been introduced to overcome this difficulty, the so-called data assimilation. Data assimilation seeks a good compromise between the actual observations at some points of the space/time location and the corresponding outputs of the numerical model. The observations constrain the control parameters (initial conditions, model parameters, . . . ) in order to force the direct model to reproduce the desired behavior. One can distinguish between two types of data assimilation methods: sequential and variational ([1,2,3,4,5,6]). The present paper is dedicated to variational methods which are more suited for observations which are not given regulary in time and space. Variational methods consist in defining a cost function J which measures the misfit between the direct numerical model outputs and the observations. Their aim is to minimize the function J which depends on the control parameters. This can be done by operating a local minimization using a gradient method. Normally, the user programs the direct (dynamic) model, computes the gradient of the cost function by programming the adjoint model and schedules the operations of minimization according to the selected scenario. From the data-processing point of view this yields two types of problems: – If a direct model program has been implemented, it is necessary, for the assimilation purpose, to implement the program which provides the adjoint model and sometimes also its tangent linear model. – Once all these models have been implemented, it is necessary, to schedule the various calculations according to a certain scenario and to the chosen minimization method. The first problem leads to use automatic differentiation softwares ([7,8,9]) and the second problem to design specific softwares ([10]). The YAO software, which we develop at LOCEAN laboratory, concerns these two tasks and aims to deal with the two previous problems simultaneously. With YAO, the user specifies, using specific directives, the type of discretization and the specification of the direct model. YAO generates the direct model, the tangent linear model and the adjoint model. It also allows to choose, according to the specific scenario, an implementation for the minimization function J. In the following, section 2 deals with the notations of variational data assimilation; section 3 introduces the modular graph which is the YAO basic formalism; section 4 presents the basics algorithms used by YAO; section 5 deals with the decomposition of an application by a modular graph; section 6 makes a brief presentation of YAO functionalities and some applications which were already implemented; section 7 presents a simple example showing how to get a YAO graph from a direct model.
2
Theoretical Principles and Notations
Variational assimilation requires the knowledge of a numerical model, the socalled direct model M which describes the physical phenomenon evolution under
YAO: A Software for Variational Data Assimilation
623
study. If we take for example a geophysical problem, the direct model allows to link together the geophysics variables and the observations. The assimilation consists in modifying the control parameters so that the model reproduces the observations as good as possible. The control parameters can be for example the initial conditions or unknown parameters of M . In this section, we present the formal mathematical notations. We adopt the formalism and the notations presented in [11]: – M : direct model describing the evolution (that is in general nonlinear) between two time steps of discretization ti , ti+1 . – x(t0 ): initial input state vector - we suppose thereafter that it corresponds to the control variables. – Mi (x(t0 )) or M (t0 , ti ): state model at time ti beginning from t0 . We will denote x(ti ) = Mi (x(t0 )). – M (ti , ti+1 ): tangent linear matrix which is the jacobian of model M calculated at x(ti ). – The tangent linear matrix of the model Mi calculated at x(t0 ) is defined by: 0
M i (x(t0 )) =
M (tj , tj+1 ) .
j=i−1
– The adjoint matrix of the model Mi calculated at x(t0 ) is defined by: M i (x(t0 ))T =
i−1
M (tj , tj+1 )T .
j=0
– xb : a background vector which is an estimate of the vector x(t0 ). – y o : set of observations at different time. The vector y oi thus represents the observations at time ti , this vector can be empty if there are no observations at time ti . The model M allows to estimate quantities which are generally observed with the observation operator (H). In the field of geophysics this operator allows, for example, to compare the model M outputs which calculates the temperature at sea surface with observations recorded by a satellite radiometer. We denote: – H: observation operator which allows to calculate, starting from the direct model outputs at x(ti ), y i =H(x(ti )), the quantity y i being the equivalent of the observation variables y oi . It is supposed thereafter that: y oi =yi +εi (εi is a random variable of null average). – H i : tangent linear model matrix of the H operator calculated at x(ti ). The assimilation consists in minimizing a cost function J which measures the misfit between the direct numerical model outputs and the observations by improving the control variables. Generally the cost function is defined as: n
J(x(t0 )) =
1 1 (y i −y oi )T R−1 (y i −y oi )+ (x(t0 )−xb )T B −1 (x(t0 )−xb ) . (1) 2 i=1 2
624
L. Nardi et al.
Ý0 Ü0
direct model composition
Ý
cost function
J
derivative ∇Ý J
Ü
parameters modification
∇Ü J
adjoint model
Fig. 1. Basic iteration of the variational data assimilation: the misfit between the direct model output y and the observations y o is defined by the cost function J. The derivative vector ∇y J is used to compute the matrix product defined by the first term of expression (2), x0 is the vector x(t0 ) at the first iteration.
With y i = H(x(ti )), R the covariance matrix estimation on the observation errors εi , B the covariance matrix estimation on the error background vector xb and n the total number of time intervals. The gradient ∇x0 J is equal to: ∇x(t0 ) J =
n
M Ti (x(t0 ))H Ti [R−1 (y i − y oi )] + B −1 (x(t0 ) − xb ) .
(2)
i=1
The minimization procedure, which is a gradient type method, is carried out by choosing a particular implementation among the set of those proposed by the optimization techniques([12]). These methods need to calculate the cost function gradient (2) with respect to the control parameters; Fig. 1 depicts the basic iteration. In order to facilitate the convergence in some specific problems, we can use an approximate gradient descent method, called incremental algorithm [13]. The incremental algorithm consists in modifying locally the J function. The minimization of the function J entails to initialize the minimization algorithm with an initial state xg (t0 ), so that the control vector parameter we look for can be defined by x(t0 ) = xg (t0 ) + δx(t0 ). We use the tangent linear approximation: y i = H(Mi (x(t0 ))) H(Mi (xg (t0 ))) + H i (M i (xg (t0 )))δx(t0 ) .
(3)
If we pose di = y oi − H(Mi (xg (t0 ))) we can express J as: n
1 [H i M i (xg (t0 ))δx(t0 )−di ]T R−1 [H i M i (xg (t0 ))δx(t0 )−di ]+ 2 i=1 (4) 1 b g −1 b g + [δx(t0 ) − (x − x (t0 ))]B [δx(t0 ) − (x − x (t0 ))] . 2 This new formulation of J represents a quadratic expression with respect to δx(t0 ) and corresponds to a J(δx(t0 )) approximation around xg (t0 ). The problem is to minimize J(δx(t0 )) depending on the vector δx(t0 ) only, since the J[δx(t0 )] =
YAO: A Software for Variational Data Assimilation
625
other function terms of (4) are constants. This phase of minimization is denoted as the internal loop. In each internal loop iteration, we have to calculate the gradient: ∇δx(t0 ) J =
n
M Ti (xg (t0 ))H Ti R−1 [H i M i (xg (t0 ))δx(t0 ) − di ] +
(5)
i=1
B −1 δx(t0 ) − (xb − xg (t0 )) . After processing an internal loop (minimization iterations) for computing δx(t0 ), we restart the minimization algorithm at the new initial state xg (t0 )) + δx(t0 ) and run again the internal loop for some more iterations. This initial state setting phase is called external loop. In this paper we present the incremental method which uses a two-stage scenario for the minimization process. Others scenarios, as the Dual formalism which takes into account the possible error of model M , lead to others and sometime more sophisticated computations([14,15]). The goal of this paper is not to present all the possible scenarios but to focus on the complexity of the algorithm needed for their implementation. Equations (2) and (5) show that gradient calculations require matrix products of type M Ti δy and M i δx for any i. However, generally the matrices M Ti and M i being of huge dimensions, it is not so easy to calculate their product. We present in section 4 two algorithms to compute these matrix products without the explicit knowledge of the matrices.
3
Modular Graph
We define the following terms: – Module: a module F is an entity of computation; it receives an input vector and calculates an output vector. A module receives inputs from other modules or from the external graph context and it transmits outputs to other modules or to the external graph context. – Basic connection: if the jth output of a module Fp is transmitted to the ith input of a module Fq , we can modelize this transmission by a connection between the output j of the module Fp and the input i of the module Fq which we denote thereafter by (i, q) and (j, p). We call this connection basic connection. Data transmission towards the external context of the graph will be represented by a basic connection starting from a module output and ending at a special node called output data point. Data transmission towards the interior of the graph will be represented by a basic connection starting from a special node called input data point and ending at an input of a module. A modular graph is a set of several interconnected modules. The modules represent the graph vertices; an arc from module Fp to the module Fq means that there exists at least one basic connection from Fp to Fq (Fig. 2a). The modular graph thus describes the scheduling of the modules execution (Fig. 2b).
626
L. Nardi et al.
x1q
x1p x2p
y1p
Fp
Fq
y1q y2q
x2q
Fq x1r
Fr
x2r
y2p x1l x2l
y1l
Fl
y2l
y1r y2r
Fp
Fr
x3r
Fl
Fig. 2. (a) Basic connections between modules, input data points xT = (x1p , x2p , x2l ) and output data points y T = (y1q , y1r , y2r ). (b) Corresponding modular graph.
The modular graph summarizes the sequential order of the computations: an arc from Fp to Fq indicates that Fq must start its execution only after Fp has been executed. The modular graph is an acyclic graph, so it contains three types of vertex: – The modules with no predecessors in the graph, receive data from input data points only. – The modules with no successors in the graph, transmit outputs only to output data points. – The internal graph modules necessarily receive inputs from one or several other modules and eventually from input data points and transmit results to their modules’ successors or output data points. The input data set of a module Fp constitutes a vector denoted xp and its output data set constitutes a vector denoted y p (y p = Fp (xp )). As a consequence, a module Fq can be executed only if its input vector xq has already been processed, which implies that all its predecessor modules Fp have been executed beforehand. Since the modular graph is acyclic, it is then possible to find a module’s order (the topological order) which respects the following property: if Fp → Fq is an arc of the modular graph then Fp precedes Fq in the order of computation. The topological order allows to correctly propagate the calculation through the graph from the input points to the output points, all the graph input data points being initialized by the external context. The propagation of intermediate computations following the topological order is called forward procedure. This procedure gives the way to produce the correct final value of the global computation of the direct model. If we denote by x the vector corresponding to all the graph input data point values, the forward procedure allows to calculate the vector y corresponding to all graph output data point values. The modular graph defines a global function Γ and makes it possible to compute y = Γ (x).
YAO: A Software for Variational Data Assimilation
4 4.1
627
Computation of the Tangent Linear and the Adjoint of the Global Function Γ Tangent Linear Computation
We now denote by Γ a graph (composed by its Fq modules) and assume that each Fq can compute the matrix product dy q = F q dxq , where dxq is the perturbation of xq and F q is the jacobian matrix of Fq calculated at xq . It is then possible to compute, as for the forward procedure, the product dy = Γ dx where dx represents the perturbation associated with all the input vectors xq and Γ the jacobian matrix of Γ computed at an input vector x. This computation is done on the modular graph using the following algorithm: Lin forward Algorithm. Before determining the linear tangent of the global function Γ for a given input data x, all the inputs of the different modules xip have to be determined. This is done by running the forward procedure with x as input. 1. Initializing by the external context, the perturbation dx, by assigning to each input data points i of the graph its corresponding perturbation dxi . 2. Passing through all the modules by following the topological order. For each module Fq we consider its input perturbation vector dxq . This can be done by transmitting computed perturbations from the output of its predecessors modules or from those initiated by the external context. Then we compute dy q = F q dxq (the jacobian F q is computed at the point xq ). 3. Recovering the vector result dy on the graph output data points. This vector represents the perturbation of the global function Γ . 4.2
Adjoint Computation
As in the case of the tangent linear model, we suppose that for each module Fp , with an input vector xp and receiving in its output points a perturbation vector dyp , we can calculate the matrix product dxp = F Tp dyp . F Tp is the transposed jacobian matrix of the module Fp calculated at the point xp . We can prove that it is possible to compute the matrix product dx = Γ T dy, where Γ T is the transposed jacobian of the global model Γ calculated at the input vector x and dy is a perturbation vector defined at each output data point. This computation is done by passing through the modules of the graph in the reverse topological order (backpropagation). The backward algorithm, requires, for each module Fp , the definition of two vectors αp and β p . The vector αp has the same dimension as the module Fp input vector and we denote by αip its ith input element. The vector βp has the same dimension as the module Fp output vector and we denote by βjp its jth output element. Moreover, the α parameters are also defined for the output data points and the β parameters are defined for the input data point. We denote by αi the parameter of the output data points i and by βj the parameter of the input data point j.
628
L. Nardi et al.
If (j, p) is the index of the jth output of Fp , we denote by: – SU CCM (j, p) the set of indices (i, q), ith input of module Fq , which take the value of (j, p) as input. – SU CCO(j, p) the set of all the output data points which take the value of (j, p) as input. If j is an input data point we denote by SU CCI(j) the set of indices (i, q), ith input variables of the modules Fq , which take their value from the j input data point. Backward Algorithm. Before running this algorithm, all inputs xip of all modules Fp should have been calculated. For that, it is necessary to run the forward algorithm with the input vector x. For every output data point j we suppose that its corresponding perturbation is already defined by dy j . 1. Initializing the parameters αi relative to the graph output data points i by assigning αi = dyi . 2. Passing through all the modules in the reverse topological order. For each module Fp computes βp and αp as follow: – For all its output indices (j, p), performs the following operations (in order to compute βp ): • Assign βjp = 0. • If SU CCM (j, p) is not empty then compute βjp = (i,q)∈SUCCM(j,p) αiq . • If SU CCO(j, p) is not empty then compute βjp = βjp + i∈SUCCO(j,p) αi . – Computes αp = F Tp β p , where F Tp is the transpose of the jacobian matrix computed at point xp . 3. For each j input data point, compute βj = (i,q)∈SUCCI(j) αiq . The vector dx, whose components are the βj of all the graph input data points, verifies dx = Γ T dy. dy is the vector whose components are the values dy i defined at the graph output data points. Remark 1. The two algorithms lin forward and backward first suppose that we can compute the tangent linear and the adjoint of each module Fp . Modules could have very different complexities. In a simple case, where the module is an analytical function, we can calculate the jacobian matrix F p explicitly and calculate the product F p dxp and F Tp dy p . Moreover it is important to define the modular graph in order to have “small” modules (small entity of computation) so the analytical calculation of F p becomes easy. Concerning more complex modules, we can use programs which make these computations (i.e. a code obtained using automatic differentiation software [7,8,9], or even another modular graph). So the modular graph formalism, and its related algorithms, makes possible to merge different numerical codes to build complex numerical models.
YAO: A Software for Variational Data Assimilation
5
629
Representation of an Application with a Modular Graph
Running simulations or data assimilation using an operational numerical model Mi (x(t0 )) = M (ti−1 , ti ) ◦ M (ti−2 , ti−1 ) ◦ . . . ◦ M (t0 , t1 ) require the definition of a modular graph representing the sequence of the computations. A numerical model operates on a discreet grid, where the physical process is computed at each time step and at each grid point. As the phenomenon under study is quite the same at each grid point, there is a large amount of repetitivity. So the modular graph Γ , associated with the numerical model M , must take into account this concept of repetitivity: – A modular sub-graph (Γg ) describes all the computations needed at time t for a given grid point (Fig. 3a). – The (1D, 2D, 3D) graph is thus a modular graph whose vertices are the subgraph Γg and the arcs represent the information exchanges between them (Fig. 3b). – The complete graph Γ , which takes into account a time interval, is obtained by duplicating the graph as long as necessary. As defined previously, we used the same concept of input data and output data points for each module of the complete graph. The basic connections coming from the external context of Γ could be, for example, initializations or boundary conditions. Outgoing basic connections transmit their values to compute, for example, a cost function.
Γg F1
F3
Γg
Γg
Γg
Γg
F5 F2
F4
Fig. 3. Two graph abstraction levels: at the lower level (Fig. a), we build the graph Γg ; at the space level (Fig. b), the same graph Γg is repeated for each grid point (2D in this example). The space connections between the Γg graphs correspond to the basic connections between the modules.
630
6
L. Nardi et al.
YAO Presentation
We have presented in the above sections the basic concepts and algorithms of the YAO software. We present, briefly in this section, the overall structure of YAO and the various directive files that allow to generate applications. YAO permits the user to work with 1D, 2D, 3D problems and the time dimension. The user focuses on the application design and does not care about hard programming work. He specifies its problem using a specific language (YAO description language) and YAO automatically generates the associated modular graph. As already indicated, the description of the modular graph and the three related algorithms, is similar to a program giving the direct numerical model, the tangent linear and the adjoint. Moreover, YAO allows to define specific minimization scenarios. Figure 4 gives a schematic representation of the basic YAO architecture. The description and the instructions files have to be written by the user of YAO as well as the module specifications and the jacobian of each module. YAO takes into account these files and generates an executable program. We briefly present the various YAO aspects. The Description File. This file contains the YAO directives which define the direct numerical model characteristics. In particular, it is necessary: – To define the numerical model time steps (denoted as the trajectory). – To define the discreet space grid mesh (denoted spaces) and its dimension (1D, 2D, 3D). – To introduce all the information related to the cost function: observations, covariance matrix, . . . . – To define the modules, specifying for each one the number of inputs/outputs, its participation in the cost function, . . . – To build the graph by defining the basic connections between the modules. – To indicate the computational order (the order in which the grid points have to be considered). The description file contains all these information which are used by YAO to generate the executable code.
Fig. 4. Schematic representation of YAO: code generation starting from the description file (which is a model specification) and the modules files. This generated code (executable program) runs with an instructions file which will control the results production.
YAO: A Software for Variational Data Assimilation
631
The Instructions File. This file contains specific instructions (YAO instruction language) for running the model in a dedicated configuration (duration of the simulation, time increments, physical size of the space, initial values of the parameters, . . . ). The YAO instruction language allows to control the execution flow, to modify some parameters during the runtime and to introduce a background for the cost function. The Modules Files. These files contain the source code in which the physical laws of the numerical model, the input parameters and the jacobian are programmed. Figure 4 displays the YAO architecture: the part into the large rectangular frame contains the YAO procedures for generating the modular graph. YAO also uses the description file and the module files presented above to generate the executable program (which are out of the frame in Fig. 4). Once the executable program is created the instructions file is used to execute the user instructions. Modifying a model consists in changing some modules or some YAO directives in the description files. Modifying an execution of a YAO application consists in changing the YAO instructions in the instructions file. Although YAO works in C/C++, it is nevertheless possible to make links with other languages. Since that YAO does not care about the size of each module, the user may choose the model’s decomposition. From a practical point of view, this may depend on the module decomposition, and also on how the gradient of a module is computed. As already indicated, it is possible, in addition to YAO, to use an automatic differentiator rather than to code manually the Jacobian matrix. YAO provides some functionalities of an integrated tool. For example, it manages interfacing with a minimizer such as M1QN3 ([16]); it can deal with multilayer perceptrons; it includes a general cost function taking into account background and covariance operators. Moreover, YAO manages multitrajectories and multi-dimensional (up to 3D) computations. YAO has already been tested with success on several models in oceanography. It was applied in the following applications: – Ocean color: variational inversion of multi-spectral satellite ocean color measurements for the restitution of the chlorophyll-a [17,18]. – Marine acoustics: variational inversion of sound speed profile and retrieval of geoacoustic parameters (celerity, density, attenuation, . . . ) [19,18]. – PISCES: ocean color variational data assimilation in a biogeochemical model [20].
7
Numerical Example: The Shallow-Water
This section presents on a simple example the necessary steps to get the YAO graph of the direct model. At the end of the specification, YAO can generate the direct code in C++, the linear tangent code and the adjoint code and is able to make some assimilation experiments. We chose the two dimensions (2D) shallowwater model in the horizontal plane (x,y), also called Saint-Venant model, which
632
L. Nardi et al.
arises from the vertical integration of three dimensions (3D) Navier-Stokes equations. This model describes the linear flow of a nonviscous fluid in shallow water environment with a free surface. In the present study, focus is given on the internal mode of the two lauer fluid whose densities are slightly different. The evolution is described by the following system of partial differential equations: ∂u ∂h = −g ∗ · +f ·v−γ·u ∂t ∂x ∂v ∂h = −g ∗ · −f ·u−γ·v ∂t ∂y ∂h ∂u ∂v = −H · + ∂t ∂x ∂y – – – – – –
u and v are the horizontal velocities on axes x,y. h is the amplitude (the height) of the free surface. g ∗ is the reduced gravity. f is the Coriolis parameter. γ is a dissipation coefficient. H is the average height of the water.
The system of partial differential equations is resolved spatially by using the Arakawa C grid. For the temporal axis we use a leap frog discretization followed by an Asselin filter to ensure stability. The spatial discretization is based on a regular 2D grid. After initialization the direct numerical model is the following: – Dynamic variables: uijt = u ˆijt−2 + 2Δt f 4
[vijt−1 + vij+1t−1 + vi+1jt−1 + vi+1j+1t−1 ] − γ · uˆijt−2 vijt = vˆijt−2 + 2Δt
f 4
(6)
−g ∗ [hijt−1 − hij−1t−1 ] − Δy
[ui−1j−1t−1 + ui−1jt−1 + uij−1t−1 + uijt−1 ] − γ · vˆijt−2
hijt
−g ∗ [hi+1jt−1 − hijt−1 ] + Δx
ˆ ijt−2 − 2Δt · H =h
(7)
uijt−1 − ui−1jt−1 vij+1t−1 − vijt−1 + Δx Δy
.
(8)
– Asselin filter: u ˆijt = uijt + α(ˆ uijt−1 − 2uijt + uijt+1 ) vˆijt = vijt + α(ˆ vijt−1 − 2vijt + vijt+1 ) ˆ ijt = hijt + α(h ˆ ijt−1 − 2hijt + hijt+1 ) h
(9) (10) (11)
YAO: A Software for Variational Data Assimilation
633
ˆ u where h, ˆ and vˆ are intermediate variables. In the following we present the directives of the YAO description file, which define the basic connections related to (6–11) and allow to generate the shallow-water modular graph. We assume that the model is defined on a 50 × 50 grid (Δx=Δy = 5000 meters) and on a 100 time step trajectory (Δt = 1500 seconds, about 1 day and 17 hours). traj shallow_trajectory 1 100 space shallow_space 50 50 shallow_trajectory modul modul modul modul modul modul ctin ctin ctin ctin ctin ctin ctin ctin ctin ctin ctin ctin ctin ctin ctin ctin ctin ctin ctin ctin ctin ctin ctin ctin ctin ctin ctin ctin
Hfil Ufil Vfil Hphy Uphy Vphy Hfil Hfil Hfil Ufil Ufil Ufil Vfil Vfil Vfil Hphy Hphy Hphy Hphy Hphy Uphy Uphy Uphy Uphy Uphy Uphy Uphy Vphy Vphy Vphy Vphy Vphy Vphy Vphy
space space space space space space 1 2 3 1 2 3 1 2 3 1 2 3 4 5 1 2 3 4 5 6 7 1 2 3 4 5 6 7
from from from from from from from from from from from from from from from from from from from from from from from from from from from from
shallow_space shallow_space shallow_space shallow_space shallow_space shallow_space Hfil Hphy Hphy Ufil Uphy Uphy Vfil Vphy Vphy Hfil Uphy Uphy Vphy Vphy Ufil Hphy Hphy Vphy Vphy Vphy Vphy Vfil Hphy Hphy Uphy Uphy Uphy Uphy
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
i i i i i i i i i i i i-1 i i i i i+1 i i i+1 i+1 i i i i-1 i-1 i i
input input input input input input
j j j j j j j j j j j j j j+1 j j j j j+1 j j+1 j j-1 j j-1 j j-1 j
t-1 t-1 t t-1 t-1 t t-1 t-1 t t-1 t-1 t-1 t-1 t-1 t-1 t-1 t-1 t-1 t-1 t-1 t-1 t-1 t-1 t-1 t-1 t-1 t-1 t-1
order shallow_space order YA1 YA2 Hphy Uphy Vphy Hfil Ufil Vfil forder forder
3 3 3 5 7 7
output output output output output output
1 1 1 1 1 1
tempo cout target tempo tempo tempo tempo tempo
634
L. Nardi et al.
The directive traj defines the trajectory called shallow trajectory composed by an initialization phase (1 time step) and a set of 100 time steps during which the model run. The directive space defines a space called shallow space. The space is defined by two axes of dimension 50 × 50 and it is linked to the trajectory shallow trajectory. YAO references the axes by YA1 for the first and YA2 for the second. These references will be used thereafter to indicate the order to traverse the space in the directive order. The directive modul allows to declare modules. The YAO grammar defines keywords to figure out some attributes that characterize the module. The keyword modul is followed by the name of the module (for instance Hfil ) and then the keyword space followed by the name of the space linked with the module. The input and output attributes allow to specify the number of input and output of the module. tempo indicates that YAO must store the computed states on all the time steps of the trajectory; this could be useful for derivative computation and for referencing previous time steps. The keyword target allows to control the outputs of this module, therefore it is the target of our assimilation process. The term cost means that the output of this module is related to some observations and it will take part of the cost function computation. The implementation of the modules is done in the module files as shown in Fig. 4. In this example the ˆ uˆ, vˆ). modules (Hphy, Uphy, Vphy, Hfil, Ufil, Vfil ) represents (h, u, v, h, The directive ctin is used to create the basic connections of the graph. The input of the first module, at a current point (i,j,t) of the discretized space, takes its value from the output of the second module at a specified point (refer to after the module name). The numbers after the first module indicates the number of its specific input, the number after the second module names indicate the number of its output. The directive order defines the execution order of the modules that belong to a space. This directive allows to coordinate the computation of the various modules, that is to compute a module only if all its inputs coming from the predecessor modules have already been computed. The axes referred by YAi (YA1 for the 1st axis and YA2 for the second) are fixed and traversed in the mentioned order. These directives allow YAO to generate the global graph (in space and time) and the order directive gives a topological order of the graph.
8
Conclusion
The YAO software is dedicated to variational data assimilation in numerical models. It allows the user, dealing with a discrete numerical model representing a physical phenomenon, to describe the basic computation at each grid point. The user has to declare the space size and the number of time steps, specify the initial conditions and other parameters, present observations in space/time domain, define the cost function and the scenario chosen for its minimization, etc.. The YAO software then generates an executable program that enables to start data assimilation sessions. The current version of YAO allows to deal with
YAO: A Software for Variational Data Assimilation
635
real applications. We presented the modular graph concept which is the core of YAO and the algorithms that have been implemented in the YAO software. The modular graph structure opens up prospects for research and improvement. An analysis of the modular graph’s structure allows to address the automatic generation of the topological order and the automatic parallelization of the presented algorithms.
References 1. Data Assimilation, Concepts and Methods. In: Training Course Notes of ECMWF, European Center for Medium-range Weather Forecasts. vol. 53, UK (1997) 2. Le Dimet, F.-X., Talagrand, O.: Variational Algorithms for Analysis and Assimilation of Meteorological Observations: Theoretical Aspects. J. Tellus, Series A, Dynamic Meteorology and Oceanography 38(2) (1986) 3. Louvel, S.: Etude d’un Algorithme d’Assimilation Variationnelle de Donn´ees ` a Contrainte Faible. Mise en Oeuvre sur le Mod`ele Oc´eanique aux Equations Primitives MICOM (in French). PhD, Universit´e de Toulouse III, France (1999) 4. Sportisse, B., Qu´elo, D.: Assimilation de Donn´ees. 1`ere Partie: El´ements Th´eoriques (in French). Technical report, CEREA (2004) 5. Talagrand, O.: The Use of Adjoint Equations in Numerical Modelling of the Atmospheric Circulation. In: Griewank, A., Corliss, G. (eds.) Workshop of Automatic, Differentiation of Algorithms: Theory, Implementation and Applications, Breckenridge, Colorado, USA (1991) 6. Talagrand, O.: Assimilation of Observations, an Introduction. J. Meteorological Society Japan 75, 191–209 (1997) 7. Hasco¨et, L., Pascual, V.: TAPENADE 2.1 User’s Guide. Technical report, INRIA, France (2004), http://www-sop.inria.fr/tropics/papers/Hascoet2004T2u.html 8. Naumann, U., Utke, J., Wunsch, C., Hill, C., Heimbach, P., Fagan, M., Tallent, N., Strout, M.: Adjoint Code by Source Transformation with OpenAD/F. In: Wesseling, P., P´eriaux, J., O˜ nate, E. (eds.) Proceedings of the European Conference on Computational Fluid Dynamics (ECCOMAS CFD 2006), TU Delft (2006), http://proceedings.fyper.com/eccomascfd2006/documents/35.pdf 9. Giering, R., Kaminski, T.: Recipes for Adjoint Code Construction. ACM Transactions on Mathematical Software 24(4), 437–474 (1998), http://www.FastOpt.com/papers/racc.pdf 10. Fouilloux, A., Piacentini, A.: The PALM Project: MPMD Paradigm for an Oceanic Data Assimilation Software. In: Amestoy, P.R., Berger, P., Dayd´e, M., Duff, I.S., Frayss´e, V., Giraud, L., Ruiz, D. (eds.) Euro-Par 1999. LNCS, vol. 1685, p. 1423. Springer, Heidelberg (1999) 11. Ide, K., Courtier, P., Ghil, M., Lorenc, A.: Unified Notation for Data Assimilation: Operational, Sequential and Variational. Special Issue J. Meteorological Society Japan, Data Assimilation in Meteorology and Oceanography: Theory and Practice 75, 181–189 (1997) 12. Fletcher, R.: Practical Methods of Optimization, 2nd edn. John Wiley, New York (1987) 13. Courtier, P., Th´epaut, J., Hollingsworth, A.: A Strategy for Operational Implementation of 4D-VAR Using an Incremental Approach. Q. J. R. Meteorological Society 120, 1367–1387 (1994)
636
L. Nardi et al.
14. Louvel, S.: Implementation of a Dual Variational Algorithm for Assimilation of Synthetic Altimeter Data in the Oceanic Primitive Equation Model micom. J. Geophys. Res. 106(C5), 9199–9212 (2001) 15. Weaver, A., Deltel, C., Machu, E., Ricci, S., Daget, N.: A Multivariate Balance Operator for Variational Ocean Data Assimilation. Q. J. Royal Meteorological Society (2005) 16. Gilbert, J., Lemar´echal, C.: Some Numerical Experiments with Variable-storage Quasi-newton Algorithms. Mathematical Programming 45, 407–435 (1989), http://www-rocq.inria.fr/estime/modulopt/optimization-routines/m1qn3/ m1qn3.html 17. Brajard, J., Jamet, C., Moulin, C., Thiria, S.: Use of a Neuro-variational Inversion for Retrieving Oceanic and Atmospheric Constituents from Satellite Ocean Colour Sensor: Application to Absorbing Aerosols. Neural Networks 19(2), 178–185 (2006) 18. Badran, F., Berrada, M., Brajard, J., Cr´epon, M., Sorror, C., Thiria, S., Hermand, J.P., Meyer, M., Perichon, L., Asch, M.: Inversion of Satellite Ocean Colour Imagery and Geoacoustic Characterization of Seabed Properties: Variational Data Inversion Using a Semi-automatic Adjoint Approach. J. of Marine Systems 69, 126–136 (2008) 19. Hermand, J.P., Meyer, M., Asch, M., Berrada, M.: Adjoint-based Acoustic Inversion for the Physical Characterization of a Shallow Water Environment. J. Acoust. Soc. Am. 119(6), 3860–3871 (2006) 20. Kane, A., Thiria, S., Moulin, C.: D´eveloppement d’une M´ethode d’Assimilation de Donn´ees in Situ dans une Version 1D du Mod`ele de Biogeochimie Marine PISCES (in French). Master’s thesis, LSCE/IPSL, CEA-CNRS-UVSQ laboratories, France (2006) 21. YAO: Home Page, http://www.locean-ipsl.upmc.fr/~yao/
CNP: A Protocol for Reducing Maintenance Cost of Structured P2P Yu Zhang, Yuanda Cao, and Baodong Cheng Beijing Laboratory of Intelligent Information Technology, School of Computer Science, Beijing Institute of Technology, Beijing 100081 PRC
[email protected],
[email protected],
[email protected]
Abstract. With highly dynamic, structured P2P system needs very high maintenance cost. In this paper we propose a Clone Node Protocol to reduce the maintenance cost of structured P2P system by a mechanism of clone nodes. In order to verify the efficiency of CNP, we achieve a Clone Node Chord structure based on CNP, i.e. CNChord. Furthermore, we implement a bidirectional CNChord (BCNChord) in order to reduce query time of CNChord. Theoretical analysis and experimental results show that CNChord can greatly reduce the cost of maintaining P2P structure and BCNChord can effectively improve the query speed. In a word, CNP can effectively reduce the maintenance cost of structured P2P. Keywords: maintenance cost, structured P2P, CNChord, BCNChord.
1
Introduction
Although structured P2P systems have many advantages, like scale-free topology, resilience, self-organization and equalized distribution of ID space, etc. Nodes in the system need to use more bandwidth and high cost to maintain the structure. There is few research on maintenance cost of structured P2P system itself, most of researches are about improving query efficiency by change the structure of P2P overlay network, such as various transforms of Chord[1]. Karger [2] proposed a version of the Chord peer-to-peer protocol that allows any subset of N odes in the network to jointly offer a service without forming their own Chord ring. Montresor[3] introduced T-Chord, that can build a Chord network efficiently starting from a random unstructured overlay. After jump-starting, the structured overlay can be handed over to the Chord protocol for further maintenance. Zols [4] presented the hybrid chord protocol (HCP) to grouping shared objects in interest groups and use it In mobile scenarios. Joung[5] proposed a Chord2 structure,they found N odes in P2P systems are not equivalent, Some N odes , known as ”super peers”, are more powerful and stable than the others. They use super peers to serve as index servers for query, and to speed up query. Kaiping[6] designed a FS-Chord structure. EpiChord[7] is a DHT lookup algorithm that demonstrates that we can remove the O(logn)-stateper-node restriction on existing DHT topologies to achieve significantly better O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 637–652, 2009. c Springer-Verlag Berlin Heidelberg 2009
638
Y. Zhang, Y. Cao, and B. Cheng
lookup performance. Approaches in [8-10] adopted physical topologies to reduce maintenance cost. In fact the maintenance of structure is very large, especially in highly dynamic P2P network. The maintenance may affect the usability of system. This article focuses on reducing maintenance cost of structured P2P system itself, and proposes a new Clone Node Protocol, CNP for short. Chord is one of the most quoted structured P2P systems. In this paper we take Chord for example, realized a Clone Node Chord based on CNP, CNChord for short. Moreover, the route table of CNP is improved to support bidirection searching and CNChord with this improvement is called BCNChord.
2 2.1
Clone Node Protocol Protocol Description
The basic idea of CNP is to clone all nodes in structured P2P system. The clone node will take part of the original node after the original node leaves the system. When the original node joins the system again, the changes will be synchronized from clone node to the original node. Firstly we take Chord for example, realized a Clone Node Chord based on CNP, short for CNChord. 2.2
Structure of CNChord
Definition 1. Main Node: Every node in the system is assigned a basic role, which is called Main Node, noted by Node,e.g.,N odex . Definition 2. Successor: N odex ’s successor node, say N odes , is the first node along the clockwise direction in the Chord ring which is behind N odex , expressed as successor(N odex )=N odes . As a network with n N odes , firstly, every node is composed to Chord ring. In the sequence of clockwise, node is denoted by N ode1 , N ode2 , · · · , N oden . For any N odex , 1 ≤ x ≤ n ,if successor(N odex )= N odes , then N odes is cloned to N odex , so that N odex will keep both the information of N odex and N odes . After the process above, CNChord is established. Definition 3. Predecessor: An predecessor node of N odex , called N odep ,which is the first node along the anti-clockwise direction in the chord ring, presented as predecessor (N odex )=N odep . Definition 4. Clone Node: In the process above, after Nodes is cloned to N odex , the mirror node of Nodes is called Clone Node, denoted by CNode, such as CNodes means Clone Node of Nodes. Figure 1 shows a typical Chord structure with 4 nodes, and the corresponding structure of CNChord. Main nodes are drawn in black, ’N’ is the abbreviation of the word ”Node”. While clone nodes are drawn in gray, ’C’ is the abbreviation of the word clone.
CNP: A Protocol for Reducing Maintenance Cost of Structured P2P
639
Fig. 1. (a) Chord with 4 nodes. (b)CNChord corresponding to (a).
2.3
Initialization Process
CNChord adopt similar structure with Chord’s ring. When setting up CNChord, nodes compose a Chord ring according to[1]. Then every node performs clone node progresssee algorithm 1 to establish CNChord overlay network. If lots of nodes in Chord perform clone node process simultaneously, it will raise a large amount of network information of replication, this may make message flow increase in a short period, even block the normal function of overlay network. For avoiding this problem, we proposed a Pull Model Clone algorithm to complete the clone process. Definition 5. Basic Unit of Time: The period during one hop in overlay network, to be measured with millisecond. We denote Basic Unit of Time by T, and T is determined by the status of Chord which CNChord is based on. Definition 6. Current Idle Time: Current Idle Time is presented by Tnoact. One node’s Tnoact is the period from the previous action finished to present. Definition 7. Idle Trigger Threshold: Idle Trigger Threshold is presented by Tts. One node’s Tts is the maximum period from the previous action finished to the next action occurs. When N odex is idle, we use Pull Model Based Clone algorithm to complete cloning N odes . The following algorithm shows the detailed process. Clone(N odes , N odex ) procedure clones the parameter N odes to N odex . Algorithm 1. Pull Model Based Clone Algorithm Pull Clone (Node_s) {//passive clone algorithm on Node_x if (Tnoact θ within a zone z and its i ,τj ] buffer (z∪buf f er(z)) for a maximal closed set of continuing time slots T = [τi , τj ] (τi ≤ τj ) where θ is the spatial prevalence threshold. A spread element with z zc Et[τ can be child linked to another spread element with Et[τp pi ,τpj ] if and ci ,τcj ] only if zc ∈ Rz (zp ), and τpi ≤ τci ≤ τpj . The implementation of the linkage is z zc denoted as Et[τ →Lc Et[τp pi ,τpj ] . ci ,τcj ] As figure 2 shows, each grid with a time interval marked can be understood as the corresponding co-occurrence’s prevalent persistence on that zone for the marked time slots which is organized as a spread element in definition 2. For example, z(4,0) z(4,1) Et[0,1] and Et[1,1] are two spread elements of {A,B}, and the linkage can be z(4,1)
z(4,0)
z(3,2)
z(2,3)
carried out as Et[1,1] →Lc Et[0,1] . Similarly, Et[2,2] and Et[3,5] are two spread z(2,3)
z(3,2)
elements of {A,C}, and the linkage can be carried out as Et[3,5] →Lc Et[2,2] . The child linkage potentially describes the spreading of co-occurrence patterns between the zones with time. Definition 3. The Spread Pattern Tree (SP-Tree) of a co-occurrence C is a tree with each node (except the root) corresponds to a spread element of C. The parental relationship of the nodes should follow the linkage of Lc . The process of building a spread pattern tree is as follows. Initially, the tree is empty with only a dump root. Then, any inserted node is appended to its parents if it can be Lc to them. If the node can not find any parent, it is appended to the root. If any existing node in the tree can also be Lc to the inserted node, they are appended to it. The SP-Tree(C ) is an expression of SPCOZ(C ) over STF. The size of the SP-Tree(C ) is defined as the size of C. Figure 3 gives the SP-Tree({A,B}) and SP-Tree({A,C}) according to the example in figure 2. Apparently, the children of the root indicate the starting points of the spreading. In a more complex situation, a node may have more than one parent due to spreading from more than one neighbor zones. This does not
682
F. Qian, Q. He, and J. He
Possible SP-Tree({A,B,C}) generated from the overlap of the left two trees
A pair of the corresponding overlap nodes as an example
{A,B,C}
{A,C}
{A,B} The overlap part of SP-Tree({A,B}) and SP-Tree({A,C}) zone(4,0) time slots[0,1]
zone(3,0) time slots[3,3]
zone(4,1) time slots[1,1]
zone(3,0) time slots[3,3]
zone(4,1) time slots[1,1]
zone(2,0) time slots[4,5]
zone(3,2) time slots[2,2]
FDQGLGDWH zone(2,0) time slots[4,5]
zone(3,2) time slots[2,2]
zone(1,1) time slots[5,5]
zone(2,3) time slots[3,5]
zone(1,0) time slots[5,5]
zone(2,3) time slots[3,4] The generated node from the pair of the corresponding overlap nodes in the example
zone(1,4) time slots[4,4]
zone(1,2) time slots[4,4]
zone(0,4) time slots[5,5]
zone(0,2) time slots[5,5]
Possible candidate
zone(4,1) time slots[1,1]
zone(3,0) time slots[3,3]
zone(3,2) time slots[2,2]
zone(2,0) time slots[4,5]
zone(2,3) time slots[3,4]
zone(1,4) time slots[4,4]
Corresponding overlap nodes zone(1,4) time slots[4,5]
Fig. 3. The spread pattern trees of figure 2
effect our expression of spread structures even if a SP-Tree may not be a pure tree (probably be a acyclic graph). Based on its definition, we state the structures of the SP-Trees are monotonic with their size. Before that, we give the definition of affiliation for spread pattern trees first. Definition 4. A SP-Tree(C ) Belongs To a SP-Tree(C’ ) which is denoted as SP-Tree(C ) ⊆sp SP-Tree(C’ ), if for each subtree of SP-Tree(C ), there exist a subtree of SP-Tree(C’ ) where there’s an overlap of its full structure (including nodes and links) according to merely the tree nodes’ zone label and for each pair z z of corresponding overlap nodes, denoted as (Et[τp ci ,τcj ] , Et[τq ,τ ] ), they satisfy c i c j that zp = zq , and τc i ≤ τci ≤ τcj ≤ τc j . Lemma 1. Given a co-occurrence Ck , SP-Tree(Ck ) ⊆sp SP-Tree(Ck−1 ) where Ck−1 ⊂ Ck and k is the size of co-occurrences. zm Proof. For each node of SP-Tree(Ck ) with Et[τ , Ck is prevalent on zm and i ,τj ] its buffer from time slot τi to time slot τj . Because of the monotonic property of Pi [1], Ck−1 is also prevalent on zm and its buffer from τi to τj for every Ck−1 ⊂ Ck . Hence, we can simply generate the incomplete SP-Tree(Ck−1 ) by keep a copy of the original SP-Tree(Ck ). And the actual SP-Tree(Ck−1 ) may have some extends with the time intervals for some tree nodes, also with some additional tree nodes appended. Since in the process of building a spread pattern tree, the inserted nodes do not split any existing linkage, and the order of the inserting process dose not effect the final tree, the overlap must can be found for every subtree of SP-Tree(Ck ) in SP-Tree(Ck−1 )’s subtrees. Hence, SP-Tree(Ck ) belongs to any SP-Tree(Ck−1 ) for every Ck−1 ⊂ Ck .
In figure 3, we pick up the overlap of SP-Tree({A,B}) and SP-Tree({A,C}) (the grey part) to construct a possible SP-Tree({A,B,C}). For each pair of the correz(2,3) z(2,3) sponding overlap nodes, such as Et[3,4] with {A,B} and Et[3,5] with {A,C}, we z(2,3)
generate the node Et[3,4] for {A,B,C} which captures the minimum intersection of the time slots. Obviously, the generated possible SP-Tree({A,B,C}) belongs to both SP-Tree({A,B}) and SP-Tree({A,C}) according to definition 4, and it is a
Mining Spread Patterns of SPCOZs
683
superset of the actual SP-Tree({A,B,C}) due to the monotonic property presented in lemma 1. Thus, the possible SP-Tree({A,B,C}) can be treated as a candidate for SPCOZs when it also belongs to SP-Tree({B,C}) which we do not illustrate in the example to save space. Figure 2 also illustrates the possible spread structure of {A,B,C} with respect to the generated possible SP-Tree({A,B,C}). The SPCOZ-Miner we propose in the next section apply the similar strategy to prune the search space benefit from the monotonic property of the spread pattern tree presented in lemma 1.
3
Mining SPCOZs
In this section, we discuss the processes of mining SPCOZs. Two approaches are proposed respectively which are a straight approach and the SPCOZ-Miner. We first give the pseudo code as algorithm 1 shows to mine size 2 SPCOZs.
Algorithm 1: Mining size 2 SPCOZs
1 2 3 4 5 6 7 8 9
Inputs: STF with ST = {z1 , ..., zn } and T F = {τ1 , ..., τm }; The neighbor relation between instances of features, R; The neighbor relation between zones, Rz ; A set of features F = {f1 , ..., fl } with their instances embedded in STF ; Spatial prevalence threshold, θ. Output: A set of size 2 SPCOZs represented as SP-Trees. Variables: Size 2 candidate co-occurrences, C ; Values of Pi for C in zi at τj , pi; Spread pattern trees, SP -Trees(C ); A set of time slots, T. C = gen candidate co-occ(F ); for each zone z ∈ SF do for each time slot τ ∈ T F by sorted order do pi=apply join approach(C, z ∪ buf f er(z), τ, R); if pi > θ then T = T ∪τ ; continue to line 3; else insert node(SP -Trees(C ),z,T,Rz ); T = φ; return SP-Tree(C ) whose depth ≥ 1;
(Line 1 ). The candidate generation approach is borrowed from apriori-gen [12] which generates the size 2 co-occurrences from size 1 co-occurrences. Here the size 1 prevalent co-occurrences are initialized to F. (Line 2-4 ) Then the algorithm finds the spread elements of the candidates by searching each zone and each time slot using the join approach [1]. (Line 5-6 ) If the value of the prevalence index is greater than the prevalence threshold θ, the time slot is recorded. (Line 7-8 ) When the maximum closed set of continuing time slots are formed, the spread element is generated and inserted into the spread pattern tree where the strategy is presented in definition 3. (Line 9 ) Finally, the algorithm returns the spread pattern trees whose roots own more than one children.
684
F. Qian, Q. He, and J. He
A straight approach to find size k SPCOZs can simply borrow the process from mining size 2 SPCOZs presented above. In each iteration, the size k candidate cooccurrences are generated from size k-1 SPCOZs. The method terminates until there is no candidate been generated. The process in each iteration of mining size k SPCOZs is similar to algorithm 1. We skip the presentation of the straight approach to save the space. In contrast, a novel approach is proposed to find the size k (k ≥ 3) SPCOZs based on the monotonic property of the spread pattern trees below. Algorithm 2: Mining size k SPCOZs (k ≥ 3) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Statement: All the parameters are inherited from algorithm 1. k = 3; while SP-Tree(Ck−1 ) not empty do Ck =gen candidate co-occ(SP-Tree(Ck−1 )); SP-Tree p (Ck )=gen possible tree(Ck ); z for each node Et[T ] in SP-Treep (Ck ) do z T =validate node with subset trees(Ck ,Et[T ] ); if T = φ then for each time slot τ ∈ T by sorted order do pi=apply join approach(Ck , z ∪ buf f er(z), τ, R); if pi 0 σ k>0,c>0
2.2 Soft Margin of SVM In real applications, since there are noises in input data it is too hard to find a hyperplane that can perfectly separate the samples. To solve this problem a ‘soft margin’ method is presented in [12]. It adds a slack variable e to each constraint to implement a soft classification. Thus a constraint can be rewritten as
:
c i (ω * x i − b ) + e i ≥ 1
738
T. Yang et al.
A ‘penalty function’ f (E ) ( E = {e i } ) can be added into the objective function. Then the objective function will be
min( 12 ω T ω + C * f ( E )) The constraint programming problem can be solved using the Lagrange multipliers. The parameter C is called ‘penalty factor’, because it determine the SVM’s sensitivity of classification errors. A wildly used penalty function is n
min( 12 ω T ω + C * ∑ ei2 ) i =1
This is called Least Squares Support Vector Machine (LSSVM) [13]. It is proved that the time complexity of solving this type of SVM equals to solving an equations with the same size. So it is more efficient.
3 Decision Oriented LSSVM and Decision Analysis 3.1 Decision Oriented LSSVM As mentioned before, a decision analysis problem is to predict the label of a decision candidate. It is easily know that there are generally two indicators for a decision prediction: 1 2
. Rate that mark an acceptable decision as ‘unacceptable’; . Rate that an unacceptable decision as ‘acceptable’;
Normal SVMs have no preference on any class. In a two-class case, this causes a result that in a statistics point of view the rates of each type of error are approximately the same. However, in a decision-analysis context, the situation is different. Both the result of ‘this decision is absolute acceptable’ and ‘this decision is absolute unacceptable’ are more meaningful than ‘this decision may be unacceptable’. Therefore, the classifier may loss when it mark some samples with one class, but will not make any mistake when it makes the sample with the other class. We named this type of SVM that have a ‘preference’ of the two classes as ‘Decision Oriented- SVM’ (DO-SVM). In the following sections, we’ll call the SVMs that have those preferences as the ‘1-Decesion Oriented SVM’ (1-DO- LSSVM’) and the ‘2-Decesion Oriented LSSVM’ (2-DO- LSSVM’). In the normal SVM that has a soft margin, the constraints are c i (ω * x i − b ) + e i ≥ 1
,thus
ω * x i − b + ei ≥ 1 ω * x i − b − e i ≤ −1
To reach the requirement of DO- LSSVM, the constraint can be rewritten as ω * xi − b ≥ 1 ω * x i − b − e i ≤ −1
SVM Based Decision Analysis and Its Granular-Based Solving
739
Fig. 3. Normal LSSVM classifier
Fig. 4. 1-DO- LSSVM classifier
The objective function can also be chosen as n
min( 12 ω T ω + C * ∑ ei2 ) i =1
Then the Do-LSSVM can be written as n
min( 12 ω T ω + C * ∑ ei2 ) s.t.
C (V , ω , E ) ≥ 0
i =1
In this constraint model, result which does not satisfies ω * x i − b ≤ −1 may be accepted, but result which does not satisfies ω * x i − b ≥ 1 will absolutely not be accepted. We can construct 1-DO- LSSVM and 2-DO- LSSVM according to this principle. Fig.3- Fig.5 illustrates the difference between DO-LSSVM and normal LSSVM. Samples with wrong class label are marked in the figure. We can learn from this figure that a normal LSSVM has two types of wrong labels, but a DO- LSSVM only has one. 3.2 DO-LSSVM Based Decision Analysis Since the DO- LSSVM has preference on different classes, we can use the DOLSSVM to predict whether a decision will be acceptable, as follows:
740
1 2
T. Yang et al.
. Construct 1-DO- LSSVM, 2-DO-LSSVM and SVM, and train them separately. . For a decision X in condition C, predict its class label using DO- LSSVM. If the
result of 1-DO-LSSVM is ‘acceptable’ or the result of 2-DO-LSSVM is ‘unacceptable’, return this result directly. Otherwise, re-predict its class label using LSSVM and output the result. Figure 6-8 illustrate these conditions, where is the decision whose label needs to be predicted.
Fig. 5. 2-DO- LSSVM classifier
Fig. 6. Prediction with 1-DO-LSSVM
Fig. 7. Prediction with 2-DO-LSSVM
SVM Based Decision Analysis and Its Granular-Based Solving
741
Fig. 8. Prediction with DO-LSSVM
4 Granular-Based LSSVM Solving One shortcoming of all SVMs is the computational complexity rises rapidly with the size of sample set. Let n be the number of samples, LSSVM has to solve an equations with the size of n. If n is very large, the performance will get worse. However, in industry, the process usually keeps stable. Thus in a short time piece there are only little changes in both system input and output. And the time period that the process is in a ‘stable’ status is much longer than that of the other status. So we can presume that most samples will lies in some small areas that represent the ‘stable’ statuses. Since SVM can get a good result even though there are not a large number of samples, especially when the samples have an equality distribution [4], we use a ‘central sample’ to represent its neighbors to reduce the running time. In the next section we’ll discuss this approach. 4.1 Granular-Based Constraint Solving As mentioned before, solving a LSSVM is a constraint solving problem. In this section we introduce a granular-based approach to constraint solving. A constraint solving problem is usually presented as follows:
min( H (V )) (Problem 1)
s.t.M (V ) ≥ 0
where V is the set of variables, M(V)≥0 represents the constraints and H(V) is the optimization function. Let Vd = {vi | ∂H ≡ 0, vi ∈V} , Vf = V −Vd , Ef = {eij | vi ∈Vf or vj ∈Vf } , then ∂vi
H(V ) = ∑ ∫ vi ∈Vf
∂H ∂vi
dvi +...+ ∑ ∫ v j ∈Vd
∂H ∂v j
dv j + C = ∑ ∫ vi ∈Vf
∂H ∂vi
dvi + C,
where
C
is
a
constant.
Let F (V f ) = ∑ ∫ ∂H dv i + C . We can use min( F (V f )) to present the optimized rule v i ∈V f
∂v i
instead of min( H (V )) . Thus Problem 1 can be rewritten as:
742
T. Yang et al.
min( F (V f ))
(Problem 2)
s.t.M (V ) ≥ 0
Let R be an equivalent relation defined on V. Then V can be separated into classes q with R. We present each equivalence class Ck with a new variable v dk . A function gk q is defined to describe this mapping, hence v dk =gk(Ck ). In a real application, gk will be
defined according to domain knowledge and may not be exact. Let V q = Vdq U V f , G={Gk} then G maps V to a granulated space Vq. If the size of Vd is much bigger than V f , the granulated problem can be much smaller. Choosing F (V f ) as the optimized rule, a granulated model can be represented as Problem 3:
min(F (V f ))
(Problem 3)
s.t. M q (V q ) ≥ 0
Let N be a solution of Problem 2, where N = N f U N d and the corresponding objective value is obj = F ( N f ) ; similarly, let N q = N qf U N dq be a solution of Problem 3, and the corresponding objective value is obj q = F ( N qf ) .We have the following lemma:
( )
Lemma 1. If M (V ) ≥ 0 ⇒ M q V q ≥ 0 then obj q ≤ obj , while the sufficient condition of
( )
equalization is: M(V ) ≥ 0 ⇔ Mq V q ≥ 0
(
( )
)
Proof: Let N dq 0 = {g k (C k )} , if M(V) ≥ 0 ⇒ Mq Vq ≥ 0, then M q N f U N dq0 ≥ 0 . Since N q is
( ) must be the minimal object value, = F(N ) ≤ F(N ) = obj . Similarly, if M (V ) ≥ 0⇒M(V) ≥ 0 , then obj ≤ obj . Hence
a solution of Problem 3 and N = q
hence objq
q f
N qf
U
N dq
, F
q
N qf
q
q
f
( )
(M (V ) ≥ 0 ⇔ M q V q ≥ 0) ⇒ objq = obj .
Furthermore, let Vd+ be the upper bound of Vdq and Vd− be the lower bound of Vdq , let N = N f U N d be a solutions of following problem:
min( F q (V f )) M q (V q ) ≥ 0 V q− ≤ V d ≤ V +
Lemma
Let obj * = Fq ( N f ) ,
2.
M (V ) ≥ 0 ⇒ M
q
(
V f U Vdq
)
≥ 0, Vdq
(Problem 4)
a
sufficient
condition
of
obj* ≤ obj
is
∈ [Vq− , Vq+ ] (2-1) , Where the sufficient condition of
SVM Based Decision Analysis and Its Granular-Based Solving
743
equalization is that [Vq− ,Vq+ ] is the exact bound of Vdq , and M (V ) ≥ 0 ⇔
(
)
Mq Vf ,Vdq ≥ 0,Vdq ∈[Vq−,Vq+] . The proof of Lemma 2 is similar with Lemma 1’s. According
to Lemma 2, if obj * ≥ E then obj ≥ E , where E is a constant. 4.2 Granular-Based LSSVM Solving According to the lemmas in the section above, the time complexity of solving a LSSVM can be reduced by defining an appropriate mapping and transfer the original sample space into a granulated space with much smaller size. In the industry context the status of process usually keep stationary in a short time piece [1]. Therefore we can use a granular-based approach to reduce the scale of sample set. In this approach we first split the sample set into cells, then use a ‘central sample’ to represent .and finally solve the LSSVM in the granulated sample space. Assume all used properties of all samples has been normalized, the steps are as follows: 1. 2.
Split the original sample space into cells. Each cell has a size of εn, where n is the dimension of the original sample space; If all samples in u and its near neighbors (that have a distance less than 2*ε) have the same label, then use the sample in u instead of all samples in u. This is named ‘central sample’ of u.
In contrast to the grid-based clustering algorithm, our approach only granules samples that far from the classification line (the distance is at least ε). Samples near the classification line will be reserved. Let V be original sample set and Vq be the granulated sample set. Since the samples in Vq are also samples in the original sample space, there must have:
C (V , ω , E ) ≥ 0 ⇒ C (V q , ω q , E q ) ≥ 0 According
to
Lemma
n
2,
n
obj q = 12 (ω q ) T ω q + C q * ∑ eiq 2 ≤ obj 12 ω T ω + C * ∑ ei2 . If appropriate C i =1
and
i =1 T
C are chosen it will have (ω ) ω ≤ obj ω ω . Thus q
1 2
q T
q
1 2
2 ||ω q || 2
≥
2 ||ω|| 2
.
Therefore the class margin in the granulated sample space is a upper bound of the margin in the original sample space. To a DO-LSSVM, this means its preference is enhanced. 4.3 Choice of Kernel Function and Parameters To use a SVM first we have to choose a kernel function, and then we should select the appropriate value of the parameters. It is wildly agreed that the kernels listed in Table 1 have little difference and their performance depends on the application. In our DOSVM we choose Gaussian Radial basic Function which is suggested in [6] as the kernel. According to the analysis above, the kernel selection will not influence the preference of the DOSVM and its granular-based solving. Another advantage of
,
744
T. Yang et al.
choosing Gaussian Radial basic Function is the parameters can be estimated more easily. Let v2 and v1 be reflections in the high-dimension space of original sample x2 and x1 , the distance between v2 and v1 is || v2 − v1 ||=
v2 • v2 − 2v2 • v1 + v1 • v1
= exp( −
|| x2 − x2 || || x − x || || x − x1 || ) − 2 * exp( − 2 1 ) + exp( − 1 ) 2σ 2 2σ 2 2σ 2
= 2 − 2 * exp( −
|| x2 − x1 || ) 2σ 2
That means || v2 − v1 || can be calculated from the distance between x2 and x1 in the original sample space. In the granulated sample space, the distance between a ‘granulated sample’ (original sample that represented by its ‘central sample’) and its ‘central sample’ is at most ε. Consider the geometric meanings of SVM [16, 17, 18], the cross-class distance in the granular-based solution and that in the original solution is at most 2 − 2 * exp(−
ε || x2 − x1 || ) = 2 − 2 * exp(− ) 2 2σ σ2
In Gaussian Radial Basic Function K ( x, x ' ) = exp(−
|| x − x ' || 2σ 2
)
,σ has an important
influence of the accuracy of the classifier. In general, the greater σ is, the larger the margin will be. In a granular sample space in the last section, the average distance between samples is ε. Therefore the initial value of σ can be set to ε. And the exact value can be found using the method presented in [15]. More accurate methods include some cross-validation methods [19, 20, 21]. Cross-validation has been proved having a good effect in machine learning. It sprits the sample set into k groups randomly. Then use some of them in training and uses the others to validate the result. In a granularbase model we can construct the groups by splitting samples in a cell into different groups randomly and merge groups in different cells (one group per cell). In LSSVM, Parameter C usually called ‘penalty factor’. It presents the sensitivity of classification error of the classifier. A greater C will cause the classifier choose a smaller class margin to reach a lower error-rate. In DO-LSSVM, the preference is independent from this parameter. According to the discussion in the last section, we choose C be the average n
number of samples in each granular, thus the factor C * ∑ ei2 in the granulated sample i =1
space is approximate equal to that in the original sample space.
5 Performance Analysis In a SVM-based approach, most running time is spent on solving the SVM. In this section we’ll discuss the time complexity of the granular-based SVM solving
SVM Based Decision Analysis and Its Granular-Based Solving
745
approach. Assume both the average time complexity of the original constraint solving problem and computing the bound of a granule only related with its own size. Let n be the total number of variables in a soft sensing problem, m be the average size of an granule and k be the number of granules, then n=k*m. Additionally, let s(n) be the complexity function of solving the original SVM and h(m) be average complexity function of computing bound of a granular variable, then the total time complexity of the granular problem is O q = s (k ) + k * h(m ) while the time complexity of the original problem is O r = s(n) = s(k * m) . Therefore, if h(n ) k, algorithm terminates. Results are: The optimum path firstly goes to hospital P2 , then heads to checkup clinic P3 , GP office P8 and finally arrives at pharmacy P12 for medicine. 5.3
Optimum Path Multiple-Object-Type Nearest Neighbors (PM NN)
Optimum path for multiple object types’ query is similar with the 2nd query except that object types can be passed in any sequence. In this query, the length of whole path is the criterion of assessment. As multiple 1 NN cannot guarantee the final path is the shortest one, this approach is different with IM NN approach in the last section. More details can be told based on example 3 in section 4. In example 3, the sequence of interest points is unimportant because posting letter, depositing cheque and so on are independent tasks and it does not matter which task the user does first. In addition, the objective of this query is to make the whole path short not to find any nearest object. There may be an instance that after choosing the nearest post office, the path to other place will become farther. Maybe choosing the second or even third nearest post office is better. In addition, how to arrange the sequence of interest points is another issue needed to be solved. The following steps illustrate the process of the approach. Firstly, generate NVD considering all objects as polygon generators. Then invoke ”contain” function to get the nearest generator P . Check its type and record P as the first object type that the user will visit. For all P ’s adjacent neighbors, sort them in the ascending sequence of their distance to P . Check their types one by one, if the path has not visited that object type, record it as the next P . From now on, start from this P , do the same operation as the first P until all types have been found and the path is completed. Above operation cannot guarantee this path shortest but it did set a boundary for the
830
G. Zhao et al.
query (dmax ) which means once expansion is over this boundary, it should be terminated. Secondly, every object whose distance to q is smaller than dmax can be treated as potential first interest point. Sort them in a queue by their distance to q. Thirdly, for each interest point in the queue, pop it out, find its closest neighbor whose type has not been covered and then from that neighbor do the same things until all types of interest points have been covered. If in the process of the expansion, the distance is over the boundary, terminate it directly. If the path is completed, compare its path length with the boundary and update the boundary if it is smaller. Terminate the algorithm when no interest point in the queue. The optimum path shows how the user can pass multiple object types in random sequence. The algorithm can be express in Algorithm 3.
Algorithm 3. PM NN(k, query point) 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: 35: 36: 37: 38: 39: 40: 41: 42: 43: 44: 45: 46: 47: 48: 49: 50: 51: 52: 53: 54:
Generate Voronoi diagram using all interest points within given types dmax =∞ Initial T S={type1 ,type2 ,...,typek } R={distq , ∅1 ,∅2 ,..., ∅k } RL = ∅ S = ∅ P1 =1st NN=contain(q) tp1 =Check type(P1 ) Suppose typei = tp1 , remove it from T S R={distq , P1 , ∅2 ,..., ∅k } Initial N P (Neighbor point)=P1 ’s adjacent generator Calculate distance from P1 to each P in N P & sort them in ascending order. NP = {(P1 , dist(q, (P1 ))),...,(Pi , dist(q, (Pi )))} Pop out the first P in N P , suppose it is Pj tpj =Check type(Pj ) if If tpj is in T S then update tpj in R & remove tpj from T S if T S is not ∅ then add Pj ’s neighbor into N P & go to step 11 else if distq < dmax then update dmax = distq & RL = R else ignore it end if end if else add Pj ’s adjacent neighbor into N P & go to step 11 end if Expand q within this polygon & record distance from q to border point Update S={all objects (dist) to q ¡ dmax sort in ascending distance order} for each P in S do Pop out the first P & Initial T S={type1 ,type2 ,...,typek } t=Check type(P ) R={distq ,P ,∅2 ,...,∅k } Initial N P (Neighbor point)=P ’s adjacent generator Calculate distance from P to each Pi in N P & Wipe out the P whose dist(q, P ) > dmax NP={(P1 ,dist(q,(P1 ))),...,(Pi ,dist(q,(Pi )))} if N P = ∅ then Pop out the first P in N P , suppose it is Pj else go to step 28 end if tpj =Check type(Pj ) if tpj is in T S then update tpj in R & remove tpj from T S if T S is not ∅ then add Pj ’s neighbor into N P & go to step 28 else if distq < dmax then update dmax = distq & RL = R & wipe off P in S dist(P ) > dmax end if go to step 34 end if else add Pj ’s adjacent neighbor into N P & go to step 34 end if end for
Multiple Object Types KNN Search Using Network Voronoi Diagram
831
To clarify the algorithm, a case study will fully illustrate how it works.
Fig. 7. Example 3 - One NVD for all objects
– Generate NVD as in Fig. 7. White triangle, black dot, black triangle and white dot indicate post office, bank, shop and dry cleaner respectively. – Initial dmax = ∞, T S = {post office, bank, shop, dry cleaner}, R= {distq , ∅1 , ∅2 ,..., ∅k }, RL = ∅ – Use contain() function to locate P1 which is the 1st NN of q. – As Type(P1 )=dry cleaner, update T S={post office, bank, shop} by removing it from T S & R = {1, P1 , ∅2 , ..., ∅k }// suppose P1 to Pq ’s distance is 1 – From P1 , find nearest neighbor whose type in T S. Suppose P4 . As Type(P3 ) = bank, update T S = {post office, shop} & R = {5, P1 , P4 , ..., ∅k }// suppose P4 to P1 ’s distance is 4 – Do the same to P4 as the step above until T S = ∅. Suppose R = {15, P1 , P4 , P11 , P53 } Update dmax =15 – Expand q within P1 ’s polygon and record all distances from q to borders. – Initial S = {P4 , P3 , P6 , P5 , P7 , P2 ...Pn } // whose distance to q within dmax – Pop out P4 &update T S={post office,dry cleaner,shop} & R= {3,P4 ,∅2 ,...,∅k }. Search P4 ’s nearest neighbor whose type in T S and do the same for the rest interest points iteratively until T S = ∅. Suppose R = {12, P4 , P3 , P12 , P11 }, then update dmax =12. Wipe out P in S distq > 12. – Do the same operation to every P in S until S is empty. In the process, once the distance is over dmax , terminate expansion for this path.
832
G. Zhao et al.
The result comes out finally R = {10, P3 , P12 , P13 , P14 }. The user firstly goes to post office P3 , then heads to dry cleaner P12 , after that, towards bank P13 and finally arrives at shop P14 and the length of the final path is 10.
6
Performance Evaluation
In the experimentations, Melbourne city map and Frankston map in Australia are chosen from the whereis website [14]. In these maps, shops and restaurants represent high-density scenario of interest points, in the other hand, hospitals and shopping centers represent low-density scenario of interest points. All interest points are real-world data. The performance of our approaches is analyzed in runtime aspect in different diversity of interest point or in different interest points’ density. For the nearest neighbor for multiple object types, the processing time is increasing with the number of object types (M). In Fig. 8(a), the dash line indicates the performance of one NVD for each object type approach and the solid line indicates the performance of one NVD for all objects approach. In this case, we use different types of shops as candidate types and the average density is 5/km2 . From Fig. 8(a), we can easily tell that one NVD for each object type performs better than one NVD for all objects if objects types are small, especially smaller than 4. Otherwise, one NVD for all objects is a better choice because it saves time for generating NVD. We can also tell that with the increasing object types, the processing time increases sharply because more polygon expansions will be invoked and more NVDs should be generated. For incremental nearest neighbors for sequential multiple object types query, the processing time is increasing with the number of object types (M ). Here a definition is introduced: density relative rate (DRR). DRR is ratio of the highest density to lowest density of all object types. As a result, DRR is not smaller than 1. The closer to 1 DRR is, the more evenly objects distribute. For example, if the user concerns 4 object types and their densities are 5.5/km2 , 3.6/km2 , 2.5/km2 and 1.1/km2 respectively. So this scenario’s DRR is 5.5/km2 (highest) 1.1/km2(lowest)=5. In Fig. 8(b), the first two bars indicate the processing time of one NVD for each object type approach and the last two bars indicate the processing time of one NVD for all objects approach. The first and third bars are operating in low DRR scenario (DRR=1) and the second and forth bars are in high DRR scenario (DRR=10). Fig. 8(b)illustrates that processing time will increase if DRR increases. In addition, the higher DRR is, the closer two approaches (one NVD for each & one NVD for all) performs. In addition, generally, one NVD for all interest points performs better than one NVD for each object types because generating and loading NVD are time consuming tasks. The processing time for optimum path for multiple object types query, the processing time is increasing with the object types (M ). In Fig. 8(c), the dash line indicates the performance of the query when density relative rate
Multiple Object Types KNN Search Using Network Voronoi Diagram
(a) M NN
(b) iM NN
833
(c) PM NN
Fig. 8. Processing Time Comparison
(DRR) = 1 and the solid line indicates the performance of the query when density relative rate (DRR) = 5. From Fig. 8(c), with the increasing object types, the processing time increases sharply because more polygon expansions will be invoked. Moreover, DRR is another critical factor for the performance of the approach. The processing time increases more sharply if DRR increases from 1 to 5.
7
Conclusion and Future Work
This paper inspires novel KNN search involving multiple object types. The first query (nearest neighbor for multiple object types) provides a solution if the user wants to get 1 NN for each category of interest points. The second query (incremental nearest neighbors for sequential multiple object types) helps user to find the shortest path to pass through multiple object types in pre-defined sequence. The last query (optimum path for multiple object types) provides an optimum path for users if they want to pass multiple object types without any sequential constrain. These approaches investigate novel KNN in multiple object types using network Voronoi Diagram which enriches the content of our mobile navigation system and gives more benefits to mobile users. In the future, we are going to incorporate intelligence techniques and contextaware in mobile navigation and mobile query processing [5,4,6,1,3,8]. In addition, range and kNN search combined with dynamic query point will also be investigated [15]. Performance in mobile query processing is always an issue, and we plan to examine more thoroughly the performance issues of mobile query processing including the use of data broadcast techniques also deserves further investigation [13]. Acknowledgments. This research has been partially funded by the Australian Research Council (ARC) Discovery Project (Project No: DP0987687).
834
G. Zhao et al.
References 1. Aleksy, M., Butter, T., Schader, M.: Architecture for the development of contextsensitive mobile applications. Mobile Information Systems 4(2), 105–117 (2008) 2. Bohl, O., Manouchehri, S., Winand, U.: Mobile information systems for the private everyday life. Mobile Information Systems 3(3,4), 135–152 (2007) 3. Doci, A., Xhafa, F.: A wireless integrated traffic model. mobile information systems. Mobile Information Systems 4(3), 219–235 (2008) 4. Goh, J.Y., Taniar, D.: Mobile data mining by location dependencies. In: Yang, Z.R., Yin, H., Everson, R.M. (eds.) IDEAL 2004. LNCS, vol. 3177, pp. 225–231. Springer, Heidelberg (2004) 5. Goh, J., Taniar, D.: Mining frequency pattern from mobile users. In: Negoita, M.G., Howlett, R.J., Jain, L.C. (eds.) KES 2004. LNCS, vol. 3215, pp. 795–801. Springer, Heidelberg (2004) 6. Gulliver, S.R., Ghinea, G., Patel, M., Serif, T.: A context-aware tour guide: User implications. Mobile Information Systems 3(2), 71–88 (2007) 7. Kolahdouzan, M.R., Shahabi, C.: Voronoi-based k nearest neighbor search for spatial network databases. In: Proc. of 30th VLDB, Toronto, Canada, pp. 840–851. Morgan Kaufmann Publishers Inc., San Francisco (2004) 8. Luo, Y., Xiong, G., Wang, X., Xu, Z.: Spatial data channel in a mobile navigation system. In: Gervasi, O., Gavrilova, M.L., Kumar, V., Lagan´ a, A., Lee, H.P., Mun, Y., Taniar, D., Tan, C.J.K. (eds.) ICCSA 2005. LNCS, vol. 3481, pp. 822–831. Springer, Heidelberg (2005) 9. Okabe, A., Boots, B., Sugihara, K., Chiu, S.N.: Patial Tessellations: Concepts and Applications of Voronoi Diagrams, 2nd edn. John Wiley and Sons Ltd., Chichester (2000) 10. Papadias, D., Zhang, J., Mamoulis, N., Tao, Y.: Query processing in spatial network databases. In: Proc. of 29th VLDB, Berlin, Germany, pp. 802–813. Morgan Kaufmann Publishers Inc., San Francisco (2003) 11. Roussopoulos, N., Kelley, S., Vincent, F.: Nearest neighbor queries. In: Proc. of ACM SIGMOD, San Jose, California, pp. 71–79. ACM Press, New York (1995) 12. Safar, M.: K nearest neighbor search in navigation systems. Mobile Information Systems 1(3), 1–18 (2005) 13. Waluyo, A.B., Srinivasan, B., Taniar, D.: Optimal broadcast channel for data dissemination in mobile database environment. In: Zhou, X., Xu, M., J¨ ahnichen, S., Cao, J. (eds.) APPT 2003. LNCS, vol. 2834, pp. 655–664. Springer, Heidelberg (2003) 14. Telstra Corporation whereis Melbourne (February 2006), http://www.whereis.com 15. Xuan, K., Zhao, G., Taniar, D., Srinivasan, B.: Continuous range search query processing in mobile navigation. In: Proceedings of the 14th ICPADS 2008, Melbourne, Victoria, Australia, pp. 361–368 (2008) 16. Zhao, G., Xuan, K., Taniar, D., Srinivasan, B.: Incremental k-nearest-neighbor search on road networks. JOIN 9(4), 455–470 (2008)
RRPS: A Ranked Real-Time Publish/Subscribe Using Adaptive QoS Xinjie Lu1,4 , Xin Li3 , Tian Yang1,4 , Zaifei Liao1,4 , Wei Liu1 , and Hongan Wang1,2 1
Institute of Software, Chinese Academy of Sciences, Beijing 100190, China
[email protected] 2 State Key Lab. of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China 3 Department of Computer Science and Technology, Shandong University, Jinan Shandong 250101, China 4 Graduate University of the Chinese Academy of Sciences, Beijing 100049, China
Abstract. Publish-Subscribe paradigm has been widely employed in Real-Time applications. However, the existing technologies and models only support a simple binary concept of matching: an event either matches a subscription or it does not; for instance, a production monitoring event will either match or not match a subscription for production anomaly. Based on adaptive Quality of Service (QoS) management, we propose a novel publish/subscribe model, which is implemented as a critical service in a real-time database Agilor. We argue that publications have different relevance to a subscription. On the premise of guaranteeing deadline d, a subscriber approximately receives k most relevant publications, where k and d are parameters defined by each subscription. After the architecture of our model is described, we present negotiations between components and scalable strategies for adaptive QoS management. Then, we propose an efficient algorithm to select different strategies adaptively depending on estimation of current QoS. Furthermore, we experimentally evaluate our model on real production data collected from manufacture industry to demonstrate its applicability in practice.
1
Introduction
Many complex distributed applications require sophisticated processing and sharing of an extensive amount of data under critical constraints such as timing, storage space, etc. These applications need to acquire data from various systems or devices, process acquired data in the context of historical data, and provide timely response. The Publish-Subscribe (pub/sub for short) paradigm has the provision of data acquisition and dissemination as its main purpose. For some specific systems, such as MES (Manufacturing Execution System), the real-time monitoring provides an important means for production safety in a
This work was supported in part by the National High Technology Research and Development Program (”863”Program) of China under Grant No. 2007AA040702.
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 835–850, 2009. Springer-Verlag Berlin Heidelberg 2009
836
X. Lu et al.
manufacturing industry. Determining how to provide the most relevant results before deadline to subscriber through pub/sub model is extremely necessary in such critical environments. Real-time Production Monitoring: A real word example involves: production device, such as air compressor, disseminates its run parameters (pressure or temperature, for example) among monitoring programs which have registered their interests for some specific event, e.g. retrieving ten pressure data that is closest to 5 Mpa and ten temperature data closest to 300 . Since the status of whole production system is uncertain, some faults may happen in any part at any instant. So, the ideal pub/sub service should adaptively provide top-k most fitting items for a client in proper situation. The identification of proper situation depends on the QoS model of run-time environment for pub/sub service. Observations from real world mean that for these production devices, which deserve special attention, deadline is more important than the strict k publications. Suppose there is an abnormal situation in production process. The pressure of a certain device D keeps increasing and gradually approaches to the threshold, 5 Mpa. At this moment, an adaptive pub/sub service should automatically allocate system resource to ensure the deadline for such an emergency because this situation usually indicates there will be a mechanical failure in device D. Even if system does not have idle memory or computing, pub/sub service should return as many items as possible to client before deadline. For an engineer, receiving some timely interesting items is far better than getting strict k items after deadline. As an efficient means, adaptive QoS management can achieve such an approximate top-k pub/sub mechanism. A great deal of traditional pub/sub models or systems[1,2,3,4,5] only support a simple binary concept of matching: an event either matches a subscription or it does not. In above mentioned instance, a production monitoring event with real-time pressure and temperature information will either match or not match a subscription for production anomaly. However, more and more applications need a more sophisticated dissemination, which requires some ”second best” events. Assuming these ”second best” events can be received by engineers through the production monitoring applications, they may take measures to avoid accidents. In addition, these relevance events records will be helpful to mining frequent patterns before production accident. Thus, we propose a Ranked Real-time Pub/Sub model called RRPS using adaptive Quality of Service (QoS) management[6,7,8,9,10]. RRPS can disseminate evens with different relevance to one subscription. On the premise of guaranteeing deadline d, a subscriber approximately receives k most relevant publications, where k and d are parameters defined by each subscription. The main contributions of our work can be summarized as follows: – We propose a novel top-k ranked real-time pub/sub model named RRPS, which uses relevance between publications and subscriptions as the matching criterion and adaptively choose approximately top-k ranked publications before deadline of each publication. Our model is conveniently achieved
RRPS: A Ranked Real-Time Publish/Subscribe
837
using adaptive QoS management and the adaptivity for QoS is based on the definitions of several most important QoS dimensions and computing their estimations. We also show the components and negotiations between them in detail. – We present a reference implementation of RRPS, which is implemented as a critical service in a real-time database. Moreover, extensive experimental results with different workload of topics and subscribers show that our model is applicable in practice. The remainder of this paper is organized as follows: We formally define our model in Section 2. Section 3 presents analytical models of QoS dimensions and adaptive QoS mechanism. The reference implementation of RRPS and a case study is described in Section 4. Section 5 evaluates the performance of RRPS with different workloads. An overview of related work is presented in Section 6 and finally, Section 7 gives some conclusions and suggestions for further research works.
2
RRPS Model
In this section, we formally define the model of RRPS. In RRPS, a subscriber approximately receives the first k publications ordered by the relevance between publication and subscription before deadline d, where k and d are parameters defined by each subscription. We define a quintuple R=(P,S,C,Q,RF),where P is the set of publications, S is the set of subscriptions, C is the set of clients, Q is the set of QoS profiles used in RRPS. RF is the set of functions used to compute the relevance between a subscription and a publication. Definition 1. (Subscription): A subscription s ∈S, which is expressed as a part of a QoS profile q ∈Q, contains some information describing the interest of c ∈C. Deadline d and top-k, as two required fields, must be assigned in each s. Definition 2. (Publication): A publication p ∈P is an event with the information for some subscriptions. There is a timestamp to identify the occurring time of this p. Definition 3. (QoS profile): A QoS profile q ∈Q is a profile in the form of xml, which has requisite parameters for QoS management. The detail contents of a QoS profile can be seen in Section 3.1. Definition 4. (Relevance): We define relevance function rf ∈RF between a subscription s and a publication in two ways: default functions and user defined functions. For some common fields(attributes) in subscription, RRPS computes the relevance by default relevance functions. However, for these unusual fields, a subscription must point out the formula when it is submitted.
838
X. Lu et al.
The problem of computing relevance between two objects is well developed in various communities. In this paper, we focus on modeling the publications and subscriptions as points in multidimensional space and use Euclidian Distance as the approximate relevance between them. Definition 5. (Approximate Top-k Publications): Let k’ denote the number of publications that actually reach client according to a subscription. Due to the importance of deadline for a subscription, we define approximate top-k publications as the top k’ publications ordered by the relevance between subscriptions and publications before the deadline d assigned in each subscription and k’ is determined by QoS of RRPS during running time. In other words, k’ is the number of publications that can be sent to client before the corresponding deadline d. For the above example, the QoS of RRPS may only allow finding six pressure data and eight temperature data at a certain time. Then, RRPS will just send the ”not enough” data to a client. But, these data are enough for engineers to make a timely decision.
3 3.1
Adaptive QoS Management QoS Expression by DDS
Data Distribution Service (DDS ) is a newly adopted specification from the Object Management Group (OMG). DDS is aimed at a diverse community of users requiring data-centric pub/sub communications. Comparison with previous approaches, the highlight of DDS is that it enumerates and provides formal definitions for the QoS polices, which can be used to configure the service conveniently. DDS defines 22 QoS policies in all. According to their characteristics, QoS policies concerning timeliness and resource management can be divided into three groups: Timeliness, Resources, Reliability. More details about each policy can refer to[11]. We describe the QoS of real-time pub/sub model by DDS ’s QoS policies and adaptively adjust these QoS policies to provide top-k ranked publications before deadline for each subscriber. In order to transmit these QoS policies efficiently between components, we proposed a QoS profile containing QoS policies and its schema file is shown as following:
RRPS: A Ranked Real-Time Publish/Subscribe
839
...... ...... As we can see from above schema, each QoS policy is described as a simple or complex element in QoS profile and its type is either simple type, e.g. long and string, or user defined data type, such as Duration t and HistoryQoSKind. In all of these QoS profiles, subscription deadline and k is the essential part. Once a subscription is submitted, RRPS will dynamically adjust system parameters to satisfy user’s requirements. 3.2
RRPS in Agilor and Its Components
The architecture of Agilor in Figure1 provides a description of the position of RRPS. RRPS defines several QoS profiles listed in the following Table1.
Fig. 1. Agilor Architecture and the posi- Fig. 2. Interactions between Publisher and tion of RRPS Subscriber
840
X. Lu et al. Table 1. QoS profiles in RRPS
Profile Name qos max qos active qos active new qos request qos predictive qos registered list qos profile
Meanings Max QoS provided by RRPS without any other workloads. Current QoS provided by RRPS according to current workloads. Current QoS provided by RRPS after parameters setting. QoS requested by application. QoS estimated by Prediction Service. QoS profiles list registered in QoS Manager. Each profile in qos registered list.
RRPS provides implicit separation of concerns, dividing responsibilities and functionalities among the defined components. These components are: – QoS Negotiator: responsible for receiving qos request from application and determining whether qos max satisfies qos request. – QoS Manager: responsible for managing every registered qos profile in qos registered list and invoking QoS Monitor. – QoS Monitor: responsible for monitoring each qos profile in qos registered list at regular intervals. – Prediction Service: responsible for estimating all kinds of QoS profiles as the foundation of setting parameters adaptively by invoking QoS Scheduler. – QoS Scheduler: responsible for re-allocating resources by setting key parameters of system. 3.3
QoS Profile Negotiations
Figure 3 illustrates the procedure necessary for the adaptive QoS management execution of a QoS request. Each number in blue circle means a step in the sequence of profile negotiation. The first step consists of sending the request to QoS Application QoS Negotiator 1 QoSApply 2 (qos_request)
NO: Return(qos_ max,qos_active)
3b
QoS Manager
QoS Monitor
Prediction Service
QoS Scheduler
Satisfy(qos_request)?
3a
YES: Register (qos_request)
4 Invoke (qos_registered_list)
5 PredictEachProfile ( qos_profile)
6
ReturnQoSActive ( qos_predictive)
8b Satisfy(qos_profile, qos_predictive)? 7
8a 12
11
ReturnQoSActive (qos_active_new)
ReturnQoSActive (qos_active_new)
YES: SatisfyAllQoS (qos_registered_list)
NO: Schedule (qos_profile) ReturnQoSActive (qos_active_new)
10
Fig. 3. Sequence diagram for QoS profile negotiation
9 ParaSetting (qos_profile)
RRPS: A Ranked Real-Time Publish/Subscribe
841
Negotiator (Step1) and QoS Negotiator determines whether the RRPS could provide qos request based on qos active(Step2). If Yes, QoS Negotiator would registered qos request to qos registered list(Step3a), otherwise, QoS Negotiator would return qos max and qos active to application(Step3b). QoS Manager has QoS Monitor to verify whether all qos profile in qos registered list is guaranteed(Step4). The verification by QoS Monitor is done using Prediction Service (Step5). The next step is to estimate qos predictive and return it to QoS Monitor(Step6), which would be further discussed in Section 3.5. For qos profile in qos registered list, if it is not guaranteed(Step7), QoS Monitor would send qos profile to QoS Scheduler(Step8b). Then QoS Scheduler would set parameters for allocating resources, e.g. communication resource, operating resource, storage resource (Step9) and return qos active to QoS Monitor(Step10). After all qos profile in qos registered list is guaranteed, QoS Monitor would send qos registered list to QoS Manager (Step8a). QoS Negotiator would replace qos active with qos active new from QoS Manager(Step11). Finally, application receives qos active new (Step 12), which is the best QoS performance provided by system after satisfying qos request. 3.4
QoS Analytical Model in RRPS
In Figure 2, p1 to p5 in blue ellipse denote the steps in a typical publication and s1 to s5 in yellow ellipse express the processes in a typical subscription in RRPS. Step p1, p2 of Figure 2 show the creation of the Publisher and DataWriter, respectively. Step p3, p4 show the process of user application writes data to DataWriter and the process of DataWriter writes data to Topic Queue. Step p5 shows the status changes listened by a Listener according to its current policy. Step s1 of Figure 2 shows the Subscriber ’s creation. Step s2 shows the use of a SubscriberListener : It must first be created and attached to the Subscriber. Then, if the notification arrives, it is made available to each related DataReader. Then the SubscriberListener is triggered (s3). The application must get the list of affected DataReader objects; then it can read the data directly from Topic Queue (s4). Step s5 shows the process of user application reading data from DataReader. To abstract the presentation of analytical estimation, each step (p1 to p5,s1 to s5) in Figure 2, can be referred to as a task. We focus on the following three aspects (QoS Group in Section 3.1) for each task and refine the task QoS model proposed in[12]. Task Timeliness(T): include four constituent elements. Delay time (DT) is the time associated with end-to-end message delays in a network. Queue time (QT) is the waiting time needed to be processed by a task, e.g. queuing delay. Initialization time (IT) is the time associated with initialization time of the task. Process time (PT) is the time for a task being processed. T (t) = DT (t) + QT (t) + IT (t) + P T (t)
(1)
Task Resources (RS): defines the cost associated with the execution of tasks. It has two constituent components: enactment cost and realization cost.
842
X. Lu et al.
The Memory Cost (MC) is the memory cost used by the pub/sub system and with the monitoring of QoS of instances. The Storage Cost (SC) is the disk space cost used by the runtime execution of the task. The Communication Cost (CC) is the network cost used by the pub/sub system. RS(t) = M C(t) + SC(t) + CC(t)
(2)
Task Reliability (RA): includes two main components: storage failures and transmission failures. Storage failures consist of data saving and accessing failures which lead to an abnormal task termination. Transmission failures consist of network transmission exceptions which lead to an anomalous termination of a task. Deadline Missing failures is the deadline missing exceptions. In this case, a task have been done but the end time misses the deadline. Storage failures rate (SFR) is the ratio between the number of times a task did not perform for its users and the number of times the task was called for execution, i.e. Number(unsuccessful executions)/Number(called for execution). Transmission failures rate (TFR) provides information concerning the relationship between the number of times the transmission is successful and the number of times the transmission is failed. It is defined as Number(failed)/(Number(failed) +Number(done)). Deadline Missing failures rate (DMFR) is the ratio between the number of times a task did not finish process before deadline and the number of times the task was called for execution, i.e. Number(deadline missing executions)/Number(called for execution). RA(t) = 3 − (SF R(t) + T F R(t) + DM F R(t))
(3)
To facilitate a more comprehensive understanding of the QoS, we need a quantitative approach to characterize the QoS in a particular dimension[13]. We define a succinct index that can summarize many aspects of QoS. It is denoted as Quality Index (QI). To get a robust and effective QI, we first perform normalization on above three dimensions (e.g. T, RS, RA) using SUM[14](shift min to 0, scale sum to 1). Then, we combine three normalized constituent indexes to QI by CombSUM[15] (sum of individual sub-index). Thus, above three dimensions can be computed as following: Tnorm = (−1) ∗ ( RSnorm = (−1) ∗ ( RAnorm = (
T − Tmin ) i=1...N Ti − Tmin
RS − RSmin ) i=1...N RSi − RSmin
RA − RAmin ) i=1...N RAi − RAmin
(4)
(5)
(6)
where N is the number of samples. Since we express high QoS level with high QI value, -1 as a multiplier has been inserted into equal(4, 5). Based on Eq.(4-6), QI can be defined as following:
RRPS: A Ranked Real-Time Publish/Subscribe
QI = wT ∗ Tnorm + wRS ∗ RSnorm + wRA ∗ RAnorm
843
(7)
where wT , wRS and wRA are respectively the weights for Tnorm , RSnorm and RAnorm . QI ranges from -1 to 1 and these weights represent the importance of three dimensions to QI. Different adaptation strategies have different settings of these weights because each strategy has its own emphasis (See Section 3.6). 3.5
QoS Estimations
The QoS conditions can be inferred by analyzing a real time data flow. For example, the Delay Time in Section 3.4, corresponding to the end-to-end message delays in a network, may be described by a probability density function (pdf ) using statistics. We note that the data model of each aspect of a task can be so complex that they can not be described in terms of simple well-known probabilistic distributions. So, the task runtime behavior specification is composed of two classes of information[12]: basic and distributional. The basic class associates with each task’s QoS dimension, i.e. the minimum value, average value and maximum value. The second class is a distributional class, usually defined as a constant or of a distribution function (such as Exponential, Normal, Weibull, and Uniform). For example, Table 2 shows the QoS estimations for tasks. Table 2. Task QoS estimations QoS Dimensions Min value Timeliness (ms) 13.333 Resources (MB) 10.953 Reliability (%) 94
Basic Class Avg value Max value 17.261 22.727 15.197 28.804 98.429 100
Distributional Class Distribution function Normal(17.261, 3.594) Normal(15.197, 6.414) Normal(98.429, 2.299)
For the model that describes the known data distribution, the problem is simplified to estimate unknown parameters of a known model from the available data. In our reference implementation of RRPS, we used the method of Maximum Likelihood Estimation (MLE), which is considered to be one of the most robust techniques for parameter estimation[2,16]. The estimations are based on collected data at runtime. Once the most suitable distribution (together with its parameters) has been identified, its statistics properties can be exploited to find proper QoS policy setting that will satisfy the objective of keeping a constant QoS level. 3.6
QoS Strategies and Adaptive Selection Algorithm
There are four typical strategies defined in RRPS. Their emphases in QoS aspects, corresponding QoS polices and weights are described as Table3. When RRPS is running and instances are executed, the log files are generated to save information records describing the runtime performance on three QoS dimensions. The QoS Monitor is an independent component, which records all
844
X. Lu et al. Table 3. List of QoS Strategies Strategy Name
Emphasis Weights QoS Policies wT =1 Completely Fair (CF) None wRS =1 All Policies wRA =1 wT =2 Timeliness-Biased (TB) Timeliness wRS =0.5 Deadline, Latency Budget wRA =0.5 wT =0.5 Resources-Biased (RSB) Resources wRS =2 Resource Limits,Time Based Filter wRA =0.5 wT =0.5 Reliability-Biased (RAB) Reliability wRS =0.5 Durability, Lifespan, History, Reliability wRA =2
of the events for the whole process of the publication and subscription by RRPS. After qos predictive is computed by QoS Predictive Service, QoS Monitor would adjust QoS strategy if some of the key QoS policies could not be satisfied. The following algorithm gives the kernel-code for QoS adaptive strategy selection, returning result list containing adjustment result and transformed strategy for each TopicID. Algorithm 1. Adaptive Selection 1: Input: Alarming Topic ID set TopicList; 2: Output: Execution Results List ExecResultList. 3: Body: QOS REQUEST qos quest tmp; 4: QOS PREDICTIVE qos predictive tmp; 5: for ∀T opicID ∈ T opicList do 6: qos quest tmp=GetRequestInfoByID(TopicID); 7: qos predictive tmp= GetPredictiveInfoByID(TopicID); 8: if qos quest tmp.LatencyBudget= Q(i)
Y
Y
Accept
Reject
Accept
N
Reject
Fig. 3. Flowchart of EBDA
Table 2 compares the packet losses for different queues in DA, DADT and EBDA algorithms. For comparison purposes, we have used six applications, average traffic network load (refer to Section 6), buffer size 600 packets, bursty uniform traffic model, and average dequeue time of 14 clock cycles for the burst of 10 packets. Table 2. Comparison of packet losses for DA, DADT, and EBDA
Queue 0 1 2 3 4 5 Total
Packet Size 8 2 8 1 4 16
DA
DADT
550486 46901 547978 16796 163126 1421350 2746637
337482 15841 335285 4549 79563 1858333 2631053
EBDA 427936 64376 426873 28265 158210 1248060 2353720
As seen from Table 2, with our new equation for threshold value, we have been able to reduce the overall packet losses significantly. Also for application 5, which has the highest packet loss for DA, DADT, and EBDA, we have been able to reduce packet loss significantly, which results in providing fairness to all the applications. This way, EBDA distributes the packet losses more evenly among different applications.
858
Y. Chu, A. Uppal, and J.S. Son
5 Simulation Model We developed a simulation model for a packet buffer by using VHDL as shown in Figure 4. In Figure 4, the Traffic Generator block produces packets according to the specifications provided in the configuration file (config file). The config file specifies the traffic model and “load on each port” [6]. headers Controller
Output Links
RA/WA
traffic model
load on each port
Traffic Generator: Config file, SIM simulator, Converter
1
FIFO
2
FIFO
packets Packet Buffer i
FIFO
n
FIFO M
M: Buffer Space RA: read address WA: write address
Fig. 4. Simulation model for the packet buffer
In Figure 4, we used the traffic model with a burst of packets in busy-idle periods with destinations uniformly distributed packet-by-packet or burst-by-burst, which is called “Bursty Uniform Traffic Model,” over all the output ports. The “load on each port (ρ)” is determined by the ratio of the number of packets in the busy-idle periods [8] and is given by the equation as follows:
ρ=
L b L +L b idle
(7)
where Lb = mean burst length and Lidle= mean idle length. The Traffic Generator produces packets with a mean inter-arrival time and a mean burst length [6]. The “SIM” simulator in [15] is used for producing a trace of packets. The trace of packets from the “SIM” simulator is written to some output file through converter.
6 Simulation Results and Analysis As discussed, we have used the “Bursty Uniform Traffic Model” for simulations of the network traffic loads since it is the most commonly used model [16][18]19]. In addition, we have used six applications and average dequeue time of 14 clock cycles for the burst of 10 packets.
A Dynamic Packet Management in a Protocol Processor
859
We implemented a traffic mix with the average network traffic loads according to [5]. First of all, we determined the optimum ‘α’ values for DA and DADT. Optimum alpha is considered as the value for which DA gives the minimum packet loss ratio. Table 3 shows the packet sizes of different applications in bytes based on the average network traffic load flow in [5]. For our simulation of the average traffic load, we have used these packet sizes for different applications. Table 3. Queue properties for average traffic load
Size in bytes packet unit # (32 bytes/unit)
Q0 256
Q1 64
Q2 256
Q3 32
Q4 128
Q5 512
8
2
8
1
4
16
Table 4 shows the packet losses for different variations of alpha and gamma values for EBDA, for a buffer size of 600 packets and load of 70% on each queue. Table 4. Variation of (alpha1, alpha2, gamma1, and gamma2) vs. total packet loss for EBDA
Variation of alpha1, alpha2, gamma1, gamma2 16,4,64,64 16,2,64,64 4,16,64,32 16,8,8,16 16,8,16,32 16,8,8,64 16,8,32,64 16,8,64,64
Total Packet Loss 2353720 2356962 2360745 2422177 2376250 2385378 2361126 2353922
From Table 4, optimum values of alpha1, gamma1, alpha2, and gamma2 come out to be for variation 1. For our comparison purpose, we will use values of alpha1, alpha2, gamma1, and gamma2 as 16, 4, 64, and 64 (from variation 1) respectively. Figure 5 shows the performance of the three algorithms (EBDA, DA, and DADT) for different loads. Load has been varied from 0.5 to 0.9. As seen in Figure 5, EBDA has the least packet loss ratio for all of loads. The packet loss ratio increases for all the algorithms with increasing “load on the queues”. Notice that the performance difference increases more at higher loads. As the load is increased, applications with larger packet size tend to increase their queue length to values greater than their threshold values frequently. Since EBDA utilizes the buffer space more efficiently by providing fairness to all the applications, it can reduce the packet loss ratio significantly.
860
Y. Chu, A. Uppal, and J.S. Son DA
Packet Loss Ratio
0.14 0.12
DADT
0.1
EBDA
0.08 0.06 0.04 0.02 0 0.5
0.6
0.7 Load
0.8
0.9
Packet Loss Ratio
Fig. 5. Packet Loss Ratio vs. Load for EBDA, DADT, DA for the average traffic load
0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 500
DA DADT EBDA
600
700
800
Buffer Size Fig. 6. Packet Loss Ratio vs. Buffer Size
Figure 6 shows the performance of the three algorithms DA, DADT and EBDA as the buffer size varies from 500 to 800 packets. With an increase of buffer size, packet loss ratio decreases for all three algorithms. This is due to the fact that each queue gets more space to accommodate packets. Table 5 shows the improvement in packet loss ratio for ‘EBDA over DA’ and ‘EBDA over DADT’ as the load varies from 0.5 to 0.9. Table 5. Improvement ratio of EBDA over DA and DADT for the average traffic load
Load 0.5 0.6 0.7 0.8 0.9
Improvement ratio (%) (EBDA /DA) 15.3 16.6 16.7 15.9 15.1
Improvement ratio (EBDA /DADT) 6.57 9.59 11.8 12.2 11.6
(%)
A Dynamic Packet Management in a Protocol Processor
861
7 Conclusions This paper proposeed the Evenly Based Dynamic algorithm (EBDA) to reduce the number of packets being dropped at the packet buffer and distribute the packet loss evenly among different applications in a network terminal. A buffer management algorithm will decide the amount of space for each output queue in the packet buffer. Three buffer management algorithms are implemented for our simulations: 1) Dynamic algorithm (DA); 2) Dynamic Algorithm with Dynamic Threshold (DADT); and 3) Evenly Based Dynamic algorithm (EBDA). EBDA provides more fairness to all the applications and utilizes buffer space efficiently, which makes it different from DA and DADT. The simulations considered a buffer size of 600 packets, 6 output queues (0-5), bursty uniform traffic model, dequeue time of 14 clock cycles for a burst of 10 packets, and uniform load for all the output queues. For the traffic mix with the average network traffic loads [5], the EBDA improves the packet loss ratio more than 15% compared to conventional dynamic algorithm.
References 1. Tanenbaum, A.: Computer Networks, 4th edn. Prentice-Hall, Englewood Cliffs (2002) 2. Henriksson, T., Nordqvist, U., Liu, D.: Embedded Protocol Processor for fast and efficient packet reception. IEEE Proceedings on Computer Design: VLSI in Computers and Processors 2, 414–419 (2002) 3. Paxson, V.: End-to-End internet packet dynamics. In: Proceedings of ACM SIG-COM, vol. 27, pp. 13–52 (October 1997) 4. Henriksson, T.: Intra-Packet Data-Flow Protocol Processor. PhD Dissertation, Linkopings university (2003) 5. Nordqvist, U., Liu, D.: Power optimized packet buffering in a protocol processor. In: Proceedings of the 2003 10th IEEE International Conference on Electronics, Circuits and Systems, vol. 3, pp. 1026–1029 (2003) 6. Arpaci, M., Copeland, J.A.: Buffer Management for Shared Memory ATM Switches. IEEE Communication Surveys (First Quarter 2000) 7. Choudhury, A.K., Hahne, E.L.: Dynamic Queue Length Thresholds for Shared-Memory Packet Switches. IEEE/ACM Transactions on Communications 6(2), 130–140 (1998) 8. Tobagi, A.: Fast Packet Switch Architectures for Broadband Integrated Services Digial Networks. Proceedings of IEEE 78, 133–167 (1990) 9. Irland, M.: Buffer Management in a Packet Switch. IEEE Transactions on Communications, COM 26(3), 328–337 (1978) 10. Foschini, G.J., Gopinath, B.: Sharing Memory Optimally. IEEE Transactions on Communications COM-31(3), 352–360 (1983) 11. Kamoun, F., Kleinrock, L.: Analysis of Shared Finite Storage in a Computer Network Node Environment under General Traffic Conditions. IEEE Transactions on Communications COM-28, 992–1003 (1980) 12. Wei, S.X., Coyle, E.J., Hsiao, M.T.: An Optimal Buffer Management Policy for HighPerformance Packet Switching. In: Proceedings of IEEE GLOBECOM 1991, vol. 2, pp. 924–928 (December 1991)
862
Y. Chu, A. Uppal, and J.S. Son
13. Thareja, A.K., Agarwal, A.K.: On the Design of Optimal Policy for Sharing Finite Buffers. IEEE Transactions on Communications COM-32(6), 737–780 (1984) 14. Henriksson, T., Nordqvist, U., Liu, D.: Specification of a configurable general-purpose protocol processor. IEEE Proceedings on Circuits, Devices and Systems 149(3), 198–202 (2002) 15. Sundar Iyer: SIM: A Fixed Length Packet Simulator, http://klamath.stanford.edu/tools/SIM 16. Chu, Y., Rajan, V.: An Enhanced Dynamic Packet Buffer Management. In: The Proceedings of the 10th IEEE Symposium on Computers and Communications (ISCC 2005), Cartagena, Spain (June 2005) 17. Cisco Systems, http://www.cisco.com/warp/public/473/ lan-switch-cisco.shtml (accessed: February 15, 2009) 18. McCreary, S., Claffy, K.: Trends in Wide Area IP Traffic Patterns: A View from Ames Internet Exchange. In: ITC Specialist Seminar on IP Traffic Measurement, Modeling, and Management, Manterey, California (September 2000) 19. Manthorpe, S.: http://lrcwww.epfl.ch/people/sam/research_protlevels.html (accessed: February 22, 2009)
On a Construction of Short Digests for Authenticating Ad Hoc Networks Khoongming Khoo1 , Ford Long Wong2 , and Chu-Wee Lim1 1 2
DSO National Laboratories, Singapore
[email protected] University of Cambridge, United Kingdom
Abstract. In pervasive ad-hoc networks, there is a need for devices to be able to communicate securely, despite the lack of apriori shared security associations and the absence of an infrastructure such as a PKI. Previous work has shown that through the use of short verification strings exchanged over manual authentication channels, devices can establish secret session keys. We examine a construction of such a cryptographic digest function for deriving the verification string, and propose an improved construction, with weaker assumptions. We further provide a concrete instantiation which is efficient, based on finite fields. Keywords: authentication, ad hoc networks, digest, hash.
1
Introduction
The strength of a pseudorandom function used for cryptographic purposes is often dependent on the absolute number of bits of output. For a cryptographic hash function, the resistance of the function to pre-image attack and collision attack is clearly related to the output length, in addition to its other mathematical properties. Clearly, the longer its output length is, the more work an attacker needs to do to attack it successfully. Such properties of cryptographic hash functions lend them for convenient usage in authentication codes. However, for certain security usages, it may not be essential to use a full-length cryptographic hash function to have adequate security, nor is it necessarily desirable. If used as a one-way function, a hash is required to be practically irreversible; for some usages this requirement is not strictly necessary. A hash function has the considerable disadvantage of relatively high computational complexity, in turn a by-product of having a high output length and the many operations it entails; this is an issue of concern in constrained devices. It is possible in certain security situations to obviate the need for the extensive set of security properties of cryptographic hash functions, but opt for pseudorandom functions which are algorithmically more efficient and whose outputs are shorter than the traditional cryptographic hashes, for example, a digest function. We would briefly overview the security protocols which would utilize such functions. Then we would review a proposal for such a function, point out some O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 863–876, 2009. c Springer-Verlag Berlin Heidelberg 2009
864
K. Khoo, F.L. Wong, and C.-W. Lim
incompleteness in its development, and offer a more complete characterization. From this, we derive our improved construction for an efficient digest function, based on finite fields. We provide a concrete instantiation, to show that this design is practical and can be efficiently realized. 1.1
Related Work - Key Agreement Protocols in an Ad Hoc Environment
Various security protocols have been proposed by researchers to address the problem of bootstrapping key agreement between two or more ad hoc devices who do not have a prior security association and in the absence of a trust infrastructure such as a PKI. The nature of the environment provides an untrusted though high-bandwidth wireless channel for radio transmissions, as well as various possibilities of smaller-bandwidth auxiliary channels — the latter offer stronger security properties than the radio channel, such as data-origin authenticity. Early work [16,2,8] make assumptions that the auxiliary channels are either private (ie. confidential), or else too high in bandwidth to be attacked, vis-a-vis adversarial brute-force computational capabilities. More recent work [9,19,18,11,13,20] make less stringent assumptions than before on the auxiliary channel, assuming that the channels are merely ‘authentic’ (to various degrees), ‘data-origin-authentic’, or ‘unspoofable’. This body of work develops various protocols similar in concept; they basically have Diffie-Hellman key contributions and bit commitments sent over the insecure radio channel between participants, and then carry out authentication by exchanges of short data strings over more secure, authentic channels (such as machine-visual, human-manual, or audio channels) to verify the commitments. The role of the Diffie-Hellman component is to achieve resistance against a passive-only adversary, while the exchange and verification of short strings over the authentic channels provides resistance against an active (Man-in-the-Middle) adversary. (To this end, one of the authors of this paper in previous work has also highlighted the particular utility of authentic one-bit messages [19,20].) 1.2
Separating the Requirements on the Pseudorandom Function
These protocol proposals mainly utilize cryptographic hash functions, or Message Authentication Code (MAC) functions, or even other constructions for bit commitments. This is because these protocols have a universal need of such a pseudorandom function to apply on the DH contributions, having the property of (using hash function terminology) strong first and second pre-image resistance, or (using bit commitment function terminology) having strong hiding and binding properties. In some two-party protocols and especially in multi-party settings, there would yet be benefits, performance-wise, in using a more efficient pseudorandom function, in addition to a hash or MAC chosen for bit commitments, instead of just using the same function at two separate stages of the protocol. Such a case has been made in [14], in which it was argued that, in view of the large size of the input and the resulting performance impact on calculating its long hash or MAC
On a Construction of Short Digests for Authenticating Ad Hoc Networks
865
output even when just a short output is really required, it is worthwhile to consider a digest function in which the computational complexity is quantifiably lower, if certain less stringent mathematical criteria are fulfilled. 1.3
Secure Key Agreement Utilizing Digest Function
As illustration, we reproduce the Symmetrised HCBK Protocol [14, Section 4] for a two-party case. This is depicted in Protocol 1.
Protocol 1. Two-Party Case of Symmetrised Group Key Agreement Protocol User Up
User Uq
Pick up ∈ {0, ..., |G − 1|} Yp ← gup Pick kp ∈ {0, 1}r Hp ← Hash(Up , Yp , kp )
Pick uq ∈ {0, ..., |G − 1|} Yq ← guq Pick kq ∈ {0, 1}r Hq ← Hash(Uq , Yq , kq ) Up , Yp , Hp −−−−−−−−−−−−−−−−→ Uq , Yq , Hq ←−−−−−−−−−−−−−−−− kp −−−−−−−−−−−−−−−−→
Verify Hq
kq ←−−−−−−−−−−−−−−−−
Dp ← Digest(kp ⊕ kq , Yp Yq ) [Sent over ‘authentic’ channel]: Check Dq == Dp
Verify Hp
Dq ← Digest(kp ⊕ kq , Yp Yq ) Dp or Dq ← −−−−−−−−−−−−−− → Check Dp == Dq
The first four messages are sent over high-bandwidth open channels – essentially a bidirectional radio link. To begin, the initiator entity Up would generate randomly a Diffie-Hellman private exponent up , calculate the corresponding Diffie-Hellman exponential Yp , generate randomly a long r-bit key kp , and hash its identifier Up together with Yp and kp to output Hp . The initiator sends the values Up , Yp and Hp to the responder entity Uq . The responder Uq would generate its own set of Diffie-Hellman private exponent, Diffie-Hellman exponential and key. It then sends its identifier Uq together with the Diffie-Hellman exponential Yq and hash output Hq to entity Up . On receipt of Uq ’s message, Up would send its key kp to Uq . Next, Uq uses the received key and hashes it with the received Up and Yp values, to verify that he can derive the same Hp value as he has received. If the verification is successful, Uq would in turn send its key kq to Up . On receipt of this message, Up would similarly verify that he can derive a value of Hq equal to that which he has received earlier. The two sets of verifications so far, if successful, only show that within each set, the received hash
866
K. Khoo, F.L. Wong, and C.-W. Lim
output is consistent with the received key. By themselves, these do not verify that the messages are received from the actual desired party (and not from some active man-in-the-middle attacker). That role would fall upon the final message and the accompanying verification. At this point, both parties would compute the digest output of the two concatenated Diffie-Hellman exponentials, using as the digest key the XOR of the two keys kp and kq , hence Digest(kp ⊕kq , Yp Yq ). The digest output is transmitted as the short fifth and final message over a lower-bandwidth human-mediated channel, which is ‘unspoofable’, possessing the property of (data-origin) authenticity. An example is a human-visual channel, ie. one human operator (or both) may read a displayed digest output Dp (or Dq ) off a device’s screen and ascertain that the data string is equal to the one displayed on the other device’s screen, Dq (or Dp ). If the strings match, the two human operators would be assured that both parties hold matching key contributions with high probability, and they may press buttons on their devices to allow the session to proceed, and to calculate a shared secret key. This key would be the DiffieHellman shared key, g up uq (or be derived from that combined perhaps with other inputs such as the entities’ identifiers, using a mutually known key derivation function). Security against Passive Adversaries Against a passive-only eavesdropping adversary, this protocol is secure to the extent that the Computational Diffie-Hellman Problem (CDHP) is hard in the chosen group, ie. given g up and g uq , find g up uq . As G would be chosen to be a large group order, the CDHP is not expected to be computationally tractable, hence the protocol is secure against passive adversaries. Security against Active Adversaries Against an active attacker (ie. one which can intercept and modify messages on the radio channel, but who still cannot do more than eavesdrop on the channel defined as authentic), the protocol is secure to the extent that the hash outputs and the keys (kp and kq ) are long, and that the digest function fulfils certain properties. Firstly, the purpose of a key and the corresponding hash (eg. kp with Hp ) is to ensure a commitment to the submitted Diffie-Hellman exponential, as well as making it difficult for the adversary to guess the key, when given the hash and the exponential. Secondly, if the adversary inserts his own set of corresponding key and hash output (derived from a Diffie-Hellman exponential whose exponent he knows) to replace one legitimate party’s messages (say Up ’s), he hopes to ensure that the resulting digest, Digest(kp ⊕ kq , Yp Yq ) would still match the legitimate digest’s value, even though the digest key and the digest input message have been modified – in effect the adversary has only a ‘oneshot’ guess. Informally speaking, the probability of two digest outputs matching should be negligibly low when their inputs and keys differ arbitrarily. Hence the protocol is secure against active adversaries.
On a Construction of Short Digests for Authenticating Ad Hoc Networks
867
Proof Sketch. Two cases can be deduced: the attacker attempting to modify Up ’s message(s) could try to (a) reverse the hash output Hp so as to recover the actual kp , or to (b) ignore Hp and kp and just use his own kp and corresponding Yp and Hp . For (a), the complexity of recovering kp or finding a second preimage would be approximately of the order of the length of kp or the length of the hash output respectively, whichever is shorter – this attack is considered computationally intractable if the bit-lengths are large (eg. >> 100 bits). For (b), the attack is difficult if the probability P r(Digest(kp ⊕ kq , Yp Yq ) = Digest(kp ⊕ kq , Yp Yq )) is negligible. (Some bit-lengths for such a string to be exchanged over manual channels have been analyzed and suggested by Gehrmann et al. [8]. We may select a digest length of say, 32 bits, to keep the probability of a match of outputs from arbitrary digest inputs negligibly low.) The construction of the digest function, to be described further in the following sections, would ensure that the probability of such an attack succeeding is negligibly low. The hash function used in the protocol is allowed to be any standard collision-resistant cryptographic hash function (such as SHA-1 or SHA-256), and we would just be focused on the digest function.
2
An Idealized Short Digest
The following definition specifies the desirable properties that a digest function should possess. Definition 1. Let A, B be message spaces where |A| = 2n and |B| = 2b . Let K be a key space where |K| = 2r . A short digest is a keyed function digest : K × A → B such that: 1. When m ∈ A is fixed, digest(k, m) is uniformly distributed. i.e., P r{k∈K} (digest(k, m) = b) = 1/2b for any b ∈ B. 2. When θ ∈ K, m = m ∈ A are fixed, P r{k∈K} (digest(k, m) = digest(k ⊕ θ, m )) = 1/2b In [14], the authors defined the following message digest: digest(k, m) = Rk · m ⎛ ⎞⎛ ⎞ ⎛ ⎞ R1,1 (k) . . . R1,n (k) m1 d1 ⎜ ⎟ ⎜ ⎟ ⎜ . . . . ⎟ . .. .. .. =⎝ ⎠ ⎝ .. ⎠ = ⎝ .. ⎠ , Rb,1 (k) . . . Rb,n (k)
mn
db
where m = (m1 , . . . , mn ) ∈ GF (2)n , Rk is a b × n matrix with entries Ri,j (k), which are independent Boolean valued random variables based on k ∈ GF (2)r .
868
K. Khoo, F.L. Wong, and C.-W. Lim
Although this is not defined explicitly in [14], we interpret it to mean: even if we know Ri ,j (k) for all (i , j ) = (i0 , j0 ), we still have no information on the output of Ri0 ,j0 (k), i.e. P r{k∈K} (Ri0 ,j0 (k) = 0|Ri ,j (k) ∀(i , j ) = (i0 , j0 )) =
1 . 2
(1)
However as remarked in [14], for the entries Ri,j (k) to be truly independent, k needs to have far more bits than it actually does. In this case, Nguyen and Roscoe calls it an idealized digest, which may not be realizable in practice. They proved the following result on constructing idealized digest functions based on linear functions of k. Proposition 1. (Nguyen-Roscoe [14, Theorem 3]) The idealized digest function digest(k, m) = Rk · m defined above satisfies the specifications of a digest function in Definition 1 provided that each entry Ri,j (k) is a linear function of k for all i, j. Remark 1. Proposition 1 is valid only for the idealized case when Ri,j (k) are independent Boolean functions. In fact, when Ri,j (k) are linear functions, the independence assumption in equation (1) is equivalent to the condition that the sum of any subset of the entries Ri,j (k) is balanced, i.e. non-zero. In the next sections, we shall give a more concrete characterization and construction which relies on a weaker assumption.
3
A Non-idealized Characterization of Short Digest
In this section, we give a characterization of short digest based on Rk whose entries are (not necessarily independent) linear functions of k. We shall need the following well-known fact on vectorial Boolean function: Proposition 2. [3] A Boolean function F : GF (2)r → GF (2)b is balanced, i.e. uniformly distributed, if and only if any non-zero linear combination of output bits v · F (k), 0 = v ∈ GF (2)b , is balanced. Suppose we construct digest(k, m) = Rk · m and let all the entries of Rk be linear functions. The first condition of Definition 1 translates to ⎛ ⎞ n n digest(k, m) = ⎝ R1,j (k)mj , . . . , Rb,j (k)mj ⎠ j=1
j=1
is uniformly distributed. By Proposition 2, this would mean any output linear combination of the above vectorial Boolean function is balanced. i.e., for any v ∈ GF (2)b , ⎛ ⎞ b n vi ⎝ Ri,j (k)mj ⎠ i=1
j=1
On a Construction of Short Digests for Authenticating Ad Hoc Networks
869
is balanced. Since the above sum is a linear function in k, we just require it to be a non-zero function. To satisfy the second condition, we need the equation: Rk · m = Rk⊕θ · m = Rk · m ⊕ Rθ · m , to have 2r−b roots in k for each θ ∈ GF (2)r and m = m ∈ GF (2)n . We can write the condition as: Rk · m = θ , has exactly 2r−b roots in k for all θ ∈ GF (2)r and m ∈ GF (2)n , where m = m ⊕ m and θ = Rθ · m . This is automatically satisfied when condition 1 of Definition 1 is satisfied. Let us summarize our discussion in the following theorem. Theorem 1. (Non-idealized version of Proposition 1) The function digest(k, m) = Rk · m satisfies the specifications of a digest function in Definition 1 pro b vided that each entry Ri,j (k) is a linear function of k and i=1 vi
n is not the zero function for all i, j. i.e., the entries of all j=1 Ri,j (k)mj submatrices of Rk do not sum to the zero function. Remark 2. By Remark 1, we see that our construction is more general than that of Proposition 1 because we only require that any submatrix of Rk does not sum to the zero function whereas Proposition 1 requires that any subset of the entries of Rk does not sum to the zero function.
4
Instantiation of a Non-idealized Short Digest Based on Finite Fields
In this section, we provide a concrete construction of a digest function based on Theorem 1. We shall identify the entries of Rk with the elements of the finite field GF (2r ), via a fixed choice of basis S of GF (2r ) over GF (2). i.e., let {α1 , α2 , . . . , αr } be a basis of GF (2r ), then the correspondence is given by: x = x1 α1 + x2 α2 + · · · + xr αr ↔ x1 k1 ⊕ x2 k2 ⊕ · · · xr kr = l(k),
(2)
where xi ∈ GF (2). Thus every element x ∈ GF (2r ) corresponds to a linear function l : GF (2)r → GF (2). Pick any two linearly independent sets {v1 , v2 , . . . , vb } and {w1 , w2 , . . . , wn } of the field GF (2r ) over GF (2). Let the matrix Rk be defined by letting entry Ri,j (k) be (the linear function corresponding to) vi wj ∈ GF (2r ), where the product refers to multiplication in GF (2r ). For any submatrix of Rk obtained from columns i1 , . . . , ip and rows j1 , . . . , jq , the sum of the entries is: ⎞ p
⎛ q p q viα wjβ = viα · ⎝ wjβ ⎠ . α=1 β=1
α=1
β=1
870
K. Khoo, F.L. Wong, and C.-W. Lim
Since GF (2r ) is a field, this is non-zero if both α viα and β wjβ are non-zero. On the other hand, we picked our vi ’s and wj ’s to be bases of GF (2r ) so both values are indeed non-zero. This shows that our matrix Rk satisfies the desired condition: that every submatrix sums up to a non-zero value. Let us summarize our discussion in the following theorem. Theorem 2. Let b < n ≤ r and {v1 , v2 , . . . , vb }, {w1 , w2 , . . . , wn } be linearly independent sets in GF (2r ). Define the matrix Rk by letting each entry Ri,j (k) to be a linear function corresponding to the element vi wj ∈ GF (2r ), by equation (2). Then Rk : GF (2)n → GF (2)b is a digest function. 4.1
A Concrete Example of a Non-idealized Digest
Let us demonstrate Theorem 2 for the case where b = 3, n = 4 and r = 4. Let us take the field GF (16) := GF (2)[t]/ t4 + t + 1 . To construct the matrix, we pick the following linearly independent sets: (vi ) = (1 + t, 1 + t2 , t + t3 ), (wj ) = (1 + t, t + t2 , t2 + t3 , t3 ). The 3 × 4 matrix is then given by: ⎛ ⎞ 1 + t2 t + t3 1 + t + t2 1 + t + t3 ⎝1 + t + t2 + t3 1 + t2 + t3 1 + t3 t + t2 + t3 ⎠ . t3 1+t t + t2 t2 When written in terms of the key k = (k0 , k1 , k2 , k3 ), Rk is: ⎛ ⎞ k0 ⊕ k2 k1 ⊕ k3 k0 ⊕ k1 ⊕ k2 k0 ⊕ k1 ⊕ k3 ⎝k0 ⊕ k1 ⊕ k2 ⊕ k3 k0 ⊕ k2 ⊕ k3 k0 ⊕ k3 k1 ⊕ k2 ⊕ k3 ⎠ . k3 k0 ⊕ k1 k1 ⊕ k2 k2 It can easily be checked that every submatrix indeed sums up to a non-zero value. Moreover, it is easy to see that the entries are not independent Boolean functions, e.g. R1,1 (k) ⊕ R1,3 (k) ⊕ R2,2 (k) = (k0 ⊕ k2 ) ⊕ k3 ⊕ (k0 ⊕ k2 ⊕ k3 ) = 0. Thus our construction is more general than that of Nguyen-Roscoe in Proposition 1 because we do not need independence of the entries of Rk .
5
An Efficient Short Digest Construction for Constrained Environment
When we implement the short digest from Theorem 2 in a constrained environment, we would like to minimize the number of XOR gates used. One way to achieve this is to define GF (2r ) by a trinomial polynomial: GF (2n ) := GF (2)[t]/ tr + ts + 1 ,
On a Construction of Short Digests for Authenticating Ad Hoc Networks
871
and to construct the matrix by Theorem 2 where we pick the following linearly independent sets: (vi ) = (1, t, t2 , . . . , tb ), (wj ) = (1, t, t2 , . . . , tn ). Then the entries of matrix Rk will be of the form: Ri,j (k) = ki+j , if i + j < n. Ri,j (k) = ki+j−n ⊕ ki+j−n+s , if i + j ≥ n. Thus the matrix Rk will consist of entries with 0 or 1 XOR gate. Let us summarize our discussion in the following theorem: Theorem 3. Let b < n ≤ r and GF (2r ) be defined by a trinomial GF (2)[t]/(tr + ts + 1). Define the matrix Rk by letting each entry Ri,j (k) to be a linear function corresponding to the element ti+j ∈ GF (2r ), by equation (2). Then Rk : GF (2)n → GF (2)b defines a digest function where each entry of Rk is a linear function with 0 or 1 XOR gate. A good resource for finding irreducible trinomials over GF (2) is given in [5]. 5.1
Concrete Examples of Optimal Short Digest
Let us demonstrate Theorem 3 for the case where b = 3, n = 4 and r = 4. Let us take the field GF (16) := GF (2)[t]/ t4 + t + 1 . The 3 × 4 matrix is then given by: ⎛ ⎞ 1 t t2 t3 ⎝ t t 2 t3 1 + t ⎠ . t2 t3 1 + t t + t2 When written in terms of the key k = (k0 , k1 , k2 , k3 ), Rk is: ⎛ ⎞ k0 k1 k2 k3 ⎝k1 k2 k3 k0 ⊕ k1 ⎠ . k2 k3 k0 ⊕ k1 k1 ⊕ k2 We can see that there are only 3 XOR gates in the matrix Rk . In comparison, the matrix Rk for the short digest presented in Section 4.1 uses 16 XOR’s. In practical applications, the numbers may not be that small. Consider Protocol 1 in this paper, where we assume secure key agreement is computed over a 163-bit elliptic curve (as recommended by NIST) and the digest is computed for the message Yp ||Yq where Yp and Yq are 163-bit strings corresponding to points on the curve. A reasonable choice for the output size b of the digest function is
872
K. Khoo, F.L. Wong, and C.-W. Lim
32-bit1 and the input size is n = 163 × 2 = 326. Since r ≥ n in Theorem 3, we choose r = 327 because it is the smallest number larger than n = 326 where an irreducible trinomial exists. Note that this means 327 bits are transmitted in passes 3 and 4 of Protocol 1, this number of bits is not a problem since the earlier passes have accommodated more bits. We use the finite field: GF (2327 ) := GF (2)[t]/ t327 + t34 + 1 . In that case, the matrix Rk is: ⎛ k0 k1 . . . ⎜ k1 k2 . . . ⎜ ⎜ k2 k3 . . . ⎜ ⎜ k3 k4 . . . ⎜ ⎜ .. .. . . ⎝ . . .
k323 k324 k325 k326 .. .
⎞ k324 k325 k325 k326 ⎟ ⎟ k326 k0 ⊕ k34 ⎟ ⎟ , k0 ⊕ k34 k1 ⊕ k35 ⎟ ⎟ ⎟ .. ⎠ .
k31 k32 . . . k27 ⊕ k61 k28 ⊕ k62 k29 ⊕ k63 and will consists of 1 + 2 + · · ·+ 30 = 465 entries with 1 XOR gate and 32 × 326 − 465 = 9967 entries with 0 XOR gate. To make the implementation even more compact, we can implement the matrix Rk with 1 XOR, a 32-bit register S and a 327-bit linear feedback shift register (LFSR) with one tap point as follows: 1. Fill up a 327-bit LFSR, defined by the feedback relation s327+i = s34+i ⊕ si , with the key (k0 , . . . , k326 ). Initialize a 32-bit register S by 0. 2. Take the dot product of the leftmost 326 bits of the LFSR (denoted by LF SR[0, 1, . . . , 325]) and the message Yp ||Yq in Protocol 1. Update S by XORing this dot product to S and shift left by 1, i.e. S := (S ⊕ (LF SR[0, 1, . . . , 325] · Yp ||Yq )) 1) model is two fold: (1) The problem involves the learning and prediction of only the first N states, where N is usually less than 5, for which a first-order Markov model is sufficient, and (2) The first-order Markov model is computationally efficient. According to the Bayes rule, the posterior probability of page category Ci visited in position i (i = 1, 2, . . . , N ) of a visit session X is given by P B (Ci |X) =
P (Ci )P (X|Ci ) . P (X)
(3)
The most probable page category visited at the start of the sequence C1 = c1 is then given by c1 = arg max P (C1 = c)P (X|C1 = c). (4) c
This fixes the start state of the Markov chain. The subsequent states can be found by combining the predictions of the Bayes classifier (Equation 3) and the Markov model. According to the Markovian property, for a given visit session X the posterior probability of page category Ci visited in position i (i = 2, . . . , N ) depends only on Ci−1 and can be expressed as P M (Ci |Ci−1 , X) =
P (X|Ci , Ci−1 )P (Ci |Ci−1 ) . P (X|Ci−1 )
(5)
The page category visited at position i (i = 2, . . . , N ) is then given by ci = arg max P B (Ci = c|X)P M (Ci = c|ci−1 , X). c
(6)
Learning and Predicting Key Web Navigation Patterns
881
This equation is based on the assumption that the predictions of the Bayes and Markov models are independent. Notice that in evaluating Equation (6), we do not need to estimate the probabilities in the denominators of Equations (3) and (5). If a visit session X is described by user ID U and timestamp T , the na¨ıve Bayes assumption can be invoked to simplify the expressions, as shown for pattern 1 above. 3.3 Pattern 3: Range of Number of Page Views Per Page Category Visited This pattern captures the range of the number of page views for the first N page categories visited in a visit session. It provides information regarding the interests of users for specific page categories. We divide the number of page visits into three ranges; 1 page view corresponds to the first range, page views from 2 to 3 correspond to the second range and consequently page views greater than 3 correspond to the third range. The page categories ci (i = 1, 2, . . . , N ) visited have been determined as part of pattern 2. We use a Bayes classifier to predict the range Ri = ri of page views made at position i (i = 1, 2, . . . , N ) in visit session X as ri = arg max P (Ri = r|Ci = ci )P (X|Ri = r, Ci = ci ) r
(7)
where the page category ci is the one predicted as part of pattern 2. 3.4 Pattern 4: Rank of Page Categories in Visit Sessions Ranking of page categories visited in the first N positions is another key pattern. As a specific case, this pattern enables prediction of the most probable (popular) page category visited given a user ID, timestamp, or both. Pattern 4 is different from pattern 2 in that it disregards the order of occurrence of page categories in the sequence. Page categories can be ranked by ordering the posterior probability of a page category C = c observed in the first N positions given X, i.e., P (C = c|X). This probability can be calculated by applying the Bayes rule as P (C = c|X) =
P (X|C = c)P (C = c) . P (X)
(8)
Since the denominator is the same for all page categories, it can be dropped from the equation when using it for ordering purposes. The most probable category c observed in the first N positions is then given by c = arg max P (X|C = c)P (C = c). c
(9)
As discussed for pattern 1, the naive Bayes assumption can be invoked to replace P (X|C = c) with P (U |C = c)P (T |C = c). Similarly, if only T or U is known, then the corresponding probability term drops out. 3.5 Estimating the Probabilities The Bayesian models presented in the previous sections are learned by estimating the various probabilities on the right hand sides of Equations (1) to (9). These probabilities are estimated from the historical training data by maximum likelihood estimation.
882
M.T. Hassan, K.N. Junejo, and A. Karim
Since all variables are observed in the training data, maximum likelihood estimates are equivalent to the frequencies in the data. Specifically, the probability estimate of P (X = x|Y = y) is given by P (X = x|Y = y) ≈
no. of examples with X = x, Y = y . no. of examples with Y = y
(10)
For an unconditional probability, the denominator will be the total number of examples in the training data. To estimate the transition probabilities in Equation (5), we count an example if it contains the given transition at any position of the visit session.
4 Evaluation Setup We evaluate the effectiveness and efficiency of our models for Web navigation patterns on real navigation data. The evaluations are performed on a desktop PC with an Intel 2.4 GHz Pentium 4 processor and 512 MB of memory. Implementation is done in Java using Eclipse development environment. The subsequent sections describe the data and evaluation criteria. 4.1 Data and Its Characteristics We use the data provided by the 2007 ECML/PKDD Discovery Challenge [10]. The data were collected by Gemius SA, an Internet market research agency in Central and Eastern Europe, over a period of 4 weeks through use of scripts placed in code of the monitored Web pages. Web users were identified using cookies technology. The first 3 weeks of data are used for training (learning the models) while the last week of data are reserved for testing (evaluating predictions of the learned models). The data records are individual visit sessions described by the fields: path id, user id, timestamp, {category id, no. of page views},. . . . An example visit session is shown below: path id user id timestamp path = (category id, no. of page views)... 27 1 1169814548 7,1 3,2 17,9 3,1 ...
The timestamp field records the time at which a visit session starts and the category ID field identifies a group of Web pages with similar theme such as entertainment, technology, or news. There are 20 page categories in the data. The entire data contain 545,784 visit sessions from which the first 379,485 visit sessions are used for training and the remaining 166,299 visit sessions are used for testing. There are 4,882 distinct users in the data. An analysis of the training and test data reveals non-uniform data distribution. The minimum and maximum number of visits by a user in the training data is 7 and 497, respectively, with an average of 77.7 visits per user. The minimum and maximum number of visits by a user in the test data is 1 and 215, respectively. Similarly, the distribution
Learning and Predicting Key Web Navigation Patterns
883
0.7 Train Data Test Data 0.6
probability
0.5
0.4
0.3
0.2
0.1
0
0
5
10
15
20
25
category
Fig. 1. Non-uniform distribution of categories [11]
of page categories is uneven. Some categories are being visited more frequently than others. This is evident from Figure 1 which shows the probability of the categories in the training and test data. About 73% of the visit sessions in the training and test data are short, i.e., a visit where only one category is visited. These statistics confirm that the data distribution of the test and training sets is similar. 4.2 Evaluation Criteria The performance of our models is determined by computing a percent score on the test data for each pattern after learning from the training data. We learn and predict the first 3 positions of the sequences only. That is, N = 3 for patterns 2, 3, and 4. We define three values for the range variable as R = {(1 page view), (2 - 3 page views), (> 3 page views)}. Pattern 1 represents a two-class classification problem. The classification accuracy, defined as the ratio of correct to total classifications, is used to evaluate this problem. Pattern 2 represents a sequence prediction problem. We evaluate this by computing a percent score. The score is the sum of scores of each position prediction, where each position prediction score is defined as follows: The position prediction score is the sum of weights assigned to the N predicted categories. If the first, second, and third categories are predicted correctly, then assign weights 5, 4, and 3, respectively, to these positions. If a prediction is incorrect for the category in the first position, then it is assigned a weight of 4 if that category occurs in the second position, 3 if it occurs in the third position, 2 if it occurs in the fourth position, 1 if it occurs in position five and beyond, and zero if it does not occur at all. The weight assigned cannot be greater than the maximum possible for that position (e.g. the weight assigned to position 2 cannot be greater than 4). The percent score is obtained by dividing the score with the maximum possible score that can be achieved on the test data through a perfect classification.
884
M.T. Hassan, K.N. Junejo, and A. Karim
Notice that this score definition penalizes predictions that are incorrect and the amount of penalty depends on its correct position. Pattern 3 is also evaluated by computing a percent score. This score computation is identical to that for pattern 2 except that the weights are incremented by one if the predicted range of page views is correct; otherwise they are not incremented. Pattern 4 represents a ranking problem. We evaluate this by calculating two scores: the percent score of top ranked category and the percent score of top 3 ranked categories. Score calculation is same as defined for pattern 2 above. For all patterns, higher evaluation values signify better performance. The maximum possible score for each problem is also given in our results.
5 Results and Discussion Using the evaluation setup described in the previous section, we present prediction results for patterns 1, 2, 3, and 4 under four settings: (1) considering both the user ID and timestamp (X = {U, T }), (2) considering only the user ID (U ), (3) considering only the timestamp (T ), and (4) considering no input (i.e. unconstrained or global pattern G). We discretize the timestamp field into four values: weekday-day, weekday-night, weekend-day, and weekend-night. Daytime starts from 8 AM and ends at 6 PM. We tried several discretizations for timestamp but present results for the above defined discretization only. The results for patterns 1, 2, 3, and 4 for all four settings are given in Table 1. For pattern 1 (short and long visit sessions), the highest prediction accuracy of 76.64% is obtained when only the user ID is used as input. When both user ID and timestamp are used, the accuracy drops slightly to 76.6%. Thus, the knowledge of the visit sessions’ timestamps is actually degrading prediction performance. The prediction accuracy obtained when only timestamp (T ) is used is equal to that when no information is given regarding the visit sessions (the G or global setting). This shows that the global pattern is identical to that of the timestamp conditioned pattern. We also learn this pattern using a support vector machine (SVM) [12]. The highest prediction accuracy given by SVM (using linear kernel and empirically tuned parameters) is 76.68% when both user ID and timestamp are used. Although this accuracy is slightly better than that reported Table 1. Prediction performance of our models (in percent score and as a score ratio). X = User ID + Timestamp; U = User ID only; T = Timestamp only; G = no input or global. Pattern 1 76.6% (127383/166299) U 76.64% (127457/166299) T 73.31% (121919/166299) G 73.31% (121919/166299)
X
Pattern 2
Pattern 3
83.17% (902849/1085494) 83.21% (903270/1085494) 64.32% (698199/1085494) 64.32% (698199/1085494)
72.42% (957235/1321706) 72.54% (958780/1321706) 54.17% (716012/1321706) 54.17% (716012/1321706)
Pattern 4 Top Ranked 79.86% (664009/831495) 80.57% (669904/831495) 52.08% (433002/831495) 52.08% (433002/831495)
Pattern 4 Top 3 Ranked 81.56% (885319/1085494) 83.1% (902010/1085494) 66.3% (719568/1085494) 66.3% (719568/1085494)
Learning and Predicting Key Web Navigation Patterns
885
by our model, our approach is significantly more efficient. For our hardware setup, our approach takes less than 1 minute to learn from the training data and classify the test data. In contrast, the SVM takes several hours to learn. For pattern 2 (page categories visited in first 3 positions), the percent scores obtained for the first two settings (X and U ) are practically identical at 83.17% and 83.21%, respectively. Here again, the addition of the timestamp information to the user ID does not improve prediction performance. However, these scores are significantly better than those obtained for the timestamp (T ) and global (G) settings. These results show that Web navigation behavior, for short page category sequences, is strongly correlated with user behavior. On the hand, we do not find any correlation between Web navigation behavior and time in our data. On our hardware setup, it takes about 15 and 6 minutes to learn and predict this pattern for the first and second settings, respectively. Similar observations can be made for the learning and prediction of pattern 3 (range of number of page views per page category). The percentage score drops slightly from 72.54% when only user ID (U ) is considered to 72.42% when both timestamp and user ID (X) are considered. Similarly, the running time decreases from about 1.5 to about 1 minute from the first to the second setting. We evaluate pattern 4 (rank of page categories in first 3 positions of visit sessions) in two ways using the same score definition for pattern 2. First, we determine the percent score of predicting the top ranked page category (favorite page category) and second, we compute a percent score based on the top 3 ranked categories (favorite 3 categories). These results are shown in the rightmost two columns of Table 1. The highest percent scores of 80.57% and 83.1% are obtained when only user ID is given. Notice that when no information is provided about the visit sessions (the global pattern) this pattern is poorly defined. Thus, just globally ranking page categories seen in historical data is ineffective. A consistent observation from the results on this data is that knowledge of the start of visit sessions does not improve prediction performance. In fact, for all four patterns, the prediction performance for the global pattern and that conditioned on timestamp is identical. We tried several discretizations of the timestamp field with similar results. It is worth pointing out that our models are generative in nature. As such, we can analyze the patterns by studying the probability distributions that generate the patterns (the probability terms on the righthand sides of Equations 1 to 9). The results for patterns 1, 2 and 3 are also reported by [13] and [14]. Dembczyski et al. in [13] presented trend-prediction and auto-regression methods for predicting pattern 1, and empirical risk minimization for predicting pattern 2 and 3. They claim slightly better results but our approach is simpler, efficient and generative. Lee in [14] used frequent items approach for predicting patterns 1,2 and 3. Their results are similar in case of pattern 1 and 3 while for pattern 2, our results are better. 5.1 Computational Complexity The time complexity of our models for all the four patterns is O(D) where D is the total number of visit sessions in the data, this is because the models are learned in a single pass over the training data and constant time is required to predict each test example as all the probabilities have been pre-computed.
886
M.T. Hassan, K.N. Junejo, and A. Karim
The space complexity of our models is defined by the number of probability estimates required. The number of probabilities required is a sum of the products of variables’ cardinalities. For example, the space complexity of pattern 2 given X, which is the highest among all patterns, is: (c × p) + (u × c × p) + (c2 ) + (u × c2 ) + (t × c × p) + (t × c2 ), where c is the number of page categories, p is the number of sequence positions, t is the number of timestamp values, and u is the number of distinct users. All of these terms correspond to the probabilities in Equation (3) and Equation (5). In general, u, c, t, and p are much less than D and as D grows u, c, t, and p remain constant or grow very slowly.
6 Conclusion In this paper, we present Bayesian models for learning and predicting key Web navigation patterns. Instead of modeling the general problem of Web navigation we focus on key navigation patterns that have practical value. Furthermore, instead of developing complex models we present intuitive probabilistic models for learning and prediction. The patterns that we consider are: short and long visit sessions, page categories visited in first N positions, range of page views per page category, and rank of page categories in first N positions. We learn and predict these patterns under four settings corresponding to what is known about the visit sessions (user ID and/or timestamp). Our models are accurate and efficient as demonstrated by evaluating them on 4 weeks of data collected from Web sites in central and eastern Europe. In particular, our model for learning and predicting short and long visit sessions has the same prediction accuracy as SVM but is orders of magnitude faster. We also find that incorporating the start time of visit sessions does not have any practical impact on prediction accuracy. Learning and predicting Web navigation patterns is of immense commercial value. We believe that a direct approach by first identifying key patterns and then building models for these patterns is more likely to be used by business people. As part of future work, we will explore the impact of behavior clustering prior to developing prediction models.
References 1. Huberman, B.A., Pirolli, P.L.T., Pitkow, J.E., Lukose, R.M.: Strong regularities in world wide web surfing. Science 280(5360), 95–97 (1998) 2. Srivastava, J., Cooley, R., Deshpande, M., Tan, P.N.: Web usage mining: discovery and applications of usage patterns from web data. SIGKDD Explor. Newsl. 1(2), 12–23 (2000) 3. Borges, J., Levene, M.: Data mining of user navigation patterns. In: Masand, B., Spiliopoulou, M. (eds.) WebKDD 1999. LNCS, vol. 1836, pp. 92–112. Springer, Heidelberg (2000) 4. Manavoglu, E., Pavlov, D., Giles, C.L.: Probabilistic user behavior models. In: ICDM 2003: Proceedings of the Third IEEE International Conference on Data Mining, Washington, DC, USA, p. 203. IEEE Computer Society, Los Alamitos (2003) 5. Deshpande, M., Karypis, G.: Selective markov models for predicting web page accesses. ACM Transaction on Internet Technology 4(2), 163–184 (2004)
Learning and Predicting Key Web Navigation Patterns
887
6. Eirinaki, M., Vazirgiannis, M., Kapogiannis, D.: Web path recommendations based on page ranking and markov models. In: WIDM 2005: Proceedings of the 7th annual ACM international workshop on Web information and data management, pp. 2–9. ACM, New York (2005) 7. Lu, L., Dunham, M., Meng, Y.: Discovery of significant usage patterns from clusters of clickstream data. In: Proceedings of WebKDD (2005) 8. Wu, J., Zhang, P., Xiong, Z., Sheng, H.: Mining personalization interest and navigation patterns on portal. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2007. LNCS, vol. 4426, pp. 948–955. Springer, Heidelberg (2007) 9. Awad, M., Khan, L., Thuraisingham, B.: Predicting www surfing using multiple evidence combination. The VLDB Journal 17(3), 401–417 (2008) 10. Nguyen, E.H.S.: Ecml/pkdd: Discovery challenge. In: Proceedings of ECML/PKDD: Discovery Challenge (2007), http://www.ecmlpkdd2007.org/challenge 11. Hassan, M.T., Junejo, K.N., Karim, A.: Bayesian inference for web surfer behavior prediction. In: Proceedings of ECML/PKDD: Discovery Challenge (2007), http://www.ecmlpkdd2007.org/challenge 12. Joachims, T.: Making large-scale support vector machine learning practical. In: Advances in kernel methods: support vector learning, pp. 169–184. MIT Press, Cambridge (2007) 13. Dembczynski, K., Kottowski, W., Sydow, M.: Effective prediction of web user behaviour with user-level models. In: Proceedings of ECML/PKDD: Discovery Challenge (2007), http://www.ecmlpkdd2007.org/challenge 14. Lee, T.Y.: Predicting user’s behavior by the frequent items. In: Proceedings of ECML/PKDD: Discovery Challenge (2007), http://www.ecmlpkdd2007.org/challenge
A Hybrid IP Forwarding Engine with High Performance and Low Power* Junghwan Kim**, Myeong-Cheol Ko, Hyun-Kyu Kang, and Jinsoo Kim Department of Computer Science, Konkuk University, 322 Danwol-dong, Chungju-si, Chungbuk 380-701, Korea {jhkim,cheol,hkkang,jinsoo}@kku.ac.kr
Abstract. Many IP forwarding engines have employed TCAM for lookup operations due to its capability of storing variable-sized network prefixes. Power consumption has been one of the most important issues in TCAM-based engines. In contrast, SRAM can be operated with low power and latency but it cannot store variable-sized data directly. While it may need several accesses per lookup, TCAM needs only one access because of its parallel search. In this paper we propose a hybrid IP forwarding engine which elaborately combines TCAM and SRAM to allow both low power and high throughput. It consists of three stages based on different kinds of memories. Each stage may or may not be operated depending on a given IP address to maximize performance and save energy. Experiment results show that the proposed engine is at least 7.3 times faster than the normal TCAM-based engine with only 1.8% energy of that engine. Keywords: IP address lookup, power saving, TCAM partitioning, a hybrid architecture.
1 Introduction The IP address lookup operation of a router is to search a forwarding table for the best matching network prefix with a destination IP address. The router can forward each incoming packet to the next hop router corresponding to the selected matching prefix. The IP lookup operation became more complex after the variable-sized prefixes were extensively employed through CIDR(Classless Interdomain Routing)[1]. In contrast to classful addressing used in the earlier stage of the Internet, there are possibly several matching prefixes for a single IP address under the CIDR. The router has to select the longest matching prefix among the matching prefixes, so the IP lookup operation has a great performance impact on a router. Many researchers have proposed fast IP lookup schemes for high-performance routing[2, 3, 4]. Most of the schemes can be classified into trie-based approaches and TCAM(Ternary Content Addressable Memory)-based approaches. Trie-based IP lookup schemes employ SRAM/DRAM and usually require several memory accesses * **
This work was supported by Konkuk University. Corresponding author.
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 888–899, 2009. © Springer-Verlag Berlin Heidelberg 2009
A Hybrid IP Forwarding Engine with High Performance and Low Power
889
per lookup. On the other hand, TCAM-based schemes can perform a lookup operation in a single cycle, because TCAM is a fully associative memory to store variable-sized prefixes. Therefore, TCAM has been paid much attention to in recent years, although it requires higher latency per single memory access than SRAM. TCAM has more transistors per cell than SRAM and it needs to activate huge number of memory cells for search. It consumes an enormous amount of power to provide high throughput in IP address lookup. Therefore, the reduction of the power consumption has been a critical issue on TCAM-based IP lookup schemes. Energy saving has been accomplished by the table compaction or the table partitioning in many researches. The table compaction schemes[5, 6, 7, 8] tried to minimize the number of prefix entries by eliminating the redundancy among the prefixes in the table. The table compaction results in the reduction of the power consumption, since it activates fewer entries. The prefix pruning[5] and the prefix overlapping[6] eliminate the redundancy among the prefixes which have ancestor-descendant relation and the same next hop. In [7], the redundancy between sibling prefixes with the same next hop has been removed. Many schemes have exploited the mask extension technique [5] and the logic minimization algorithm as well. The table partitioning schemes[8-14] divide a forwarding table into several TCAM blocks and enable the block required in a lookup operation. As a result, the power consumption per lookup can be reduced. In general, these schemes have to include indexing mechanisms to select the suitable TCAM block containing the matching prefixes. One of the major issues in those schemes is the simplicity and the efficiency of the indexing mechanism to allow fairly even prefix and traffic distribution over the TCAM blocks. We propose an SRAM/TCAM hybrid architecture for IP lookup to provide high performance and low power consumption. Our architecture is composed of three stages and each stage is based on either SRAM or TCAM. The first stage has SRAM containing commonly accessed information because SRAM has low latency and low power. The second stage handles a great part of prefixes which are contained in TCAM. The TCAM is partitioned into several blocks for power saving and low latency. We propose a novel partitioning scheme which evenly distributes the prefixes over the partitioned blocks. Moreover, our scheme reduces the word size of a TCAM block in the second stage by an elaborate mapping algorithm. This also contributes to power saving and low latency by reducing the number of cells in TCAM. The third stage also uses TCAM but its size is not critical because the prefixes handled by it are not so much. The rest of this paper is organized as follows. Section 2 describes the general structure of the partitioned TCAMs. It also explains the background of TCAM partitioning in terms of the indexing mechanism. In section 3 we propose an IP address lookup architecture by explaining how it achieves high performance and power saving with two main ideas. Section 4 evaluates the performance of our IP lookup scheme through simulation results. We finally conclude this paper in section 5.
2 Background This section explains an abstract structure of partitioned TCAMs with indexing mechanism. We categorize and compare various indexing mechanisms used in the
890
J. Kim et al.
previous partitioning schemes. Based on the observation of the previous schemes, we also describe what affects the balanced partitioning. 2.1 Partitioned TCAM Fig. 1 shows an abstract structure of most TCAM partitioning schemes for reducing energy consumption. The indexing mechanism selects a few TCAM blocks according to the destination address of an incoming IP packet. A part of a destination address is used for indexing. Since only selected blocks are enabled, the required energy per lookup can be reduced in the partitioning scheme. For example, if an indexing scheme selects just ‘TCAM block 2’ in Fig. 1, the total energy consumption can be roughly decreased by a factor of n provided that every block size is nearly equal.
TCAM block 1 destination address
Indexing Mechanism
TCAM block 2 TCAM block n
Fig. 1. Abstract structure of partitioned TCAMs
2.2 Indexing Scheme The indexing mechanism is to find a block id from a given IP address. Those mechanisms can be classified into fixed indexing and variable indexing based on the length of an index. The typical fixed indexing schemes are bit selection-based[10, 11] and range-based[9, 14]. The bit selection-based schemes use the selected bits of the IP address as an index. In the range-based schemes, a block id is obtained from a range covering the IP address. These schemes require the complicated logic to find which range includes the IP address. The variable indexing mechanism, called trie-based scheme, uses index TCAM to contain variable-sized indexes[10, 12, 13]. It can accommodate any indexes without restriction on its length. It may evenly distribute prefixes over data TCAMs. However, the index TCAM may need to include the excessive number of indexes to achieve even distribution in some cases. Even some data prefixes need to be duplicated and contained in more than one data TCAM. 2.3 Balanced Partitioning A degree of the energy saving highly depends on the effectiveness of the partitioning scheme. It may be evaluated how evenly it distributes prefixes over the blocks. Smaller block size it achieves by the distribution, it consumes less energy.
A Hybrid IP Forwarding Engine with High Performance and Low Power
891
The partitioning is thought of mapping from a chunk, i.e., a set of prefixes sharing the same index onto a TCAM block. It is easy to balance the TCAM blocks if there are a lot of chunks and many-to-one mapping is utilized. We introduce a fixed indexing scheme which uses the selected bits of an IP address as an index. In our scheme the index size is 16 and there are 65,536 chunks. It is expected that those chunks can be effectively distributed over 256 blocks using manyto-one mapping. For many-to-one mapping, SRAM is used as a direct mapping table which is directly addressed by an index. The key idea of our scheme is to construct the direct mapping table so that prefixes are evenly distributed over the TCAM blocks. We will describe a novel mapping algorithm to construct the direct mapping table in the next section.
3 Proposed Architecture The proposed architecture is evolved from the abstract structure of Fig. 1. In this section we explain detailed structure including indexing and partitioning schemes. 3.1 Overall Architecture Our design is based on a principle that the frequently accessed prefixes should be searched as fast as possible compared to other prefixes. Also, a small amount of energy should be consumed especially for those prefixes in order to save overall energy. As a consequence, our architecture pursues low latency and energy saving for a lookup operation in average case. It is reasonable that the larger networks may be accessed more frequently. Some literatures[15, 16] remarked the prefixes whose lengths are shorter than 17 bits tend to be accessed highly frequently. However, the prefixes being from 17 to 24 bit long accounts for more than 90% of the entire prefixes, although each of them is not highly frequently accessed. The prefixes longer than 24 bits are a small portion of the whole prefixes and also infrequently accessed. Such characteristic about prefixes inspired us to consider three groups of prefixes in our architecture: P0-16, P17-24 and P25-32. Pα-β denotes the set of prefixes such that, for every prefix p in the set, α ≤ leng(p) ≤ β where leng(p) is the length of a prefix p. For example, P17-24 is the set of prefixes such that 17 ≤ leng(p) ≤ 24. Fig. 2 shows an example of prefixes which belong to different groups according to their lengths: p, r ∈ P0-16, q, t ∈ P17-24, and s, u ∈ P25-32. 1110 0000 1101 11
0
p P0-16 P17-24 P25-32
0
1
1
r 0
1 t
q s
u
prefix leng group p 15 P0-16 q 20 P17-24 r 16 P0-16 s 27 P25-32 t 18 P17-24 u 26 P25-32
bit string 1110 0000 1101 110* 1110 0000 1101 1101 0010* 1110 0000 1101 111* 1110 0000 1101 1110 0101 0011 001* 1110 0000 1101 1111 01* 1110 0000 1101 1111 1000 0001 11*
Fig. 2. An example of prefixes
892
J. Kim et al.
Those three prefix groups are stored in different memories and utilized according to their properties. The proposed architecture consists of three stages named S1, S2 and S3, of which each has a different kind of memory as shown in Fig. 3. Each prefix group is handled by S1, S2 and S3 respectively. Note that it is not required to operate all stages for every lookup. For a given IP address, the engine operates S1 with upper 16 bits. However, S2 and S3 may or may not be operated according to the result of S1.
Fig. 3. Overall architecture of the proposed IP lookup engine
Note that TCAM can store variable-length data but SRAM cannot. The prefixes of P0-16 are not actually stored into the SRAM in S1, while those of P17-24 and P25-32 are stored in the TCAMs in S2 and S3 respectively. The SRAM just contains the lookup results associated with the prefixes. Each prefix of P0-16 is expanded to 16 bits and the lookup results are stored in the entry addressed by the 16-bit strings. The technique is known as prefix expansion[17]. For example, a prefix p, 1110 0000 1101 110*, is expanded to two 16-bit strings: 1110 0000 1101 1100 and 1110 0000 1101 1101. The next hop information of the prefix p is duplicated and stored in the two entries. There are 65,536 entries in the SRAM since its address line is 16-bit width. Since SRAM is much faster than TCAM and also consumes less energy, it is suitable for the highly accessed prefix group, P0-16. The second prefix group, P17-24, contains a great part of prefixes, so the TCAM containing P17-24 requires huge power as well as long latency for lookup. To figure out the problem, P17-24 is partitioned into 256 blocks and only one block can be enabled for a lookup. The block index is obtained from a matched entry in the SRAM of S1. The third prefix group, P25-32, covers a very small portion of whole prefixes. We assign a separate TCAM to it, but not partition it because it’s sufficiently small. It consumes little energy and requires short latency in an access.
A Hybrid IP Forwarding Engine with High Performance and Low Power
893
3.2 Key Design 1 – Coupling of Indexing and Lookup Many researches have used indexing mechanism as shown in Fig. 1, however, every lookup should get through both indexing and access to a TCAM block in that kind of architectures. In our architecture a lookup may be finished at the first stage without access to a TCAM block for some prefixes. The entry of SRAM provides the lookup result as well as index information. As shown in Fig. 4, each entry of SRAM consists of nexthop, a block index for S2, and flags v2 and v3. Flags v2 and v3 are used to determine whether S2 or S3 should be looked up, respectively. A block index indicates a TCAM block in S2 when v2 = 1, i.e., S2 needs to be searched. Since there is only one TCAM in S3, it does not need any index in case v3 = 1. Now let’s consider an example with the prefixes of Fig. 2. The prefix p is expanded to 16 bits and it occupies two entries in SRAM. One of those entries is shared by the prefix q. Note that v2 = 1 in that entry because q is in P17-24. For a given 32-bit IP address, ip1 = 1110 0000 1101 1100 0000 0000 0000 1100, the engine looks up SRAM with its upper 16 bits. Since both v2 and v3 are 0 in the corresponding entry, S2 and S3 won’t be searched furthermore. nh1 will be the final lookup result. With another IP address, ip2 = 1110 0000 1101 1101 0000 0000 0000 1100, the result entry shows v2 = 1 and v3 = 0. It means S2 needs to be searched because there are some descendant prefixes. The index of a TCAM block which needs to be enabled is 23 which is stored in block index field. If there is a match in block 23, then it will be the final result because the longest matched prefix should be presented. Otherwise, nh1 will be the final lookup result. In this example there is no match in that block, that is, q doesn’t match with ip2. So, the result is nh1 together with the best matching prefix p. The case of v3 = 1 is similarly described except block index is not used. associated prefix
SRAM address
v2
v3
nexthop
0000 0000 0000 0000
…
p p, q r, s r, t, u
1110 0000 1101 1100 1110 0000 1101 1101 1110 0000 1101 1110 1110 0000 1101 1111
…
… 0 1 0 1
0 0 1 1
nh1 nh1 nh2 nh2
…
block index
⎯ 23 ⎯ 125
for ip1 for ip2
1111 1111 1111 1111 Fig. 4. Structure of SRAM
3.3 Key Design 2 – A Novel Mapping Heuristic Our second key design is to balance the TCAM block size and reduce TCAM word size in S2. It affects latency and energy of TCAM as well as its cost. While many energyefficient lookup architectures use 32-bit TCAM word, our architecture is able to reduce the word size to a half, i.e., 16 bits. It is achieved by our novel mapping algorithm.
894
J. Kim et al.
The following definition describes a set of prefixes having common MSB16 where MSB16 implies most significant 16 bits of a given prefix. Definition 1. A Common MSB16 Set, CMS is defined as Ck = { p | p ∈ P17-24 and MSB16 of the prefix p is k } The prefix group, P17-24 is partitioned into 65,536 CMSs from C0 to C65,535 according to their MSB16. The goal of the mapping algorithm is to pack the CMSs into 256 TCAM blocks while balancing the size of the block or minimizing the size of the largest block. This is equivalent to scheduling problem balancing the load over the parallel identical machines. It is known as NP-hard problem[21]. The Longest Processing Time (LPT) first is one famous heuristic for the scheduling problem. In LPT, whenever a machine is available, the longest job among the remaining jobs is placed on the machine. In our context the LPT put the largest Ck into the current smallest block in turn. The mapping algorithm based on LPT is described in Fig. 5.
MAP(C0, C1, ..., C65535) 1. Sort C0, C1, ..., C65535 such that |Cϕ(0)| ≥ |Cϕ(1)| ≥ ... ≥ |Cϕ(65535)| where ϕ is a permutation of {0, 1, …, 65535}. 2. for i ← 0 to 65535 do 3. block_index[ϕ(i)] = (n such that block_size[n] is the smallest) 4. block_size[n] = block_size[n] + |Cϕ(i)| 5. endfor 6. return block_index Fig. 5. A mapping algorithm based on LPT
Since the size of a prefix in every CMS can be at most 24, any two distinct prefixes can be distinguished by 24 bits in a TCAM block. However, if there is some restriction on the mapping rule, we can reduce the word size to 16 bits. Restriction 1. For given two distinct CMSs, Ci and Cj, if (i mod 256) = (j mod 256), then those sets can’t be contained in the same TCAM block. For example, C1 and C257 can’t be in the same TCAM block because 1 = 0000 0000 0000 0001, 257 = 0000 0001 0000 0001, and those are the same in lower 8 bits. Theorem 1 shows it is possible to distinguish any two distinct prefixes in a TCAM block by only 16 bits excluding upper 8 bits if it satisfies the restriction 1. Theorem 1. If any two distinct prefixes p=p0p1…p23*…* and q=q0q1…q23*…*, where pi, qi {0, 1, *} for 0 i 23, are assigned to a TCAM block and the restriction 1 is satisfied, then p8…p23 q8…q23.
∈
=
≠
≤≤
Proof. Suppose p8…p23 q8…q23, i.e., those prefixes can’t be distinguished. If p and q are in the same CMS, then they have the same MSB16. It means pi qi, for all i such that 0 i 15. Consequently p and q are wholly identical because we suppose p8…p23 q8…q23. It contradicts the assumption. If p and q are in different CMSs, say Ci
≤≤ =
=
A Hybrid IP Forwarding Engine with High Performance and Low Power
895
and Cj, then they have MSB16 i and j respectively, and i ≠ j. Moreover, (i mod 256) = p8…p15 and (j mod 256) = q8…q15. Consequently, (i mod 256) = (j mod 256), since we suppose p8…p23 q8…q23. It is against the restriction 1. □
=
By simply modifying line 3 in Fig. 5, we reflect the restriction 1 and reduce the TCAM word size in Fig. 6. Bm is a set of index residuals for m-th TCAM block and is initialized as empty set. 3-a. 3-b. 3-c.
k = i mod 256; block_index[ϕ(i)] = (m such that block_size[m] is the smallest with satisfying k ∉ Bm) Bm = Bm ∪ { k } Fig. 6. Modification of the mapping algorithm
4 Evaluation 4.1 Experiment Environment We experimented with four forwarding tables which had been used at different time period, but from a single site, Route Views[18]. Each table is loaded onto three stages for the experiment, and its distribution over the stages will be shown in the following section. Energy consumption and lookup delay depend on what stage would be activated. S1 is always activated at any IP address. However, S2 and S3 may be activated according to the IP address. Energy and delay time of S1 are calculated using Cacti [19], and also a TCAM model and its tool[20] are used for S2 and S3 in our experiment. All results were produced in 90-nm technology. 4.2 Simulation Results Table 1 shows prefix distribution for three stages. Total number of prefixes has been steadily increasing from 2005 to 2008. Most of prefixes are located at S2, which accounts for 92.6% ~ 93.9%. Remaining prefixes are either less than 17 bits (S1), or more than 24 bits (S3). Table 1. Prefix distribution in stages
Stage S1 S2 S3 Total
July 2005
July 2006
July 2007
July 2008
9787 (5.4%) 169728 (93.4%) 2182 (1.2%) 181697 (100.0%)
10341 (5.1%) 191990 (93.9%) 2099 (1.0%) 204430 (100.0%)
11419 (4.9%) 218930 (93.5%) 3804 (1.6%) 234153 (100.0%)
12241 (4.4%) 255405 (92.6%) 8088 (2.9%) 275734 (100.0%)
896
J. Kim et al.
The SRAM of S1 contains the expanded prefixes of P0-16 and mapping information for P17-24 and P25-32. The distribution of the entries is shown in Table 2. Most of entries are used for the expanded prefixes of P0-16, which accounts for 79.1%~85.0%. It means most of lookups are finished at S1 if all entries are evenly accessed. The fact that P0-16 is frequently searched can be also observed in other literatures[15, 16]. This implies that our architecture provides not only high performance but also less energy consumption because S1 is based on SRAM. Table 2. Distribution of entries in SRAM of S1
v2
v3
Description
0
0
S1 only
1
0
S1,S2 only
0
1
S1,S3 only
1
1
S1,S2,S3 Total
July 2005
July 2006
July 2007
July 2008
55730 (85.0%) 9236 (14.1%) 39 (0.1%) 531 (0.8%) 65536 (100.0%)
54638 (83.4%) 10267 (15.7%) 47 (0.1%) 584 (0.9%) 65536 (100.0%)
53210 (81.2%) 11597 (17.7%) 56 (0.1%) 673 (1.0%) 65536 (100.0%)
51857 (79.1%) 12651 (19.3%) 84 (0.1%) 944 (1.4%) 65536 (100.0%)
As described in section 3.2, the proposed mapping algorithm achieves not only balancing of prefixes but also reducing the width of a TCAM block in S2. Table 3 shows how balanced the number of prefixes per TCAM block is. The result implies the best distribution. It is owing to that there are a lot of CMSs containing only one prefix so prefixes can be well balanced over the blocks. This enables not only best energy saving but short latency by activating a small TCAM block. Table 3. The number of prefixes per TCAM block in S2
Date Max. Prefixes per Block Min. Prefixes per Block
July 2005
July 2006
July 2007
July 2008
663 663
750 749
856 855
998 997
Our novel mapping heuristic also makes it possible to reduce the width of a TCAM block. Fig. 7 shows the effect of the width reduction in energy consumption. It reduces about 21% energy. In Fig. 8 the search time per lookup is also reduced about 2.1%~2.7% just by the width reduction. There are very small amount of prefixes in S3 as shown in Table 1. Its energy consumption is 0.114 ~ 0.288 nJ by means of our experiment. It takes 3.096 ~ 5.902 ns to search TCAM in S3. In S1, the energy consumption of SRAM is 0.065 nJ and the access time is 1.552 ns respectively.
0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0.00
897
3.00
16-bit width
2.50
24-bit width
16-bit width 24-bit width
2.00 ns
nJ
A Hybrid IP Forwarding Engine with High Performance and Low Power
1.50 1.00 0.50
2005
2006
2007
0.00
2008
2005
Forwarding Table
Fig. 7. Energy consumption per lookup in S2
2006 2007 Forwarding Table
2008
Fig. 8. Search time per lookup in S2
The overall energy consumption and search time are evaluated as follows, when the effects of S1, S2 and S3 are all combined. We assume packet traffic is proportional to the size of its destination network. The probability distribution of accessibility to stages can be directly interpreted from Table 2. Table 4 shows average energy consumption and search time by combining the result of individual stages according to their probabilities. It also compares the result of our architecture with that of a normal TCAM-based architecture. The proposed architecture consumes the energy less than 1.8% of the normal TCAM-based architecture. Search time is at least 7.3 times faster than the normal TCAM-based architecture. Table 4. Comparison of proposed architecture and normal TCAM-based architecture
July 2005
July 2006
July 2007
July 2008
Proposed Arch. Dynamic Energy (nJ) Search Time (ns)
0.072 1.833
0.074 1.892
0.077 2.004
0.083 2.144
Normal TCAM-based Arch. Dynamic Energy (nJ) Search Time (ns)
4.005 15.351
5.147 15.618
6.607 15.351
8.211 15.624
5 Conclusion TCAM can provide high throughput in IP address lookup owing to its capability of parallel lookup, but it requires excessively large power consumption. This paper proposed a hybrid IP lookup architecture to provide low power consumption as well as high throughput employing both SRAM and TCAM. The proposed architecture is composed of three stages which contain prefixes according to their lengths. Each stage employs either SRAM or TCAM. SRAM needs lower power and latency than TCAM. The first stage employs SRAM, because it is desirable for the frequently accessed prefixes to be looked up early. Our simulation result shows that more than 79% lookups are completed in the first stage. It results in high throughput and energy saving in our IP lookup scheme.
898
J. Kim et al.
The second stage employs TCAM which accommodates a lot of variable sized prefixes. It needs to be partitioned into several blocks for energy saving. Our elaborate mapping algorithm evenly distributes prefixes over the TCAM blocks. The simulation result shows that the distribution is nearly optimal. Our mapping algorithm also decreases the word size of a TCAM block by 8 bits. It contributes to additional energy saving. In the third stage, TCAM accommodates the remaining prefixes which are rarely accessed. We compared our IP lookup architecture with the traditional TCAM-based one in terms of energy consumption and lookup performance by simulation. Experimental results show that the proposed scheme consumes the energy less than 1.8% of the traditional TCAM-based scheme. The lookup speed is at least 7.3 times higher than the traditional TCAM-based scheme.
References 1. Fuller, V., Li, T., Yu, J., Varadhan, K.: Classless Inter-Domain Routing (CIDR): An Address Assignment and Aggregation Strategy. RFC1519 (1993) 2. Ruiz-Sanchez, M.A., Biersack, E.W., Dabbous, W.: Survey and Taxonomy of IP Address Lookup Algorithms. IEEE Network 15, 8–23 (2001) 3. Chao, H.J., Liu, B.: High Performance Switches and Routers. Wiley Interscience, Hoboken (2007) 4. Varghese, G.: Network Algorithmics: An Interdisciplinary Approach to Designing Fast Networked Devices. Morgan Kaufmann Pub, San Francisco (2005) 5. Liu, H.: Routing table compaction in Ternary CAM. IEEE Micro 22, 58–64 (2002) 6. Ravikumar, V.C., Mahapatra, R.N.: TCAM architecture for IP lookup using prefix properties. IEEE Micro 24, 60–69 (2004) 7. Wang, G., Tzeng, N.-F.: TCAM-Based Forwarding Engine with Minimum Independent Prefix Set (MIPS) for Fast Updating. In: 2006 IEEE ICC, pp. 103–109 (2006) 8. Mahini, A., Berangi, R., Fatemeh, S., Firouzabadi, K., Mahini, H.: Low Power TCAM Forwarding Engine for IP Packets. In: MILCOM 2007, pp. 1–7. IEEE, Los Alamitos (2007) 9. Panigrahy, R., Sharma, S.: Reducing TCAM power consumption and increasing throughput. In: 10th Symposium on High Performance Interconnects Hot Interconnects (HotI 2002), pp. 107–112 (2002) 10. Narlikar, G.J., Basu, A., Zane, F.: CoolCAM: Power-Efficient TCAMs for Forwarding Engines. In: IEEE INFOCOM 2003, pp. 42–52 (2003) 11. Zheng, K., Hu, C., Lu, H., Liu, B.: A TCAM-based distributed parallel IP lookup scheme and performance analysis. IEEE/ACM Trans. Netw. 14, 863–875 (2006) 12. Koçak, T., Basci, F.: A power-efficient TCAM architecture for network forwarding tables. Journal of Systems Architecture 52, 307–314 (2006) 13. Wang, G., Tzeng, N.-F.: Exact Forwarding Table Partitioning for Efficient TCAM Power Savings. In: IEEE NCA 2007, pp. 249–252 (2007) 14. Lin, D., Zhang, Y., Hu, C., Liu, B., Zhang, X., Pao, D.: Route Table Partitioning and Load Balancing for Parallel Searching with TCAMs. In: IPDPS 2007, pp. 1–10 (2007) 15. Nilsson, S., Karlsson, G.: IP-Address Lookup using LC-Tries. Journal of Selected Areas in Communications 17, 1083–1092 (1999)
A Hybrid IP Forwarding Engine with High Performance and Low Power
899
16. Jing, F., Hagsand, O., Karlsson, G.: Performance evaluation and cache behavior of LC-trie for IP-address lookup. In: HPSR, pp. 29–35 (2006) 17. Srinivasan, V., Varghese, G.: Fast Address Lookups Using Controlled Prefix Expansion. ACM Trans. Comput. Syst. 17, 1–40 (1999) 18. University of Oregon Route Views Project, http://www.routeviews.org/ 19. Shivakumar, P., Jouppi, N.P.: Cacti 3.0: An Integrated Cache Timing, Power and Area Model. Technical Report Western Research Lab (WRL) Research Report. 2001/2 20. Agrawal, B., Sherwood, T.: Ternary CAM Power and Delay Model: Extensions and Uses. IEEE Tr. on Very Large Scale Integration (VLSI) Systems 16, 554–564 (2008) 21. Pinedo, M.: Scheduling: Theory, Algorithms, and Systems. Prentice-Hall, Englewood Cliffs (2002)
Learning Styles Diagnosis Based on Learner Behaviors in Web Based Learning Nilüfer Atman1, Mustafa Murat Inceoğlu2, and Burak Galip Aslan3 1
Ege University, Department of Computer and Instructional Technologies 35040 Bornova, Izmir, Turkey
[email protected] 2 Ege University, Department of Computer and Instructional Technologies 35040 Bornova, Izmir, Turkey
[email protected] 3 Izmir Institute of Technology, Department of Computer Engineering, 35430 Gulbahce, Urla, Izmir, Turkey
[email protected]
Abstract. Individuals have different backgrounds, motivation and preferences in their own learning processes. Web-based systems that ignore these differences have difficulty in meeting learners’ needs effectively. One of these individual differences is the learning style. For providing adaptively incorporated learning styles, firstly learning styles of learners have to be identified. There are many different learning models in literature. This study is based on Felder and Silverman’s Learning Styles Model and investigates only active/reflective and visual/verbal dimensions of this model. Instead of filling out a questionnaire, learner behaviors are analyzed with the help of literature-based approaches so that learning styles of learners can be detected. Keywords: Felder and Silverman’s Index of Learning Styles, Web based Education.
1 Introduction Web-based training with all its potential benefits is growing at a tremendous rate; however most of current systems provide a ‘‘one-size-fits-all’’ approach for the delivery of the material [1]. The fundamental problem is that learners inevitably have diverse backgrounds, abilities and motivation – and hence highly individual learning requirements [2]. These individual differences affect the learning process and are the reason why some learners find it easy to learn in a particular course, whereas others find the same course difficult [3]. As e-Learning environments evolve, learners have become increasingly demanding on personalized learning which allows them to build their own knowledge pathway [4]. Therefore, it is very crucial to provide the different styles of learners with different learning environments that are more preferred and more efficient to them [5]. The O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 900–909, 2009. © Springer-Verlag Berlin Heidelberg 2009
Learning Styles Diagnosis Based on Learner Behaviors in Web Based Learning
901
objective of this study is to develop a literature-based approach for diagnosing learning styles of learners with the help of behavior and action patterns on the user interface. This approach is also evaluated accordingly.
2 Related Work The related studies about this research can be summarized under two main sections; namely, the related studies about learning styles and the related studies about adaptive systems. 2.1 Learning Styles There are many models of learning styles existing in literature. Coffield et al.[6] identified 71 models of learning styles and categorized 13 of them as major models. The term of learning styles is defined as the ways in which an individual characteristically acquires, retains, and retrieves information [7]. In 1988, Felder and Silverman [8] defined five dimensions: perception (sensing/intuitive), input ( visual/auditory), organization (inductive/deductive), processing (active/reflective), understanding (sequential/global). Lately, inductive/deductive was excluded from the model and the dimension of visual/auditory amended as visual/verbal [8]. • •
• •
Active learners prefer to do something about the outside world, to discuss, comment and test a number of ways when reflective learners prefer make observations, and work on manipulation of information. Sensing learners prefer events, data, experiments while, and intuitive learners prefer principles and theories. Sensing learners solve problems in a causal way and are not quite fond of surprises while intuitive learners like to explore new things and do not like repeating stuff. Visual learners like pictures, diagrams, flow charts, time charts while verbal learners prefer words and sounds. Sequential learners prefer to go step by step; holistic learners can understand better when they see the whole picture [8].
Felder and Soloman developed the Index of Learning Styles (ILS) [9], a 44-item questionnaire, for identifying learning styles based on the Felder and Silverman Learning Style Model. There are four dimensions and 11 items for each dimension and learners have different preferences on each dimension. ILS is a bipolar scale so there can be two answers for each item. Each item scored +1 or -1 and total score of a dimension ranged between -11 and +11. The advantage of this model is; ILS represents the individuals learning styles as a tendency and there is a third option is that somebody can be equal in both two directions. If the score is between 9-11, it means that there is a strong preference in that dimension; Similarly, if 5-7 then it indicates a moderate preference while 1-3 indicates a balanced preference for that dimension.
902
N. Atman, M.M. Inceoğlu, and B.G. Aslan
Although there are so many learning styles models in literature such as Kolb, McCarthy, Myers-Briggs and so on, researchers believe that Felder and Silverman Learning Style Model is the most appropriate model for hypermedia courseware and adaptive web based learning systems [10], [11]. Furthermore, Index of Learning Styles is one of the most frequently used instrument and it is especially chosen because of its applicability to online learning and its relevance to the principles of interactive learning systems design [12]. 2.2 Adaptive Systems Adaptive hypermedia systems (AHS) build a model of the goals, preferences and knowledge of each individual user, and use this model throughout the interaction with the user, in order to adapt to the needs of that user [13]. The goal of the various Adaptive Educational Hypermedia (AEH) systems that have been developed in recent years has been to avoid the "one size fits all" mentality that is all too common in the design of web-based learning systems [2]. Although there are so many adaptive educational hypermedia systems incorporating learning styles models in literature such as CS383 [10], MANIC [14], IDEAL [16], MASPLANG [17], AHA! [18] etc., the systems that are based on Felder and Silverman’s Learning Styles Model are emphasized in this study. CS383 [10] is providing adaptivity according to sensing/intuitive, visual/verbal, and sequential/global dimensions of Felder and Silverman Learning Styles Model. They underlined that hypermedia system supports both active and reflective dimensions. Hypermedia system enables learners to make choices and participate in learning that supports active learning and give a chance to stop and think for reflective learners. TANGOW [19] is providing adaptivity based on global/sequential, sensing/intuitive dimensions. Students first fill out ILS and results are pointed to into 3level scale i.e. strong intuitive, balanced learning style or strong sensing learning style. Then balanced learners perform default version, while adaptivity is provided to others. Student model is initialized before the learner behaviors are monitored. According to data from learner behaviors, the information in the student model is updated. In LSAS [20], sequential/global dimension was studied and adaptivity is provided by two interfaces. Small chunks of information are presented for sequential learners, more navigational freedom is provided for global learners. For assessing the effectiveness of system, first template which is appropriate to the learning styles of learners is presented. The template which is not appropriate to their learning styles is presented afterwards. Test – retest results pointed out there is a significant difference between performances. One of the studies to diagnose of learning styles belongs to Cha et al. [5]. Firstly, preferences of learners are identified then system adapted interface according to learner’s preferences. System is based on Felder and Silverman Learning Styles Model and uses Hidden Markov Model and Decision Trees to extract behavior patterns. Interface is designed with Macromedia Flash and monitored events are recorded. Learning styles are detected by analyzing interface behaviors of learners instead of filling out the Index of Learning Styles.
Learning Styles Diagnosis Based on Learner Behaviors in Web Based Learning
903
Another study analyzing behaviors of learners in Moodle course belongs to Graf [21]. This study is also based on Felder and learning styles model. Behavior patterns are determined according to frequent activities on LMS. This study shows that diverse learning styles of students result in different behaviors in Instructional Management Systems. Garcia et. al. [22] used Bayesian Networks to detect the learning style of students in a web based education system. The information obtained can then be used by an intelligent agent to provide personalized assistance to students, and delivering teaching material that best fits to students’ learning styles.
3 Methodology In this study, a literature-based approach is performed to detect learning styles of learners. The aim of this study is detection of learning styles by analyzing learner behaviors in the web based education system. This study is based on active/reflective and visual/verbal dimensions of Felder and Silverman Learning Styles Model. The topic of teaching “If Clauses” in an ESL course are chosen as subject of this study. A screenshot of web based learning system is given in Fig.1.
Fig. 1. Screen shot of web based course
In this course, there are four main features, namely; Introduction, Form, Example, Exercise. In Introduction, there are conversations supported by pictures. Form includes the main aspect of subject supported with interactive features. In Example
904
N. Atman, M.M. Inceoğlu, and B.G. Aslan
section, just like in Introduction part, there are if clauses sentences supported with pictures. Exercises include interactive features and finally in test, there are self assessment questions about the subject. In this course, each content feature is labeled as VisualActive, VisualReflective, VerbalActive, VerbalReflective. There are 9 Introduction contents labeled as VisualReflective, and 3 form content labeled as VisualActive. 30 example contents that take place are labeled as VisualReflective and finally 6 exercise contents are labeled as active in which one of them is VisualActive and the other is VerbalActive. These labels are determined in parallel to dimension of Felder and Silverman Learning Styles Model. Exercise and form modules have interactive features while Introduction and Example modules have reflective features. 3.1 Patterns For determining the patterns, firstly literature is reviewed, with the help of especially Graf and the other studies, these patterns are defined. Table1. shows relevant patterns for active and visual dimensions. Table 1. Patterns for [active/reflective] and [visual/verbal] dimensions
Active/Reflective Dimension Introduction_visit (-) Introduction_stay(-) Form_visit(+) Form_stay(+) Example_visit(-) Example_stay(-) Exercise_visit(+) Exercise_stay(+) Test_visit(+) Test_stay(-) Test_Results_View(-)
Visual/Verbal Dimension Introduction_visit (+) Introduction_stay(+) Form_visit(+) Form_stay(+) Example_visit(+) Example_stay(+) Exercise1_visit(-) Exercise1_stay(-) Exercise2_visit(+) Exercise1_stay(+)
Introduction_visit is count of visited introduction content of all over the course and Introduction_stay is total time of spent on introduction content. Introduction is labeled as VisualReflective so Active Learners are expected to spend less time. On the other hand Visual learners are supposed to spend more time. Another module is Form module which has interactive features and labeled as VisualActive. There are drag and drop activities and gives explanations about the subject. Form_visit is number of total visit of individuals and Form_stay is amount of time during the individual stay in the form contents. On that case active and visual learners spend more time on Form module. Just like in Introduction part, Example part is labeled as VisualReflective and expected to spend more time for visual and reflective learners. Example contents are included if clauses sentences supported by pictures. Also example contents could be developed as VerbalReflective, including sentences supported by native speakers talking. Example_stay is total time that
Learning Styles Diagnosis Based on Learner Behaviors in Web Based Learning
905
learner spends in example contents and example_visit is number of visits of example contents. There are two kinds of exercise modules which are VisualActive and VerbalActive. In VerbalActive exercises are true/false exercises which include sentences that learner reads and listens. VisualActive exercises are drag&drog exercise where learner chooses from multiple choices. Test_visit is the count of the questions in test accessed by the learner. According to the Felder and Silverman’s Learning Style Model active learners like to do something outside the world [8]. Number of questions performed by learner gives evidence about whether learner active or reflective. Test_stay shows the duration of learner staying in test. Test_Result_View is a pattern that shows the time spent on reviewing the results of test. These patterns give evidence if the learner reflective or not because reflective learners are patient and careful about what they read. They are expected to spend more time in test and test results page. For detecting learning styles, also thresholds of each pattern must be defined. Garcia and Graf used thresholds for their studies. In this study, these threshold levels are also used but some modifications have to be made because of the nature of the course. The thresholds used in this study are given in Table 2. For example Introduction and form sections of the course are firstly met by learners, and they could visit these features more than others. Table 2. Thresholds for determined patterns
Pattern Introduction_stay Form_stay Example_stay Exercise_stay Test_stay Test_results Introduction_visit Form_Visit Example_Visit Exercise_visit Test_Visit
Thresholds 75% - 125% 75% - 100% 50% - 75% 50% - 75% 50% - 75% 75% - 100% 75% - 100% 50% - 75% 25% - 50% 25% - 50% 25% - 75%
These thresholds are used as an evaluation criteria for values coming from learner actions. First, the expected time spent on Introduction, Form, Example, Exercise and Test modules and visit frequencies are determined. Then data from learner actions are monitored and values compared using thresholds. For example, the time expected to be spent in Introduction pages is determined at first, then the time spent in Introduction pages extracted from database. The percentage is derived and compared with thresholds. Introduction is labeled as VerbalReflective so active learners are supposed to spend less time. Also there are nine Introduction contents in this course. If the percentage of number of total visits of learner for Introduction content to all of introduction content is between thresholds, this gives us there is balanced evidence. If the
906
N. Atman, M.M. Inceoğlu, and B.G. Aslan
values are less than threshold, there is a weak preference. Introduction module has reflective features so if the percentage of learner is less than threshold, then it could be said there is a weak evidence for reflective and strong evidence for active learner. Finally if the value is higher than threshold, it could be said there is a strong evidence for reflective and weak evidence for active learner. The threshold for Introduction_visit is lay between 75% - %100. Introduction Part is the firstly met by learners and at visiting %75 - %100 of Introduction pages is expected. Visiting more than the threshold gives a hint for a reflective learner. Form part is labeled as VisualActive and gives the main point of subject. Just like in Introduction part, percentages of a learner determined and compared. Thresholds for Introduction and Form contents are higher than Example, Exercise and Test modules because these modules are firstly met by learners. Example_stay, Exercise_stay and Test_stay also Example_visit, Exercise_visit and Test_visit have same thresholds because they are supporting features for subject. Table 2 shows thresholds for determined patterns. 3.2 Method of Evaluation First, the expected time spent on each page is determined. Then, the time that learner spends on each module is recorded. The ratio of these values shows the percentage of each user based on each pattern. If the percentage shows a strong preference for the respecting dimension, then the value of 3 is marked. If the percentage lies between the thresholds then 2 is marked. If there is a weak preference, then 1 is pointed. If there is not any evidence then 0 is pointed. The average of total hints show the individuals’ respected learning style and this value is ranged between 1 and 3. For scaling the results of the literature-based approach, ranging from 0 to 1 and values of 0.25 and 0.75 were used as thresholds. These thresholds are based on experiments, showing that using the first and last quarter for indicating learning style preferences for one or the other extreme of the respective dimension and using the second and third quarter for indicating a balanced learning style, achieves better results than dividing the range into 3 parts [20]. Also ILS values were mapped again on a 3-item scale. This evaluation procedure is summarized in Fig. 2. For measuring how close predicted learning style to the ILS scores, Garcia et al. [22], uses the following formula:
The value of LS predicted refers to predicted learning styles; LS ILS is the value of ILS. If predicted learning styles and ILS values is equal, Sim returns 1. If one the values is balanced and the other is a preferred learning style of the two poles of that dimension, function then returns 0.5. Finally if each of two values differs from each other, the function returns 0. This formula is performed for each student and divided to the number of learners (n).
Learning Styles Diagnosis Based on Learner Behaviors in Web Based Learning
907
Fig. 2. Overview of evaluation process of automatic detection
4 Results and Discussion Graf’s study [20], based on active/reflective, sensing/intuitive and global/sequential dimensions of Felder and Silverman Learning Styles Model and used literature based approach to detect learning styles. Graf [20] analyzed the student behaviors in Moodle software and uses the features of this LMS package. ‘Object Oriented Modeling’ course continued 7 weeks, data from 75 students analyzed and 79.33% prediction achievement was found for active/reflective dimension. On the other hand, Garcia et. al. [21], [22], used data-driven approach for diagnosing learning styles. They conducted two experiments. In first experiment 30 students were used to train Bayesian networks and data with 10 students’ data were tested, in second experiment 50 students has been used to train Bayesian network and system is tested by 27 students. Result for active/reflective dimension is 62.50% which is lower then results of the other dimensions. They explained this situation because of the rare use of communication tools such as chat and forum. In this study a web based education system proposed and each module is labeled for their corresponding learning style dimension. This makes analyzing process faster and transportable to other dimensions. Literature-based approach is used. 17 college students behaviors analyzing results show that the prediction achievement is 83.15%. The results of this study belong to a pilot study and also include the detection of visual/verbal dimension. That work is also in progress.
5 Conclusion The study based on Felder and Silverman Learning Styles Model and investigated active/reflective and visual/verbal dimensions. In this study, learner behaviors in a
908
N. Atman, M.M. Inceoğlu, and B.G. Aslan
web based course are analyzed and learning styles of learners is predicted with the help of literature based approach. For this purpose, learner actions are monitored and stored in database. Five types of content developed, namely; Introduction, Form, Example, Exercise and Test. Each module is labeled such as Visual_Active, Visual_Reflective, Verbal_Active and Verbal_Reflective. This process makes analyzing actions easier and effective. Relevant patterns and thresholds are defined and literature-based approach used for analyzing learner behaviors. To evaluate system effectiveness, learners filled Index of Learning Styles Questionnaire at the beginning of the course. Predicted ILS scores and ILS scores is compared by using the formula developed by Garcia et al. [21], [22]. The very first results show revealing outcomes. The goal of this study is to detect learning styles automatically instead filling out questionnaires. In this way, students can focus better on course and don’t spend extra time and effort for submitting questionnaires. Additionally, course administrators can monitor learning styles of learners. Also in some cases, real behavior and answers for questionnaires do not overlap. This approach is an alternative way to overcome these problems. Besides, this approach provides necessary information for adaptive systems and helps automatically update information in learner model.
References 1. Liegle, J.O., Janicki, T.N.: The effect of learning styles on the navigation needs of Webbased learners. Computers in Human Behavior 22, 885–898 (2006) 2. Brown, E., Cristea, A., Stewart, C., Brailsford, T.: Patterns in Authoring of Adaptive Educational Hypermedia: A Taxonomy of Learning Styles. Educational Technology & Society 8(3), 77–90 (2005) 3. Jonassen, D.H., Grabowski, B.L.: Handbook of Individual Differences, Learning, and Instruction. Lawrence Erlbaum Associates, Hillsdale (1993) 4. Sun, L., Ousmanou, K., Williams, S.A.: Articulation of Learners Requirements for Personalised Instructional Design in E-Learning Services. In: Liu, W., Shi, Y., Li, Q. (eds.) ICWL 2004. LNCS, vol. 3143, pp. 427–431. Springer, Heidelberg (2004) 5. Cha, H.J., Kim, Y.S., Park, S.H., Yoon, T.B., Jung, Y.M., Lee, J.-H.: Learning Style Diagnosis Based on User Interface Behavior for the Customization of Learning Interfaces in an Intelligent Tutoring System. In: Ikeda, M., Ashley, K.D., Chan, T.-W. (eds.) ITS 2006. LNCS, vol. 4053, pp. 513–524. Springer, Heidelberg (2006) 6. Coffield, F., Moseley, D., Hall, E., Ecclestone, K.: Should We Be Using Learning Styles? What Research Has to Say to Practice. Learning and Skills Research Centre/University of Newcastle upon Tyne, London (2004) 7. Felder, R.M.: Matters of Style. ASEE Prism 6(4), 18–23 (1996) 8. Felder, R., Silverman, L.: Learning and Teaching Styles. Journal of Engineering Education 94(1), 674–681 (1988) 9. Felder, R.M., Soloman, B.A.: Index of Learning Styles Questionnaire (March 2009), http://www.engr.ncsu.edu/learningstyles/ilsweb.html 10. Carver, C.A., Howard, R.A., Lane, W.D.: Addressing Different Learning Styles through Course Hypermedia. IEEE Transactions on Education 42(1), 33–38 (1999) 11. Kuljis, J., Liu, F.: A Comparison of Learning Style Theories on the Suitability for Elearning. In: Hamza, M.H. (ed.) Proceedings of the Iasted Conference on Web Technologies, Applications, and Services, pp. 191–197. ACTA Press (2005)
Learning Styles Diagnosis Based on Learner Behaviors in Web Based Learning
909
12. Baldwin, L., Sabry, K.: Learning Styles For Interactive Learning Systems. Innovations in Education and Teaching International 40(4), 325–340 (2003) 13. Brusilovsky, P.: Methods and techniques of adaptive hypermedia. User Modeling and User-Adapted Interaction 6(2-3), 87–129 (1996) 14. Stern, M.K., Steinberg, J., Lee, H.I., Padhye, J., Kurose, J.: Manic: Multimedia Asynchronous Networked Individualized Courseware. In: Proceedings of the World Conference on Educational Multimedia/Hypermedia and World Conference on Educational Telecommunications (Ed-Media/Ed-Telecom), Calgary, Canada, pp. 1002–1007 (1997) 15. Shang, Y., Shi, H., Chen, S.-S.: An Intelligent Distributed Environment for Active Learning. ACM Journal of Educational Resources in Computing 1(2), 1–17 (2001) 16. Peña, C.-I., Marzo, J.-L., de la Rosa, J.-L.: Intelligent Agents in a Teaching and Learning Environment on the Web. In: Petrushin, V., Kommers, P., Kinshuk, Galeev, I. (eds.) Proceedings of the International Conference on Advanced Learning Technologies, pp. 21–27. IEEE Learning Technology Task Force, Palmerston North (2002) 17. Stash, N., Cristea, A., de Bra, P.: Authoring of Learning Styles in Adaptive Hypermedia: Problems and Solutions. In: Proceedings of the International World Wide Web Conference, pp. 114–123. ACM Press, New York (2004) 18. Carro, R.M., Pulido, E., Rodriguez, P.: Tangow: A Model for Internet-Based Learning. International Journal of Continuing Engineering Education and Lifelong Learning 11(1/2), 25–34 (2001) 19. Bajraktarevic, N., Hall, W., Fullick, P.: Incorporating Learning Styles in Hypermedia Environment: Empirical Evaluation. In: de Bra, P., Davis, H.C., Kay, J., Schraefel, M. (eds.) Proceedings of the Workshop on Adaptive Hypermedia and Adaptive Web-Based Systems, pp. 41–52. Eindhoven University, Nottingham (2003) 20. Graf, S.: Adaptivity in Learning Management Systems Focusing On Learning Styles. Unpublished Ph. D. Thesis (2007) 21. García, P., Amandi, A., Schiaffino, S., Campo, M.: Using Bayesian Networks to Detect Students’ Learning Styles in a Web-Based Education System. In: Proceedings of the Argentine Symposium on Artificial Intelligence, Rosario, Argentina, pp. 115–126 (2005) 22. García, P., Amandi, A., Schiaffino, S., Campo, M.: Evaluating Bayesian Networks’ Precision for Detecting Students’ Learning Styles. Computers &Education 49(3), 794–808 (2007)
State of the Art in Semantic Focused Crawlers Hai Dong, Farookh Khadeer Hussain, and Elizabeth Chang Digital Ecosystems and Business Intelligence Institute, Curtin University of Technology, GPO Box U1987 Perth, Western Australia 6845, Australia {hai.dong,farookh.hussain,elizabeth.chang}@cbs.curtin.edu.au
Abstract. Nowadays, the research of focused crawler approaches the field of semantic web, along with the appearance of increasing semantic web documents and the rapid development of ontology mark-up languages. Semantic focused crawlers are a series of focused crawlers enhanced by various semantic web technologies. In this paper, we make a survey in this research field. We discover eleven semantic focused crawlers from the existing literature, and classify them into three categories – ontology-based focused crawlers, metadata abstraction focused crawlers and other semantic focused crawlers. By means of a multi-dimensional comparison, we conclude the features of these crawlers and draw the overall state of the art of this field. Keywords: focused crawlers, semantic web, semantic focused crawlers, ontology-based focused crawlers, metadata abstraction focused crawlers.
1 Introduction Semantic web is a vision for the future of the web in which information is categorized and made comprehensible by various automated tools [11].The major mission of the semantic web is to “express meaning” This demands that agents execute more intelligent operations on behalf of users [14]. A crawler is an agent which can automatically search and download webpages [4]. Focused (topical) crawlers are a group of distributed crawlers that specialize in certain specific topics [18]. Each crawler will analyze its topical boundary when fetching webpages. Semantic web is an extension of World Wide Web with the purpose of expressing the meaning of the information [21] [22]. The technologies involved in the semantic web subsume (Extensible Markup Language) XML [24], XML schema [25], Resource Description Framework (RDF) [26], RDF schema [27], Web Ontology Language (OWL) [28] and SPARQL Protocol and RDF Query Language (SPARQL) [23] etc. Within them, the former four are employed to annotate web documents in order to convert them into semantic web documents; SPARQL is a RDF-based query language for querying the annotated documents [23]. Recently, focused crawler approaches the are increasingly being used in semantic web, along with the appearance of increasing semantic web documents and the rapid development of ontology mark-up languages [9] [10]. We define O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 910–924, 2009. © Springer-Verlag Berlin Heidelberg 2009
State of the Art in Semantic Focused Crawlers
911
semantic focused crawlers as a subset of focused crawlers enhanced by various semantic web technologies. In this paper, we carry out a thorough survey from the literature on semantics-focused crawlers and provide directions for future and further research in this area. Eleven semantic focused crawlers are discovered and are classified into three groups. According to their features, we make a comprehensive evaluation of these crawlers along six dimensions, and thus draw conclusions in the final part.
2 Semantic Focused Crawlers In accordance with the respective characteristics of the eleven semantic focused crawlers, we categorize them into three classes – ontology-based focused crawlers [19], metadata abstraction focused crawlers [20], and other semantic focused crawlers, which are defined in Table 1. From Section 2.1 to Section 2.3, we introduce the typical examples within these categories. Table 1. Classification of semantic focused crawlers Crawler category Ontology-based focused crawlers
Metadata abstraction focused crawlers Other semantic focused crawlers
Definition The focused crawlers that utilize ontologies to link a crawled web document with the ontological concepts (topics), with the purpose of organizing and categorizing web documents, or filtering irrelevant webpages with regards to the topics [19]. The focused crawlers that can abstract and annotate metadata from the fetched web documents, in addition to fetching relevant documents [20]. The focused crawlers that employ other semantic web technologies than ontology-based filtering and metadata abstraction.
2.1 Ontology-Based Focused Crawlers In the existing literature there are four ontology-based focused crawlers, namely (a) LSCrawler (b) Courseware Watchdog crawler (c) crawler proposed by Ganesh et al. (d) THESUS crawler. In this section, we present an overview of each of these four crawlers. Yuvarani et al. [16] propose a new generation of focused crawler – LSCrawler – which makes use of ontologies to analyze the semantic similarity between URLs and topics. In the LSCrawler, an ontology base is built to store ontologies. For each query keyword, a Relevant Ontology Extractor retrieves the ontology base to find the compatible ontology. Then the matched ontology is passed to a Crawler Manager. Meanwhile, a Seed Detector sends the keyword to the three most popular search engines, and returns the retrieved seed Uniform Resource Locator (URLs) to the URL Buffer of the Crawler Manger. Based on the matched ontology and the retrieved URLs, the Crawl Manager then generates a multi-threaded crawler to fetch webpages by these URLs. Meanwhile, a Busy Server is configured to prevent repeatedly visiting
912
H. Dong, F.K. Hussain, and E. Chang
URLs, which have already been visited. The fetched webpages are then stored into a Document Repository, and the fetched URL database is updated. Subsequently, a Link Extractor extracts all URLs and their surrounding texts from the fetched webpages, and sends them to a Hypertext Analyzer. Meanwhile the Porter Stemmer algorithm is used to remove stop keywords and extract terms from the texts. The Hypertext Analyzer then removes the URLs found in the fetched URL Database, and the extracted terms are matched with the concepts in the ontology, to determine the relevance of webpages to the keyword. Based on the relevance values, the URLs are ranked and then stored in the URL repository for further visit. In order to evaluate the framework of the proposed LSCrawler, the authors compare the performance of LSCrawler-based search engine with a full text-indexed search engine on the benchmark of recall. The results show that the LSCrawler has a 10% advantage on average. Tane et al. [15] propose a new ontology management system – Courseware Watchdog. One important component of the system is an ontology-based focused crawler. By means of the crawler, a user can specify his/her preference, by assigning weights to the concepts of an ontology. By means of the interrelations between concepts within the ontology, the weights of other concepts can be calculated. Once a webpage is fetched, its text and URL descriptions are matched with the weighted ontological concepts. Thus, the weights of the webpage and its URLs are measured, ranked and clustered according to the concepts. In addition, the webpage relations can be viewed by linking the webpages to the ontology concepts that appear in the webpages. Ganesh et al. [6] propose a group of metrics, with the purpose of optimizing the order of visited URLs for web crawlers. Three metrics are involved, which are combination importance metric, association metric and ordering metric. First of all, given a webpage p, the combination importance metric CI(p) can be computed as shown below:
CI ( p) = a1⋅ IB ( p) + a 2 ⋅ IL( p) + a3 ⋅ IF ( p) + a 4 ⋅ IR( p )
(1)
where a1, a2, a3 and a4 are constants, IB(p) is the number of inbound links to webpage p, IL(p) is the location weight of webpage p, IF(p) is the number of outbound links from webpage p, and IR(P) is the PageRank [30, 31] weight of webpage p, which can be computed by (2) shown below: IR ( p) = (1 − d ) + d
∑
pi ∈IB ( p )
IR( pi ) / IF ( pi )
(2)
where d is a damping factor. For each URL ui in the webpage p, two association metrics AS(ui) and AS(P) respectively evaluate the semantic relevance of the URL ui and the semantic relevance of webpage p based on a reference domain-specific ontology. These association metrics can analyze the link strength between parent and children webpages after the latter is downloaded, in order to refine itself. Finally, the URLs would be ranked according to an ordering metric O(u) that can be mathematically expressed below:
State of the Art in Semantic Focused Crawlers
O(u ) = b1 ⋅ CI '( p) + b 2 ⋅ AS (u ) + b3 ⋅
∑
pi ∈P ( u )
AS ( pi )
913
(3)
where b1, b2, b3 are constants, CI’(p) is the combination importance metric that evaluates the downloaded webpage p, P(u) is a function that returns all the parent pages of a URL u. THESUS aims to organize online documents by linking their URLs to hierarchical ontology concepts, which are seen as thematic subsets [8]. A web crawler is used in the document acquisition component of the system. The working mechanism of this crawler is as follows: first, the crawler extracts the URLs and their descriptive texts from the initial set of documents; then the descriptive text of a URL are matched with one of the ontological concepts, and the URL is linked to concept. A threshold of maximum number of recursions or maximum number of documents is set in order to ensure that the process is not carried out endlessly. For a web document dj, the crawler extracts a set of terms ki (i = 1…n) with the respective weights ni extracted from the descriptive texts of the URLs pointing to the document. Then the document dj can be seen as the following set as: {URL, ki, ni} (i = 1…n). The similarity simct,ki between a concept ct and terms ki (i = 1…n) is computed as shown below:
simct , ki =
∑
ni ⋅ st
L ( ki , ct )
∑
ni
(4)
L ( ki , ct )
where L(ki, ct) is a function that returns all available couplings between terms ki (i = 1…n) and concept ct, st is the indication-based weight of each concept ct. Therefore, the web document dj then can be seen as a following set as: {URL, ct, simct,ki} (t = 1…m), which can be utilized for the following ontology-based document clustering. In order to evaluate the crawler’s framework, the authors compare its clustering efficiency with a keyword-based clustering approach based on the benchmarks of Fmeasure, rand statistics, preprocessing time and average clustering time. The result indicates that the THESUS crawler has a 0.12 advantage on F-measure, a 0.05 advantage on rand statistics, over 40 times faster on preprocessing time and 0.4s faster on average clustering time than the latter. 2.2 Metadata Abstraction Focused Crawlers
In the existing literature there are two metadata abstraction focused crawlers namely Vertical Portal crawler and CiteSeer crawler. In this section, we present an overview of these crawlers. Francesconi and Peruginelli [5] propose Vertical Portal, with the purpose of providing both resources and available solutions and services to satisfy users’ requirements, within the legal domain. In the system a metadata abstraction focused crawler is designed by the authors, to fetch the domain-specific web documents. Then a metadata generator automatically transforms the web documents into
914
H. Dong, F.K. Hussain, and E. Chang
metadata, by means of extraction. The focused crawler is implemented by computing the probabilities that URLs are similar with predefined topics. The metadata format is in accordance with the Dublin Core (DC) scheme in its XML version. Then, with the purpose of document clustering, each document d can be represented as a vector of weights (w1, …, wn), in which each weight could be one of the following three types: • • •
Binary weight δ(w, d) that indicates the presence/absence of a term w in a document. Term frequency (tf) weight tf(w, d) that indicates the frequency of a term w appearing in a document. Term frequency-inverse documents frequency (tfidf) weight tfidf(w, d) which can be mathematically computed, as shown below:
aij =
freq N log n max freq
(5)
where freq is frequency of term w appearing in a document, maxfreq is the total number of terms appearing in the document, N is total number of documents and n is the number of documents where term w appears. During the next stage, two document classification algorithms – Naive Bayes (NB) and Multiclass Support Vector Machines (MSVM) are adopted respectively for evaluation purposes. The results show that the latter has a 2.6% advantage than the former on accuracy. Giles et al. [7] propose a niche search engine for retrieving e-business information, with the integration of CiteSeer technique. A set of crawling strategies, including Brute Force, Inquirus-based and focused crawlers are used to fetch web documents. The CiteSeer technique is used to parse citations from the downloaded documents, and then create metadata based on the documents. To enhance the quality of metadata, the Support Vector Machine (SVM) algorithm is chosen to extract metadata, in comparison with the Hidden Markov Model (HMM) algorithm. Based on a small training set of words, the SVM model shows better performance than the HMM on accuracy. 2.3 Other Semantic Focused Crawlers
The following focused crawlers all have their own unique features that differentiate them from the previously discussed ontology-based crawlers and metadata abstraction crawlers, and therefore cannot be grouped them. From the literature, there are five such crawlers, namely Lokman crawler, the crawler proposed by Liu et al., Web Spider, Digital Library crawler and BioCrawler, Can and Baykal [2] propose a medical search engine – MedicoPort – which employs a topical web crawler – Lokman. Lokman is responsible for collecting the medical information while limiting the scope of linking URLs. By means of the concepts from a Unified Medical Language System (UMLS) [29], Lokman can identify the links relevant to the medical domain. For each fetched document, a
State of the Art in Semantic Focused Crawlers
915
Document Parser extracts the links from it. For each fetched webpage, its relevance value to the UMLS concepts are estimated based on the concept frequencies, concept weights, and the relevance value of the contained URLs. The URL relevance values are evaluated by a Link Estimator, based on the relevance between the texts within the URLs and the UMLS concepts. Then a URL Frontier determines the order of URL queue, based on their relevance values. Lokman then fetches the URLs within the URL queue. The performance of Lokman is tested by the comparison between two reevaluation algorithms – IncrementValues which regards the sum of link relevance values for a link as its link value, and GetGreater which regards the maximum value as the link value. Two situations, which are direct links included and excluded out of the seed URLs, are tested by means of the two re-evaluation algorithms. In comparison with a simple best search crawler, Lokman shows significant improvement in both the situations. Liu et al. [12] propose a learned user model-based approach to assist focused crawlers to predict relevant links based on users’ preference. Three components are involved in their architecture, which are User Modeling, Pattern Learning, and Focused Crawling. In the User Modeling, the system observes the sequence of uservisited pages with regards to a specific topic. A web graph is drawn, which consists of nodes that represent the webpages user-visited, and edges that represent the links among the webpages, in order to analyze user browsing pattern. In addition, the nodes are highlighted when users regard them as relevant. In the pattern leaning, the Latent Semantic Indexing (LSI) model is adopted to cluster the documents to several groups, and to reveal the topic for each cluster and the relationship between the identified topics. Meanwhile an Expectation-Maximization (EM) algorithm is used to optimize the clusters. Then a Hidden Markov Model (HMM) algorithm is used to estimate the likelihood of the topics directly or indirectly leading to a target topic. The mathematical representation of the HMM [32] is described as follows: Let S = (Tn-1…T0) be n hidden states of reaching an object, O = (o1…om) are m visible states associated with two conditional probability distributions P(sj|si) and P(o|sj), then the Initial Probability Distribution Matrix P = {P(T0)… P(Tn-1)}, the Transition Probabilities Matrix A = [aij]n×n, where aij = probability of being in state Tj at time t+1, given that the observer is in state Ti at time t, and Emission Probabilities Matrix B = [bij]n×m, where bij = probability of seeing cluster j if the observer is in state Ti. The probabilities are estimated by maximum likelihood with ratio of counts, which can be shown mathematically as follows:
aij =
| Lij | n −1
∑| L k =0
bij =
kj
|
(6)
| N ij | m
∑| N k =1
kj
|
where Lij = {v ∈ Ti , w ∈ T j : (v, w) ∈ E} and N ij = {Ci : Ci ∈ T j } .
(7)
916
H. Dong, F.K. Hussain, and E. Chang
In the focused crawler, the HMM is used to find the most likely state sequence in state s at time t+1 given the observed webpage sequence which can be mathematically shown below:
δ ( s, t + 1) = max δ ( s, t ) P ( s | s ') P(ot +1 | s ) s'
(8)
where maxδ(s, t) is the maximum probability of all sequences ending at state s at time t, P(s|s’) and P(ot+1|s) are transition probabilities and emission probabilities. In the Focused Crawling, a focused crawler downloads the page linked to the first URL in its URL queue, and computes the page’s reduced LSI representation. It then downloads all the children pages and clusters them by means of the K-Nearest Neighborhood algorithm to obtain the corresponding Visit Priority Value based on the learned HMM. In comparison with a Best-First search crawler, the crawler shows significant advantage on precision [13]. Cesarano et al. [3] propose an agent-based semantic search engine. In their proposed methodology, the query keywords are sent to a traditional search engine and the retrieved URLs are returned. One of the components of the search engine – Web Spider – can download all pages by URLs and then visit all children pages pointed by the URL, which traditional search engines can not reach. The web spider uses a Web Catcher which follows links to visit web pages. Then the web pages are stored in a Web Repository, and the unvisited links parsed from the web pages are visited next time. The whole crawling procedure stops when a predefined depth parameter is reached. Then a Document Postprocessor extracts the useful information for each downloaded page, including the title, content and description; a Miner Agent ranks these pages according to the similarities between the pages’ information and a userpredefined search context. The tool used for computing similarity values is the group of ontologies stored in a Semantic Knowledge Base, which has weighted relations between concepts. The similarity between concepts ci and cj are obtained by (9) shown below: mi
d (ci , c j ) = max (∏ Pij ) i∈(1...n )
(9)
j =1
where n is the number of existing paths between ci and cj, mi is the number of edges in these paths and Pij is the weight on each edge. The semantic relevance of a webpage is considered to be a function of single-word concepts, which consists of the following processes: • •
The title, content and description of a webpage are extracted as a sequence of concepts. The Normalized Probabilistic Distance between each pair of concepts can be computed by (10) shown below: NPD (ci , c j ) =
d (ci , c j ) DIST (ci , c j )
(10)
State of the Art in Semantic Focused Crawlers
•
where DIST(ci, cj) is the distance between the words representing concept ci and cj. The Semantic Grade of a webpage can be computed by (11) shown below: NC
SeG = ∑
NC
∑
h =1 k = h +1
•
NPD(Cik , Cih )
(11)
where NC is the number of concepts appearing in the webpage. Then the Semantic Grade is normalized by (12) shown below: 2 ⋅ NPD (Cik , Cih ) NC 2 + NC h =1 k = h +1 NC
NSeGi = ∑ •
917
NC
∑
(12)
Finally, the Normalized Semantic Grade for a webpage can be mathematically shown as follows: NSeG =
∑
i∈( t , c , k , d )
ρi ⋅ NSeGi
(13)
where t is the title of the webpage, c is the content of the webpage, k is the keywords of the webpage, d is the description of the webpage and ρt + ρc + ρk + ρd = 1. Zhuang et al. [17] propose to use publication metadata to guide focused crawlers to collect the missing information in digital libraries. The whole procedure is as follows: when a request for retrieving the publications in a specific venue is sent by a user; a Homepage Aggregator queries a public metadata repository, and returns some Metadata Heuristics for a focused crawler to locate the authors’ homepages, it also returns a list of URLs to a Homepage URL Database; then the focused crawler fetches the publications by means of the seed URLs and stores them in a Document Database. Batzios et al. [1] propose a vision of crawler – BioCrawler – working in the environment of semantic web. BioCrawler extends from the focused crawler, which is a group of distributed crawlers over the web, which is seen as an entity of “vision, moving, communication abilities”, and an up-to-date knowledge model when browsing web content. Vision is the scope of domains which one BioCrawler can visit, in the form of web page link vectors. Thus, BioCrawler’s movement is controlled by their visions. A Rule Manager agent is configured to determine the best rule (route) upon a crawler’s request, based on the strength parameter of each available route plan. The knowledge model mechanism in BioCrawler is composed of a classifier that stores the information regarding rules, and a classifier evaluator which calculates the amount of semantic content grabbed by following the rules, also called the rules’ strength. In order to evaluate its framework, the BioCralwer is compared with a dump crawler on the benchmark of crawler energy that is defined as the amount of webpages crawled per unit of bandwidth. Two experiments are implemented, which respectively compare the crawler energy of the two crawlers during 30,000 website visited and their average crawler energy during 100 random re-starts. Both of the experiment results show that the BioCrawler outperforms the dump crawler.
918
H. Dong, F.K. Hussain, and E. Chang
3 Comparison of the Semantic Focused Crawlers In the following sections, we make a comprehensive comparison to the introduced semantic focused crawlers by their categories. Based on their typical features, we choose the seven dimensions below for comparison: domain of application (e.g., business, medicine etc.), working environment (e.g., Google crawler for Google, Yahoo crawler for Yahoo etc.), special functions, technologies utilized, evaluation methods, evaluation results, and finally our comments or suggestions to the crawlers. 3.1 Comparison of the Ontology-Based Focused Crawlers
We would like to observe that most of the ontology-based focused crawlers are designed for general domains. Some of them are encapsulated in larger systems, and others are designed as separate tools. Ontology is mostly used to match the fetched URLs or webpages with the predefined topics (ontological concepts), by means of computing the similarity value between the ontological concepts and the fetched URLs or webpages. In addition, the weight of some ontological concepts can be defined by users to highlight users’ preference. While most of the crawlers do not provide evaluation methods and results, those that provide make use of precision and recall as primary metrics to measure the performance of such crawlers. As a whole, the ontology-based focused crawlers show obvious progress, compared with some full-text crawlers. We suggest that the crawlers’ designers should provide more technical details regarding their evaluation process and results, in order to consolidate the crawlers. The detailed comparison results of the ontology-based crawlers are shown in Table 2 (see appendix). 3.2 Comparison of the Metadata Abstraction Focused Crawlers
From this comparison it is found that the metadata abstraction crawlers mostly work in the specific domains and capsulated in more comprehensive systems. Due to the specialty of documents fetched, they need to convert the domain-specific document into more meaningful metadata. Various technologies are utilized for document classification and metadata abstraction. Whilst some crawlers do not provide their evaluation details and results, from the existing evidence based on the preliminary survey, we still can observe its prime performance. It is suggested that the authors should disclose their evaluation details and results. The detailed comparison results can be found in Table 3 (see appendix). 3.3 Comparison of the Other Semantic Focused Crawlers
The ungrouped crawlers display the flexibility of the application of semantic web technologies in focused crawlers. Most of the crawlers are applied as part of a larger system in a specific domain, such as search engine, knowledge portal and so forth. Differing from the traditional focused crawlers, they have some special functions, such as estimating the similarity values between documents/URLs and ontology concepts/user-predefined context, indexing the unvisited URLs based on ontology concepts and users’ preference, seeking the missing documents in a metadata base. Multiple semantic web technologies are used, including ontologies, similarity and
State of the Art in Semantic Focused Crawlers
919
clustering algorithms, as well as metadata heuristics. The primarily used evaluation method is to compare the harvest rate or precision with a Best-First Crawler or Breadth-First Crawler and nearly all of them show significant advantages. However, the disadvantages are also obvious: some crawlers contain many complex algorithms or operations which may affect their efficiency; some need more testing, considering the obvious variation in different environments; some should provide metadata abstraction function for the interdisciplinary knowledge sharing. The detailed comparison results can be found in Table 4 and 5 (see appendix). 3.4 Conclusions and Recommendations toward the Comparison of Semantic Focused Crawlers
From the respective comparison of three clusters of crawlers, the conclusions with regard to the features and situation of the semantic focused crawlers are made as follows: First of all, let us emphasize on the features of each category of crawlers with semantic web technologies. For the ontology-based focused crawlers, the utilization of semantic web technologies mainly focuses on the use of ontology for linking webpages/URLs with topics (ontological concepts), indexing webpages based on estimating the similarity values between webpages and ontological concepts, or analyzing users’ preference, in order to provide personalized crawling services; for the metadata abstraction focused crawlers, the utilization focuses on annotating the parsed and extracted web information with the ontology mark-up languages; for the category of other semantic focused crawlers, ontology can be used to calculate the similarity values between webpages or between webpages and queries etc.. Next we conclude the comparison results of the semantic focused crawlers from the dimension of domain, working environment, evaluation method and result. The domains where the semantic focused crawlers work can be divided into two categories – general and specific domain. They are designed either as a part of a complex system, such as search engine, or only as a tool that can be used independently or as a plug-in for any systems. The evaluation methods focus on the traditional methods in information retrieval – precision and recall. By means of the comparison with the traditional full-text, Best-First or Breadth-First crawlers, the significant advantages of the series of semantic crawlers are indicated directly. However, apart from the advantages, the disadvantages are obvious – many proposed models are not tested, which reveals that this field is not mature yet. In the backdrop of these semantic focused crawler shortcomings, we recommend that these researchers should disclose their evaluation details and compare their crawlers with other crawlers without semantic technological supports, in order to validate the feasibility and applicability of their research.
4 Conclusion In this paper, we carried out a detailed survey in the field of semantic focused crawlers. According to the literature, we classify the existing semantic focused crawlers into three primary categories – ontology-based focused crawlers that determine the relevance of web documents by analyzing their relevance to ontology concepts, metadata abstraction focused crawlers that employ ontology mark-up
920
H. Dong, F.K. Hussain, and E. Chang
languages to convert HTML documents into semantic web documents, and other semantic focused crawlers that have unique applications of semantic web technologies. Based on a thorough literature analysis, we found eleven crawlers in this domain. The working mechanism of each of these research crawlers is explained in detail. In order to perform a comparative analysis of these crawlers, for each category of crawlers, identified six key attributes for comparison and evaluation purposes. These are the domain, working environment, special functions, technologies utilized, evaluation methods and evaluation results. We observed that the ontology-based focused crawlers focus on using ontology for linking webpages/URLs with topics (ontological concepts), indexing webpages based on estimating the similarity values between webpages and ontological concepts, or analyzing users’ preference, in order to provide personalized crawling services. Additionally, we observed that the metadata abstraction focused crawlers focuses on annotating the parsed and extracted web information with the ontology mark-up languages; other semantic focused crawlers employ ontology to calculate the similarity values between webpages or between webpages and queries etc. By means of this comparison, we came to the conclusion that these semantic focused crawlers have significant advantages in contrast to the traditional crawlers. However, some researchers do not disclose their evaluation details and results, which indicate the “blueprint” stage of the semantic focused crawler research. In conclusion, on one hand, the application of semantic web technologies achieves undebatable progress in the field of focused crawler research; on the other hand, it is still far away to claim success to researchers, which reveals the state of the art in this field.
References 1. Batzios, A., Dimou, C., Symeonidis, A.L., Mitkas, P.A.: BioCrawler: An intelligent crawler for the semantic web. Expert Systems with Applications 35, 524–530 (2008) 2. Can, A.B., Baykal, N.: MedicoPort: A medical search engine for all. Computer Methods and Programs in Biomedicine 86, 73–86 (2007) 3. Cesarano, C., d’Acierno, A., Picariello, A.: An intelligent search agent system for semantic information retrieval on the internet. In: WIDM 2003, pp. 111–117. ACM Press, New Orleans (2003) 4. Cho, J., Garcia-Molina, H.: Parallel Crawlers. In: WWW 2002, pp. 124–135. ACM Press, Honolulu (2002) 5. Francesconi, E., Peruginelli, G.: Searching and retrieving legal literature through automated semantic indexing. In: ICAIL 2007, pp. 131–138. ACM Press, Standford (2007) 6. Ganesh, S., Jayaraj, M., Kalyan, V., Aghila, G.: Ontology-based web crawler. In: The International Conference on Information Technology: Coding and Computing (ITCC 2004). IEEE CS, Las Vegas (2004) 7. Giles, C.L., Petinot, Y., Teregowda, P.B., Han, H., Lawrence, S., Rangaswamy, A., Pal, N.: eBizSearch: A niche search engine for e-business. In: SIGIR 2003, pp. 213–214. ACM Press, Toronto (2003) 8. Halkidi, M., Nguyen, B., Varlamis, I., Vazirgiannis, M.: THESUS: organizing web document collections based on link semantics. The VLDB Journal 12, 320–332 (2003)
State of the Art in Semantic Focused Crawlers
921
9. Hendler, J.: Agents and the semantic web. IEEE Intelligent System 16, 30–37 (2001) 10. Jansen, B.J., Mullen, T., Spink, A., Pedersen, J.: Automated fathering of web information: an in-depth examination of agents interacting with search engines. ACM Transactions on Internet Technology 6, 442–464 (2006) 11. Konopnicki, D., Shmueli, O.: Database-inspired search. In: The 31st VLDB Conference, Trondheim, pp. 2–12 (2005) 12. Liu, H., Milios, E., Janssen, J.: Focused crawling by learning HMM from user’s topicspecific browsing. In: The IEEE/WIC/ACM International Conference on Web Intelligence (WI 2004). IEEE CS, Los Alamitos (2004) 13. Liu, H., Milios, E., Janssen, J.: Probabilistic models for focused web crawling. In: WIDM 2004, pp. 16–22. ACM Press, Washington (2004) 14. Stojanovic, L., Stojanovic, N., Volz, R.: Migrating data-intensive web sites into the semantic web. In: SAC 2002, pp. 1100–1107. ACM Press, Madrid (2002) 15. Tane, J., Schmitz, C., Stumme, G.: Semantic resource management for the web: an elearning application. In: WWW 2004. ACM, New York (2004) 16. Yuvarani, M., Iyengar, N.C.S.N., Kannan, A.: LSCrawler: a framework for an enhanced focused web crawler based on link semantics. In: The 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006). IEEE, Los Alamitos (2006) 17. Zhuang, Z., Wagle, R., Giles, C.L.: What’s there and what’s not?: Focused crawling for missing documents in digital libraries. In: JCDL 2005, pp. 301–310. ACM Press, Denver (2005) 18. Barfourosh, A.A., Anderson, M.L., Nezhad, H.R.M., Perlis, D.: Information Retrieval on the World Wide Web and Active Logic: A Survey and Problem Definition. Department of Computer Science, University of Maryland, Maryland, pp. 1-45 (2002) 19. Dong, H., Hussain, F.K., Chang, E.: A survey in semantic web technologies-inspired focused crawlers. In: The Third International Conference on Digital Information Management 2008 (ICDIM 2008). IEEE, East London (2008) 20. Dong, H., Hussain, F.K., Chang, E.: State of the art in metadata abstraction crawlers. In: 2008 IEEE International Conference on Industrial Technology (IEEE ICIT 2008). IEEE, Chengdu (2008) 21. W3C Semantic Web Frequently Asked Questions. W3C (2008) 22. Berners-Lee, T.: The semantic web. Scientific American Magazine, May 17 (2001) 23. Rapoza, J.: SPARQL will make the web Shine. eWeek, May 2 (2007) 24. Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E., Yergeau, F.: Extensible Markup Language (XML) 1.0, 4th edn. origin and goals. W3C (2006) 25. Sperberg-McQueen, C.M., Thompson, H.: XML Schema 1.1. W3C (2000) 26. Klyne, G., Carroll, J.J.: Resource Description Framework (RDF): concepts and abstract syntax. In: McBride, B. (ed.) W3C Recommendation W3C (2004) 27. Brickley, D., Guha, R.V.: RDF Vocabulary Description Language 1.0: RDF Schema. In: McBride, B. (ed.) W3C Recommendation. W3C (2004) 28. Herman, I.: Web Ontology Language (OWL). W3C (2007) 29. Unified medical language system. National Library of Medicine (2008) 30. Cho, J., Garcia-Molina, H., Page, L.: Efficient crawling through URL ordering. In: The Seventh International World Wide Web Conference (WWW 1998), pp. 161–172. ACM Press, New York (1998) 31. Gauch, S., Chaffee, J., Pretschner, A.: Ontology-based personalized search and browsing. Web Intelligence and Agent System 1, 219–234 (2003) 32. Rabiner, L.R.: A tutorial on Hidden Markov Model and selected applications in speech recognition. Proceedings of the IEEE 77, 257–285 (1989)
922
H. Dong, F.K. Hussain, and E. Chang
Appendix Table 2. Comparison of the ontology-based focused Crawlers Name
LSCrawler [16]
Domain Working Environment Special Functions
General General
Technologies Utilized
Ontology for similarity estimation; Porter Stemmer’s algorithm for removing stop words.
Evaluation Methods
Comparing recall rate with a fulltext crawler.
Evaluation Results
Nearly 10% progress on recall rate, compared with the fulltext crawler. Provide more evaluation details.
Comments/ Suggestions
Computing and indexing the similarity values between URLs and topics.
Courseware Watchdog crawler [15] E-learning Courseware Watchdog Assigning weight to ontological concepts based on users’ preference; weighting, ranking and clustering webpages based on the weighted concepts. Ontology for weighting, ranking and clustering webpages.
Crawler proposed Ganesh [6] General General
by
THESUS crawler [8] General THESUS
Weighting similarity values between URLs and ontological concepts, and between parent pages and children pages.
Linking URLs with ontological concepts.
Ontology and Boolean model for linking URLs with ontological concepts.
Not provided.
Ontology and combination importance, association and ordering metric for weighting similarity values between URLs and ontological concepts, and between parent pages and children pages. Not provided.
Not provided.
Not provided.
Provide evaluation details.
Provide evaluation details.
Compared with a keywordbased clustering method on Fmeasure, rand statistics, etc. 12% higher on F-measure, 5% advantage on rand statistics.
None.
State of the Art in Semantic Focused Crawlers
923
Table 3. Comparison of the Metadata Abstraction Crawlers Name Domain Working Environment Special Functions
Vertical portal crawler [5] Legal Vertical Portal
CiteSeer crawler [7] E-business A niche search engine.
Collecting legal documents; abstracting metadata.
Technologies Utilized
NB and MSVM for document classification; DC schema and tfidf for metadata abstraction.
Evaluation Methods
Evaluating the classification accuracy values for NB and MSVM respectively. 82.5% for NB, 85.1% for MSVM.
Parsing citations and abstracting metadata from downloaded documents. CiteSeer for parsing citations and abstracting metadata from downloaded documents; HMM for similarity estimation; SVM for metadata abstraction. Comparing SVM with HMM on accuracy.
Evaluation Results Comments/ Suggestions
Compare crawlers.
with
other
similar
SVM has better performance than HMM. Evaluating with a bigger training set.
Table 4. Comparison of the Other Semantic Focused Crawlers (Part 1) Name
Lokman crawler [2]
Domain Working Environment Special Functions
Medical MedicoPort
Technologies Utilized
Evaluation Methods
Fetching medical documents; estimating documents’ similarity values to UMLS concepts; indexing unvisited URLs. UMLS for ontology construction.
Evaluation Results
Respectively comparing the harvest rate of the crawler in two algorithms with a Best-First crawler. Better than the Best-First crawler on overall harvest rate.
Comments/ Suggestions
Provide semantic functions for the fetched webpages.
Crawler proposed by Liu et al. [12] [13] General General Predict relevant links based on users’ preference.
LSI for document clustering; HMM for similarity estimation; KNearest Neighborhood for children page clustering. Comparing precision with a BestFirst crawler.
Significantly advantageous on precision than the Best-First crawler. Too many algorithms adopted could affect the overall efficiency.
924
H. Dong, F.K. Hussain, and E. Chang Table 5. Comparison of the Other Semantic Focused Crawlers (Part 2)
Name
Web spider [3]
Domain Working Environment
General An agent-based semantic search engine. Downloading all children pages; computing similarity values between downloaded webpages and predefined context.
Special Functions
Technologies Utilized
Evaluation Methods
Algorithms for computing the semantic relevance between concepts and webpages. Not provided.
Evaluation Results
Not provided.
Comments/ Suggestions
Downloading all children pages may affect the overall performance; provide evaluation details.
Digital library crawler [17] Digital library CiteSeer
Using metadata heuristics to retrieve missing publications in digital library.
Metadata heuristics for locating authors’ homepages.
Testing harvest level; comparing the crawler with a Breadth-First crawler in precision; comparing the crawler with a Hutch crawler in precision and speed. 0.75 on harvest level; nearly 10% better than the Breadth-First crawler; superior than the Hutch crawler. Its performance may vary in different venues.
BioCrawler [1] General Semantic or nonsemantic environment Weighting the semantic strength of the obtained information based on its internal rules; sharing knowledge between crawlers; periodically revisiting websites to maintain its knowledge model. Not provided.
Comparing the BioCrawler’s energy with a dumb crawler during 30,000 websites visited and during 100 random restarts.
More knowledgeable than the dumb crawler, along with the increase of website visited. Provide technical details.
Towards a Framework for Workflow Composition in Ontology Tailoring in Semantic Grid Toshihiro Uchibayashi1, Bernady O. Apduhan1, Wenny J. Rahayu2, David Taniar3, and Norio Shiratori4 1
Faculty of Information Science, Kyushu Sangyo University, Fukuoka, 813-8503, Japan Dept. of Comp Sc & Comp Eng, La Trobe University, Bundoora, VIC 3086, Australia 3 Clayton School of Information Technology, Monash University, Clayton 3800, Australia 4 Research Institute of Electrical Communication, Tohoku Univ., Sendai, 980-8577, Japan 2
Abstract. Research and developments in Semantic Grid computing has gained much attention in areas of knowledge discovery, medical sciences, business world, among others. Dealing with large domain ontology in this computing environment and its complexity becomes a daunting task. The idea of ontology tailoring is to cope up and minimize the complexity in dealing with large domain ontology and to provide services for ontology reuse, extension, addition, merging, and replace. In this paper, we tackle the task in developing the models to provide the above-mentioned services and the framework for workflow composition in ontology tailoring using UML-based design. Workflow validations were conducted to verify towards realizing a workable ontology tailoring schemes.
1 Introduction The evolution of the Semantic Web and the adaptation of its technology in Grid computing environment have created the so-called Semantic Grid [1]. In Semantic Grid organization, the host computer is enabled to understand the content of “metadata” (defines the meaning of data) and its relation with other metadata hosted by other computer host(s), and this information is processed automatically. Using ontology technology, the information on the Web can be retrieved with improved accuracy and effectiveness. However, as Grid (ontology) databases and computational resources are geographically distributed, processing the required ontology will become more complex and will relatively exhibit low performance [2]. Scientific workflows, in particular, have emerged as a means to formalize the structure composition of workflows on the grid environment [3]. In our continuing effort and study in Semantic Grid computing, this paper describes the workflow models of sub-ontology extraction using UML-based design based on some optimization schemes, towards the development of a framework for workflow composition in ontology tailoring. Some procedures were applied to validate the workflow models and the results are described. In the following, Section 2 describes the experiment environment, while Section 3 discusses some related work. Section 4 introduces some ontology optimization schemes. Section 5 describes the workflows design by UML activity model. of the adapted optimization schemes and the validations. Section 6 describes the design and O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 925–936, 2009. © Springer-Verlag Berlin Heidelberg 2009
926
T. Uchibayashi et al.
validation by activity hypergraph. Section 7 gives our concluding remarks and cite some future works.
2 Experiment Environment Figure 1 shows the system organization of the Distributed Ontology Framework (DOF) on a Semantic Grid. The system composed of a Resource Broker, OTPR (Ontology Tailoring Processing Resource), Algorithm Server, Ontology Server, and Processing Plant. The Resource Broker takes care of handling the users’ means, security problem, and other policies to access heterogeneous resources in a widely distributed environment. It has the role to mediate the connections to destination according to the network traffic situation, number of users, and other related conditions. Likewise, it has the role to distribute the job for the best service and performance sought by the user. The Ontology Server contains the database of ontology, and provides the needed ontology for sub-ontology processing which it can send to OTPR or directly to the Processing Plant. The Algorithm Server contains the needed algorithm(s) which may be sent to the OTPR or to the selected Processing Plant, which can be a high-end processing computer, a cluster computer, or a supercomputer. The OTPR has the role of executing the sub-ontology processing given the ontology data from the Ontology Server and the required algorithm from the Algorithm Server. Depending on the OTPR’s processing capacity, the OTPR may relegate the sub-ontology processing execution to some Processing Plant. To facilitate processing and reduce network traffic, the OTPR may instruct the Ontology Server to send directly the ontology data, and the Algorithm Server for the required algorithm to the selected Processing Plant. Either way, the OTPR sends the sub-ontology processing results back to the user.
Fig. 1. The system prototype of the Distributed Ontology Framework
Towards a Framework for Workflow Composition in Ontology Tailoring
927
3 Related Work The transactional grid workflow service (GridTW) of the Shanghai Grid deals with guaranteeing reliability and automation of e-business application(s) [7]. They have presented the coordination algorithm for the management of the transactional grid workflow, and the validation of the algorithm is examined by using Petri Net. The PGWFT proposed a Petri Net Based Grid Workflow Verification and Optimization Toolkit (PGWFT) to help users explain a workflow [8]. The authors presented some examples on the verification and the optimization analysis. The paper in [12] describes a tool that supports verification of workflow models specified in UML activity graphs. Whereas [13], define a formal execution semantics for UML activity diagrams that is appropriate for workflow modeling. While our paper deals with workflow composition and verification and uses UML, we focus our subject on the sub-ontology extraction workflows composition and validation on a Semantic Grid environment.
4 Ontology Optimization Schemes in Brief In narrowing down the target ontology/sub-ontology, we developed some optimization schemes, namely: the Consistency Validation Checking (CnV), and the Completeness Validation Scheme (CmS) [9,10]. CnV is a combination of four subschema that check whether there is various form of consistency, and CmS consists of three subschema that check whether there is various form of meaning completeness. The following describes these in detail. 4.1 Consistency Validation Checking (CnV) As the name implies, CnV checks for the consistency of the user-specified requirements of the target ontology in the form of labeling. Currently, CnV itself is a combination of four sub-schemes that check for various forms of consistency, the simpler phases within the entire extraction process, and illustrate our distribution scheme pertaining to the ontological workload associated with it. CnV ensures that the requirements as expressed by the user (or any other optimization scheme) are consistent, i.e. there are no contradictory statements in the labeling, as set up by the user. By ensuring requirements consistency, we eliminate the possibility that no sub-ontology (based on user preferences) is derivable in the first place. CnV is currently the very first rule to be applied during the extraction process. Moreover, it is also one of the rather simpler rules within the overall extraction process; both from a conceptual level as well as from an implementation viewpoint. CnV is a suite of four optimization schemes, which without any implicit ordering or execution priority, we denote as CnV(1)–CnV(4). CnV(1) If a binary relationship between concepts is selected by the user to be present in the target ontology, the two concepts that the relationship associates cannot be disqualified/deselected from the target ontology.
928
T. Uchibayashi et al.
CnV(2) The CnV(2) rule is similar to CnV(1) with the difference that instead of a binary relationship over the set of concepts, it is applied on a special relationship, called an attribute mapping, that exists between concepts and their attributes. This rule enforces the condition that if an attribute mapping has a selected labeling, the associated attribute as well as the concept it is mapped onto must be “selected” to be present in the target ontology. CnV(3) This rule imposes a requirement on a more specific characteristic of an attribute mapping. It stipulates that if an attribute mapping has a deselected labeling, its associated attribute must also be disqualified from the target ontology. Basically, no contradicting preferences are allowed between an attribute mapping and the associated attribute. CnV(3) together with CnV(2) imposes this condition. CnV(4) CnV4 is relatively more complex than that of CnV(1)–CnV(3).We utilize the notion of a Path, again informally, for illustrating CnV(4). Paths are very important in the specification of ontology views. They provide seemingly new relationships (new information) that are semantically correct, albeit only implicitly present in the ontology definition. A path is defined as the chain of relationships that connect concepts, where the end concept of one relationship is the start concept of the following relationship in the chain. Note that the same relationship can appear only once in the entire path. From an CnV(4) viewpoint, the emphasis of a path lies on the first and last concept it visits, as these two are connected by the path. However, alternative formulations of the “path concept” have been utilized by imposing certain qualification criteria on the concepts that are a part of the connection and/or the relationships that form the chain for other optimization schemes. Based on the above formulation of a path, we can now introduce the fourth required consistency rule. If an attribute is selected, but the concept it “belongs” (mapped) to is deselected, CnV(4) stipulates that there must be a path from the attribute to another concept that is not deselected. Moreover, the path can only contain relationships with a label other than “deselected”. 4.2 Completeness Validation Scheme (CmS) The idea of semantic completeness of ontology can be interpreted in a number of ways. However, for the purposes of sub-ontology extraction, it amounts to the inclusion of the defining elements for the elements selected by the user by way of requirements specification. A defining element is a concept, relationship or attribute that is essential to the semantics of another element of the ontology. For example, a concept selected to be present in the sub-ontology would be semantically incomplete if its super-concept (the defining element in this case) is deselected at the same time. This could be further generalized into a situation where a set of elements are connected by an IS-A relationship unto any arbitrary depth. The scenario can only get more complex in the presence of more complex relationships such as multiple-inheritance, aggregation, etc. The Completeness Validation Scheme (CmS) exists to guard against such inconsistencies. The CmS sub-schemes are briefly described as follows:
Towards a Framework for Workflow Composition in Ontology Tailoring
929
CmS(1) If a concept is selected, all its super-concepts, and the inheritance relationships between the concepts and its super-concepts, must be selected. CmS(2) If a concept is selected, all the aggregate part-of concepts of this concept, together with the aggregation relationship, must also be selected. CmS(3) If a concept is selected, then all of the attributes it possesses, with a minimum cardinality other than zero, and their attribute mappings should be selected.
5 Workflow Design by UML Activity Diagram The flow chart is origin to the activity diagram of UML, and is almost similarly interpreted on semantics. The activity diagram of UML is used to show the behavior. The behavior like the flow chart for automatic and the chain is shown below. If one activity ends, the following activity automatically begins. The workflows of the above two schemas, i.e., CnV and CmS are designed using UML. Here, we use activity diagram constructs action node, wait state node, sub-activity state node, decision/merge, fork/join, initial and final, as shown Figure 2.
Fig. 2. Activity diagram constructs
5.1 Consistency Validation Checking (CnV) Figure 3 shows the design of CnV by activity diagram. This activity diagram can be divided into four steps (step1-step4). Steps in CnV(1) 1. 2. 3. 4. 5. 6.
At the beginning, one in the pair concept is determined whether it is qualified/selected or disqualified/deselected. If the concept is qualified/selected, it proceeds to 3; otherwise, it jumps to 6. The other concept in the pair is then determined whether it is qualified/selected or disqualified/deselected. If the concept is qualified/selected, it proceeds to 5; otherwise, it jumps to 6. The binary relationship and/or consistency of the pair concept is validated. End.
930
T. Uchibayashi et al.
Steps in CnV(2) 1. 2. 3. 4. 5. 6.
At the beginning, one concept in the set is determined whether it is qualified/selected or disqualified/deselected. If the concept is qualified/selected, it proceeds to 3; otherwise, it jumps to 6. The other concepts are also determined whether they are qualified/selected or disqualified/deselected. If the concepts are qualified/selected, it proceeds to 5; otherwise, it jumps to 6. The attribute mapping of the set of concepts is validated. End.
Steps in CnV(3) 1. 2. 3. 4.
At the beginning, one concept in the set is determined whether it is qualified/selected or disqualified/deselected. If the concept is disqualified/deselected, it proceeds to 3; otherwise, it jumps to 4. The associated attributes of this attribute mapping is also disqualified/deselected. End.
Steps in CnV(4) 1. 2. 3. 4. 5.
At the beginning, an attribute is determined whether it is qualified/selected or disqualified/deselected. If the attribute is qualified/selected, it proceeds to 3; otherwise, it jumps to 5. If the concept in the attribute is deselected, it proceeds to 4; otherwise to 5. A path from the attribute to another concept which is not deselected is determined. End.
5.2 Design of CmS Figure 4 shows the design of CmS by activity diagram. This activity diagram can be divided into four (step1-step3).
Fig. 3. Design of CnV by activity diagram
Towards a Framework for Workflow Composition in Ontology Tailoring
931
Fig. 4. Design of CmS by activity diagram
Steps in CmS(1) 1. 2. 3. 4.
At the beginning, a concept is determined whether it is qualified/selected or disqualified/deselected. If the concept is qualified/selected, it proceeds to 3; otherwise, it jumps to 4. The concept’s super-concepts and its inheritance relationship with its superconcepts are selected. End.
Steps in CmS(2) 1. 2. 3. 4.
At the beginning, a concept is determined whether it is qualified/selected or disqualified/deselected. If the concept is qualified/selected, it proceeds to 3; otherwise, it jumps to 4. The concept’s aggregate part-of concepts and relationships are selected. End.
Steps in CmS(3) 1. 2. 3. 4.
At the beginning, a concept is determined whether it is qualified/selected or disqualified/deselected. If the concept is qualified/selected, it proceeds to 3; otherwise, it jumps to 4. The concept’s attributes’ with minimum cardinality other than zero and attributes’ mapping are selected. End.
6 Design and Validation by Activity Hypergraph In mathematics, a hypergraph is a generalization of a graph, where edges can connect any number of vertices. An activity hypergraph is connected directly with hypergraph. An activity hypergraph is a quintuple (Nodes, Edges, Events, Guards, LVar).
932
T. Uchibayashi et al.
Fig. 5. Example of eliminations of pseudo state nodes
N nodes are partitioned into a set of AN activity nodes, a set of FN final nodes, and one initial node, Initial. Events are sets of event expressions, and Guards are sets of guard expressions. Actions are sets of action expressions. An activity hypergraph consists of a set of labeled state nodes that are connected by labeled directed hyperedges. State nodes of an activity hypergraph are action state nodes and final state nodes. An activity hypergraph can have variables. A hyperedge can be labeled with an optional event expression and an optional guard expression, where the latter can refer to variables of the activity hypergraph. Figure 5 shows some of the most common eliminations. For every XOR-node, all its entering and exiting edges maps into one compound transition. And for every AND-node, all its entering and exiting edges map into the same compound transition. If AND-nodes are connected to OR-nodes, the mapping becomes slightly more complicated. Syntactic constraints on activity hypergraphs are as follows: 1.
Every edge that leaves an action state node is labeled with empty event NULL.
2.
For every edge e that has action state node a as source, a is the only source of e. This implies that for every edge with multiple sources, none of its sources is an action state node. The disjunction of guards on the edges leaving an action state node must be a tautology.
3. 4.
The initial state node may only occur in the source of an edge. Moreover, if it is a source of an edge, it is the only source of that edge. Th e final state node may only occur in the target of an edge. Moreover, if it is target of an edge, it is the only target of that edge.
5.
The edges leaving the initial state node must have no events and the disjunction of their guard expressions must be a tautology.
We convert the activity diagram into an activity hypergraph, and verify the validity of CnV and CmS. The composition of nodes will be easy and its validity will be simple
Towards a Framework for Workflow Composition in Ontology Tailoring
933
Table 1. Number of former and next elements
if the activity diagram is converted into an activity hypergraph. We verify the validation with regards to the relationship between element and transition of state, flow of control and non-execution element. The method is described as follows: •
•
•
Relation between element and transition of state Initial, final, activity, decision, and merge are called state elements. The state element connects the former element and the next element of the number decided respectively. The transitional element connects the initial and final state elements. Table 1 shows the number of the former and the next elements. Transition of state 1) Specification at the initial position - The initial position of the control is examined. If the number of initial elements is one, the initial position of the control can be specified. 2) Divergence control - The control after the element moves to one transition without failure. It depends on the guard condition from which transition the element moves after divergence. The guard condition number (g1, g2, g3, . . ., gn) in a certain divergence is a factor of (1 ≤ i ≤j ≤ n), in which gi and gj are different at a certain time, except when n=1. Non-execution element: It determines whether non-execution elements exists.
Fig. 6. Design of CnV by activity hypergraph
934
T. Uchibayashi et al.
Fig. 7. Design of CmS by activity hypergraph
6.1 Validation of CnV Figure 6 shows CnV by activity hypergraph. The validity of this figure is verified as follows. •
•
•
Relation between element and transition of state In Fig. 6, the number of next element is one, and so the initial state element requirement in Table 1 is satisfied. In Step1 of CnV(1), in the decision node, the number of former element is one and the number of next elements are two. In the activity node, the number of former element is one and the next element is one. In the merge node, the number of former elements is two and the number of next element is one, and satisfies Table 1. The steps in CnV(2) to the steps in CnV(4) also satisfies Table 1. Since the former element is one and the final state element is also one, this and the steps in CnV(2) through CnV(4) satisfies all the requirements in Table 1. Therefore, it fills all requirements in Table 1. Transition of state 1) Specification at the initial position - Because the number of initial elements is one, it satisfies the requirement. 2) Divergence control – In step1 of CnV(1), the number of decision elements is one. Because the guard condition is two, gi and gj becomes (1 ≤ i ≤ j ≤ 2). Therefore, the control decision in step1 of CnV(1) works. Likewise, from step 2 in CnV(2) to step 4 CnV(4) satisfies the requirements. Non-execution element Because non-executable element doesn’t exist, it satisfies the requirement.
6.2 Validation of CmS Figure 7 shows CmS by activity hypergraph. The validity of this figure is verified as follows.
Towards a Framework for Workflow Composition in Ontology Tailoring
•
•
•
935
Relation between element and transition of state Since the decision element in Step 1 of CmS(1) has one next element, the initial state element requirement in Table 1 is satisfied. The activity element has one former element and one next element. The merge element has two former elements and one next element. Likewise, the Step 2 in CmS(2) and Step 3 in CmS(3) satisfies the requirements in Table 1. The final element has one former element. These conditions satisfy all the requirements in Table 1. Transition of state 1) Specification at the initial position - Since the number of initial elements in the figure is one, it satisfy the requirement. 2) Divergence control - The number of step1 (CmS(1)) of decision elements is one. There are two guard conditions, and so gi and gj becomes (1 ≤ i ≤ j ≤ 2). Therefore, the control decision in Step1 of CmS(1) works, and satisfy the requirement. Non-execution element Because non-executable element doesn’t exist, it satisfies the requirement.
7 Conclusions We described the CnV and CmS optimization schemes and corresponding subschemes for sub-ontology extraction. We design and described the workflow of each scheme using UML activity diagram, and converted it into activity hypergraph. We used relation between element and transition of state, transition of state, and non-execution element validation criteria. The workflow of our proposed Distributed Ontology Framework in a Semantic Grid was designed using UML activity diagram. The relations between the element and transition of state, the transition of state, and non-execution element criteria were also used to validate the workflow design. Preliminary results confirm the validity of the design which can be a reference in designing a VO (Virtual Organization), as part of a larger Semantic Grid environment. Future work includes a more detailed design with more resources and thorough analysis and validation of the workflow model composition, and its implementation on a prototype system.
Acknowledgment This work was supported in part by the Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research, 21500087.
References 1. Tangmunarunkit, H., Decker, S., Kesselman, C.: Ontology-based Resource Matching in the Grid — The Grid meets the Semantic Web. In: Proceedings of the Second International Semantic Web Conference, Sanibel-Captiva Islands, Florida, USA (October 2003)
936
T. Uchibayashi et al.
2. Bhatt, M., Rahayu, W., Soni, S.P., Wouters, C.: Ontology Driven Requirement Profiling and Information Retrieval in the Medical Information Systems Domain. The International Journal of Medical Informatics, March 15 (preprint submitted) (2007) 3. Andrew Flahive, J., Rahayu, W., Bernady, O.: Apduhan and David Taniar, Simulating the Distributed Ontology Framework in the Semantic Grid Environment with GridSim. In: Proceedings of PDPTA, pp. 717–723 (2006) 4. Kurdel, P., Sebestyénová, J.: Grid Workflows Specification and Verification WSEAS Transactions on Computes (2008) 5. Chen, J., Yang, Y.: A Taxonomy of Grid Workflow Verification and Validation. Concurrency and Computation: Practice and Experience 20(4), 347–360 (2008) 6. Chen, J., Yang, Y.: Key Research Issues in Grid Workflow Verification and Validation. ACSW Frontiers 2006, pp. 97–104 (2006) 7. Tang, F., Li, M., Guo, M.: Transactional Grid Workflow Service for ShanghaiGrid. Int. J. Web and Grid Services 3(4) (2007) 8. Cao, H., Jin, H., Wu, S., Tao, Y.: PGWFT: A Petri Net Based Grid Workflow Verification and Optimization Toolkit. In: Wu, S., Yang, L.T., Xu, T.L. (eds.) GPC 2008. LNCS, vol. 5036, pp. 48–58. Springer, Heidelberg (2008) 9. Bhatt, M., Flahive, A., Wouters, C., Rahayu, J.W., Taniar, D.: MOVE: A Distributed Framework for Materialized Ontology View Extraction. Algorithmica 45(3), 457–481 (2006) 10. Bhatt, M., Rahayu, J.W., Soni, S.P., Wouters, C.: OntoMove: A Knowledge Based Framework for Semantic Requirement Profiling and Resource Acquisition. In: ASWEC 2007, pp. 137–146 (2007) 11. Yu, J., Buyya, R.: Workflow Schdeduling Algorithms for Grid Computing, Technical Report, GRIDS-TR-2007-10, Grid Computing and Distributed Systems Laboratory, The University of Melbourne, Australia, May 31 (2007) 12. Eshuis, H., Wieringa, R.J.: Verification Support for workflow Design with UML Activity Graphs. In: 24th International Conference on Software Engineering (ICSE) (2002) 13. Eshuis, H., Wieringa, R.J.: A Formal Semantics for UML Activity Diagrams - Formalising Workflow Models, CTIT technical reports series (01-04)
Fusion Segmentation Algorithm for SAR Images Based on HMT in Contourlet Domain and D-S Theory of Evidence* Yan Wu1, Ming Li2, Haitao Zong1, and Xin Wang1 1
School of Electronics Engineering, Xidian University, Xi’an 710071, China Yan wu
[email protected] 2 National Key Lab. of Radar signal processing, Xidian University, Xi’an 710071, China
Abstract. Utilizing the Contourlet’s advantages of multiscale, localization, directionality and anisotropy, a new SAR image segmentation algorithm based on hidden Markov tree (HMT) in Contourlet domain and dempster-shafer (D-S) theory of evidence is proposed in this paper. The algorithm extends the hidden Markov tree framework to Contourlet domain and fuses the clustering and persistence of Contourlet transform using HMT model and D-S theory, and then, we deduce the maximum a posterior (MAP) segmentation equation for the new fusion model. The algorithm is used to segment the real SAR images. Experimental results and analysis show that the proposed algorithm effectively reduces the influence of multiplicative speckle noise, improves the segmentation accuracy and provides a better visual quality for SAR images over the algorithms based on HMT-MRF in the wavelet domain, HMT and MRF in the Contourlet domain, respectively. Keywords: SAR images segmentation, Contourlet transform, hidden Markov tree (HMT), D-S theory of evidence.
1 Introduction Over the past few decades, Wavelets have had a growing impact on signal and images processing [1–5], mainly due to their good non-linear approximation performance for piecewise smooth functions in one dimension. Unfortunately, this is not the case in two dimensions. In essence, Wavelets are good at catching point or zero-dimensional discontinuities, but two-dimensional piecewise smooth functions resembling images have one-dimensional discontinuities. Intuitively, Wavelets in 2-D obtained by a tensor-product of two one dimensional Wavelet basis functions will be good at isolating the discontinuities at edge points, but will not say the directional information and *
This work is supported by the National Natural Science Fundation of China (No.60872137) ,the National Defence Fundation of China(No.9140A01060408DZ0104),and by the Ariation Science Fundation of China(20080181002).
O. Gervasi et al. (Eds.): ICCSA 2009, Part II, LNCS 5593, pp. 937–951, 2009. © Springer-Verlag Berlin Heidelberg 2009
938
Y. Wu et al.
intrinsic geometric structures such as smooth contours in natural images,which leads to blocking effects and Gibbs phenomena [2] easily. This indicates that more powerful representations are needed in higher dimensions. In 2002, M.N.Do and Martin Vetterli [3] pioneered a new “true”two–dimensional representation for images named the Contourlet transform, which provides a flexible multiresolution, local and directional expansion for images by combining the Laplacian pyramid with a directional filter bank. The Contourlet transform is designed to satisfy the anisotropy scaling relation for curves, and thus offers a fast and structured Curvelet [7] -like decomposition sampled signals. Contourlet not only possesses the main features of the Wavelet, namely, multiresolution and time-frequency localization, but also shows a high degree of directionality and anisotropy. Due to those advantages, many scholars focus on the research of image processing in Contourlet domain [8,9]. The segmentation of synthetic aperture radar (SAR) images is a key step in the automatic analysis and interpretation of data, which can provides the overall structural information on the image to reveal the essence of SAR images. Image segmentation establishes the foundation for automatic targets recognition (ATR) and promotes the wide application of SAR images. This domain has received a wide attention at home and abroad in recent years. Real SAR images are corrupted with an inherent signal-dependent phenomenon named multiplicative speckle noise, which is grainy in appearance and due primarily to the phase fluctuations of the electromagnetic return signals. Then the classical segmentation technique, which works successfully on natural images do not perform well on SAR images. In most SAR images, pixels of a particular cover type rarely exist in isolation; rather they are often part of broader geographical regions that share common properties. In an agricultural region, for example, pixels exist as contiguous sets making the likelihood of adjacent pixels being from the same class very high. This local spatial dependence property in SAR images leads to the pixel weather belongs to a class has a certain relation with its neighbors belong to which class. Over the past few decades, several segmentation methods have been proposed for carrying spatial neighborhood information into classification [1,2,5]. In the statistical image processing domain, Markov random field (MRF), which takes the local dependence between pixels into consideration, occupies the important position. However, The MRF model has a limited ability to describe large scale behaviors [10]. For example, we may know that segmented regions are likely to be at least 50 pixels wide. However, it is difficult to accurately incorporate this information by specifying the interactions of adjacent pixels. The model can be improved by using a larger neighborhood for each pixel, but this rapidly increases the number of parameters of interaction and the complexity of the segmentation algorithms. The fundamental limitation of local models is that they do not allow behavior to be directly controlled at different spatial scales. This is of critical importance since scale variation occurs naturally in images and is important in quantifying image behavior. In order to resolve the problem stated above, multiscale Markov model was come into being. Venkatachalam and Hyeokho Choi proposed an image segmentation algorithm based on multiscale hidden Markov tree (HMT) [11, 12] model in 1998 and 2001, respectively, and applied it to SAR imagery segmentation. This algorithm applies the persistence property of Wavelet coefficients to segmentation, and uses the segmentation result of the coarser scale to impact the finer scale, so as to fuse the global information
Fusion Segmentation Algorithm for SAR Images Based on HMT
939
into segmentation results, and reduce the misclassification, which has a certain effect on the segmentation of SAR images. However, the poor angle resolution of Wavelet basis determines Wavelets can’t capture the high-dimensional singularity information in SAR images, which causes the directional edge vagueness and singularity diffuse phenomena in the segmentation images. Then the HMT model was extended to Contourlet domain [13], and obtained better segmentation result than HMT performed in Wavelet domain. The D-S theory of evidence, which fuses information through the mechanism of the orthogonal sum, do a good performance in image segmentations [14], but its disadvantage is that it only notices the neighborhood information, and doesn’t take the global and detail information between the coarser scale and the finer scale into consideration during image segmentation, then it is easily have misclassification phenomenon. We analyze the statistical properties of Contourlet coefficients in this paper, and proposed a new fusion segmentation algorithm for SAR images based on HMT in Contourlet domain and D-S theory of evidence, this algorithm captures the persistence and clustering properties of Contourlet transform, which are modeled by HMT and D-S theory of evidence respectively. We apply the maximum a posterior (MAP) approach to SAR images segmentations, and use relaxation approach to deduce the ultimate MAP segmentation formula. Numerical experiments results demonstrate the good performance of the proposed algorithm.
2 Contourlet Transform and Coefficients Statistic Analysis Contoulet transform is also called pyramid direction filter banks decomposition, which has an elongated support of flexible aspect ratios. It is a double filter bank structure for obtaining sparse expansions for typical images with smooth contours. In the double filter bank, a Laplacian pyramid (LP) is first used to capture the point discontinuities, and then a directional filter bank (DFB) links point discontinuities into linear structures. The overall result is an image expansion using basic elements like contour segments, and thus is named Contourlet. By the Contourlet transform, one can decompose each scale into any arbitrary power of two’s number of directions, and different scales can be decomposed into different numbers of directions. These features make Contourlet a unique transform that can achieve a high level of flexibility in decomposition while being close to critically sampled [15] (up to 33% overcomplete, which comes from the Laplacian pyramid). Fig.2 (a) plots the histograms of the finest subbands of the image Fig.1 (a). Its distribution exhibits a sharp peak at zero amplitude and heavy tails on both sides of the peak. This implies that the Contourlet transform is sparse, as the majority of coefficients are close to zero. The kurtosis of the shown distribution is 27.445, which is much higher than the kurtosis of 3 for Gaussian distributions. Similar distributions are also observed at all subbands of other test images. Thus, the subband marginal distributions of natural images in the Contourlet domain are highly non-Gaussian but conditional-Gaussian. In other words, the Contourlet coefficients of natural image can be accurately modeled by a mixture of Gaussian distributions, as shown in Figure2 (b).
940
Y. Wu et al.
(a)
(b)
Fig. 1. Sketch of Contourlet transform: (a) test image; (b) Contourlet transform of two scale
(a)
(b)
Fig. 2. Marginal probability distributions of Contourlet coefficients:(a) marginal probability distribution of Contourlet decomposition subband; (b) mixture Gaussian probability density distribution
Fig. 3. Conditional distribution of Contourlet coefficients in one subband: (a) P(X|PX); (b) P(X|NX)
Marginal statistics only describe individual behaviors of transform coefficients without accounting for their dependencies. It is clear that Contourlet coefficients depend on each other since only Contourlet functions that overlap and directionally align with image edges lead to significant coefficients. Fig.3 shows the conditional distributions of Contourlet coefficients, conditioned on their parents (PX) and neighbors (NX). First, we notice that all of these conditional distributions exhibit a “bow-tie” shape where the variance of the coefficients is related to the magnitude of the conditioned coefficient. Second, even though coefficients are correlated due to the slight overcompleteness of the Contourlet transform, they are approximately
Fusion Segmentation Algorithm for SAR Images Based on HMT
941
decorrelated since conditional expectations E[ X | .] = 0 . Thus we conclude that the Contoulet coefficients have dependencies between neighborhood and adjacent scales, which are clustering and persistence.
3 Fusion Segmentation Based on D-S Theory of Evidence in Contourlet Domain HMT 3.1 Contourlet Domain HMT Model Contourlet transform has the properties of multiresolution, localization, directionality and anisotropy [15]. Because of these properties, the original signal is decomposed into different scales and directionalities, and the number of its coefficient decreases with the increase in the scale of the Contourlet transform by the exponential power of two’s number. So the Contourlet coefficients correspond to the shape of the tree structure naturally. We model the joint probability of Contourlet coefficients by a hidden Markov model [13, 15] (HMM). An N-state HMM associates each coefficient with a hidden state variable, randomly distributed over its N states. Conditioned on its state, each coefficient is modeled by a zero-mean Gaussian distribution with parameters depending on the state. Therefore, each coefficient is characterized by an N-dimensional state probabilities vector p and an N-dimensional standard deviation vector δ (we assume Contourlet coefficients have a zero-mean since all Contourlet basis functions have a zero-sum). Here,
p = ( p1 , p2 ,..., pN )T , δ = (δ1 , δ 2 ,..., δ N )T
(1)
Where 1, 2,....N denote the states. The intercoefficient dependencies in HMT are established by links between the hidden states of dependent coefficients. Associated with each link between coefficients m and n is an N × N state transition probability matrix A m , n where its entry ( k , l ) is the probability that the coefficient m is in state k , given that the coefficient n is in state l . To reduce the number of parameters, like in the Wavelet-domain HMM, we “tie” all Contourlet coefficients in the same subband to share the same set of model parameters. The hidden Markov tree (HMT) model is an HMM that uses a quad-tree dependency structure. More precisely, the HMT model establishes links between the hidden state of each coefficient and those of its four children. The HMT model utilizes the shape of the tree structure in order to describe the interscale dependence of Contourlet coefficients. Concretely speaking, it uses the intrascale Markov chain to model the hidden state corresponding to the Contourlet coefficient rather than the coefficient itself. So it is named the HMT model in the Contourlet domain. In the HMT model, these dependencies across the scale are captured using a probabilistic tree that connects the hidden state variable of each coefficient with the state variable of each of its children. Each subband is represented with its own quad-tree; thus quad-trees are assumed independent. The dependencies across the scale (between
942
Y. Wu et al.
each parent and its children) form the transition probabilities between the hidden states. The transitions among the states are governed by a set of transition probabilities. The HMT model directly establishes the links between parent and children, and indirectly establishes the links between the coefficients and its adjacent coefficients through the same parent coefficients. For a Contourlet decomposition of J scales and m j , ( j = 1, 2, ... J ) directional subbands within the scale j ( j = 1,..., J , from coarse to fine), a Contourlet HMT model contains the following parameters: •
p1,k (where k = 1,..., m1 ): for the root state probability vector at each directional
subband at the coarsest scale; • A j , k (where j = 1,..., J , and k = 1,..., m j ): for the state transition probability matrix to the directional subband
k
at the scale j from its parent subband at the
scale j − 1 ; •
δ j ,k
(where j = 1,..., J , and k = 1,..., m j ): for the Gaussian standard deviation
vector of the subband at the scale
j and in the direction k .
The state transition probability matrix
A j ,k is used to describe the scale persistence
of coefficient magnitudes. For each parent-child pair of hidden states {S ρ ( i ), Si } , the state transition probabilities
ρ
ε i ,m(i')
,m
with m, m ' = S , L represent the probability for
children coefficient to be small or large when its parent coefficient is small or large. For each i at the directional subband k at the scale j , we thus have A j , k .
⎛ ε ρ (i ), S Aj ,k = ⎜⎜ iρ, s(i ), S ⎝ ε i,s
ε iρ, L(i ), S ⎞ ⎛ ε iρ, S(i ), S 1 − ε iρ, S(i ), S ⎞ = ⎟ ⎜ ⎟ ε iρ, L(i ), L ⎟⎠ ⎜⎝1 − ε iρ, L(i ), L ε iρ, L( i ), L ⎟⎠
Because of the persistence property, we see that
ε iρ, S(i ), S and ε iρ, L(i ), L
(2)
are large. All
these parameters can be estimated by the EM [16] algorithm. Compared with the Wavelet HMT model, the Contourlet HMT model has a major advantage that it can account for the interdirection dependencies while the Wavelet HMT model does not. Fig. 4 illustrates this difference. In the Wavelet HMT model, the parent-children links are always in the same direction among three Wavelet directions. As a result, the Wavelet HMT model coefficients in each direction are independently. In contrast, Contourlet coefficients can have their four children in two separate directional subbands. Notice that in the Contourlet case, a parent coefficient can have its children spread over two subbands while in the Wavelet case, a parent coefficient has its children spread over only one subband. As a result, the dependence tree in the Contourlet HMT can span several adjacent directions in the finer scales, and thus inter-direction dependencies are modeled in the same way as inter-location
Fusion Segmentation Algorithm for SAR Images Based on HMT
943
dependencies. In other words, the Contourlet HMT model effectively captures all dependencies across scales, space, and directions.
Fig. 4. The difference of the quad tree structure between wavelet and Contourlet(two scale, four and eight direction respectively). (a) Wavelet domain quad tree structure; (b) Contourlet domain quad tree structure.
3.2 Fusion Segmentation of SAR Images Based on D-S Theory of Evidence From the segmentation algorithm based on HMT model, we can get the segmentation results at different scales, but this algorithm doesn’t fuse the segmentation result of the coarser scale and the finer scale together to make the classification judgement, and also doesn’t take the intrascale space neighborhood information into consideration, and can’t eliminate the effect of speckle noise fundamentally. To this end, we propose a new method that we employ the D-S theory of evidence to combined the segmentation results of the coarser scale with segmentation results of the finer scale together to make the classification decision. The fusion information include two aspects: the class that the parents of the coarser scale j − 1 belongs to and the segmentation information of the neighborhood system of the finer scale j . According to this method, we can fuse the segmentation information of each scale into the final segmentation results. Then our algorithm not only considers the global information and detail information, but also takes the neighborhood information into account, which effectively reduces the affection of speckle noise. The basis of the theory of evidence is to define a mass of evidence (or belief) associated with each labeling proposition for a pixel from a particular data source: m(A) is the mass of evidence in favor of a pixel having the thematic label A. Some propositions can be combinations of label types reflecting mixed pixels and other forms of ambiguity. Mass can also be allocated to the user’s uncertainty about the effectiveness of the original mass distribution. The initial values of mass over the candidate labels could be generated from, say, the posterior probabilities found from a maximum likelihood classification. If there is a second source of data then the user can generate another distribution of mass or beliefs over the labeling propositions. The orthogonal sum is a key feature of the Dempster–Shafer approach; it allows the mass distributions from two or more data sources to be combined and, in so doing, reduce the uncertainty associated with the overall labeling. For two data sources, the orthogonal sum is defined as:
944
Y. Wu et al.
1 ∑ m1 ( x)m2 ( y ) 1 − β x∩ y = z → m1 ( x) ⊕ m2 ( y ), β ≠ 1 m12 ( z ) =
(3)
where
β = ∑ m1 ( x)m2 ( y ) = 1 − ∑ m1 ( x)m2 ( y ) x ∩ y =φ
x ∩ y ≠ϕ
(4)
In which ϕ is the null set. We use the operator ⊕ to indicate the orthogonal sum. Now we only consider a four-pixel neighborhood system. Suppose we have mass distributions associated with each of the four neighbors for a pixel that tell us something about what each of those neighbors recommends in terms of the central pixel itself. At the scale
j , for neighbor n labeled w j , we can represent the mass of evi-
dence in favor of pixel g being allocated to class
wc , by
m( g ∈ wc | n ∈ w j )
(5)
The total evidence in favor of the labeling on m however must account for all labeling possibilities w j on the neighbor, so we should rewrite (5) as
m( g ∈ wc | n ∈ w j , ∀j )
(6)
When we consider the global property at the coarser scale, we direct take the class that its parents belong to into consideration, as to the central pixel g at scale j ,its parent is defined as cρ ( g ) , then the probability of its parent condition of the central pixel g belongs to the class
cρ ( g ) belongs to wj on the
wc is defined as
m( g ∈ wc | cρ ( g ) ∈ w j , ∀j )
(7)
The distribution of evidential mass over all possible labels for the central pixel g can be expressed
mg , n ,cρ ( g ) ( wc ) =
{m( g ∈ w | n ∈ w , ∀j),..., 1
j
(8)
m( g ∈ w2 | cρ ( g ) ∈ w j , ∀j ), θ n } in which θ n is the weighted factor we have the importance of the neighbor n in relation to the central pixel. The four neighbor recommendations can be combined through the orthogonal sum to form a joint neighborhood opinion.
Fusion Segmentation Algorithm for SAR Images Based on HMT
mg ( wc ) = ⊕
n , cρ ( g )
mg ,n ,cρ ( g ) ( wc )
945
(9)
As with other spatial consistency methods, the process needs to be iterated since the mass distribution on all pixels have been changed by the application of the orthogonal sum. Thus, at the ( k + 1)th iteration
m k +1 ( wc ) = m k ( wc ) ⊕ mgk ( wc ) (10)
= m k ( wc ) ⊕ ⊕ mg ,n ,cρ ( g ) ( wc ) n
where m k ( w ) c
is the mass distribution on the central pixel at iteration k . When
iterated, we should choose an initial value for Eq.(10), we choose
m( g ∈ wc | n ∈ w j , ∀j ) (11)
= (1 − θ n )∑ pgn ( wc | w j ) pnk ( w j ) j
The value of p gn ( w c
| w j ) can be obtained by the Gibbs distribution [17] 1 pgn ( wc | w j ) = exp{−U ( wc )} z
(12)
U ( wc ) = ∑ β [1 − δ ( wc ,w∂g ρ )]
(13)
∂g
in which w∂g ρ denotes the neighborhood nodes and parent node of the central pixel g . In the orthogonal sum of (3), it can be deduced that a completely uncertain source has no impact on the result, then we control the iteration number by change the value of θ n , first we select an intial value for θ n , and increase the value of θ n as a step d during the iterative process, if the value of θ n is increased to 1, stop the iteration, otherwise, continue the iteration. We set an initial value θ n =0.4, and d =0.125 in this paper. 3.3 Multiscale Fusion Segmentation Formula Based on MAP We employ the MAP(maximum a posterior) segmentation approach in this paper. Let H be the set of pixels, and X = {x1 , x2 ,..., xH } the observed vectors, and the corresponding true classes for the pixels are represented by the set Ω = {wc1 , wc 2 ,..., wcH } . Suppose there are C classes in the image, and then each
wci ( i = 1,..., H ) can be one
of the C classes available. The MAP method is to optimize the estimation of class
946
Y. Wu et al.
ˆ by maximizing the global posterior probability p(Ω | X ) . From Bayesian labels Ω ˆ is found to be theorem [10], the MAP labeling Ω
ˆ = arg max{ p ( X | Ω) p(Ω)} Ω Ω
(14)
The distribution p(Ω) is the prior probability of the full scene labeling. Unfortunately, Bayesian methods coupled with Markovian modelization usually result in a nonconvex objective function, which could have many local maxima/minima. Global optimization algorithms, such as simulated annealing [18] , should be applied. Theoretically, if the cooling process is implemented infinitely slowly, simulated annealing algorithms can reach a global maximum/minimum in probability. However, both the calculation of the joint probability over the entire image and randomly perturbing each pixel value are extremely computationally demanding, which is very time-consuming. In this work, maximizing the posterior probability is implemented using the iterated conditional modes (ICM) algorithm, a deterministic optimization algorithm proposed in [19] that maximizes the local conditional probability iteratively. Using the ICM method, for the Contourlet coefficient at pixel g , its optimal hidden state can be iteratively updated by maximizing the local distribution p( wc | xg , wH \ g ) , that is
wˆ = arg{ p( wc | xg , wH \ g )}, ∀g c
where H \ g denotes the set of all the pixels in the image except pixel have the following expression [18]:
p ( wc | xg , wH \ g ) ∝ p ( xg | wc ) p ( wc | w∂g )
(15)
g , and we (16)
where w∂g is the labeling on the pixels in a neighborhood surrounding pixel
g , p ( xg | wc ) denotes the probability of the central pixel g if it belongs to class
wc . We will employ the following method to find it out. Now we proceed to how to solve (16). From the first section, it is known that the distribution of Contourlet coefficients is non-Gaussian but conditional-Gaussian. Then we adopt a mixture density of two normal distributions to model the Contourlet coefficients. To capture the insignificant/significant coefficient property, for each Contourlet coefficient, we define a binary hidden state s , which can take on the value 0 (insignificant coefficient) or 1 (significant coefficient). The configuration of s over the entire Contourlet subband image forms a binary mask s . The marginal pdf of Contourlet coefficients is defined as
f ( x g ) = ∑ p ( s = m ) p ( x g | s = m)
(17)
Fusion Segmentation Algorithm for SAR Images Based on HMT
947
2 where p( xg | s = m) ~ N (0,σ xm ) stands for a Gaussian distribution with a zero mean
2 and variance σ xm . p ( s = m) is the probability mass function (pmf) and p(s = 0) + p( s = 1) = 1 . Using this model, we can assume a prior probability function
for p( xg | wc ) . For an image of C texture classes, the marginal pdf of the Contourlet coefficients can be viewed as a mixture density of C normal Gaussian distributions, in which the
wc is considered as the hidden state. Thus, the measure of the pixel g with respect to the class wc is
class label
p ( xg | wc ) = ∑ p ( wc = m) p ( xg | wc = m) For the
σ xm and p( s = m)
(18)
in the coefficient, we use the EM algorithm to estimate
them, and make a rough segmentation for the image by (18). p ( wc | w∂g ) fuses the intrascale dependencies, and we employ the Gibbs distribution [19] to find it out.
1 p ( wc | w∂g ) = exp{−U ( wc )} z
(19)
where z is a regular factor, and U ( wc ) is an energy function. According to the Ising model, the energy exponent is expressed as
U ( wc ) = ∑ β [1 − δ ( wc , w∂g )] ∂g
where
δ ( wc , w∂g )
(20)
is the Kroneker delta, which is unity if the arguments are equal
and zero otherwise; the second argument implies that each member of the neighborhood is tested and the results are summed according to (20). β > 0 is a parameter with a value fixed by the user when applying the MRF technique. Thus, the ultimate fusion segmentation formula based on D-S theory of evidence is
wcH = arg max p ( xg | wc ) p ( wc | w∂g ρ ) wc ∈{1,2...C }
(21)
4 Experimental Results and Analysis For the Contourlet transform, we use the ‘9-7’ [20] biorthogonal filters for the multiscale decomposition stage and the ‘PKVA’ [21] filter for the directional decomposition stage. To test the validity and popularity of our proposed algorithm, four images are chosen. The first two are synthetic texture images and their noise version which
948
Y. Wu et al.
Case1:
Case2:
(a)
(b)
(c)
(d)
Fig. 5. The segmentation results of synthetic texture image: (a) synthetic texture image corrupted by multiplicative speckle noise of variance of 0.005; (b) manual segmentation; (c) Wavelet domain HMT-HMF Segmentation; (d) proposed method
Case1:
Case2:
Case3: (a)
(b)
(c)
(d)
(e)
Fig. 6. The segmentation results of real SAR image : (a) original image ; (b) Contourlet domain MRF segmentation; (c) Wavelet domain HMT-HMF segmentation; (d) Contourlet domain HMT segmentation; (e) proposed method
corrupted by multiplicative speckle noise of variance of 0.005, the others are real SAR images which are from Jet Propulsion Laboratory. We begin with two synthetic texture images to illustrate the performance of our algorithm, as shown in Fig.5. In Fig.5,Case1, we can observe that both the methods based on the wavelet and Contourlet can give good segmentation results. However, to
Fusion Segmentation Algorithm for SAR Images Based on HMT
949
segment a circle, our method can obtain a smoother contour, whereas the method based on the Wavelet can not. In Fig.5,Case2, the aiguilles are cut off immediately by the Wavelet method. This is due to the disability of the separable Wavelet, which is the extension of the one-dimensional Wavelet. These kinds of Wavelets can not optimally represent the bi-dimensional singularities, and our method based on the Contourlet shows a better description of the aiguilles. Similar behaviors are also observed in the other texture image. Table 1. Segmentation Misclassification Ratios of different algorithms for two texture images
Images HMT-MRF in Wavelet domain The proposed method
Case1
Case2
0
.0174
0
.0300
0
.0159
0
.0181
We give the misclassification rates as the objective evaluation criteria, as shown in Table 1. The misclassification rate is defined as the ratio of the sum of misclassification pixel points to the sum of all pixel points in a class. The evaluation of two synthetic texture images segmentation results are shown in Table 1. The segmentation of synthetic texture images can only show the effectiveness of the proposed algorithm, but can not show its extensive application. So we choose the real SAR images for test. Fig.6,Case1 is the L band, HH polarized airborne SAR image whose size is128 × 128 with ENL=1.4815. For comparison, the Contourlet domain HMT method, MRF method and Wavelet domain HMT-MRF method are also presented. In Fig.6,Case1, the two textures are land and sea, respectively. From Fig.6,Case1 (a), we can see that some regions in the land part have the same statistical property as in the sea, so the Contourlet domain MRF method which only considers neighbors and the HMT method which only captures the persistence of Contourlet coefficients can not get accurate segmentation, leading to the mistake. Our method which fuses the clustering and persistence of Contourlet coefficients using HMT model and D-S theory, effectively suppresses the influence of speckles and achieves superior contour representation and excellent visual performance. Fig.6,Case2 is the Ku band airborne SAR image whose size is 128 × 128 with ENL=3.2203. Fig.6,Case3 is the real airborne SAR images which is of the L band and whose size is 512 × 512 with ENL=2.8406 . Similar behaviors are also observed in the Fig.6,Case2 and Case3, which approve the validity of our method.
5 Conclusions In this paper we proposed a new fusion segmentation algorithm for SAR images based on HMT in Contourlet domain and D-S theory of evidence. This algorithm captures the persistence and clustering properties of Contourlet coefficients, which are modeled by HMT and D-S theory of evidence respectively. We employ MAP
950
Y. Wu et al.
approach to SAR images segmentations, and use relaxation approach to deduce the ultimate MAP fusion segmentation formula. We applied the proposed algorithm to synthetic texture images and real SAR images. Numerical experiments results show that the proposed algorithm effectively suppresses the effect of noise and provides superior contour representation and excellent visual performance.
References 1. Bovolo, F., Bruzzone, L.: A Detail-Preserving Scale-Driven Approach to Change Detection in Multitemporal SAR Images. IEEE Transaction on Geoscience and Remote Sensing 43(12), 2963–2972 (2005) 2. Pierce, L.E., Ulaby, F.T.: SAR Speckle Reduction using Wavelet Denoising and Markov Random Field Modeling. IEEE Trans on GRS 40(10), 2196–2212 (2002) 3. Wu, Y., Wang, X., Liao, G.-S.: SAR Images Despeckling via Bayesian Fuzzy Shrinkage Based on Stationary Wavelet Transform. Applied and Numerical Harmonic Analysis 12, 407–418 (2006) 4. Wu, Y., Zhang, Q., Wang, X., Liao, G.-S.: SAR Images Despeckling Based on Hidden Markov Mixture Model in the Wavelet Domain. In: CIE International Conference on Radar, ShangHai, China, pp. 16–19 (2006) 5. Zhang, Q., Wu, Y.: Wavelet Markov Random Field Based on Context and Hidden Class Label for SAR Image Segmentation. Journal of Electronics & Information Technology 30(1), 211–215 (2008) 6. Do, M.N., Vetterli, M.: Contourlets:a new directional multiresolution image representation. In: Signals, Systems and Computers,Conference Record of the Thirty-Sixth Asilomar Conference, Rochester, vol. 1, pp. 3–6 (2002) 7. Candès, E.J., Donoho, D.L.: Curvelets. Technical report, Stanford Univ (1999) 8. Ni, W., Guo, B., Yan, Y.: Speckle Suppression for SAR Images Based on Adaptive Shrinkage in Contourlet Domain. In: Proceeding of the 6th World Congress on Intelligent Control and Automation, pp. 10017–10027 (2006) 9. Miao, Q., Wang, B.: A Novel Image Fusion Method Using Contourlet Transform. IEEE Trans. Signal Proc., 548–552 (2006) 10. Bouman, C.A., Shapiro, M.: A multiscale random field model for Bayesian image segmentation. IEEE Trans. on Images Processing 3(2), 162–177 (1994) 11. Venkatachalam, V., Choi, H.: Multiscale SAR image segmentation using wavelet-domain hidden Markov tree model. In: SPIE, vol. 3497, pp. 141–151 (1998) 12. Choi, H., Baraniuk, R.G.: Multiscale Image Segmentation Using Wavelet-Domain Hidden Markov Models. IEEE Transaction on Image Processing 10(9), 1309–1321 (2001) 13. Sha, Y.-H., Jiao, L.-C.: Unsupervised Image Segmentation Using Contourlet Domain Hidden Markov Tree Model. In: Kamel, M.S., Campilho, A.C. (eds.) ICIAR 2005. LNCS, vol. 3656, pp. 32–39. Springer, Heidelberg (2005) 14. Richards, J.A., Jia, X.-P.: A Dempster-Shafer Relaxation Approach to Context Classification. IEEE transactions on geoscience and remote sensing 45(5), 1422–1431 (2007) 15. Po, D.Y., Do Duncan, M.N.: Directional multiscale modeling of images using the Contourlet transform. IEEE Trans. on Image Proc. 15(6), 1610–1620 (2003) 16. Crouse, S.: Matthew: Wavelet-Based Statistical Signal Processing Using Hidden Markov Models. IEEE Trans. on Signal Proc. 46(4), 886–902 (1998) 17. Derin, H., Cole, W.: Segmentation of textured images using Gibbs random fields. CVGIP 35(1), 72–98 (1986)
Fusion Segmentation Algorithm for SAR Images Based on HMT
951
18. Aarts, E.H.L.: Simulated Annealing and Boltzmann Machines: A Stochastic Approach to Combinatorial Optimization and Neural Computing. Wiley, New York (1989) 19. Besag, J.: On the statistical analysis of dirty pictures. J. R. Statist. Soc. B 48(3), 259–302 (1986) 20. Vetterli, M., Herley, C.: Wavelets and filter banks: Theory and desigh. IEEE Trans. Signal Proc. 40(9), 2207–2232 (1992) 21. Phoong, S.-M., Kim, C.W., Vaidyanathan, P.P., Ansari, R.: A new class of two-channel biorthogonal filter banks and wavelet bases. IEEE Trans. Signal Proc. 43(3), 649–665 (1995)
Author Index
Abbas, Zulkifly I-752 Abdullah-Al-Wadud, Mohammad II-764 Afzali, Hamidreza I-788 Ahmad, Fatimah I-752 Ahmadian, Kushan II-664 Ahn, Khi-Jung II-503 Ahn, Tae-Ki II-749 Aida, Kento I-118 Akbari, Mohammad I-533 Akhter, Shamim I-118 Al-Zakwani, Abdullah I-300 Alba, Mario I-79 Alesheikh, Ali Asghar I-17, I-66, I-442, I-479, I-543, II-572 Alimohammadi, Abbas I-17, I-442 Alipour, Mahdi I-788 Allirani, A. I-797 Amiri, Ali II-780 Amirian, Pouria I-543 Angelucci, Monica II-115 Apduhan, Bernady O. II-925 Asche, Hartmut I-285, I-490 Ashurzad Delcheh, Morteza I-788 Aslan, Burak Galip II-900 Atman, Nil¨ ufer II-900 Azula, Oier I-107 Ba¸ca ˜o, Fernando I-453 Babaei, Hamideh I-565 Badran, Fouad II-621 Bae, Sueng Jae II-237, II-250 Baek, Hoki II-513 Bai, Hexiang I-467 Baik, MaengSoon II-393 Baldasano, Jos´e Mar´ıa I-107 Barr, Stuart L. I-221 Bassiri, Anahid I-479, I-543 Behzadi, Saeed II-572 Bilancia, Massimo I-253, I-353 Biscione, Marilisa I-328 Blecic, Ivan I-313 Borruso, Andrea I-313 Bouyer, Asgarali I-775
Brimicombe, Allan J. I-300 Byun, EunJoung II-393 Cafer, Ferid II-59 Cai, Xiaojuan II-605 Cao, Feng I-176 Cao, Yuanda II-637 Caragliano, Simona I-79 Carlson, Jan II-43 Carneiro, Cl´ audio I-205 Cattani, Carlo I-729 Cattrysse, Dirk I-132 Cecchini, Arnaldo I-313 Chae, Oksam II-764 Chai, Kevin II-791 Chan, W.S. I-937 Chang, Elizabeth II-910 Chemin, Yann I-118 Chen, Yen Hung II-537 Cheng, Baodong II-637 Cho, Beoungil II-580 Choi, Bum-Gon II-237, II-250 Choi, Jongmyung I-627, I-649 Choi, SungJin II-393 Choi, Tae-Sun II-82, II-693 Choi, Wook-Jin II-693 Chong, Dahae II-314 Choo, Hyunseung II-393, II-465, II-479, II-525, II-557 Chu, Yul II-851 Chung, Min Young II-237, II-250 Chung, Tai-Myoung I-885, II-345 Ciligot-Travain, Marc I-426 Coluzzi, Rossella I-328 Costantini, Alessandro II-93, II-104, II-115 Crnkovi´c, Ivica II-43 Crocchianti, Stefano II-115 D’Acci, Luca I-237 D’Argenio, Antonio I-313 Danese, Maria I-50, I-328 Delgado-Mohatar, Oscar II-145 Dillon, Tharam II-791 Doma, Supraja II-13
954
Author Index
Dong, Hai II-910 Duan, Zhenhua II-29 Eom, Ki-Yeol II-749 Eom, Sangkeun I-148 Eom, Young Ik II-302 Ercan, M. Fikret I-897, II-547 Faginas Lago, Noelia II-93, II-104 Farooq Ahmad, Muhammad II-82 Fathy, Mahmood I-565, II-780 Frunzete, Madalin II-703 Fung, Yu Fai I-897 F´ uster-Sabater, Amparo II-145 Galli, Andrea I-369 Gasca, Rafael M. II-130 Gavrilova, Marina II-664, II-819 Gazzea, Nicoletta I-33 Ge, Yong I-176, I-467 Ghomshei, Mory I-693 Glorio, Octavio I-505 Golay, Fran¸cois I-205 Gonz´ alez-Aguilera, Diego I-520 Gottschalk, Larry II-13 Gu, Yonggen II-605 Guo, Youqiang II-653 Gupta, Pankaj I-949 Haddad, Hedi I-1 Haining, Robert I-269 Hamrah, Majid I-66 Han, Sangchul II-173 Hasan, Mohammad Khatim I-752, I-764 Hashim, Mazlan I-163 Hassan, Malik Tahir II-877 He, Jiangfeng II-677 He, Qinming II-677 Henriques, Roberto I-453 Hong, Choong Seon II-491 Hong, Kwang-Seok II-286 Hosseinali, Farhad I-17 Huarte-Larra˜ naga, Ferm´ın II-93, II-104 Huh, Eui-Nam I-841, I-853 Hussain, Farookh Khadeer II-910 Hwang, Sun-Min I-841 Inceo˘ glu, Mustafa Murat II-900 Ipbuker, Cengizhan I-553 Jabeur, Nafaa Jalali, Mnsour
I-1 I-775
Jang, Gyu-Jin II-749 Jang, Hyunsu II-302 Jang, Sung Ho I-810 Jelokhani-Niaraki, Mohammad Reza I-442 Jeong, Do-Un I-868 Jim´enez, Pedro I-107 Jo, Heasuk II-204 Jorba, Oriol I-107 Josselin, Didier I-426 Jun, Chulmin I-397 Junejo, Khurum Nazir II-877 Jung, Junwoo II-513 Jung, Young-woo II-302 Kang, Hyun-Kyu II-888 Kang, Min-Jae II-383 Karim, Asim II-877 Karimi, Mohammadbagher I-775 Kavousi, Amir I-94 Kew, Hsein-Ping I-868 Khoo, Khoongming II-863 Kim, Cheol Min I-821 Kim, Chul Soo II-503 Kim, Dong In II-276, II-372 Kim, Eun Cheol II-212, II-226 Kim, Gyu-Jin II-749 Kim, Hanil II-443 Kim, HongSoo II-393 Kim, Hye-Jin II-383 Kim, Hyeyoung I-397 Kim, Ik-Chan I-831 Kim, Jin-Seong II-335 Kim, Jinsoo II-888 Kim, Jin Young II-212, II-226 Kim, Jung-Hyun II-286 Kim, Junghwan II-888 Kim, Mijin II-204 Kim, Moon-hyun II-749 Kim, Moonseong II-361 Kim, Sang-Chul I-580 Kim, Sanghun II-325 Kim, Sang Joon II-383 Kim, Seki II-594 Kim, Seungjoo II-182, II-191, II-204 Kim, Taehoon I-912 Kim, Taekhoon II-372 Kim, Ung Mo II-173 Kang, Oh-Hyung I-593 Ko, Hoon I-649
Author Index
955
Ko, Kwang Sun II-302 Ko, Myeong-Cheol II-888 Kosek, Katarzyna I-636, II-261 Kwak, Ho-young II-443 Kwon, Hyeong-Joon II-286 Kwon, Sungoh II-250
Liu, Qizhong I-685 Liu, Wei II-734, II-835 Lo, Amanda II-719 Lobo, Victor I-453 Lu, Xinjie II-734, II-835 Luca, Adrian II-703
Lagan` a, Antonio II-93, II-104, II-115 Land, Rikard II-43 Larsson, Stig II-43 Lasaponara, Rosa I-328 Lazzari, Maurizio I-50 Le, Viet-Duc II-419 Lee, Bongkyu II-503 Lee, Eunseok II-580 Lee, Ji-Hyun I-604 Lee, Jin Ju II-250 Lee, Jong-Chan I-593, I-604, I-615 Lee, Jong-Hyouk I-885, II-345 Lee, Jongchan I-649 Lee, Jong Sik I-810 Lee, Jong Sung II-276 Lee, Junghoon I-821, I-831, II-383 Lee, Kwangwoo II-191 Lee, Myungsoo II-314 Lee, Sang-Won II-453 Lee, Seungil I-148 Lee, Seunjae I-413 Lee, Seunjun I-413 Lee, Shinhae I-413 Lee, Sungyoung II-419, II-432 Lee, Tae-Jin II-237, II-335 Lee, Young-Koo II-408, II-419 Lee, Youngpo II-325 Lee, Youngsook II-164 Lee, Youngyoon II-314, II-325 Lee, Yunho II-191 Leung, C.H.C. I-937 Li, Chao I-300 Li, Ming I-717, II-937 Li, Xiang I-897, II-547 Li, Xin II-835 Li, Yang I-300 Liang, Ximing II-547 Liao, Zaifei II-734, II-835 Lim, Chu-Wee II-863 Lim, Jaesung II-513 Liu, Chi-Lun II-1 Liu, J. I-937 Liu, Jigang II-13
Ma, Yong Beom I-810 Mahajan, Yogesh I-341 Majid, Abdul II-82, II-693 Malek, Mohammad R. I-479, I-543, II-572 Mancera-Taboada, Juan I-520 Marcheggiani, Ernesto I-369 Mart´ın, Fernando I-107 Masini, Nicola I-328 Mattioli, Gianni I-745 Mehlawat, Mukesh Kumar I-949 Meshkani, Mohammad Reza I-94 Meydanipour, Gelare I-788 Milani, Alfredo I-924 Misra, Sanjay II-59, II-70 Mittal, Garima I-949 Monowar, Muhammad Mostafa II-491 Montagnino, Fabio I-313 Montrone, Silvestro I-253, I-353 Moon, Hyun-Joo I-627, I-649 Mousavi, Ali I-533 Murgante, Beniamino I-50, I-328 Mutka, Matt W. II-361 Na, Gap-Joo II-453 Nam, Junghyun II-173 Nardi, Luigi II-621 Natkaniec, Marek I-636, II-261 Niyogi, Rajdeep I-924 No, Si-Young I-615 Noor, Norzailawati Mohd. I-163 Nourjou, Reza I-66 Nucci, Michele I-369 Oh, Seungtak II-479 Orshoven, Jos Van I-132 Othman, Mohamed I-752, I-764 Ourednik, Andr´e I-189 Paik, Juryon II-173 Palacios, Magdalena I-107 Palomino, Inmaculada I-107 Pamungkas, Rela Puteri II-806 Pan, Qifeng I-717
956
Author Index
Park, Deok-Gyu I-615 Park, Gyung-Leen I-831, II-443, II-503 Park, Kisoeb II-594 Park, Minkyu II-173 Park, Sang-Joon I-593, I-604, I-615 Park, Sangjoon I-649 Park, Seong-Soo I-885, II-345 Park, Wongil I-662 Pastaki Rad, Milad I-788 Pei, Xuezhu II-653 Perchinunno, Paola I-253, I-353 Piccione, Maria Elena I-33 Plaza L´ opez, H´ector II-155 Potdar, Vidyasagar II-791 Pozo, Sergio II-130 Pusca, Lucian I-671 Pyo, Seong-Bae I-593, I-604 Qian, Feng
II-677
Rahayu, Wenny J. II-925 Rahman, Md. Obaidur II-491 Rajabi, Mohammad A. I-533 Rhee, Yang-Won I-593, I-604 Rhee, Yoon-Jung I-831, II-503 Rodionov, Alexey II-465, II-557 Rodionova, Olga II-557 Rodr´ıguez-Gonz´ alvez, Pablo I-520 Romoozi, Mojtaba I-565 Romoozi, Morteza I-565 Rota, Renato I-79 Sadeghi-Niaraki, Abolghasem I-442 Safar, Maytham II-819 Saha, Abhijit II-443, II-503 Sajedi Badashian, Ali I-788 Sakamoto, Keigo I-118 S´ anchez-Chaparro, Fernando II-145 Scaioni, Marco I-79 Scalia, Massimo I-745 Shad, Rouzbeh I-66 Shamsuddin, Siti Mariyam II-806 Shaw, William T. II-594 Shin, In-Hye I-821, I-831, II-383, II-443 Shin, Junghwa I-912 Shin, Seong-Yoon I-593, I-604, I-615 Shiratori, Norio II-925 Shoyaib, Mohammad II-764 Sierra, Jos´e M. II-145 Silva, Vitor I-205
Sokolova, Olga II-465 Son, Jae Sok II-851 Song, Biao I-853 Song, Chonghan II-314, II-325 Song, MoonBae II-525 Song, Wanqing I-685 Sorror, Charles II-621 Srinivasan, Bala II-819 Stankut˙e, Silvija I-490 Steele, Robert II-719 Sten Hansen, Henning I-385 Suarez Touceda, Diego II-155 Suganthi, M. I-797 Sulaiman, Jumat I-752, I-764 Tak, Sungwoo I-912 Tan, Su-Yin I-269 Taniar, David II-925, II-819 Tao, Zhifu I-717 Taramelli, Andrea I-33 T´ellez Isaac, Jes´ us II-155 Teng, Xudong I-708 Thiria, Sylvie II-621 Tian, Yuan I-853 Torre, Carmelo Maria I-253 Triantakonstantis, Dimitrios P. Trujillo, Juan I-505 Trunfio, Giuseppe A. I-313
I-221
Uchibayashi, Toshihiro II-925 Uehara, Tetsutaro II-13 Uppal, Amit II-851 Vafaeinezhad, Ali Reza I-66 Vahidnia, Mohammad H. I-17 Valentini, Emiliana I-33 Vanegas, Pablo I-132 Varela-Vaca, A.J. II-130 Vecchiocattivi, Marco II-115 Venkatachalam, Parvatham I-341 Verma, Shilpi I-949 Viedma Astudillo, Miguel II-155 Villecco, Francesco I-693 Vivanco, Marta G. I-107 Vlad, Adriana II-703 Vollero, Luca I-636, II-261 Wang, Hongan II-734, II-835 Wang, Jin II-408, II-432 Wang, Xin II-937 Wolff, Markus I-285
Author Index Won, Dongho II-164, II-182, II-191, II-204 Wong, Ford Long II-863 Woo, Chong-Woo I-580 Wu, Yan II-937 Xuan, Kefeng
Yoon, Seokho II-314, II-325 Yoon, Taeung II-314, II-325 Youn, Hyunsang II-580 Yu, Qiu I-699 Yuan, Xiao I-708 Yurgenson, Anastasia II-465
II-819
Yang, Tian II-734, II-835 Yao, Meng I-717 Yeom, Sae-Hun I-627 Yi, Wansuck II-182 Yoo, Chae-Woo I-627
Zhang, Li-Ping I-699 Zhang, Yu II-637 Zhang, Zijun II-653 Zhao, Geng II-819 Zhao, Jinqiang II-29 Zong, Haitao II-937
957