Communications in Computer and Information Science
87
Filip Zavoral Jakub Yaghob Pit Pichappan Eyas El-Qawasmeh (Eds.)
Networked Digital Technologies Second International Conference, NDT 2010 Prague, Czech Republic, July 7-9, 2010 Proceedings, Part I
13
Volume Editors Filip Zavoral Charles University Prague, Czech Republic E-mail:
[email protected] Jakub Yaghob Charles University Prague, Czech Republic E-mail:
[email protected] Pit Pichappan Al Imam University Riyadh, Saudi Arabia E-mail:
[email protected] Eyas El-Qawasmeh Jordan University of Science and Technology Irbid, Jordan E-mail:
[email protected]
Library of Congress Control Number: Applied for CR Subject Classification (1998): H.4, C.2, H.3, I.2, D.2, H.5 ISSN ISBN-10 ISBN-13
1865-0929 3-642-14291-5 Springer Berlin Heidelberg New York 978-3-642-14291-8 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. springer.com © Springer-Verlag Berlin Heidelberg 2010 Printed in Germany Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper 06/3180 543210
Message from the Chairs
The Second International Conference on ‘Networked Digital Technologies’ (NDT2010)––co-sponsored by Springer––was organized and hosted by the Charles University in Prague, Czech Republic, during July 7–9, 2010 in association with the Digital Information Research Foundation, India. NDT2010 was planned as a major event in the computer and information sciences and served as a forum for scientists and engineers to meet and present their latest research results, ideas, and papers in the diverse areas of Web and Internet technologies, computer science, and information technology. This scientific conference included guest lectures and the presentation of 85 research papers in the technical session. This meeting was a great opportunity to exchange knowledge and experience for all the participants who joined us from all over the world and to discuss new ideas in the area of Web applications. We are grateful to the Charles University in Prague for hosting this conference. We use this occasion to express our thanks to the Technical Committee and to all the external reviewers. We are grateful to Springer for co-sponsoring the event. Finally, we would like to thank all the participants and sponsors.
May 2010
Filip Zavoral Mark Wachowiak Jakub Yaghob Veli Hakkoymaz
Preface
On behalf of the NDT 2010 conference, the Program Committee and Charles University in Prague, Czech Republic, we welcome you to the proceedings of the Second International Conference on ‘Networked Digital Technologies’ (NDT 2010). The NDT 2010 conference explored new advances in digital and Web technology applications. It brought together researchers from various areas of computer and information sciences who addressed both theoretical and applied aspects of Web technology and Internet applications. We hope that the discussions and exchange of ideas that took place will contribute to advancements in the technology in the near future. The conference received 216 papers, out of which 85 were accepted, resulting in an acceptance rate of 39%. These accepted papers are authored by researchers from 34 countries covering many significant areas of Web applications. Each paper was evaluated by a minimum of two reviewers. Finally, we believe that the proceedings document the best research in the studied areas. We express our thanks to the Charles University in Prague, Springer, the authors and the organizers of the conference.
May 2010
Filip Zavoral Mark Wachowiak Jakub Yaghob Veli Hakkoymaz
Organization
General Chairs Filip Zavoral Mark Wachowiak
Charles University, Czech Republic Nipissing University, Canada
Program Chairs Jakub Yaghob Veli Hakkoymaz
Charles University, Czech Republic Fatih University, Turkey
Program Co-chairs Noraziah Ahmad Yoshiro Imai Eyas El-Qawasmeh
University Malaysia Pahang, Malaysia Kagwa University, Japan Jordan University of Science and Technology, Jordan
Publicity Chair Maytham Safar
Kuwait University, Kuwait
Proceedings Chair Pit Pichappan
Al Imam University, Saudi Arabia
Table of Contents – Part I
Information and Data Management A New Approach for Fingerprint Matching Using Logic Synthesis . . . . . . Fatih Ba¸sc¸ift¸ci and Celal Karaca
1
Extracting Fuzzy Rules to Classify Motor Imagery Based on a Neural Network with Weighted Fuzzy Membership Functions . . . . . . . . . . . . . . . . . Sang-Hong Lee, Joon S. Lim, and Dong-Kun Shin
7
Distributed Data-Mining in the LISp-Miner System Using Techila Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ˇ unek and Teppo Tammisto Milan Sim˚
15
Non-negative Matrix Factorization on GPU . . . . . . . . . . . . . . . . . . . . . . . . . Jan Platoˇs, Petr Gajdoˇs, Pavel Kr¨ omer, and V´ aclav Sn´ aˇsel
21
Chatbot Enhanced Algorithms: A Case Study on Implementation in Bahasa Malaysia Human Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abbas Saliimi Lokman and Jasni Mohamad Zain
31
Handwritten Digits Recognition Based on Swarm Optimization Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Salima Nebti and Abdellah Boukerram
45
A Framework of Dashboard System for Higher Education Using Graph-Based Visualization Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wan Maseri Binti Wan Mohd, Abdullah Embong, and Jasni Mohd Zain An Efficient Indexing and Compressing Scheme for XML Query Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I-En Liao, Wen-Chiao Hsu, and Yu-Lin Chen
55
70
Development of a New Compression Scheme . . . . . . . . . . . . . . . . . . . . . . . . . Eyas El-Qawasmeh, Ahmed Mansour, and Mohammad Al-Towiq
85
Compression of Layered Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bruno Carpentieri
91
Classifier Hypothesis Generation Using Visual Analysis Methods . . . . . . . Christin Seifert, Vedran Sabol, and Michael Granitzer
98
Exploiting Punctuations along with Sliding Windows to Optimize STREAM Data Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lokesh Tiwari and Hamid Shahnasser
112
XII
Table of Contents – Part I
A Framework for In-House Prediction Markets . . . . . . . . . . . . . . . . . . . . . . . Miguel Velacso and Nenad Jukic Road Region Extraction Based on Motion Information and Seeded Region Growing for Foreground Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . Hongwu Qin, Jasni Mohamad Zain, Xiuqin Ma, and Tao Hai Process Mining Approach to Promote Business Intelligence in Iranian Detectives’ Police . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mehdi Ghazanfari, Mohammad Fathian, Mostafa Jafari, and Saeed Rouhani Copyright Protection of Relational Database Systems . . . . . . . . . . . . . . . . . Ali Al-Haj, Ashraf Odeh, and Shadi Masadeh Resolving Semantic Interoperability Challenges in XML Schema Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chiw Yi Lee, Hamidah Ibrahim, Mohamed Othman, and Razali Yaakob Some Results in Bipolar-Valued Fuzzy BCK/BCI-Algebras . . . . . . . . . . . A. Borumand Saeid and M. Kuchaki Rafsanjani
120
128
135
143
151
163
Security The Effect of Attentiveness on Information Security . . . . . . . . . . . . . . . . . . Adeeb M. Alhomoud
169
A Secured Mobile Payment Model for Developing Markets . . . . . . . . . . . . Bossi Masamila, Fredrick Mtenzi, Jafari Said, and Rose Tinabo
175
Security Mapping to Enhance Matching Fine-Grained Security Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Monia Ben Brahim, Maher Ben Jemaa, and Mohamed Jmaiel
183
Implementation and Evaluation of Fast Parallel Packet Filters on a Cell Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yoshiyuki Yamashita and Masato Tsuru
197
On the Algebraic Expression of the AES S-Box Like S-Boxes . . . . . . . . . . M. Tolga Sakallı, Bora Aslan, Ercan Bulu¸s, Andac S ¸ ahin Mesut, Fatma B¨ uy¨ uksara¸co˘glu, and Osman Karaahmeto˘glu
213
Student’s Polls for Teaching Quality Evaluation as an Electronic Voting System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Marcin Kucharczyk
228
An Improved Estimation of the RSA Quantum Breaking Success Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Piotr Zawadzki
234
Table of Contents – Part I
Mining Bluetooth Attacks in Smart Phones . . . . . . . . . . . . . . . . . . . . . . . . . Seyed Morteza Babamir, Reyhane Nowrouzi, and Hadi Naseri
XIII
241
Users’ Acceptance of Secure Biometrics Authentication System: Reliability and Validate of an Extended UTAUT Model . . . . . . . . . . . . . . . Fahad AL-Harby, Rami Qahwaji, and Mumtaz Kamala
254
Two Dimensional Labelled Security Model with Partially Trusted Subjects and Its Enforcement Using SELinux DTE Mechanism . . . . . . . . Jaroslav Jan´ aˇcek
259
A Roaming-Based Anonymous Authentication Scheme in Multi-domains Vehicular Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chih-Hung Wang and Po-Chin Lee
273
Human Authentication Using FingerIris Algorithm Based on Statistical Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ahmed B. Elmadani
288
Aerial Threat Perception Architecture Using Data Mining . . . . . . . . . . . . . M. Anwar-ul-Haq, Asad Waqar Malik, and Shoab A. Khan
297
Payload Encoding for Secure Extraction Process in Multiple Frequency Domain Steganography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Raoof Smko, Abdelsalam Almarimi, and K. Negrat
306
An Implementation of Digital Image Watermarking Based on Particle Swarm Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hai Tao, Jasni Mohamad Zain, Ahmed N. Abd Alla, and Qin Hongwu
314
Genetic Cryptanalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abdelwadood Mesleh, Bilal Zahran, Anwar Al-Abadi, Samer Hamed, Nawal Al-Zabin, Heba Bargouthi, and Iman Maharmeh
321
Multiple Layer Reversible Images Watermarking Using Enhancement of Difference Expansion Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shahidan M. Abdullah and Azizah A. Manaf
333
Modeling and Analysis of Reconfigurable Systems Using Flexible Nets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Laid Kahloul, Allaoua Chaoui, and Karim Djouani
343
Using Privilege Chain for Access Control and Trustiness of Resources in Cloud Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jong P. Yoon and Z. Chen
358
XIV
Table of Contents – Part I
Social Networks Modeling of Trust to Provide Users Assisted Secure Actions in Online Communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lenuta Alboaie and Mircea-F. Vaida
369
A Collaborative Social Decision Model for Digital Content Credibility Improvement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yuan-Chu Hwang
383
Improving Similarity-Based Methods for Information Propagation on Social Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Francesco Buccafurri and Gianluca Lax
391
Approaches to Privacy Protection in Location-Based Services . . . . . . . . . . Anna Rohunen and Jouni Markkula Social Media as Means for Company Communication and Service Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Elina Annanper¨ a and Jouni Markkula A Problem-Centered Collaborative Tutoring System for Teachers Lifelong Learning: Knowledge Sharing to Solve Practical Professional Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Thierry Condamines Bridging the Gap between Web 2.0 Technologies and Social Computing Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Giorgos Kormaris and Marco Spruit
402
410
420
430
Ontology Using Similarity Values for Ontology Matching in the Grid . . . . . . . . . . . . Axel Tenschert Rapid Creation and Deployment of Communities of Interest Using the CMap Ontology Editor and the KAoS Policy Services Framework . . . . . . Andrzej Uszok, Jeffrey M. Bradshaw, Tom Eskridge, and James Hanna
444
451
Incorporating Semantics into an Intelligent Clothes Search System Using Ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ching-I Cheng, Damon Shing-Min Liu, and Li-Ting Chen
467
SPPODL: Semantic Peer Profile Based on Ontology and Description Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Younes Djaghloul and Zizette Boufaida
473
Table of Contents – Part I
Ontology Based Tracking and Propagation of Provenance Metadata . . . . Miroslav Vacura and Vojtˇech Sv´ atek
XV
489
Real Time Biometric Solutions for Networked Society A Real-Time In-Air Signature Biometric Technique Using a Mobile Device Embedding an Accelerometer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ´ J. Guerra Casanova, C. S´ anchez Avila, A. de Santos Sierra, G. Bailador del Pozo, and V. Jara Vera On-Demand Biometric Authentication of Computer Users Using Brain Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Isao Nakanishi and Chisei Miyamoto Encrypting Fingerprint Minutiae Templates by Random Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bian Yang, Davrondzhon Gafurov, Christoph Busch, and Patrick Bours
497
504
515
Web Applications Method for Countering Social Bookmarking Pollution using User Similarities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Takahiro Hatanaka and Hiroyuki Hisamatsu
523
A Human Readable Platform Independent Domain Specific Language for WSDL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Balazs Simon and Balazs Goldschmidt
529
A Human Readable Platform Independent Domain Specific Language for BPEL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Balazs Simon, Balazs Goldschmidt, and Karoly Kondorosi
537
Impact of the Multimedia Traffic Sources in a Network Node Using FIFO scheduler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tatiana Annoni Pazeto, Renato Moraes Silva, and Shusaburo Motoyama
545
Assessing the LCC Websites Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Saleh Alwahaishi and V´ aclav Sn´ aˇsel
556
Expediency Heuristic in University Conference Webpage . . . . . . . . . . . . . . Roslina Mohd Sidek, Noraziah Ahmad, Mohamad Fadel Jamil Klaib, and Mohd Helmy Abd Wahab
566
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
577
Table of Contents – Part II
Green Computing and Health Care Informatics Lot-Size Planning with Non-linear Cost Functions Supporting Environmental Sustainability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Markus Heck and Guenter Schmidt Electronic Health Record (Dossier M´edical Personnel) as a Major Tool to Improve Healthcare in France: An Approach through the Situational Semiotic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christian Bourret Data Mining Technique for Medical Diagnosis Using a New Smooth Support Vector Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Santi Wulan Purnami, Jasni Mohamad Zain, and Abdullah Embong
1
7
15
Rule Induction as a Technique in Genomic Analysis for Cancer . . . . . . . . M. Adib, Md. Mosharrof Hossain Sarker, S. Syed Ahmed, Ezendu Ariwa, and Fuzail Siddiqui
28
Clustering Analysis for Vasculitic Diseases . . . . . . . . . . . . . . . . . . . . . . . . . . Pınar Yıldırım, Cınar ¸ Ceken, ¸ Ka˘gan Ceken, ¸ and Mehmet R. Tolun
36
Analysis on the Characteristics of Electroencephalogram (EEG) and the Duration of Acupuncture Efficacy, Depending on the Stimulation at the Acupuncture Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jeong-Hoon Shin and Dae-Hyeon Park
46
Web Services Architecture, Modeling and Design Open Service Platform Based Context-Aware Services across Home . . . . . Jin-Neng Wu and Yu-Chang Chao
60
Web Services Testing Approaches: A Survey and a Classification . . . . . . . Mohamad I. Ladan
70
Benefits of Semantics on Web Service Composition from a Complex Network Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chantal Cherifi, Vincent Labatut, and Jean-Fran¸cois Santucci
80
Development Tool for End-to-End QoS Sensitive Frameworks and Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bogdan Iancu, Adrian Peculea, and Vasile Teodor Dadarlat
91
XVIII
Table of Contents – Part II
Learning-Based Call Admission Control Framework for QoS Management in Heterogeneous Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abul Bashar, Gerard Parr, Sally McClean, Bryan Scotney, and Detlef Nauck
99
A Multi-Objective Particle Swarm Optimization for Web Service Composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hamed Rezaie, Naser NematBaksh, and Farhad Mardukhi
112
A Comparison between EJB and COM+ Business Components, Case Study: Response Time and Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abedulhaq Abu-Kamel, Raid Zaghal, and Osama Hamed
123
Integration of Similar Location Based Services Proposed by Several Providers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Roula Karam, Franck Favetta, Rima Kilany, and Robert Laurini
136
Distributed and Parallel Computing A Central Management for Reducing Volumes of Data Harvested from Distributed Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Min-hwan Ok and Duck-shin Park A Trial Evaluation of Distributed Campus Network Environment Based on Comparison of Theoretical and Real Performance of Packet Flow Amount Using Video Transmission System . . . . . . . . . . . . . . . . . . . . . . . . . . Yoshiro Imai, Yukio Hori, Kazuyoshi Kawauchi, Mayumi Kondo, Toshikazu Sone, Yoshitaka Seno, Nobue Kawada, Shinobu Tada, Miho Yokoyama, and Rieko Miki Locality Preserving Scheme of Text Databases Representative in Distributed Information Retrieval Systems . . . . . . . . . . . . . . . . . . . . . . . . . . Mohammad Hassan and Yaser Hasan
145
152
162
Neural Networks Solving the Problem of Flow Shop Scheduling by Neural Network Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Saeed Rouhani, Mohammad Fathian, Mostafa Jafari, and Peyman Akhavan Artificial Neural Network-Based Algorithm for ARMA Model Order Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Khaled E. Al-Qawasmi, Adnan M. Al-Smadi, and Alaa Al-Hamami
172
184
Table of Contents – Part II
Efficient Substructure Preserving MOR Using Real-Time Temporal Supervised Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Othman M.K. Alsmadi, Zaer. S. Abo-Hammour, and Adnan M. Al-Smadi
XIX
193
E-Learning Dynamic Properties of Knowledge Networks and Student Profile in e-Learning Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Radoslav Fasuga, Libor Holub, and Michal Radeck´y
203
An Optimized Cost-Benefit Analysis for the Evaluation in E-Learning Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gianni Fenu and Massimiliano Picconi
215
Services Recommendation in Systems Based on Service Oriented Architecture by Applying Modified ROCK Algorithm . . . . . . . . . . . . . . . . . Agnieszka Prusiewicz and Maciej Zieba
226
Web Mining Mining Website Log to Improve Its Findability . . . . . . . . . . . . . . . . . . . . . . Jiann-Cherng Shieh
239
Mining Relations between Wikipedia Categories . . . . . . . . . . . . . . . . . . . . . Julian Szyma´ nski
248
Web Document Classification by Keywords Using Random Forests . . . . . Myungsook Klassen and Nikhila Paturi
256
Wireless Networks Minimizing the Effects of Multi-rate WLANs by Adapting Link Adaptation and Call Admission Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . M. Fatih T¨ uys¨ uz and Hacı A. Mantar Marmot: A Novel Low-Power Platform for WSNs . . . . . . . . . . . . . . . . . . . . P´eter V¨ olgyesi, J´ anos Sallai, S´ andor Szilv´ asi, Prabal Dutta, and ´ Akos L´edeczi Steerable Distributed Large-Aperture Audio Array Using Low-Power Wireless Acoustic Sensor Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ´ J´ anos Sallai, Akos L´edeczi, Xenofon Koutsoukos, and P´eter Volgyesi An Experimental Wireless Platform for Acoustic Source Localization . . . S´ andor Szilv´ asi and P´eter V¨ olgyesi
262
274
281
289
XX
Table of Contents – Part II
A Simulation Discipline in OpenUP to Satisfy Wireless Sensor Networks Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gian Ricardo Berkenbrock and Celso Massaki Hirata
295
Architecture for Interoperability between Instant Messaging and Presence Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Patricia E. Figueroa and Jes´ us A. P´erez
306
An Approach towards Time Synchronization Based Secure Protocol for Wireless Sensor Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Arun Kumar Tripathi and Ajay Agarwal
321
Intelligent Agent Based Systems, Cognitive and Reactive AI Systems Agent Behavior Diagram for Intelligent Agents . . . . . . . . . . . . . . . . . . . . . . Michal Radeck´ y, Petr Gajdoˇs, and Radoslav Fasuga Multi-agent System Environment Based on Repeated Local Effect Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kazuho Igoshi, Takao Miura, and Isamu Shioya
333
342
Hierarchical Model of Trust in Contexts . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jan Samek and Frantisek Zboril
356
Multi-Agent Linear Array Sensors Modeling . . . . . . . . . . . . . . . . . . . . . . . . . Benadda Belkacem and Fethi Tarik Bendimerad
366
A Framework for Intelligent Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Diana F. Adamatti
376
Agent-Based Digital Networking in Furniture Manufacturing Enterprises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anthony Karageorgos, Dimitra Avramouli, Christos Tjortjis, and Georgios Ntalos Detecting Malwares in Honeynet Using a Multi-agent System . . . . . . . . . . Michal Szczepanik and Ireneusz J´ o´zwiak Reputation Model with Forgiveness Factor for Semi-competitive E-Business Agent Societies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Radu Burete, Amelia B˘ adic˘ a, and Costin B˘ adic˘ a RoadMic: Road Surface Monitoring Using Vehicular Sensor Networks with Microphones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Artis Mednis, Girts Strazdins, Martins Liepins, Andris Gordjusins, and Leo Selavo
381
396
402
417
Table of Contents – Part II
Model Generated Interface for Modeling and Applying Decisional Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Thomas Tamisier, Yoann Didry, Olivier Parisot, J´erˆ ome Wax, and Fernand Feltz
XXI
430
Information and Data Management Directed Graph Representation and Traversal in Relational Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mohammad Beydoun and Ramzi A. Haraty
443
Transferring Clinical Information between Heterogeneous Hospital Database Systems in P2P Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Meghdad Mirabi, Hamidah Ibrahim, and Leila Fathi
456
Lowest Data Replication Storage of Binary Vote Assignment Data Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Noraziah Ahmad, Ainul Azila Che Fauzi, Roslina Mohd. Sidek, Noriyani Mat Zin, and Abul Hashem Beg The Location Path to Hell Is Paved With Unoptimized Axes: XPath Implementation Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Martin Kruliˇs and Jakub Yaghob Neighbour Replica Transaction Failure Framework in Data Grid . . . . . . . Noraziah Ahmad, Noriyani Mat Zin, Roslina Mohd. Sidek, Mohammad Fadel Jamil Klaib, and Mohd. Helmy Abd Wahab Mobile Agent-Based Digital Rights Management Scheme Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bartlomiej Zi´ olkowski and Janusz Stoklosa A Toolkit for Application Deployment on the Grid . . . . . . . . . . . . . . . . . . . Jie Tao and Holger Marten A Parallel Tree Based Strategy for Test Data Generation and Cost Calculation for Pairwise Combinatorial Interaction Testing . . . . . . . . . . . . Mohammad Fadel Jamil Klaib, Sangeetha Muthuraman, Noraziah Ahmad, and Roslina Mohd Sidek Integrity Check for Printed Binary Document Images . . . . . . . . . . . . . . . . . Dave Elliman, Peter Blanchfield, and Ammar Albakaa FACE – A Knowledge-Intensive Case-Based Architecture for Context-Aware Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Monica Vladoiu, J¨ org Cassens, and Zoran Constantinescu
466
474
488
496
503
509
523
533
XXII
Table of Contents – Part II
Application of Genetic Algorithm in Automatic Software Testing . . . . . . . Faezeh Sadat Babamir, Alireza Hatamizadeh, Seyed Mehrdad Babamir, Mehdi Dabbaghian, and Ali Norouzi Reliability Optimization of Complex Systems Using Genetic Algorithm under Criticality Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Samer Hamed, Belal Ayyoub, and Nawal Al-Zabin A Novel Technique for ARMA Modelling with Order and Parameter Estimation Using Genetic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zaer. S. Abo-Hammour, Othman M.K. Alsmadi, and Adnan M. Al-Smadi
545
553
564
Networks Metadata Management in P2P over Mobile Ad Hoc Network . . . . . . . . . . Pekka Kaipio and Jouni Markkula
577
Prediction of Network Delay with Variable Standard Deviation, Skewness and Kurtosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Renads Safins
587
A New Computational Model to Evaluate the Quality of Perceptual Voice Using E-Model in VOIP Communications . . . . . . . . . . . . . . . . . . . . . . Meysam Alavi and Hooman Nikmehr
594
Modeling and Verification of RBAC Security Policies Using Colored Petri Nets and CPN-Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Laid Kahloul, Karim Djouani, Walid Tfaili, Allaoua Chaoui, and Yacine Amirat GSM-Based Notification System for Electronic Pigeon Hole . . . . . . . . . . . . Mohd Helmy Abd Wahab, Ahmad Al’ Hafiz Riman, Herdawatie Abdul Kadir, Rahmat Sanudin, Ayob Johari, Roslina Mohd Sidek, and Noraziah Ahmad An Efficient Alert Broadcasting Scheme Considering Various Densities in VANET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hyunsook Kim New Secure Communication Protocols for Mobile E-Health System . . . . . M. Aramudhan and K. Mohan Determination of IDS Agent Nodes Based on Three-Tiered Key Management Framework for MANET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Marjan Kuchaki Rafsanjani and Arsham Borumand Saeid On Wind Power Station Production Prediction . . . . . . . . . . . . . . . . . . . . . . Jiˇr´ı Dvorsk´ y, Stanislav Miˇsa ´k, Luk´ aˇs Prokop, and Tadeusz Sikora
604
619
631 639
648 656
Table of Contents – Part II
Packet Count Based Routing Mechanism – A Load Balancing Approach in MANETS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bollam Nagarjun, L. Sathish, S. Santhosh Chaitanya, Md. Tanvir Ansari, and Shashikala Tapaswi A Comparative Study of Statistical Feature Reduction Methods for Arabic Text Categorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fouzi Harrag, Eyas El-Qawasmeh, and Abdul Malik S. Al-Salman A Scalable Framework for Serializable XQuery . . . . . . . . . . . . . . . . . . . . . . . Sebastian B¨ achle and Theo H¨ arder Artificial Neural Network Based Technique Compare with “GA” for Web Page Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ali Alarabi and Kamta Nath Mishra
XXIII
669
676 683
699
Generating XForms from an XML Schema . . . . . . . . . . . . . . . . . . . . . . . . . . J´ an Kasarda, Martin Neˇcask´ y, and Tom´ aˇs Bartoˇs
706
Semantic Information Retrieval on Peer-to-Peer Networks . . . . . . . . . . . . . Mehmet Ali Ert¨ urk, A. Halim Zaim, and Selim Akyoku¸s
715
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
727
A New Approach for Fingerprint Matching Using Logic Synthesis Fatih Başçiftçi and Celal Karaca Tehncical Education Faculty, Selcuk University, Konya, Turkey
[email protected],
[email protected]
Abstract. In this study, a new approach based on logic synthesis to match feature points from fingerprint images is developed and introduced. For fingerprint matching we propose logic minimization method to make matching process easy and fast. We think that use logic minimization method might be used as a reliable in fingerprint recognition. Keywords: Biometric, fingerprint recognition, logic synthesis.
1 Introduction of Biometrics and Fingerprint A biometric is a physical or psychological feature that can be measured and quantified. This quantified feature can be used to authenticate a person with a degree of certainty by comparing different measurements of this feature. Clearly the degree of certainty depends on the type and quality of the biometric and the authentication algorithm used. Fingerprint biometrics was one of the first biometrics that is used for identification and authentication purposes. It is still widely used in many areas and people accept that fingerprints are unique and can be used for identification. Since it is widely used, it is crucial to have a secure fingerprint authentication system. Generally macro and micro features are used to identify a fingerprint image [1]. Macro features can be seen with the naked eye but to see the micro features, a sensor device is necessary. The most common macro features are ridge patterns as illustrated in Figure 1.On the other hand, common minutia points (i.e. micro features) are ridge ending, ridge bifurcation and dot (or island) as illustrated in Figure 2. Some of the main macro and micro features are marked in Figure 3. In our study, we will work on feature points determined fingerprint image to match feature points between two fingerprint images by using logic minimization. Feature points will be taken an important role for our matching process.
2 Logic Synthesis Two-level logic minimization is a basic problem in logic synthesis [3, 4]. The minimization of Boolean Functions (BFs) can lead to more effective computer programs and circuits. Minimizing functions can be important because, electrical circuits consist of individual components that are implemented for each term or literal for a given function. This allows designers to make use of fewer components, thus reducing the cost of a particular system [5]. F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 1–6, 2010. © Springer-Verlag Berlin Heidelberg 2010
2
F. Başçiftçi and C. Karaca
Arch
Loop
Whorl
Fig. 1. Ridge Patterns
Ridge Ending
Bifurcation
Dot
Fig. 2. Micro Features
Fig. 3. Macro Micro Features
A wide variety of Boolean minimization techniques have been explained in [3-8], most of which work on two-step principles: first, identifying the prime implicants (PIs) of chosen On-minterm and second, determining a set of the essential prime implicants (EPIs). Since the size of the PIs can be as large as 3n/n for a function of n variables. The PIs identification step can become computationally impractical as n increases [4, 6]. In our study, we used logic synthesis to minimize fingerprint points from scanned fingerprint image. Using logic minimization we do not have to match all fingerprint points on the image, we only look at the feature points determined fingerprint image.
A New Approach for Fingerprint Matching Using Logic Synthesis
3
2.1 Minimization Method In order to simplify the given Boolean Function, Exact Direct Cover Minimization Algorithm has been developed. This algorithm is explained in [7]. Exact Direct Cover Minimization Method algorithm is given in below. For shortness and formality of the explanations the following notations are used in the below algorithm. The number of variables of a function: n; Status Word: SW; a set of On-cubes: SON; a set of Off-cubes: SOFF; an element of the set SON to be handled: λ; a set of prime implicants: SPI; a set of essential prime implicants: SEPI; coordinate subtraction (sharp product) operation: #; unite operation: ∪; temporarily used variables: Q0, Q1, α, β. 1. Put SW=∅, 2. Take out the first minterm from SON set, mark it by λ, 3. Transform one by one all of elements of SOFF. Mark it by Q0, 4. Apply the absorption operation to Q0. Mark the result by Q1, 5. Coordinate Subtract the set Q1 from the n dimensional full cube. Mark the result by SPI, 6. Apply the Great or Less operation to SPI set. Note that element α is greater than element β if the set of SON # α is powerless than the set of SON # β, 7. If the result is not single element then SW=SW ∪λ and goto 2 8. If the result is single element then mark it by Essen-
tial Prime Implicant (EPI), 9. Put SON=SON # EPI, SW=SW # EPI, SEPI = SEPI ∪ EPI 10.If SON ≠ ∅ then go to 2 11.If SW = ∅ then END else SON = SW 12.If SON = ∅ and SW ≠ ∅ then CALL BS 13.go to 3 BS 1. Take out the first minterm from SON set, mark it by λ, 2. Transform one by one all of elements of SOFF. Mark it by Q0, 3. Apply the absorption operation to Q0. Mark the result by Q1, 4. Coordinate Subtract the set Q1 from the n dimensional full cube. 5. Apply the Great or Less operation to the elements of SPI set. 6. If the result is single element then mark it by EPI, Otherwise select one of them and mark it by EPI, 7. Put SON = SON # EPI, SEPI = SEPI ∪ EPI, 8. RETURN
4
F. Başçiftçi and C. Karaca
3 Proposed Method A fingerprint is comprised of ridges and valleys. The ridges are the dark area of the fingerprint and the valleys are the white area that exists between the ridges. Many classifications are given to patterns that can arise in the ridges. Process of a scanned fingerprint image have shown at figure 4 (original scanned image), figure 5 (image after cleaning process), figure 6 (lines thinned image) and figure 7 (feature points determined image) [9]. Our work will start after this process.
Fig. 4. Input image
Fig. 5. Cleaned image
Fig. 6. Lines thinned image
Fig. 7. Feature points
In this study we propose a new approach that how logic synthesis works to match feature points determined fingerprints. We choose 10 different points at the scanned fingerprint. The choosed points will use as input values of logic function. Input Values of the logic function: x1, x2, x3, x4, x5, x6, x7, x8, x9, x10 Output Value of the logic function: Y. We choosed 6 different input values as feature points of the fingerprint. If these 6 points match on the fingerprint, our output value: Y is going to be 1; Y=1 that means the fingerprint matches. Otherwise our output value Y: is going to be 0; Y=0 that means the fingerprint does not match. Input and output cases are shown at Table 1. After logic minimization we got simplification input values for function that shown at Table 2. Table 1. Input and output values for function Input
Input Cases
x1
Determined feature point
x2
Feature point
x3
Determined feature point
x4
Feature point
x5
Determined feature point
x6
Feature point
x7
Determined feature point
x8
Feature point
x9
Determined feature point
x10
Determined feature point
Output
Output Cases
Y
Match/not match
A New Approach for Fingerprint Matching Using Logic Synthesis
5
Table 2. Simplification input values for function Output Symbols
Y
Output Cases
Match
1111110000 1000011111 1000101111 1000111101 1000111110 1001001111 1001010111 1001011011 1001011101 1001011110 1001100111 1001101011 1001101101 1001101110 1001110011 1001110101 1001110110 1001111001 1001111010 1001111100
Sample Simplification Functions (x1,x2,x3,x4,x5,x6,x7,x8,x9,x10) 1010001111 1100100111 1101100101 1011010110 1100101011 1101100110 1011011001 1100101101 1101101001 1011011010 1100101110 1101101010 1011011100 1100110011 1101101100 1011100011 1100110101 1101110001 1011100101 1100111001 1101110010 1011100110 1100111010 1101110100 1011101001 1100111100 1101111000 1011101010 1101000111 1110000111 1011101100 1101001011 1110001011 1011110001 1101001101 1110001101 1011110010 1101001110 1110001110 1011110100 1101010011 1110010011 1011111000 1101010101 1110010101 1100001111 1101010110 1110010110 1100010111 1101011001 1110011001 1100011011 1101011010 1110011010 1100011101 1101011100 1110011100 1100011110 1101100011 1110101001
If we look at table 2 for a few examples of input values, how logic function works to recognize fingerprint image. Examples: Successful one: 1001110101 for this input value x1=1, x2=0, x3=0, x4=1, x5=1, x6=1, x7=0, x8=1, x9=0, x10=1 for this case 6 different points are taking value of 1 (x1, x4, x5, x6, x8, x10) and other 4 points are taking value of 0 (x2, x3, x7, x9). For our minimization method, minimum 6 point should get value of 1 for matching. So for this input value (1001110101), our output is going to be 1 that shows the fingerprint matched. Unsuccessful one: 0001110100 for this input value x1=0, x2=0, x3=0, x4=1, x5=1, x6=1, x7=0, x8=1, x9=0, x10=0 for this case 4 different points are taking value of 1 (x4, x5, x6, x8) and other 6 points are taking value of 0 (x1, x2, x3, x7, x9, x10). For our minimization method, minimum 6 point should get value of 1 for matching. So for this input value (1001110101), there are only 4 matched points that is not enough for recognition, so our output is going to be 0 that shows the fingerprint didn’t match.
4 Conclusions In this study we proposed logic minimization method for fingerprint recognition. We have used 10 different fingerprint points for matching scanned fingerprint image. So for logic function there are 210=1024 input values. After logic minimization we got 198 simplification input values. By this method, reduced functions for each input
6
F. Başçiftçi and C. Karaca
have been obtained. So for using logic synthesis we do not have to look all points of a fingerprint image. We only look simplification input values after minimization. For our example, there were 1024 input values at the beginning, after logic minimization we got 198 simplification input values. That means 5 times less than beginning. In conclusion, using logic synthesis for fingerprint recognition systems minimizes time for matching process. We think that use logic minimization method might be used as a reliable in fingerprint recognition.
Acknowledgements This work is supported by the Coordinator ship of Selcuk University’s Scientific Research Projects.
References 1. Reid, P.: Biometrics for Network Security. Prentice-Hall, Englewood Cliffs (2004) 2. Uludag, U., Jain, A.: Securing fingerprint template: Fuzzy vault with helper data. In: Proc. IEEE Workshop on Privacy Research in Vision (PRIV), NY, USA (June 2005) 3. Sasao, T.: Worst and Best Irredundant Sum-of–Product Expressions. IEEE Transactions on Computers 50(9), 935–947 (2001) 4. Mishchenco, A., Sasao, T.: Large-Scale SOP Minimization Using Decomposition and Functional Properties. In: IEEE CNF, Design Automation Conference, Proceedings, June 2-6, pp. 149–154 (2003) 5. Kahramanli, S., Başçiftçi, F.: Boolean Functions Simplification Algorithm of O(N) Complexity. Journal of Mathematical and Computational Applications 8(4), 271–278 (2002) 6. Kahramanlı, Ş., Güneş, S., Şahan, S., Başçiftçi, F.: A New Method Based on Cube Algebra for The Simplification of Logic Functions. The Arabian Journal For Science and Engineering 32(1B), 1–14 (2007) 7. Başçiftçi, F., Kahramanlı, Ş.: An Off-Cubes Expanding Approach to the Problem of Separate Determination of the Essential Prime Implicants of the Single-Output Boolean Functions. In: EUROCON 2007, Warsaw-Poland, September 09-12, pp. 432–438 (2007) ISBN:1-4244 0813-X 8. Brayton, R.K., Hachtel, G.D., McMullen, C.T., Singiovanni-Vincentelli, A.: Logic Algorithms for VLSI Synthesis. Kluwer Academic, Boston (1984) 9. Özkaya, N., Sağıroğlu, Ş.: Minutiate Extraction Based on Artifical Neural Networks for Fingerprint Recognition Systems. Pamukkale University Engineering Faculty. Journal of Engineering Sciences 13(1), 91–101 (2007) 10. He, Y., Tian, J., Luo, X., Zhang, T.: Image enhancement and minutiae matching in fingerprint verification. Pattern Recognition Letters 24(9-10), 1349–1360 (2003) 11. Benhammadi, F., Amirouche, M.N., Hentous, H., Bey Beghdad, K., Aissani, M.: Fingerprint matching from minutiae texture maps. Pattern Recognition 40(1), 189–197 (2007) 12. Liu, E., Liang, J., Pang, L., Xie, M., Tian, J.: Minutiae and modified Biocode fusion for fingerprint-based key generation. Journal of Network and Computer Applications (Available online December 21, 2009) (in press) (corrected proof)
Extracting Fuzzy Rules to Classify Motor Imagery Based on a Neural Network with Weighted Fuzzy Membership Functions Sang-Hong Lee1, Joon S. Lim2, and Dong-Kun Shin3,∗ 1,2 College of IT, Kyungwon University, Korea {shleedosa,jslim}@kyungwon.ac.kr 3 Division of Computer, Sahmyook University, Korea
[email protected]
Abstract. This paper presents a methodology to classify motor imagery by extracting fuzzy rules based on the neural network with weighted fuzzy membership functions (NEWFM) and twenty-four numbers of input features that are extracted by wavelet-based features. This paper consists of three steps to classify motor imagery. In the first step, wavelet transform is performed to filter noises from signals. In the second step, twenty-four numbers of input features are extracted by wavelet-based features from filtered signals by wavelet transform. In the final step, NEWFM classifies motor imagery using twentyfour numbers of input features that are extracted in the second step. In this paper, twenty-four numbers of input features are selected for generating the fuzzy rules to classify motor imagery. NEWFM is tested on the Graz BCI datasets that were used in the BCI Competitions of 2003. The accuracy of NEWFM is 83.51%. Keywords: Fuzzy neural networks, brain-computer interface, wavelet transform, NEWFM, feature extraction.
1 Introduction A brain-computer interface (BCI) is a new technique that is intended to help to disabled people to communicate with a computer using their brain’s mental tasks. There have been many studies using fuzzy neural network (FNN) in the BCI system. FNN is the combination of neural network and fuzzy set theory, and provides the interpretation capability of hidden layers using knowledge based the fuzzy set theory [7-9]. Xu used fuzzy support vector machine (SVM) for classification of EEG signals [1]. Ting used probabilistic neural network (PNN) as a classifier [2]. Müller, Lotte, and Bashashati have reviews that SVM is the most common and efficient classifier [3-5]. ∗
Corresponding author.
F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 7–14, 2010. © Springer-Verlag Berlin Heidelberg 2010
8
S.-H. Lee, J.S. Lim, and D.-K. Shin
Xu and Ting used wavelet-based features to make initial inputs after signals are filtered by wavelet transform to classify motor imagery. Wavelet-based features [6] are used in many studies such as epileptic seizure classification [10-11] and signal-related classification [12]. This paper consists of three steps to classify motor imagery. In the first step, wavelet transform is performed to filter noises from signals. In the second step, twenty-four numbers of input features are extracted by wavelet-based features [6] from filtered signals by wavelet transform. In the final step, NEWFM classifies motor imagery using twenty-four numbers of input features that are produced in the second step. In this paper, twenty-four numbers of input features are selected for generating the fuzzy rules to classify motor imagery. NEWFM is tested on the Graz BCI datasets that were used in the BCI Competitions of 2003. The accuracy of NEWFM is 83.51%.
2 Overview of Motor Imagery Classification Model Fig. 1 shows EEG signal classification model that is proposed in this paper. In the first step, wavelet transform is performed to filter noises from signals. In the second step, twenty-four numbers of input features are extracted by wavelet-based features [6] from filtered signals by wavelet transform. In the final step, NEWFM classifies motor imagery using twenty-four numbers of input features that are produced in the second step.
Fig. 1. Diagram of Motor Imagery Classification Model
2.1 Experimental Data This paper uses the Graz BCI datasets that were used in the BCI Competitions of 2003. This dataset was recorded from a normal subject (female, 25y) during a feedback session. The subject sat in a relaxing chair with armrests. The task was to control a feedback bar by means of imagery left or right hand movements. The order of left and right cues was random. The experiment consists of 7 runs with 40 trials each. The experiment consisted of 7 runs with 40 trials each. In the available 280 trials, 140 labeled trials were used to train the classifiers, whereas the other 140 trials are for testing the generalization performance of the trained classifiers. All runs were conducted on the same day with several minutes break in between. Given are 280 trials of 9s length. As Fig.2 shows, the first 2s was quite, at t=2s an acoustic stimulus indicates the beginning of the trial,
Extracting Fuzzy Rules to Classify Motor Imagery Based on a NEWFM
9
the trigger channel (#4) went from low to high, and a cross “+” was displayed for 1s; then at t=3s, an arrow (left or right) was displayed as cue. At the same time the subject was asked to move a bar into the direction of a cue. Three bipolar EEG channels (anterior ‘+’, posterior ‘-‘) were measured over C3, Cz, and C4. The EEG was sampled with 128 Hz, it was filtered between 0.5 and 30Hz.
1
2
3 5 cm
C3
1
Cz
2
C4
3
0
1
2
3
4
5
6
7
8
9
sec
Feedback period with Cue Trigger Beep
Fig. 2. Electrode positions (left) and timing scheme (right)
2.2 Wavelet-Based Feature Extraction DWT is a transformation to basis functions that are localized both in scale and in time. DWT decomposes the original signal into a set of coefficients that describes the frequency content at given times [16]. As Fig. 3 shows, this paper makes wavelet coefficients from level 1 to level 3 using Haar wavelet transform. Fig. 4 shows approximation and detail coefficients taken by Haar wavelet transform from left motor imagery and right motor imagery.
Fig. 3. Sub-band decomposition of DWT at level 3
The extracted wavelet coefficients provide a compact representation that shows the energy distribution of the signal in time and frequency. In order to further diminish the dimensionality of the extracted feature vectors, statistics over the set of the wavelet coefficients are used [6]. The following statistical features are used to represent the time-frequency distribution of EEG signals: (1) Mean of the absolute values of the coefficients in each sub-band. (2) Median of the values of the coefficients in each sub-band.
10
S.-H. Lee, J.S. Lim, and D.-K. Shin
(3) Average power of the wavelet coefficients in each sub-band. (4) Standard deviation of the coefficients in each sub-band. Wavelet-based features 1, 2, and 3 represent the frequency distribution of the signal and the feature 4 the amount of changes in frequency distribution. These feature vectors, calculated for the frequency bands d2–d3 and a3, are used for classification of EEG signals.
Fig. 4. Examples of A3, D2, and D3 in sub-band decomposition of DWT
Extracting Fuzzy Rules to Classify Motor Imagery Based on a NEWFM
11
3 Neural Network with Weighted Fuzzy Membership Function (NEWFM) Neural network with weighted fuzzy membership function (NEWFM) is a supervised classification neuro-fuzzy system using bounded sum of weighted fuzzy membership functions (BSWFM) [13-15][17]. The structure of NEWFM, illustrated in Fig. 5, comprises three layers namely input, hyperbox, and class layer. The input layer contains n input nodes for an n featured input pattern. The hyperbox layer consists of m hyperbox nodes. Each hyperbox node Bl to be connected to a class node contains n BSWFMs for n input nodes. The output layer is composed of p class nodes. Each class node is connected to one or more hyperbox nodes. An hth. Input pattern can be recorded as Ih={Ah=(a1, a2, … , an), class}, where class is the result of classification and Ah is n features of an input pattern.
Fig. 5. Structure of NEWFM
The connection weight between a hyperbox node Bl and a class node Ci is represented by wli, which is initially set to 0. From the first input pattern Ih, the wli is set to 1 by the winner hyperbox node Bl and class i in Ih. Ci should have one or more than one connections to hyperbox nodes, whereas Bl is restricted to have one connection to a corresponding class node. The Bl can be learned only when Bl is a winner for an input Ih with class i and wli = 1.
4 Experimental Results The accuracy of NEWFM is evaluated by the same data sets which were used in Xu [1]. Table 1 and Table 2 display the comparison of accuracy for Xu with NEWFM. In this experiment, two hypoboxes are created for classification. While a hyperbox that
12
S.-H. Lee, J.S. Lim, and D.-K. Shin
contains a set of lines (BSWFM) in Fig. 6 is a rule for class 1 (the left motor imagery), the other hyperbox that contains a set of lines (BSWFM) is another rule for class 2 (the right motor imagery). The graphs in Fig. 6 are obtained from the training process of the NEWFM program and graphically show the difference between the left motor imagery and right motor imagery for each input feature. Table 1. Accuracy for four kinds of wavelet transforms used in Xu
Accuracy
Sym2
Bior3.1
Db4
Coif3
80.00%
77.86%
80.71%
80.00%
Table 2. Comparisons of accuracy for Xu with NEWFM
Accuracies
NEWFM
Db4 in Xu
83.57%
80.71%
(a) BSWFM of C3 Fig. 6. Trained BSWFM of twenty-four numbers of input features for all features in C3 and C4
Extracting Fuzzy Rules to Classify Motor Imagery Based on a NEWFM
13
(b) BSWFM of C4 Fig. 6. (continued)
5 Concluding Remarks This paper proposes a new classification model based on neural network with weighted fuzzy membership function (NEWFM). NEWFM is a new model of neural networks to improve classification accuracy by using self adaptive weighted fuzzy membership functions. The degree of classification intensity is obtained by bounded sum of weighted fuzzy membership functions extracted by NEWFM. In this paper, wavelet-based features are used for extracting twenty-four numbers of input features. NEWFM classifies motor imagery using twenty-four numbers of input features that are produced in the second step. In this paper, twenty-four numbers of input features are selected for generating the fuzzy rules to classify motor imagery. NEWFM is tested on the Graz BCI datasets that were used in the BCI Competitions of 2003. The accuracy of NEWFM is 83.51%. NEWFM outperforms Xu’s classifier by 2.8% for test sets.
14
S.-H. Lee, J.S. Lim, and D.-K. Shin
References 1. Xu, Q., Zhou, H., Wang, Y., Huang, J.: Fuzzy support vector machine for classification of EEG signals using wavelet-based features. Medical Engineering & Physics 31, 858–865 (2009) 2. Ting, W., Guo-zheng, Y., Bang-hua, Y., Hong, S.: EEG feature extraction based on wavelet packet decomposition for brain computer interface. Measurement 41, 618–625 (2008) 3. Müller, K.R., Krauledat, M., Dornhege, G., Curio, G., Blankertz, B.: Machine learning techniques for brain–computer interfaces. Biomed. Eng. 49, 11–22 (2004) 4. Lotte, F., Congedo, M., Lećuyer, A., Lamarche, F., Arnaldi, B.: A review of classification algorithms for EEG-based brain–computer interfaces. J. Neural Eng. 4, 1–13 (2007) 5. Bashashati, A., Fatourechi, M., Ward, R.K., Birch, G.E.: A survey of signal processing algorithms in brain–computer interfaces based on electrical brain signals. J. Neural Eng. 4, 32–57 (2007) 6. Tzanetakis, G., Essl, G., Cook, P.: Audio analysis using the discrete wavelet transform. In: D’ Attellis, C.E., Kluev, V.V., Mastorakis, N. (eds.) Mathematics and Simlulation with Biological Economical and Musicoacoustical Applications, pp. 318–323. WSES Press, New York (2001) 7. Wang, J.S., Lee, C.S.G.: Self-Adaptive Neuro-Fuzzy Inference System for Classification Applications. IEEE Trans., Fuzzy Systems 10, 790–802 (2002) 8. Simpson, P.: Fuzzy min-max neural networks-Part 1: Classification. IEEE Trans., Neural Networks 3, 776–786 (1992) 9. Carpenter, G.A., Grossberg, S., Reynolds, J.: ARTMAP: Supervised real-time learning and classification of nonstationary data by a self-organizing neural network. Neural Networks 4, 565–588 (1991) 10. Kemal Kiymik, M., Akin, M., Subasi, A.: Automatic recognition of alertness level by using wavelet transform and artificial neural network. Journal of Neuroscience Methods 139, 231–240 (2004) 11. Subasi, A.: EEG signal classification using wavelet feature extraction and a mixture of expert model. Expert Systems with Applications 32, 1084–1093 (2007) 12. Kandaswamy, A., Sathish Kumar, C., Ramanathan, R.P., Jayaraman, S., Malmurugan, N.: Neural classification of lung sounds using wavelet coefficients. Computers in Biology and Medicine 34, 523–537 (2004) 13. Lim, J.S., Wang, D., Kim, Y.-S., Gupta, S.: A neuro-fuzzy approach for diagnosis of antibody deficiency syndrome. Neurocomputing 69, 969–974 (2006) 14. Lim, J.S.: Finding Fuzzy Rules by Neural Network with Weighted Fuzzy Membership Function. International Journal of Fuzzy Logic and Intelligent Systems 4(2), 211–216 (2004) 15. Lim, J.S.: Finding Features for Real-Time Premature Ventricular Contraction Detection Using a Fuzzy Neural Network System. IEEE Transactions on Neural Networks 20, 522– 527 (2009) 16. Mallat, S.: Zero crossings of a wavelet transform. IEEE Trans. Inf. Theory 37, 1019–1033 (1991) 17. Shin, D.-K., Lee, S.-H., Lim, J.S.: Extracting Fuzzy Rules for Detecting Ventricular Arrhythmias Based on NEWFM. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS, vol. 5476, pp. 723–730. Springer, Heidelberg (2009)
Distributed Data-Mining in the LISp-Miner System Using Techila Grid Milan Šimůnek1 and Teppo Tammisto2 1
University of Economics, Prague, Czech Republic 2 Techila Technologies, Tampere, Finland
[email protected],
[email protected]
Abstract. Distributed data-mining opens new possibilities for answering evercomplex analytical questions that the owners of databases want to ask. Even highly optimized data-mining algorithms with many decades of research have their limits and growing complexity of tasks exceeds computing power of a single PC. It starts to be difficult to get results in acceptable time for an interactive work. Moreover, new goals of research require to run iteratively many tasks to automate the whole KDD process. Need is therefore to speed-up maximally solving of every single task. This article describes newly implemented algorithm to divide the whole task into sub-tasks solved in parallel on grid nodes. There are presented the first results and further possible improvements. Keywords: Data-mining, grid, distributed computing, algorithm implementation.
1 Introduction Today trends in the area of Knowledge Discovery in Databases (KDD) lead to ever complex data-mining tasks that need to be solved. These trends are followed in research and development of the academic system LISp-Miner [1]. Used data-mining algorithms are highly optimized and based on research and experiences dating back to 1960’s. But even these optimizations have their limits and solution-times on a single PC could not be significantly improved forever. One possible solution is to divide task among many PCs/processors organized into some kind of grid. This paper describes changes to the original GUHA algorithm to be able of distributed generation and verification of patterns. There are several possibilities how to divide the whole task into sub-tasks computed afterwards in parallel on grid nodes. The solution proposed here is the most promising one. The paper* is organized as follows. The LISp-Miner system is introduced in the next section and some of its distinct features are mentioned. A description of the used distributed environment is in the third section. The fourth, main section of this paper, explains proposed solution for dividing complex task into suitable sub-tasks and *
The work described here has been supported by the project 201/08/0802 of the Czech Science Foundation.
F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 15–20, 2010. © Springer-Verlag Berlin Heidelberg 2010
16
M. Šimůnek and T. Tammisto
explains also the main challenges faced. There are presented the first experiences with distributed solving of data-mining tasks also in the fourth section. The last section of this paper proposes directions for further development.
2 LISp-Miner System The LISp-Miner system (developed since 1996 at University of Economics, Prague) is an academic system used mainly for data-mining research and teaching. It is used at several universities in Czech Republic, Finland, France and USA and for real data analysis (see e.g. [2]). It is freely available at http://lispminer.vse.cz. The system is based on many decades of related research of the GUHA method. Theoretical foundations were published in books and papers since 1960’s (see e.g. [3], [4], [5], [6]). LISp-Miner is one of several implementations of the GUHA method and now it consists of seven data-mining procedures plus several modules supporting the Bussiness understanding and Data preprocessing phases of the data-mining process (see e.g. [7]). We would like to stress that these procedures mine not for simple association rules in sense of the shopping baskets (see e.g. [8]) but for more complex types of patterns – e.g. 4ft-association rules, action rules, set-difference rules, K×L frequency dependencies… with a really rich syntax of so called derived Boolean attributes (briefly described in the next sub-section, for details see e.g. [9], [10]). The used optimized algorithms have allowed up to present time to mine for these complex patterns in reasonable time for interactive work. The above-mentioned rich syntax and growing user needs lead to a growing complexity of tasks and therefore growing solution times. On the other hand the rich syntax could be successfully exploited for proposed distributed solution (see sect. 4). For simplicity reasons we would discuss only the 4ft-association rule syntax that is used in the 4ft-Miner procedure. The 4ft-association rule has form of: ϕ ≈ ψ / χ. The pattern means that derived Boolean attributes of ϕ and ψ are associated in the way corresponding to the 4ft-quantifier ≈ if the condition χ (a derived Boolean attribute too) is satisfied. A brief Boolean attributes description follows, for details see [9]. Basic Boolean attributes are based on user-defined many-categorial attributes. Basic Boolean attribute has syntax of Attribute( Category) and logical value of true or false could be assigned for each row of analyzed data matrix if there is value from this Category in this row and in the column specified by the Attribute. Examples are: • District( Prague) … true if district is Prague, false otherwise; • Quality( good) … true if for good quality of loan, false otherwise. Derived boolean attributes are automatically generated from basic Boolean attributes by the system using (possible many) logical conjunctions, disjunctions and negation. Examples are: • • • • •
District(Prague) ∨ District(Plzen) – true if district is Prague OR Plzen; Duration(1) ∨ Duration(2) ∨ Duration(3) – true if duration of loan is 1 to 3 years; District(Prague) ∧ Duration(1) – true if district is Prague AND duration is 1 year; District(Plzen) ∨ Duration(3) – true if district is Plzen OR duration is 3 years; ¬District(Prague) – true if district is NOT Prague.
Distributed Data-Mining in the LISp-Miner System Using Techila Grid
17
Even more complex derived Boolean attribute could be generated: • (District(Prague, Plzen) ∧ Duration(<=3)) ∧ (¬Age<20;30) ∨ Quality(good)) – true if district is Prague or Plzen and duration is 3 years or less and simultaneously age is not between 20 and 30 or quality of loan is good. User of the LISp-Miner system has many parameters available to define preciously what kinds of patterns he or she demands to be generated. From point of view of gridification the most important ones are those which restrict the minimal and maximal length of ϕ, ψ and χ (in terms of used basic Boolean attributes). User could restrict e.g. maximal length of ψ to be 3, so only up to three basic Boolean attributes could be used to create derived Boolean attribute representing ψ. Analogously user could force the minimal length of ψ to be at least 2, therefore ψ could consists of two or three basic Boolean attributes only. Similarly, user could influence number disjunction of categories for one attribute (so called length of coefficient). This pairs of minimal-maximal lengths – each for ϕ, ψ and χ and each for coefficient of every attribute used offer good way how to divide complex tasks into sub-tasks suitable for distributed solving on grid nodes (see later).
3 Techila Grid Techila Grid is developed since 2006, is already deployed on majority of Finnish universities and several real-world problems were solved using it (see e.g. [11]). It offers computer grid-like features while utilizing power of ordinary PC clients. This concept allows to set-up grid nearly everywhere with near-to-zero hardware costs and to add computing power as needed by simple registering more PC clients. The PC-Grid is able to provide low latency results for the gridified problems even in large-scale computation networks. Intelligent nodes allow optimized distribution of computation jobs with rather small amount of network traffic. They are independently monitoring their own state and making the decisions based on it. The nodes are connected to the server all the time allowing them to report changes in their status immediately and also allowing the server to give them orders with no delay. All the communication is secured and data are compressed. We choose this grid as a platform for implementation of distributed computing in the LISp-Miner system due to our long-term cooperation with Technical University of Tampere, Finland. There is currently more than 400 PCs in laboratories registered into the grid. It was a great advantage while implementing grid support into the LISpMiner because we had a possibility to do testing right from the start. An import decision factor (for University of Economics, Prague as an academic institution) was possibility to install the grid server as a virtual server (in e.g. VMWare) and to register existing PCs (in offices, laboratories and dormitories) as nodes. Any grid-related calculation is immediately suspended if user activity is noticed so ordinary users are not restricted by registering their PCs into grid. No new hardware had to be purchased so the initial costs were significantly reduced. Total grid computing power could be easily increased by registering more PCs.
18
M. Šimůnek and T. Tammisto
4 Modifying Algorithm for Distributed Solving of Task The main problem that had to be addressed while incorporating grid features into the LISp-Miner system was how to divide data-mining task into sub-tasks that could be solved (in parallel) on particular grid nodes. The goal was to find a general solution that could be used in all the implemented GUHA procedures within the LISp-Miner system, although they mine for different parents and used data-mining algorithm is different. A caution had to paid to the fact, that system is systematically used in teaching and several commercial applications and used algorithms are thoroughly proven and well debugged after many years of deployment. There was an imminent danger that a significant upgrade could introduce hidden bugs and degrade many years of work. An ideal way to incorporate the grid would be to add a new layer above these data-mining core algorithms and to make as little changes to core as possible. 4.1 Previous Non-grid Implementation of Data-Mining Algorithm Used implementation of data-mining algorithms (each for every GUHA procedure) is based on original GUHA procedure DB-ASSOC [3]. The main feature is to walkthrough all the possibly interesting patterns and to verify each of them in analyzed data. Only successfully verified patterns are included into results. The total amount of possibly interesting patterns is really huge so the algorithm was developed with aim to its effectiveness and was equipped with several optimizations – namely to identify parts of the state-space that could not contain verifiable patterns and to immediately skip such parts. As a result there is a sequential deep-first walk-through-tree algorithm with massive skipping of branches based on properties in the analyzed data. This allows very fast processing but problems emerge while trying to divide processing onto grid nodes. The first problem is a sequential nature of algorithm. The algorithm consists of single function PrepareNextVariant returning the immediate next pattern from the current one. Massive skips of the whole branches are the second problem. They yield to very complex changes in patterns during a single call to PrepareNextVariant that are highly influenced by analyzed data itself. A tree-branch could be skipped for one set of analyzed but had to be carefully walk-through for another data. This feature highly complicates an estimation of the total number of patterns that have to be verified and suitable division of the whole task into sub-tasks. 4.2 Proposed Solution Proposed solution takes user defined task parameters and checks pairs of minimalmaximal length restrictions defined by user for each derived Boolean attribute (separately for ϕ, ψ and χ). For every pair where minimal length < maximal length it suggests to divide task into (maximal length- minimal length+ 1) variants, where each variant has its task parameters changed so maximal length = minimal length = n for this particular pair and where n is going from minimal to maximal length defined in the original task. Total number of sub-tasks is given by the permutation of number of
Distributed Data-Mining in the LISp-Miner System Using Techila Grid
19
variants for min-max pair. Lets suppose we had a task where minimal length for ϕ, ψ and χ is the same and is 1 and maximal length is same too and is 3. So we could divide this task into 27 sub-tasks (3* 3* 3). We will get the same results by solving all of the 27 sub-tasks as by solving the original one. These sub-tasks are computationally independent and could be solved in parallel on different grid nodes. Things get a little bit complicated after we include min-max pairs for length of coefficient and hierarchical structure of derived attributes. But the overall concept remains the same and we have already successfully incorporated proposed solution into two procedures – see the next subsection. The most important feature of this solution is that no changes to the original optimized algorithm for solving of tasks are needed and this solution is applicable for all the GUHA-procedures implemented so far in the LISp-Miner system. The most significant drawback is a non-even complexity of emerged sub-tasks, although it is not straightforward that the sub-task of 27 (length 3-3-3) is the most difficult to solve (due to optimization used in the original algorithm). We would like to address this problem in our future work. Some duplicities in found patterns among different subtasks are another drawback of this solution. It is not always possible to create nonoverlapping sub-tasks, so duplicities should be dropped while aggregating results from sub-tasks. Proposed algorithm was implemented initially into the 4ft-Miner procedure (for 4ft-association rules) and then into the Action 4ft-Miner procedure (for 4ft-actionrules) where its potential is even higher due to the more complex pattern syntax. A real data analysis was undertaken using distributed grid verification and results were compared to solving-times of the same task on a single computer. First experiments were done using the grid installed on the Technical University of Tampere with just 30 clients dedicated to this experiment. Task parameters and analyzed data have been sent each time from Prague (with cache always cleared). Another rather experimental grid solution (with up to 10 grid nodes available) was established at the University of Economics, Prague for development purposes and to better understanding of grid behavior. A significant improvement in solution times was observed right from the first tasks. The grid overhead due to division of a task into sub-tasks, upload all the necessary data to the grid and then to download the results is reasonably small and its relative importance decreases by growing complexity of task, see Tab. 1. Table 1. Local and grid solution times on TUT and UEP grids Nr. 1 2 3 4 5 6
Task A (4ft-Miner) B (4ft-Miner) B (4ft-Miner) B (4ft-Miner) C (Act4ft-Miner) C (Act4ft-Miner)
Local Time 0h 14m 14s 3h 48m 3h 48m 3h 48m 30h 23m 30h 23m
Grid Time 0h 9m 48s 0h 21m 38s 0h 53m 01s
1h 5m 28s 1h 06m 35s 5h 44m 57s
Grid TUT TUT UEP UEP TUT UEP
Active nodes 30 30 4 5 24 5
20
M. Šimůnek and T. Tammisto
We observed that task taking more than 1 minute to solve is generally better to run on grid instead of locally (while processing analyzed data up to 10 MB). We realized also that non-even complexity of sub-tasks together with big differences in HW parameters of grid nodes could lead to unexpected results (see the task B on lines 3 and 4 where the solution time was better with less active nodes – when the slowest one was shut-down). There is therefore a good potential for improvements.
5 Further Steps The most important problem that has to be solved is to improve strategy of dividing task into sub-tasks and to obtain more balanced sub-tasks based on their solution time. Efficiency of the distributed solution of tasks on grid could be further improved by forward identification of the most complex sub-tasks and by providing this information to the grid to assign them in priority onto the most powerful nodes. We are already working on implementing the “snapshot feature” that will allow to re-allocate sub-tasks onto another node even in the middle of computation (if a node is shutdown or a more powerful one is idle and could deliver results faster).
References 1. Šimůnek, M.: Academic KDD Project LISp-Miner. In: Abraham, A., Franke, K., Köppen, M. (eds.) Advances in Soft Computing – Intelligent Systems Design and Applications, pp. 263–272. Springer, Heidelberg (2003) 2. Rauch, J., Tomečková, M.: System of Analytical Questions and Reports on Mining in Health Data – A Case Study. In: MCCSIS 2007, pp. 176–181. IADIS, Lisabon (2007) 3. Hájek, P., Havránek, T.: Mechanising Hypothesis Formation – Mathematical Foundations for a General Theory, p. 396. Springer, Heidelberg (1978) 4. Rauch, J.: Some Remarks on Computer Realisations of GUHA Procedures. International Journal of Man-Machine Studies 10, 23–28 (1978) 5. Rauch, J.: Main Problems and Further Possibilities of the Computer Realizations of GUHA Procedures. International Journal of Man-Machine Studies 15, 283–287 (1981) 6. Hájek, P., Holeňa, M., Rauch, J.: The GUHA method and its meaning for data mining. Journal of Computer and System Sciences 76, 34–48 (2010) 7. Rauch, J., Šimůnek, M.: Dealing with Background Knowledge in the SEWEBAR Project [online]. In: Berendt, B., et al. (eds.) Knowledge Discovery Enhanced with Semantic and Social Information, pp. 89–106. Springer, Berlin (2009) 8. Aggraval, R., et al.: Fast Discovery of Association Rules. In: Fayyad, U.M., et al. (eds.) Advances in Knowledge Discovery and Data Mining. AAAI Press, Menlo Park (1996) 9. Rauch, J., Šimůnek, M.: Alternative Approach to Mining Association Rules. In: Lin, T.Y., Ohsuga, S., Liau, C.J., Tsumoto, S. (eds.) Data Mining: Foundations, Methods, and Applications. Springer, Heidelberg (2005) 10. Rauch, J., Šimůnek, M.: Action Rules and the GUHA Method: Preliminary Considerations and Results. In: Rauch, J., Raś, Z.W., Berka, P., Elomaa, T. (eds.) Foundations of Intelligent Systems. LNCS, vol. 5722, pp. 76–87. Springer, Heidelberg (2009) 11. Kanniainen, J., et al.: Use of Distributed Computing in Derivative Pricing. In: 5th Int. Conference on Computational Management Science. Imperial College London (2008)
Non-negative Matrix Factorization on GPU Jan Platoˇs, Petr Gajdoˇs, Pavel Kr¨ omer, and V´ aclav Sn´ aˇsel Department of Computer Science, FEI, VSB - Technical University of Ostrava, 17. listopadu 15, 708 33, Ostrava-Poruba, Czech Republic {jan.platos,petr.gajdos,pavel.kromer,vaclav.snasel}@vsb.cz
Abstract. Today, the need of large data collection processing increase. Such type of data can has very large dimension and hidden relationships. Analyzing this type of data leads to many errors and noise, therefore, dimension reduction techniques are applied. Many techniques of reduction were developed, e.g. SVD, SDD, PCA, ICA and NMF. Non-negative matrix factorization (NMF) has main advantage in processing of nonnegative values which are easily interpretable as images, but other applications can be found in different areas as well. Both, data analysis and dimension reduction methods, need a lot of computation power. In these days, many algorithms are rewritten with the GPU utilization, because GPU brings massive parallel architecture and very good ratio between performance and price. This paper introduce computation of NMF on GPU using CUDA technology. Keywords: NMF, parallelism, CUDA, GPU computing.
1
Introduction
Data analysis and data mining are techniques that are used in many areas. Data for analysis is usually in raw format and it is necessary to apply data preprocessing, data cleaning and other techniques. Another problem is that the data may have very large dimension and hidden relationships between columns. Both problems are solved using of some dimension reduction techniques or feature extraction techniques. At present, there are several books that describe methods for dimension reduction. David Skillicorn in the book [25] described several methods based on matrix decomposition, such as Singular Value Decomposition (SVD) [4, 26, 27, 7, 13, 16], Semi-Discrete Decomposition (SDD), Principal Component Analysis (PCA) [9,12] etc. Lars Eld´en in the book [8] described methods for dimension reduction and theirs application to various problems such as Text Mining, Classification of Handwritten Digits, etc. Methods for dimensionality reduction have already been successfully used to various problems many times. The most significant techniques are Principal Component Analysis and Singular Value Decomposition. Both are shortly described in following paragraphs.
Corresponding author.
F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 21–30, 2010. c Springer-Verlag Berlin Heidelberg 2010
22
J. Platoˇs et al.
Principal Component Analysis (PCA) is the main linear technique for dimension reduction. This method performs a linear mapping of given data to a lower dimensional space in such a way, that the variance of the data is maximized in the low-dimensional representation. We refer to [9, 12] for more details. In practice, correlation matrix of the data is constructed and eigenvectors of this matrix are computed. The eigenvectors corresponding to the set of largest eigenvalues (the principal components) can now be used to reconstruct a large fraction of the variance of the original data. The original space is reduced (with data loss, but hopefully retaining the most important variance) to the space spanned by a the set of selected eigenvectors. This method is frequently used because it is a simple, non-parametric method for extracting relevant information from confusing data sets. I. K. Fodor described several methods in the book [9] that were based on components analysis, such as Principal Components Analysis (PCA), Independent Component Analysis (ICA), etc. Kurucz et al. [12] extend applicability of PCA to very large scale social networks.PCA was also used for the mapping of geochemical data. A testing data matrix was prepared from the chemical and physical analyses of the coals altered by thermal and oxidation effects [22]. For this problem was used technique PCA based on Singular Value Decomposition. The Singular Value Decomposition (SVD) [4] represents an important technique used for factorization of rectangular real or complex matrix. SVD is used for computing the pseudoinverse of a matrix, solving homogeneous linear equations, solving the total least square minimization problem and finding approximation matrix. See following to read more about SVD [26, 7, 13, 16]. SVD was used as a very good solution for clustering searching results [26]. Also it was successfully used to solve the problem of nearest-neighbor search of high-dimensionality spaces. That is critical for many applications, e.g. contentbased retrieval from multimedia databases, similarity search of patterns in data mining, and nearest-neighbor classification [7]. V. Sn´aˇsel et al. described usage of dimensionality reduction methods to solve the problem of Dimensional Binary Datasets [27]. Sheetal et al. [13] presented an implementation of Singular Value Decomposition (SVD) of a dense matrix on GPU using CUDA programming model. This allows significantly acceleration computing SVD for large matrices. Singular Value Decomposition (SVD) breaks a n × m matrix A into three matrices U , Σ and V . (1) A = U ΣV T U is a (n×n) orthogonal matrix whose column vectors are called the left singular vectors of A, V is a (n × m) orthogonal matrix whose column vectors are termed the right singular vectors of A, and Σ is a (m × m) diagonal matrix having the singular values of A ordered decreasingly. Columns of U form an orthogonal basis for the column space of A. Singular value decomposition is well-known because of its application in information retrieval as LSI. SVD is especially suitable in its variant for sparse matrices [1].
Non-negative Matrix Factorization on GPU
23
Because the singular values usually fall quickly, we can take only k greatest singular values with the corresponding singular vector coordinates and create a kreduced singular decomposition of A. We can approximate the decomposition as Ak = Uk Σk VkT
(2)
and we call the decomposition k-reduced singular value decomposition. Rank-k SVD is the best rank-k approximation of the original matrix A. This means that any other decomposition will increase the approximation error, calculated as a sum of squares (F robeniusnorm) of error matrix B = A − Ak . This paper is focused on the Non-negative Matrix Factorization which was designed for processing of non-negative numbers. NMF was applied in many areas, e.g. image processing [14], where NMF was used for recognition of parts of images, clustering emails from Enron corpus1 [5], clustering of Wikipedia pages according citations [17]. NMF may be also used in Intrusion detection systems [21, 20, 28] In Section 2, description of Non-negative Matrix Factorization is depicted. Section 3 contains a short introduction to GPU computing and Section 4 contains description of NMF implementation on GPU. Section 5 contains experimental results and comparison of CPU and GPU approach for two collections. Finally, Section 6 contains short summary of the paper and future works.
2
Non-negative Matrix Factorization
Non-negative matrix factorization [14,15] is a class of decompositions whose members are not necessarily closely related to each other [8,25]. It was designed for data sets in which attribute values are not negative. A side-effect of this feature is that the mixing of components in decompositions can be additive only. A set of data S can be expressed as a m × n matrix A, where m is the number of attributes and n is the number of records in S. Each column Aj of A is an encoding of a particular record in S and every entry aij of vector Aj is the value of i-th term with regard to the semantics of Aj , where i ranges across attributes. The NMF problem is defined as a searching for an approximation of the matrix A with respect to some metric (e.g., the norm) by factoring A into the product W × H of two reduced matrices W and H. Each column of W is a basis vector which contains an encoding of a semantic space or concept from A and each column of H contains an encoding of the linear combination of the basis vectors that approximates corresponding column of A. Dimensions of W and H are m × k and k × n, where k is the reduced rank. Usually, k is much smaller than n. Finding an appropriate value of k depends on application and it is also influenced by the nature of the collection itself. Common approaches to NMF obtain an approximation of A by computing a (W, H) pair to minimize the Frobenius norm of the difference A − W H. The matrices W and H are not unique. Usually H is initialized to zero and W to a randomly generated matrix where each Wij > 0 and these initial values are improved with iterations of the algorithm. 1
http://www.cs.cmu.edu/~ enron, 2010-02-14.
24
2.1
J. Platoˇs et al.
Computation Algorithms
NMF is computed in an iterative process. Minimization rules are applied in every iteration to minimize difference between W × H and original matrix A. The first approach to solve this problem was based on multiplicative rules defined be Lee and Seung [15] and it can be described by following steps: 1. Initialize matrix W and H with random numbers 2. For each iteration compute T A (a) H = H ∗ W TWW H+ (b) W = W ∗
AH T W HH T +
where the constant is set to 10−9 , W T and H T represents the transposition of matrix W and H, symbol ∗ represents per element matrix multiplication and division (the fraction) is per element as well. The second group of algorithms is based on gradient descend method. GDCLS [24] algorithm is a typical representant of this group. We refer to [6] for more detail of this algorithm and a description of others. There is a number of optimization tasks related to NMF [11]. Unconstrained NMF, which corresponds to the original NMF introduced above, is a non-convex problem with known algorithms that can compute its global optimum. Unfortunately, they are not able to deal with real-world sized data and so algorithms finding local optimum have to be employed. Although the NMF codes are rather sparse, in Sparsity constrained NMF is the sparsity controlled directly. The sparsity constraints control to what extent basis functions are sparse, and how much each basis function contributes to the reconstruction of only a subset of the original data matrix A.
3
GPU Utilization
Modern graphics hardware play an important role in the area of parallel computing. Graphics cards have been used to accelerate gaming and 3D graphics applications, but now, they are used to accelerate computations with relatively distant topics, e.g. remote sensing, environmental monitoring, business forecasting, medical applications or physical simulations etc. Architecture of GPUs (Graphics Processing Unit) is suitable for vector and matrix algebra operations which leads to the wide usage of GPUs in the area of information retrieval, data mining, image processing, data compression, etc. Nowadays, one does not need to be an expert in graphics hardware because of existence of various APIs (Application Programming Interface), that help programmers to implement their software faster. Nevertheless, it will be always necessary to keep basic rules of GPU programming to write code more effective. There are two graphics hardware leaders, vendors that prefer they own solution before any other, ATI and nVIDIA. The first mentioned developed technology called ATI Stream [2] and the second one presented nVIDIA CUDA [18].
Non-negative Matrix Factorization on GPU
25
Comparison of these two APIs is not a goal of this article. We refer to [18] for more information. Our implementation was realized in nVIDIA CUDA technology, because of hardware availability and experience with this technology. CUDA is an acronym for Compute Unified Device Architecture. It is a general purpose parallel computing architecture that leverages the parallel compute engine in nVIDIA graphics processing units. GeForce GTX 280 was used in our experiments. The GPU processor is composed of 30 multiprocessors. Each SIMD (Single Instruction Multiple Data) multiprocessor drives eight arithmetic logic units (ALU) which process the data, thus each ALU of a multiprocessor executes the same operations on different data, lying in the registers. In contrast to standard CPUs which can reschedule operations (out-of-order execution), the selected GPU is an in-order architecture. This drawback is overcome by using multiple threads as described by Wellein et al. in [10]. Current general purpose CPUs with clock rates of 3 GHz outrun a single ALU of the multiprocessors with its rather slow 1.3 GHz. The huge number of 240 parallel processors on a single chip compensates this drawback. The processing is optimized for floating point calculations and a FMA (Fused Multiply Add) is four step pipelined, thus its latency is four clock cycles. Additional operations have different specifications and therefore require different numbers of clock cycles to complete. GPU producers prefer double precision in later GPUs because of their wide utilization in scientific computations. CUDA Capability 1.3 and above is necessary to have double precision support. Andrecut [3] in 2009 described computing based on CUDA on two variants of Principal Component Analysis (PCA). The usage of parallel computing on GPU improved efficiency of the algorithm more than 12 times in comparison with CPU. Preis et al. [23] applied GPU on methods of fluctuation analysis, which includes determination of scaling behavior of a particular stochastic process and equilibrium autocorrelation function in financial markets. The speed up was more than 80 times than the previous version running on CPU. Patnaik et al. [19] used GPU in the area of temporal data mining in neuroscience. They analyzed spike train data with the aid of a novel frequent episode discovery algorithm. Achievement of more than 430 speed up is described in mentioned paper. We refer to [10] for more nVidia CUDA examples.
4
Computation of NMF on GPU
The utilization of GPU in NMF is not so difficult, especially in case of basic version of NMF with multiplicative rules, because these rules are defined as the series of matrix multiplication and per element division. Matrix multiplication represents the task, for which GPUs were optimized since their creation. Per element operations are also often used in graphics rendering, because they are used for post processing effects such as motion-blur, bloom, heat-shimmering, etc. The realization of matrix multiplication was simplified, because nVIDIA CUDA contains specially optimized implementation of BLAS library. This library
26
J. Platoˇs et al.
contains functions for Vector-Vector, Vector-Matrix, and finally Matrix-Matrix operations. Our implementation use only Sgemm operation for matrix multiplication. This operation is able not only multiply matrices but is able to transpose first, second or both matrices. Other necessary functions, especially per element division and Frobienus norm were implemented manually in C for CUDA. Because of the fact, that CUDA is extension to standard C programming language, reimplementing of methods is simple. For comparison purposes, CPU based NMF computation was implemented. This implementation was written in C++ and it was optimized for maximal performance.
5
Experimental Results
Two collection were used in our experiments. Both contain samples of intrusion detection data [29]. They are composed of five class of attacks. Learning and testing sets of events are available for every class. Each event is described by 41 features. Fourth class representing user-root attacks was used. Teaching collection contains 5092 events (records) and testing collection contains 6892 records. Both were used in our experiments. This was the hardware configuration: CPU Intel Core2Quad, 8GB RAM, nVIDIA GeForce GTX 280, 1024MB GRAM with 240 CUDA cores. 5.1
Results on Smaller Collection
The first experiment was run on smaller collection containing 5092 × 41 values. A set k = {12, 17, 25, 33, 41} was used and number of iterations equals 10, 50, 100, 500, 1000. The selection of k values is the same like in our previous work on Designing of Intrusion Detection System using NMF [20, 21, 28]. Results are shown in Table 1 and the final speed up realized by GPU in comparison with CPU is shown in Figure 1. The missing values denotes using ”-” symbols means, that these results was not computed due to theirs enormous running time. Table 1. Computation time [ms], small collection Computation time on CPU [ms] number of basis vectors (k) Iterations 12 17 25 33 41 CPU GPU CPU GPU CPU GPU CPU GPU CPU GPU 10 1146 399 1787 399 2993 419 4312 405 5999 399 50 5885 515 9313 519 16464 545 24285 567 34742 580 100 12308 628 19567 675 33958 730 53870 755 78742 775 500 81907 1715 162713 1905 199894 2080 289228 2311 481204 2317 1000 201626 3011 - 3416 - 3803 - 4268 - 4298
Non-negative Matrix Factorization on GPU
27
Speed-up factor
250,0 200,0 k = 12
150,0
k = 17
100,0
k = 25 50,0
k = 33
0,0
k = 41 10
50
100
500
1000
Number of iterations
Fig. 1. Speedup on GPU, small collection
Computation on GPU is much faster than on CPU as may be seen from results. When k and number of iterations are small, the speedup is small as well, because of the overheads in data copying between system and GPU. On the other hand, larger k and higher number of iteration lead to speed up more that 100 times and for k = 41 and 500 iterations it is more that 200 times. 5.2
Results on Larger Collection
The second experiment was run on collection containing 6892×41 values, i.e. this collection has 35 % more records that the previous one. The set of k and number of iterations stay unchanged. Results are shown in Table 2 and the final speed up realized by GPU in comparison with CPU is shown in Figure 2. The missing values denotes using ”-” symbols means, that these results was not computed due to theirs enormous running time. Table 2. Computation time [ms], large collection Computation time on CPU [ms] number of basis vectors (k) Iterations 12 17 25 33 41 CPU GPU CPU GPU CPU GPU CPU GPU CPU GPU 10 1564 402 2399 410 3977 416 5927 420 8075 422 50 8092 563 12789 567 22314 599 35229 632 48800 637 100 17665 741 28361 774 48813 830 75122 880 108987 890 500 107700 2264 167262 2442 275312 2663 389674 2967 627278 3029 1000 277317 4292 - 4309 - 4778 - 5406 - 5556
For large collection, computation on GPU leads to better speedup than for small collection. It can be seen in tables and figures. Efficiency of GPU is much higher for tasks which may be parallelized.
28
J. Platoˇs et al.
Speed-up factor
250,0 200,0 k = 12
150,0
k = 17
100,0
k = 25 50,0
k = 33
0,0
k = 41 10
50
100
500
1000
Number of iterations
Fig. 2. Speedup on GPU, large collection
6
Conclusion
This paper described the possibility of computation of Non-negative Matrix Factorization using GPU. GPU has many parallel executive units and it enables us to solve some problems faster than on standard CPU. The final speed up was more than 200 times. In the future, we would like to test larger collections and improve implementation of NMF on GPU using optimization. Improvements for very large collections, which can not be store in main memory, is the main task of near future work. Acknowledgement. This work was supported by the Ministry of Industry and Trade of the Czech Republic, under the grant no. FR-TI1/420.
References 1. Abdulla, H.D., Polovincak, M., Snasel, V.: Using a matrix decomposition for clustering data. In: International Conference on Computational Aspects of Social Networks, pp. 18–23 (2009) 2. AMD ATI. Ati stream technology (February 2010), www.amd.com/stream 3. Andrecut, M.: Parallel gpu implementation of iterative pca algorithms. Journal of Computational Biology 16(11), 1593–1599 (2009) 4. Berry, M., Dumais, S., Letsche, T.: Computational Methods for Intelligent Information Access. In: Proceedings of the 1995 ACM/IEEE Supercomputing Conference, San Diego, California, USA (1995) 5. Berry, M.W., Browne, M.: Email surveillance using non-negative matrix factorization. Comput. Math. Organ. Theory 11(3), 249–264 (2005) 6. Berry, M.W., Browne, M., Langville, A.N., Pauca, V.P., Plemmons, R.J.: Algorithms and applications for approximate nonnegative matrix factorization. Computational Statistics & Data Analysis 52(1), 155–173 (2007) 7. Castelli, V., Thomasian, A., Li, C.-S.: Csvd: Clustering and singular value decomposition for approximate similarity search in high-dimensional spaces. IEEE Transactions on Knowledge and Data Engineering 15, 671–685 (2003)
Non-negative Matrix Factorization on GPU
29
8. Eld´en, L.: Matrix Methods in Data Mining and Pattern Recognition (Fundamentals of Algorithms). Society for Industrial and Applied Mathematics, Philadelphia (2007) 9. Fodor, I.K.: A survey of dimension reduction techniques. Technical report, Lawrence Livermore National Laboratory (2002) 10. Hager, G., Zeiser, T., Wellein, G.: Data access optimizations for highly threaded multi-core cpus with multiple memory controllers. In: IPDPS, pp. 1–7. IEEE, Los Alamitos (2008) 11. Heiler, M., Schn¨ orr, C.: Learning sparse representations by non-negative matrix factorization and sequential cone programming. J. Mach. Learn. Res. 7, 1385–1407 (2006) 12. Kurucz, M., Benczur, A., Pereszlenyi, A.: Large-scale principal component analysis on livejournal friends network. In: Proceedings of SNAKDD 2008 (2008) 13. Lahabar, S., Narayanan, P.J.: Singular value decomposition on gpu using cuda. In: IPDPS ’09: Proceedings of the 2009 IEEE International Symposium on Parallel & Distributed Processing, Washington, DC, USA, pp. 1–10. IEEE Computer Society, Los Alamitos (2009) 14. Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999) 15. Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) Advances in Neural Information Processing Systems, pp. 556–562. MIT Press, Cambridge (2001) 16. Moravec, P., Gajdos, P., Snasel, V., Saeed, K.: Normalization impact on svd-based iris recognition, pp. 60–64 (June 2009) 17. Nielsen, F.˚ A.: Clustering of scientific citations in wikipedia. In: Wikimania 2008, Richard Petersens Plads, Building 321, DK-2800 Kgs. Lyngby, Informatics and Mathematical Modelling, Technical University of Denmark (June 2008) 18. nVidia. Cuda programming guide 2.3 (February 2010), http://developer.nvidia.com/object/cuda_2_3_downloads.html 19. Patnaik, D., Ponce, S.P., Cao, Y., Ramakrishnan, N.: Accelerator-oriented algorithm transformation for temporal data mining. CoRR, abs/0905.2203 (2009) 20. Platoˇs, J., Sn´ asel, V., Kr¨ omer, P., Abraham, A.: Detecting insider attacks using non-negative matrix factorization. In: IAS, pp. 693–696. IEEE Computer Society, Los Alamitos (2009) 21. Platoˇs, J., Sn´ aˇsel, V., Kr¨ omer, P., Abraham, A.: Designing Light Weight Intrusion Detection Systems: Non-negative Matrix Factorization Approach. In: Socioeconomic and Legal Implications of Electronic Intrusion, Information Science Reference, 1st edn., pp. 216–229 (April 2009) 22. Praus, P.: Svd-based principal component analysis of geochemical data. Central European Journal of Chemistry 3(4), 731–741 (2005) 23. Preis, T., Virnau, P., Paul, W., Schneider, J.J.: Accelerated fluctuation analysis by graphic cards and complex pattern formation in financial markets. New Journal of Physics 11(9), 093024, 21 (2009) 24. Shahnaz, F., Berry, M.W., Pauca, V.P., Plemmons, R.J.: Document clustering using nonnegative matrix factorization. Inf. Process. Manage. 42(2), 373–386 (2006) 25. Skillicorn, D.: Understanding Complex Datasets: Data Mining using Matrix Decompositions. Chapman & Hall/CRC (2007) 26. Snasel, V., Kromer, P., Platos, J.: Evolutionary approaches to linear ordering problem. In: Bhowmick, S.S., K¨ ung, J., Wagner, R. (eds.) DEXA 2008. LNCS, vol. 5181, pp. 566–570. Springer, Heidelberg (2008)
30
J. Platoˇs et al.
27. Snasel, V., Kromer, P., Platos, J., Husek, D.: On the implementation of boolean matrix factorization. In: Bhowmick, S.S., K¨ ung, J., Wagner, R. (eds.) DEXA 2008. LNCS, vol. 5181, pp. 554–558. Springer, Heidelberg (2008) 28. Snasel, V., Platos, J., Kromer, P., Abraham, A.: Matrix factorization approach for feature deduction and design of intrusion detection systems. In: Rak, M., Abraham, A., Casola, V. (eds.) Proceedings of IAS 2008, pp. 172–179 (2008) 29. The UCI KDD Archive. Kdd cup data (February 2010), http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
Chatbot Enhanced Algorithms: A Case Study on Implementation in Bahasa Malaysia Human Language Abbas Saliimi Lokman and Jasni Mohamad Zain Faculty of Computer Systems & Software Engineering, Universiti Malaysia Pahang, Lebuhraya Tun Razak, 26300 Kuantan, Pahang Darul Makmur
[email protected],
[email protected] http://www.ump.edu.my
Abstract. Chatbot is one of a technology that tried to encounter the question that popped into computer science field in 1950 which is ”Can machines think?” [6]. Proposed by mathematician Alan Turing, the question later becomes the pinnacle reference for researchers in artificial intelligence discipline. Turing later also introduces ”The Imitation Game” that now known as ”Turing Test” where the idea of the test is to examine whether machine can fool a judge into thinking that they are having a conversation with an actual human. The technology back then was great but in rapid evolution of computer science, it can become even better. Evolution is computer scripting language, application design model, and so on, clearly have its advantage towards enabling more complex features in developing a computer program. In this paper, we propose an enhanced algorithm of a chatbot by taking advantages of relational database model to design the whole chatbot architecture that enable several features that cannot or difficult to be done in previous state of computer science programming technique. Started with some literature of a previous developed chatbot, then a detailed description of each new enhanced algorithm together with testing and results from the implementation of these new algorithms that can be used in development of a modern chatbot. These several new algorithms will enable features that will extend chatbot capabilities in responding to the conversation. These algorithm is actually implemented in design and development of chatbot that specifically deal with Bahasa Malaysia language, but taking to account that language in chatbot is really about the data in chatbot knowledge-based, the algorithm is seems transferable wherever it fits into another human language. Keywords: Chatbot, sentence processing, pattern matching, knowledgebased, coversation path.
We would like to thank Universiti Malaysia Pahang (UMP) for the support grant for this research no. GRS 070166.
F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 31–44, 2010. c Springer-Verlag Berlin Heidelberg 2010
32
1
A.S. Lokman and J.M. Zain
Introduction
Through Artificial Intelligence Chatting Robot or Chatbot, human can interact with computer by using natural language which is a language that human used to interact with each other. First introduced by Joseph Weizenbaum (an MIT professor) in 1966 [9], ELIZA became the first chatbot that later became an inspiration for computer science and linguistic researchers to create a computer application that can understand and response to human language. The huge breakthrough in chatbot technology came in 1995 where Dr. Richard Wallace, an ex-Processor of Carnegie Mellon University combine his background in computer science with his interest in the internet and natural language processing to produce A.L.I.C.E. Artificial Linguistic Internet Computer Entity. A.L.I.C.E. that later described as a modern ELIZA is a three times winner of Loebner’s annual instantiation of Turing’s Test for machine intelligence [10]. When computer science evolves, so does the chatbot technology. As for a chatbot that need to have a huge knowledge-based which some call it as a ”chatbot’s brain”, managing data is really critical. Reviewing the evolving of chatbot technology that paralleling with the evolving of computer science technology, starting with ELIZA that stored the data by embedding the knowledge-based right into the code, then came A.L.I.C.E. that using AIML (Artificial Intelligence Markup Language) which is a derivative of Extensible Markup Language or XML [1] [7]. Now, with a relational database model with Database Management System (DBMS) technology, came a chatbot that actualize it. One of an example is VPbot, an SQL-Based chatbot for medical application. Developed by Dr. Griffin Webber from Harvard University, VPbot is a chatbot that takes advantage of a relational database model to stored, manage and even used SQL language (that is a database language) to do the chatbot data processes.
2 2.1
Chatbots Literature ELIZA
The fundamental technical problems with which ELIZA must be preoccupied are the following: (1) The identification of the ”most important” keyword occurring in the input message; (2) The identification of some minimal context within which the chosen keyword appears, for example if the keyword is ”you”, is it followed by the word ”are” (in which case an assertion is probably being made); (3) The choice of an appropriate transformation rule and, of course, the making of the transformation itself; (4) The provision of mechanism that will permit ELIZA to respond ”intelligently” when the input text contained no keywords; and (5) The provision of machinery that facilities editing, particularly extension, of the script on the script writing level [9]. In ELIZA, input sentence are analyzed on the basis of decomposition rules that are triggered by key words appearing in the input text. Then, responses are generated by reassembly rules associated with selected decomposition rules [9]. The basic principal of decomposition rules in ELIZA is that the input sentence
Chatbot Enhanced Algorithms
33
will be analyzed from left to right. Each word is looked up in database for keywords. If a word is identified as keywords, then the decomposition rules will apply. Let say the input sentence is ”Yesterday, my chest hurt badly” and the keywords that ELIZA’s looking is ”hurt”. After analyzing each word, ELIZA found the word ”hurt” in the sentence and decomposition rules applies as (0 hurt 0) means: 1. All words in front of ”hurt” 2. The word ”hurt” itself 3. All words following ”hurt” All those three decomposition components then will be used to generate response. 2.2
A.L.I.C.E.
A.L.I.C.E’s knowledge about English conversation pattern is stored in AIML files. In order for AIML to work with A.L.I.C.E, the knowledge-based in AIML is stored using a mechanism called Graphmaster. The Graphmaster consists of collection of nodes called Nodemappers. These Nodemappers map the branches from each node. The branches are either single words or wildcards. A convenient metaphor for AIML patterns is the file system stored in our computer that are organized hierarchically (tree structure). The file system has a root, such as ”c:¨and the root have some branches that are files, and some that are folders. The folders, in turn, have branches that are both folders and files. The leaf nodes of the whole tree structure are files. Every file has a ”path name” that spells out its exact position within the tree. The Graphmaster is organized in exactly the same way. AIML that stored a pattern like ”I LIKE TO *” is metaphorically are ”g:/I/LIKE/TO/star”. All of the other patterns that begin with ”I” also go into the ”g:/I/” folder. All of the patterns that begin with ”I LIKE” go in the ”g:/I/LIKE/” subfolder. So it’s like the folder ”g:/I/LIKE/TO/star” has a single file called ”template.txt” that contains the template [7]. Following the Graphmaster rules, A.L.I.C.E pattern matching process can be described as follows (let say the input sentence first word is ”yesterday” and the AIML was stored in file system architecture with folders and files): 1. From template file in the root folder, find a match pattern. If no match was found, try: 2. Find the subfolder ” ”. If found, try matching all remaining suffixes from the input sentence following the first word ”yesterday” (the whole input sentence). If no match was found, try: 3. Find the subfolder ”yesterday”. If found, try matching all remaining suffixes minus ”yesterday”. If no match was found, try: 4. Find the subfolder ”*”. If found, try matching all remaining suffixes from the input sentence following the first word ”yesterday”. If no match found, change directory to the parent of this folder and put back ”yesterday” on the head of the input.
34
A.S. Lokman and J.M. Zain
These processes will run recursively until the input is null (all words in the input sentence have been processes), or until the match is found in the template, then the process will stop. 2.3
VPbot
VPbot, a modern chatbot that store it’s language rules’ in a relational data model. It mostly shares many of the same features as A.L.I.C.E., but it is often easier to define new language rules in VPbot that with AIML. Whereas A.L.I.C.E. is designed to be able to produce generic responses to a wide range of topics, VPbot is best suited for a targeted topic of conversation. The VPbot algorithm accepts three input variables, a vpid, the current topic, and a sentence. The vpid is a unique identifier for each VPbot instance. The topic is an optional variable, which can be used to handle pronouns. Although the topic is an input variable in the general VPbot algorithm, in the Virtual Patient implementation, the student can neither see the topic nor change it. It is simply a variable that the Virtual Patient stores internally and returns to the chatbot with the next question. Note that while the VPbot topic serves a purpose similar to that of the AIML
and tags, it is used differently in VPbot [8]. The third input variable is a sentence entered by user. The output of VPbot is a new sentence and a new topic. As with AIML, the output sentence can be dynamically constructed using parts of the input sentence; the database does not have to store every possible response. Although VPbot uses SQL rather than AIML, there is a certain limit. As stated by Dr. Webber in his thesis, there are some limitations on what is possible with a single SQL statement, primarily because true recursion is not supported. However, in a restricted domain, such as a doctor-patient conversation, the full capabilities of AIML are not needed [8]. 2.4
Recap
In summary, those three chatbots are mostly used word-by-word analyzing technique (from left to right of an input sentence) that typically similar to human incremental parsing technique [5] in processing an input data/sentence. Some of them also implementing an algorithm regarding synonyms in linguistic in order to minimize the keywords matching possibilities. Other technique that shares the same objective is the used of variable that will hold the conversation topic (topic and that tags in AIML and topic variable in VPbot) by the intention to makes the chatting conversation is attached to the topic specified, or to produce a next response that related to the previous response. Either it is incremental parsing or holding the conversation topic, both technique are interrelated with the theory of human sentence processing where human brain had a working memory for sentence processing, either it is specialized or not (from general human memory) [2] [3] [4]. Nevertheless, for all these algorithms and techniques, we found a several ways for chatbot to enhanced it capabilities. The next section (section 3) will be a description of several sets of new algorithms or techniques
Chatbot Enhanced Algorithms
35
for chatbot that will enhance its productivity, usability, and also accurateness in response selection. Although all this algorithms are actually implemented in Bahasa Malaysia human language chatbot, the logic and technique was rather transferable. The description will cover a problem that those algorithms try to solve and then in another next section (section 4), a technical description (testing and results) for each one of them will be discussed.
3 3.1
New Enhanced Algorithms Extension and Prerequisite
Extension and Prerequisite are designed to make chatbots more capable in remembering the path within the conversation. This was an attempt to implement the working memory on human sentence processing technique within the brain of a chatbot. In general, chatbots are designed to response for user’s input in a one-way input-response mechanism that not actually holds the conversation path. It was like a search engine where user typed an input and engine will response specifically to that input without any relation to the previous input, and when a new input is inserted, the process will start all over again, holding nothing related to the previous search. Although there is a used of a technique that will hold the ”topic” of the conversation, it is not exactly draw the conversation path because the ”topic” mechanism is used as a state of what the conversation is all about, not in a state of where the conversation begin, go through, and end. In specific knowledge chatbot, a path can be useful to handle chatbot’s responses that related to each other (have an interaction in it’s meanings). Extension and Prerequisite are designed to works with each other. In all responses data, we put another two variables named Extension and Prerequisite where the default value for both variables is ”0”. The value will changed if that particular response linked to other responses (can be linked to more than one response) that later will draw a path. This path is used to hold the conversation specific ”issue” in order to reduce the needs for chatbot to scan the whole database for keywords (scanning will be focused on what is next in the path if the response had it defined in Extension variable). This path is very practical when there is a sets of responses pointing to final response that only triggered if the conversation follow the path toward an end of it. For example, a set of questions doctor have to ask in order to diagnosed a patient where the first answer from patient will trigger the second question that depending on the first answer; example question is ”what is your gender?” that had two possible answers which is ”male” or ”female” in which will determine the next question. As for Prerequisite variable, it will hold a value of the previous response that triggered the new response. In managing the response’s interactions, Extension and Prerequisite value will make it easy for chatbot’s authot to navigate through the path. In a meaning description, Extension is for the next responses and Prerequisite is for the previous responses. The visual explanation of Extension and Prerequisite are as follows: (Abbreviations note: Response Id = rid; Extension = ext; Prerequisite = pre)
36
A.S. Lokman and J.M. Zain
Fig. 1. Sample path for Extension and Prerequisite
Extension and Prerequisite are designed to works with each other. In all responses data, we put another two variables named Extension and Prerequisite where the default value for both variables is ”0”. The value will changed if that particular response linked to other responses (can be linked to more than one response) that later will draw a path. This path is used to hold the conversation specific ”issue” in order to reduce the needs for chatbot to scan the whole database for keywords (scanning will be focused on what is next in the path if the response had it defined in Extension variable). This path is very practical when there is a sets of responses pointing to final response that only triggered if the conversation follow the path toward an end of it for example, a sets of questions doctor have to ask in order to diagnosed a patient where the first answer from patient will trigger the second question that depending on the first answer; example question is ”what is your gender?” that had two possible answers which is ”male” or ”female” in which will determine the next question (the response can start to draw two path that represent the user’s answer ”male” or ”female” respectively). As for Prerequisite variable, it was mostly used in managing the Extension and Prerequisite. In order to make interactions between responses easy to manage, the UI (User Interface) for chatbot’s knowledge-based needs to be created. In the UI, Extension and Prerequisite value will be visible in order for the chabot’s author to easily navigate through the path. In a meaning description, Extension is for the next responses and Prerequisite is for the previous responses that is related to the selected response.
Chatbot Enhanced Algorithms
3.2
37
General Word Percentage (GWP)
General Words Percentage (GWP) is used to determine which response is more appropriate when the outcome from finding a longer keywords, less changed (synonyms replacement) keywords and all technique regarding finding a best suited keywords was still resulting in more than one possible responses (assuming chatbot can only choose one most-appropriate response to reply to user, which usually is). To make GWP works, the chatbot needs to had following components: 1) a set of database that contains a general words, which in the context of Bahasa Malaysia language, a ”penanda wacana” which is a words/phrases that used in a sentence to connect one point to another. These general words mostly used in every basic sentence which for example in English language are the words ”and”, ”or”, ”therefore”, and so on. 2) a variable that will hold the value of GWP attached to a chatbot’s responses data. In the process on inserting or editing the response in the knowledge-based, algorithm will scan for a general words in the new inputted response and calculated the GWP by the following equation: (total count of general words / total count of all words) * 100. So the response will holding a GWP value that is a percentage of general words over all words in the response sentence. The idea when analyzing the GWP value to determine which response is more appropriated is that the lowest GWP value will win by the logic of lowest general words percentage means that the response sentence had more ”meaning words” rather than general words. ”Meaning words” means a words that actually represent the meaning of what that sentence would like to say, not the words that just connect an idea/meaning of a small sentence or phrases that combined to become the complete sentence. 3.3
Synonyms and Root-Words
Most of previous developed chatbot had make the used of synonyms data in linguistic either to expand the possibility of keywords matching, remove the need to create a redundant keywords that had the same meaning but in different words, or just to simplify the input sentence. Synonyms in chatbot can be either a single words, phrases or event a complete sentence because at the end of the process, all synonyms will mostly be replaced with more relevant data that is suitable for each particular chatbot. As developing chatbot for Bahasa Malaysia language, other than synonyms, there is another issue in this linguistic that need to be tackled that is an ”imbuhan”. ”Imbuhan” is a Bahasa Malaysia term for additional character that being added to the words (mostly verb) in order to make it more suitable for the sentence (active or passive sentence for example). The appearance of an ”imbuhan” in a words makes the pattern to be match becoming bigger as one word can have different ”imbuhan” at a different time although it still principally represent the same meaning. To avoid the need to create redundant keywords that had the same meaning (similar as for synonyms), we introduce the Root-words database. A database that will include all appropriated root words that can have an ”imbuhan” attached to it. The
38
A.S. Lokman and J.M. Zain
database will cover two types of root words that is root words that will still be the same after the ”imbuhan” is being removed and also root words that will slightly changed after the ”imbuhan” is being removed. The algorithm steps are simple. 1) scan an appearance of a root words in an input sentence using incremental parsing technique. 2) if a root word is detected, replace original word (from input sentence) with root word from Root-words database. 3) if Root-words database contains a ”swap” data (in the case where the root word is changing because of the ”imbuhan”), replace original word with swap word from the Root-words database.
4
Testing, Results and Discussions
This section will discussed an implementation of all new proposed algorithms and techniques in a chatbot named ViDi (acronyms for Virtual Diabetes Physician) that is a Bahasa Malaysia human language chatting robot which function as virtual Diabetes physician. ViDi will answer any questions asked by users regarding Diabetes disease by providing responses that is designed by the actual Diabetes physician/doctor (author of ViDi’s knowledge-based). These response set are the result from multiple discussions between three medical doctors in order to produce a simplified version of definitions, descriptions and everything regarding an information on Diabetes disease. There are two parts of UI in ViDi; 1) the ViDi chatting interface and 2) the vBrain, that is an UI for ViDi’s knowledge-based management. Both parts are developed using HTML, PHP, javascript, AJAX, CSS and mySql programming languages and techniques. Fig. 2 and Fig. 3 will show a rough preview of UI for ViDi and vBrain (there will be more detail images of the UI later).
Fig. 2. 1) ViDi chatting interface. 2) vBrain: Managing ViDi’s responses.
Chatbot Enhanced Algorithms
39
Fig. 3. 1) vBrain: Managing ViDi’s keywords and Extension data for each response. 2) vBrain: Managing Synonyms, Root-words and General words.
The idea of building vBrain for ViDi’s knowledge-based management is to make it easy for an actual doctor or anyone (for other chatbot) without an experience in computer’s programming languages to author the chatbot’s knowledgebased. vBrain also being created to work as a front-end UI that will help authors managing responses interaction regarding implementing the architecture of Extension and Prerequisite and others that matter. 4.1
Extension and Prerequisite
In vBrain’s UI section for author to enter new response (refer Fig. 4), there is an input box for Extension data. If that response had more than one Extension data, author can separate the Response Id by using commas (,) and if there is no Extension, author can just leave it as default value of ”0”. After author save the new response, vBrain will automatically update all Prerequisite data for responses related to the newly added Extension data. This mechanism also applied when a previous response data is being edited. In a nutshell, Extension is manually entered/edited by author while Prerequisite is automatically adjusted by vBrain regarding the change happens in a responses set, which is whether a new response is ”added” or a previous response is ”edited”.
Fig. 4. Add New Response UI
40
A.S. Lokman and J.M. Zain
Following figure (Fig. 5) will show an example of responses with Extension and Prerequisite data in vBrain UI. This example is a simple one-to-one interaction within two responses that only have one Extension and one Prerequisite data (note that Extension and Prerequisite can be more that one data).
Fig. 5. 1) Response with Extension data. 2) Response with Prerequisite data.
Fig. 6. Sample conversation ViDi with user regarding the implementation of Extension and Prerequisite
For an example (shown in Fig. 5), two responses by the Match Id 10 and 11 are connected to each other by Extension (response 11 is Extension to response 10) and Prerequisite (response 10 is Prerequisite for response 11). In a chatting conversation, once response 10 is triggered, ViDi will hold the Extension value of 11 that will be used in keywords matching process whereby the next input is entered. If none of the keywords set for Extension response 11 is matched for the new input, ViDi then will look for other response’s keywords sets to be match. Remember that this will only happens if there is none match for the Extension response’s keywords which describe that in keywords matching process, the precedence always goes to response within the Extension value. Another thing is that ViDi will not run the matching process for responses that had Prerequisite data unless that particular response is in the previous matched response’s Extension data. Sample conversation for this response’s relation is shown in the Fig. 6 above.
Chatbot Enhanced Algorithms
4.2
41
General Words Percentage (GWP)
Fig. 7 will show a sample database table for ViDi’s responses data with GWP value of two point floating number. As mention earlier, GWP value is calculated by total count of general words divided by total count of all words in input sentence and the result of that will be multiply by 100 so that we get the percentage value of general words in the sentence. Since the GWP value is calculated when the response is being added or edited, ViDi then will only need to retrieve each related response’s GWP value whenever it is needed. The need to use GWP values occur when there are multiple final responses after the keywords matching process is done. This situation then will require ViDi to chose response with the lowest GWP value to be a winner by the logic of response’s with lowest GWP value holds the more meaningful sentence that the others given the description that the sentence had more ”meaning words” rather than general words apart from other responses (mention earlier in section 3.2). Taking into account if there is more than one response with the same lowest GWP value, ViDi then will have no choice but to choose a random response among those responses to be the winner assuming all responses had the same value of meaning to be revealing in their respective sentence.
Fig. 7. A sample database table for ViDi’s responses data with GWP value at the last column from the left. Note that second column (from the left) is holding the Macth Id data.
We put into a test One-match keywords data ”testing1” to all responses (response’s Match Id 25, 26 and 27) and ”testing2” only to response 25 and 26 in order to demonstrate how ViDi will chose its final winning response in a condition that final responses had more than one data. Referring to Fig. 8 below, keywords data ”testing1” will trigger three final responses and with response 27 had the lowest GWP value (2.86 over 3.51 and 5.71), it became the winner. Then for keywords data ”testing2” that triggered two final responses (25 and 26), the winner is response 25 with lowest GWP value of 3.51 over 5.71. 4.3
Synonyms and Root-Words
Most of previous developed chatbot had indicated the implementation of synonyms replacement but none of them had really implied the mechanism of root words replacement. It is generally because that mechanism is not essential when developing chatbot for the used of English language (general language used) but when dealing with Bahasa Malaysia language, it became otherwise by the reason of Bahasa Malaysia had its own linguistic component named ”imbuhan”
42
A.S. Lokman and J.M. Zain
Fig. 8. Testing the selection of winner response over several final responses after keywords matching process
that made it possible for the same meaning word to have a lot of variations in it’s spelling. If synonyms replacement were to be made without limiting the word’s variations, synonyms database will be really huge with redundant words with the same meaning, and also authoring that database will be really hard considering the flexibility of ”imbuhan”. ”Imbuhan” by all syntactic means is an additional character that being added into the words (mostly verb) in order to make it more suitable for the sentence condition, for example, a sentence’s types (passive or active). There are simply two conditions where ”imbuhan” affected the root words which are 1) ”imbuhan” that did not change the root words after being added and 2) ”imbuhan” that will slightly changed the root words in order to adapted into it. For the second condition, ”slightly changed” means a changed that is considered minor, which involved only the change of one to two character in average. Therefore, in managing Root-words, there will be two parameters involved that are word and swap. Word is a root word within the word while swap is a word that will be replacing word if the root word matched the second condition of ”imbuhan” affection. For Synonyms, it uses basic parameters (as mostly used by other chatbots) that are input (a word to matched with an input word from user) and swap (a synonym word that will be replacing input once the input matched with the input from user). In the input normalization process, the precedence goes to Root-words replacement over Synonyms replacement. This is because the needs for algorithms to eliminate word’s variations by Root-words replacement before getting it normalize by Synonyms replacement that has no jurisdiction over words with incorporated ”imbuhan” component. In conjunction to that, a swap data for Synonyms database must not contain any word with ”imbuhan” attached to it in order to avoid unnormalize words being sent to the next step of sentence processing (after normalization process is done). Fig. 9 below will show a sample of vBrain UI for managing Synonyms and Root-words data and then Fig. 10 will show a sample data from before and after normalization regarding Synonyms and Root-words replacement processes.
Chatbot Enhanced Algorithms
43
Fig. 9. vBrain UI for managing Synonyms and Root-words
Fig. 10. Sample of input data before and after normalization process regarding Synonyms and Root-words (noted that this will not actually be shown in the real conversation between users and ViDi)
5
Conclusion
Although the case study of the new proposed algorithms and techniques is just within Bahasa Malaysia language, the logic and technique was rather transferable. Using an approach of relational database model is a remarkable choice when dealing with a large amount of data as in knowledge-based needed by chatbot. The relation within the data in the knowledge-based database really open a whole new possibilities in the interaction within chatbot data processes. Those interaction makes it possible to have a lot more features embedded into chatbot processing algorithm such as the replacement of the ”topic” variable to connect response to one another, to the used of Extension and Prerequisite that is much more capable in define and maintain not only one, but multiple path within multiple responses, and focus to the specific interaction rather than make it open wide to other interaction that is not so much into the conversation topic
44
A.S. Lokman and J.M. Zain
or issue. Reducing the range of possibilities in keywords matching is one of the important things in developing chatbot because the chatting process is happen in real time where every seconds is precious. You do not want to leave the user on hold for a long time because the user might lose their interest into chatting with your chatbot. Therefore, any algorithm that can shorten the process of finding the match is very critical in developing chatbot, but not too skeptical that it will choose a less relevant response over the much more relevant one because focusing on process speed that it forget the conservative components of a chatbot that is conversation to human that requires a relevant and applicable responses. That is why we taking into account both performances and accurateness over chatbot process that define our algorithms and technique to be the enhancement of previously known chatbot algorithms.
References 1. Sawar, A., Atwell: Chatbots: are they really useful? LDV-Forum Band 22(1), 31–50 (2007) 2. Caplan, D., Waters, G.: Verbal working memory and sentence comprehension. Behavioral and brain Sciences 22, 77–126 (1999) 3. Gibson, E.: Linguistic complexity: locality of syntactic dependencies. Cognition 68, 1–76 (1998) 4. King, J., Just, A.: Individual differences in syntactic processing: the role of working memory. Journal of Memory and Language 30, 580–602 (1991) 5. Crocker, M.W., Pickering, M., Clifton Jr., C. (eds.): Architectures and Mechanisms for Language Processing. Cambridge University Press, Cambridge (2000) 6. Turing, A.M.: Computing Machinery & Intelligence. Mind LIX(236) (1950) 7. Wallace, R.: AIML Pattern Matching Simplified, http://alicebot.org 8. Webber, G.M.: Data Representation and Algorithms For Biomedical Informatics Applications, PhD thesis, Harvard University (2005) 9. Weizenbaum, J.: ELIZA- A Computer Program For the Study of Natural Language Communication Between Man And Machine. Communication of the ACM 9(1) (1966) 10. Shah, H.: A.L.I.C.E.: an ACE in Digitaland. TripleC 4(2), 284–292 (2006)
Handwritten Digits Recognition Based on Swarm Optimization Methods Salima Nebti and Abdellah Boukerram Department of computer science, Ferhat Abbess University, Setif, 19000, Algeria [email protected], [email protected]
Abstract. In this paper, the problem of handwritten digits recognition is addressed using swarm based optimization methods. These latter have been shown to be useful for a wide range of applications such as functional optimization. The proposed work places specific swarm based optimization methods that are the particle swarm optimizer and variations of the bees’ colony optimization in handwritten Arabic numerals recognition so that to improve the generalization ability of a recognition system through the use of two alternatives. In the first one, swarm based methods have been used as statistical classifiers whereas in the second one a combination of the famous gradient descent back-propagation learning method and the bees algorithm has been proposed to allow better accuracy and speediness. Comparative study on a variety of handwritten digits has shown that high recognition rates (99.82%) have been obtained. Keywords: Handwritten digits recognition, particle swarm optimization, the bees’ algorithm, artificial bee colony optimization, Multi layer perceptron.
1 Introduction Handwritten characters recognition has gained a significant attention and still on the level of research as humans need to communicate with computer in the easiest manner to help and speed interaction as well as the exchange of information [5]. The objective of an optical character recognition system (OCR) is to give significance in a numerical format of alphabetical letters, numerical numbers and other type styles such as punctuation without human intervention. Handwritten character recognition is an important task as it helps automation of a great number of applications such as automatic reading of bank checks, postal addresses and more; especially for tablet PC applications…... Handwritten character recognition systems can be distinguished into two main categories: online and off-line recognition systems. The first one treats the data in real time when the user writes the text, while the second one treats the data coming from a scanned document after text acquisition [11]. Characters recognition is a difficult task due to the large changeability of writing styles and it is a hard task on the level of text recognition where recognition goes by many processes that are: text segmentation into sentences, sentences into words and F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 45–54, 2010. © Springer-Verlag Berlin Heidelberg 2010
46
S. Nebti and A. Boukerram
words could be segmented into letters for their recognition. In all these processes, a lexicon dictionary is essential to deduce the correct assumption during recognition [10],[3]. The majority of character recognition systems are based on neural networks thanks to their generalization capability which is their ability to treat distinct examples from the learned data [4]. Artificial Neural networks (ANN) have been successfully applied to handwritten characters recognition. Evolutionary optimization provides new solutions for a wide range of neural networks applications. They can bring a great number of solutions such as neural networks learning and neural networks architecture evolution. The reasons behind their use consist mainly in their ability to explore large search spaces with a minimum knowledge concerning the objective function to optimize. The use of evolutionary algorithms such as genetic algorithms [2], genetic programming [15], simulated annealing [1] and particle swarm optimization [14], [16] has been justified very effective for many applications. In this work, we show that swarm based algorithms can indeed be used effectively, to make a recognition system capable to give good results. Particle swarm optimization and the bees’ colony optimization have been adapted for handwritten Arabic digits classification. Also, we show that the bees’ algorithm and the multi layer perceptron (MLP) can be combined effectively, to make a robust recognition system. Details of these algorithms are given with their adaptation also the obtained results. The rest of the paper is organized as follows. In section 2 a brief description of MLP neural network. In sections 3 and 4, the basic concepts underlying the bees’ algorithm and the artificial bee colony optimization are presented. In section 5, the tackled problem is formulated. In section 6, some experimental results are presented. Finally, conclusion and future works are drawn.
2 The Multi-Layer Perceptron The multi layer perceptron (MLP) is a supervised neuronal classifier which consists of an input layer whose size is equal to the input data size, one or more hidden layers whose size is determined by experiment and the output layer whose size is equal to the number of the target classes. In MLP networks, each neuron is connected to a number of inputs that can be the input data or the outputs of the preceding layers. From each neuron, information is propagated to the next layer using an activation function (usually a sigmoid function) applied to the scalar product between the inputs and the corresponding weights [7].
net j = bias * W bias + ∑ O pk (W jk ) k
O pj ( net j ) =
1 − λ net j 1+ e
(1)
Handwritten Digits Recognition Based on Swarm Optimization Methods
Opj : function "j": "p": "Wjk": "netj":
47
is the output from neuron "j" for pattern "p" using asigmoid activation the current neuron indicates the current pattern is the synaptic weight from intput k to neuron "j" is the total weight of neuron j
3 The Bees’ Algorithm The Bees Algorithm is a recent meta-heuristic that imitates the foraging behavior of honey bees when searching for food sites. This algorithm achieves a neighborhood search in joint with a random search for combinatorial or functional optimization [13]. To find an optimal solution, a proportion of the population is specified as representative bees (sites) searching for best patches and the remaining bees scout randomly from one patch to another during exploration [12]. The following is the bees’ algorithm in pseudo code [12], [13]: Initialize a population of solutions in a random way. Evaluate the fitness of each individual. While (stopping criterion is not satisfied) Select sites for neighborhood search. Recruit bees for selected sites Recruit more bees around the elite sites Evaluate the recruited bees’ fitnesses. Replace each site (elite site) by the fittest bee from its neighborhood. Assign the remaining bees in a random way then evaluate their fitnesses. End while
4 The Artificial Bee Colony Optimization (ABC) The ABC algorithm is a recent population based algorithm proposed by Karaboga [6] inspired from the intelligent behavior of honey bees when searching for food sources. The population in ABC algorithm is organized in three sub-groups: employed bees, onlookers and scouts. In ABC algorithm, food sources are possible solutions to an optimization problem and their nectar amounts correspond to fitnesses of the associated solutions [6]. Initialize the food sources in a random way (employed and onlooker bees) For a number of iterations do For each employed bee do Determine its fitness Produce a new solution in its neighborhood (its site) If it is better, it replaces the current employed bee End Calculate the probability value of the sources preferred by the onlooker bees
48
S. Nebti and A. Boukerram
For each onlooker bee do Select a source based on the estimated probability. Produce a new solution in its neighborhood If it is better, it replaces the current onlooker bee end Initialize the abandoned sources in a random way (not improved solutions) Memorize the best food source End [8]
5 Problem Formulation The described approach can be summarized in the following steps: 5.1 The Preprocessing Step We perform the following preprocessing steps: • • •
Binarization using Otsu method Crop each character from its image by eliminating the blank spaces Skeletization (thining)
5.2 Features Extraction Step In this paper, we have considered two famous feature extractors: Zoning and Hu moments. Zoning: The character image is divided into 8 by 8 zones, then the normalized number of foreground pixels i.e. the pixels density in each zone is considered as a feature item, so the feature vector of each character image has a size of 64 (8*8) features. Hu moments: The first 7 Hu moments were used; these features have been chosen since they are known by their invariance to translation, scaling and rotation. 5.3 The Classification Step We have developed two alternatives: The first alternative After determination of feature vectors it is possible to assign them to known character classes (here ten classes: from 0 to 9). In this work each particle (bee) is a set of ten feature vectors of randomly chosen characters from the training database, these features are considered as clusters centers. A cluster is a group of characters having the same class. The center is the character capable to give the minimum of the sum of squared error (SSE). The aim of PSO (the bees algorithm) is to minimize the SSE between the training data and the set of handwritten characters being recognized then to save the best characters that give the best minimized fitness (i.e. the sum of squared error).
Handwritten Digits Recognition Based on Swarm Optimization Methods
49
The proposed approach uses PSO or Bees colony optimisation dynamics to search for good characters classification. Algorithm: The particle swarm optimizer (bees colony optimizer) for characters recognition 1. Transform the set of characters being recognized into feature vectors. 2. Initialise a population of particles (bees) with a number of character features (this number is equal to the number of characters being recognized). For i =1 to number of characters being recognized do 3. Depending on the fitness between the current particle (bee: a character) and using the suitable optimizer method generate new particle (bee) i.e. new character in the current generation. Save the best character that have the minimum sum of squared error. 4. Check if the specified number of iterations is reached or no. 5. If no go to Step 3. 6. end End for i The second alternative This alternative is a combination of the famous back-propagation algorithm with the BA that allows high accuracy. In so doing, we have considered ten MLPs to recognize the ten digits (0 ..9). the output of each MLP is 1 for the considered digit class and 0 for all the other digits. For example the output of the first MPL that recognizes the digit 1 is: 1000000000 0000000000 0000000000 …………………. The output of the second MLP that recognizes the digit 2 is: 0000000000 0100000000 0000000000 …………………. The output of the third MLP that recognizes the digit 3 is: 0000000000 0000000000 0010000000 …………………. And so on. We have considered this scheme to find the unrecognized digits during the test step. (the unrecognized character is the character not classified by any one of the ten networks).
50
S. Nebti and A. Boukerram
The following is a description of this method. Algorithm: The bees algorithm with MLP trained by back-propagation for digits recognition 1. Transform the set of characters being recognized into feature vectors. 2. Apply back-propagation to train the ten MLPs 3. Test the networks and determine the characters which are not recognized 4. Send these characters as inputs to the BA classifier described above.
6 Results and Discussion To assess the performance of the proposed algorithms, different kinds of handwritten digits have been used like handwritten digits obtained from the MNIST database and scanned samples of handwritten digits. The initial parameter settings during experiments were as summarized in the table below: Table 1. Initial parameter setting The first alternative PSO based Classifier
The BA based Classifier
The ABC based Classifier
NumberOfparticles:200
Number of bees: 200
The inertia weight : A decreasing inertia weight from 0.78 to 0.1
Number of sites: 40
Number of iterations for scouting: 20
The cognitive and social factors :1.49
Number of elite sites:20 Number of bees around sites:50 Number of bees around elite sites: 80
Number of onlooker bees:100
Size of neighborhood: 0.0234
Number of employed bees:100
The second alternative Number of bees: 200
For each MLP
Number of sites: 40
The number of Hidden neurons: 32
Number of elite sites:20
The number of Output neurons: 10
Number of bees around sites:50
Number of epochs:5000
Number of bees around elite sites: 80
The number of training characters : 8000
Size of neighborhood: 0.0234
The number of testing characters: 2000
Handwritten Digits Recognition Based on Swarm Optimization Methods
51
For all the following results, we have used 10 000 selected digits from the MNIST database, 8000 digits were used for training (i.e. 800 digits per class) and the other 2000 digits were used for testing (i.e. 200 digits per class). Concerning the ABC approach, it is adapted with some changes; we have employed the notion of neighborhood used in the BA in the place of random selection of possible solutions as neighbors. Also we have used the Sum of Squared Error between the characters being recognized and the characters used for training as fitness without any alteration as in the ABC algorithm. The probability of the preferred sources by the onlooker bees is: (the-fitness-of-an-employed-bee / Sum-of-allemployed-bees-fitnesses). With these changes, the algorithm is more stable when applied to handwritten digits recognition. In table2, we summarize the obtained results after 10 runs for each algorithm. Table 2. Comparative results The first alternative The recognition rate The used classifier
Zoning features
Hu moments
PSO (100 iteration) ABC (100 iteration) BA (100 iteration) K-nn
80.428 % 80 % 92.857 % 90%
92.2% 84.49% 99.8% 98%
The used classifier
The second alternative The recognition rate
Back-propagation(50000 iteration) without regularization or cross validation
80.01%
combined back-propagation BA (5000 iteration for each MLP with BP + 100 iteration for the BA)
99.82%
As these results show, it is seen that in the first alternative that the BA based training approach gives the best results, PSO algorithm gives better results than ABC when dealing with handwritten digits recognition. Also, it has been found that the Hu moment feature extractor contributes to a better classification than the zoning feature extractor. In the second alternative the combined back-propagation with the BA gives the best results compared to the famous back-propagation algorithm because the misclassified digits using the MLP will be then correctly classified using a good classifier such as the bees’ algorithm. These mentioned results are very promising when compared to [9].
52
S. Nebti and A. Boukerram
Also, it is important to say that we have found in the MNIST database a number of similar digits which belong to different classes (i.e. having the same feature vector but different labels i.e. different target vectors) For example we have found: are similar to digit 6 but labeled as digit 5. is similar to digit 4 but labeled as digit 8. is similar to digit 8 but labeled as digit 3…. and more. When we have eliminated these digits, a recognition rate of 100% has been obtained by the BA and the combined Back-propagation with the BA, using 7500 training digits and 2000 testing digits. The graphs bellow show the behavior of some classifiers I.e. the minimization of the sum squared error (SSE) on the training data during iterations when the Hu moment features are considered.
0.02
0.015
0.018
0.014
0.016
SSE
0.013
SSE
0.014
0.012
0.012 0.011
0.01 0.01
0.008
0.009
0.008
0.006 0.004 0
10
20
30
40
50
60
70
80
90
100
0
10
20
30
Fig. 1. The ABC classifier behavior: Recognition Rate 80.9%
40
50
60
70
80
90
100
iterations
iterations
Fig. 2. The BA classifier behavior: Recognition Rate 99.85% 0.09 0.08 0.07
Mse
SSE 0.06 0.05 0.04 0.03 0.02 0.01
iterations
Fig. 3. The combined Bp with BA classifier behavior: Recognition Rate 99.87%
0
10
20
30
40
50
60
70
80
90
100
iterations
Fig. 4. The PSO classifier behavior: Recognition Rate 90.65%
Handwritten Digits Recognition Based on Swarm Optimization Methods
53
7 Conclusion Our work focuses on applying swarm based optimization methods as training classifiers for handwritten digits recognition. We have shown how particle swarm optimization and the bees’ colony optimization methods can be used to guide the search for better numerals that are the most close to the numerals being recognized and thus allowing good recognition accuracy by maintaining diversity. Experiments show very promising results in terms of accuracy. When dealing with handwritten digits recognition, the use of the BA often gives the best results compared to PSO, ABC and the famous K nearest neighbor (K-nn) classifier. Also, it is found that the combined BA with Back-propagation for training a MLP gives the highest results in terms of accuracy and speediness due to hybridization. We have found that the neural approach is the best, it takes an enormous time for training but once the training finishes, the best weights are saved for a very quick recognition time, in contrast to the other presented approaches which give good results but a very long time for recognition which make their use impossible in the real applications. As future works, we can easily extend this work to recognize the handwritten Latin characters based on structural and geometrical features. Also we can reduce the training time based on digits selection using these swarm optimization algorithms as clustering methods before the training step. Also we guess that the use of these metaheuristics for neural networks architecture evolution (i.e.: to determine the best number of hidden layers and hidden neurons) will give good quality results.
References 1. Arnaud, A.L., Adeodato, P.J.L., Vasconcelos, C.G.: MLP Neural Networks Optimization through Simulated Annealing in a Hybrid Approach for Time Series Prediction. In: SBC ENIA 2005 - V Encontro Nacional de Inteligência Artificial, pp. 1110–1113 (2005) 2. Branke, J.: Evolutionary algorithms for neural network design and training. Technical Report No. 322, University Karlsruhe, Institute AIFB (1995) 3. Bloomberg, D., et al.: Extraction of Text-Related Features for Condensing Image Documents. In: SPIE, vol. 2660, pp. 72–88 (1996) 4. Vesely, A.: Neural networks in data mining. AGRIC. ECON. – CZECH 49(9), 427–431 (2003) 5. Harnett, J.: Developments in OCR for automatic data entry. In: Picken, C. (ed.) Proceedings of a conference on Translating and the Computer, November 14-15, 1985, vol. 7, pp. 109–117. CBI Conference Centre, London (1986) 6. Karaboga, D.: An Idea Based On Honey Bee Swarm for Numerical Optimization. Techn. Rep. TR06, Erciyes Univ. Press, Erciyes (2005) 7. Rojas, R.: A Graph Labelling Proof of the Backpropagation Algorithm. ACM Commun. 39(12es), 207–215 (1996) 8. Karaboga, D., Akay, B.: Artificial Bee Colony (ABC), Harmony Search and Bees Algorithms on Numerical Optimization. In: IPROMS 2009, Innovative Production Machines and Systems Virtual Conference, Cardiff, UK (2009) 9. Gorgel, P., Oztas, O.: Handwritten Character Recognition System Using Artifıcial Neural Networks. Istanbul University – Journal of Electrical & Electronics Engineering 7(1), 309– 313 (2007)
54
S. Nebti and A. Boukerram
10. Makhoul, J., Starner, T., Schwartz, R.M., Chou, G.: On-Line Cursive Handwriting Recogni-tion Using Hidden Markov Models and Statistical Grammars. In: Proceedings of Human Language Technology Conference HLT 1994, pp. 432–436 (1994) 11. Park, J.: Hierarchical Character Recognition and its uses in handwritten word/phrase recognition, PHD thesis (1999) 12. Pham, D.T., Ghanbarzadeh, A., Koc, E., Otri, S., Rahim, S., Zaidi, M.: The Bees Algorithm. Technical Note, Manufacturing Engineering Centre, Cardiff University, UK (2005) 13. Pham, D.T., Ghanbarzadeh, A., Koç, E., Otri, S., Rahim, S., Zaidi, M.: The Bees Algorithm – A Novel Tool for Complex Optimisation Problems. In: Proceedings of IPROMS 2006 Conference, pp. 454–461 (2006) 14. Su, T., Jhang, J., Hou, C.: A hybrid artificial neural networks and particle swarm optimization for function approximation. International Journal of Innovative Computing, Information and Control 4(9), 2363–2374 (2008) 15. Tan, C.L., Chia, H.W.K.: Neural Logic Network Learning using Genetic Programming. In: Proc. 17th Int’l Joint Conf. Artificial Intelligence (IJCAI ’01), vol. 2, pp. 803–808 (2001) 16. Wu, P., Shieh, C.S., Kao, J.H.: The Development of Neural Network Models by Revised Particle Swarm Optimization. In: JCIS 2006, pp. 1951–6851 (2006), http://cui.unige.ch/AI-group/teaching/dmc/07-08/cours/ dm06-mlp-handouts.pdf
A Framework of Dashboard System for Higher Education Using Graph-Based Visualization Technique Wan Maseri Binti Wan Mohd1, Abdullah Embong2, and Jasni Mohd Zain3 Faculty of Computer System & Software Engineering University Malaysia Pahang [email protected], [email protected], [email protected]
Abstract. We propose a novel approach of knowledge visualization method by adopting graph-based visualization technique and incorporating Dashboard concept for higher education institutions. Two aspects are emphasized, knowledge visualization and human-machine interaction. The knowledge visualization helps users to analyze the comprehensive characteristics of the students, lecturers and subjects after the clustering process and the interaction enable domain knowledge transfer and the use of the human’s perceptual capabilities, thus increases the intelligence of the system. The knowledge visualization is enhanced through the dashboard concept where it provides significant patterns of knowledge on real-world and theoretical modeling which could be called wisdom. The framework consists of the dashboard model, system architecture and system prototype for higher education environment is presented in this paper. Keywords: Data Mining, Knowledge Discovery, Knowledge Visualization.
Data Visualization,
1 Introduction For many tasks of exploratory data analysis, visualization plays an important role. Data visualization is widely accepted by the medical, scientific, engineering, entertainment and business industries due to its ability to add value to existing data, but have not been widely studied for applications in higher education data analysis. It is critical for the higher educationist to identify and analyze the relationships among different entities such as students, subjects, lecturers, environment and organizations to ensure the effectiveness of their important processes. Furthermore, higher education will find larger and wider applications for data visualization than its counterpart in the business sector, because higher education institutions carry three duties that are data mining intensive: scientific research that relates to the creation of knowledge, teaching that concerns with the transmission of knowledge, and institutional research that pertains to the use of knowledge for decision making [1]. The study to investigate a new concept of University Dashboard was motivated by the findings from a previous study comparing the existing university reporting system and the new expected dashboard through qualitative and quantitative study done F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 55–69, 2010. © Springer-Verlag Berlin Heidelberg 2010
56
W.M.B. Wan Mohd, A. Embong, and J.M. Zain
among the stakeholders of Malaysian Higher Education Institution [16]. Table 1 shows the comparison between the current reporting system and the new expected dashboard required by the university. Table 1. Comparison between Features of the current reporting system and the new expected dashboard
1
New University Dashboard Illustrate the Overall Summary of Student Performance at a glance using Dashboard Concept and incorporating Key Performance Indicator.
2
Incorporating Data Mining engine to cluster the Student’s Result for knowledge exploration
3
Incorporating Forecasting engine to forecast the Future Student Performance
4
Incorporating Opinions from Education Experts on the student performance and suggestion for improvement as part of knowledge management mechanism.
5
Incorporating Related Publications such as article, conference proceeding, paperwork and technical report which relates to the student performance.
6
Incorporating Drill-down Features to zoom into Cluster’s Behavior and Analysis by Faculty, year and race
7
Incorporating Knowledge Management Elements to process the Explicit Knowledge such as Data Clustering of Student’s Result, Detail Student Behavior and Analysis of Results. Incorporating Knowledge Management Elements to process the Tacit Knowledge such as Expert Opinion and Skill and Experience through publications such as knowledge bank in the terms of Technical Report, Article, Journal, Paperwork and etc.
8
Current University Reporting System The current dashboard is just a normal reporting without incorporating the Key Performance Indicator and display details without summary for the University. The current system does not incorporate Data Mining process. The results are only go through normal information retrieval with predefine requirements The current system does not incorporate forecasting engine. It only displays the current status and history. The current report does not include expert opinion. Opinions are recorded as part of meeting minutes without proper documentation – no knowledge management mechanism. No sufficient research being done to improve student performance. The findings are not attached to the academic system for review or to help in decision making. Drill down only for faculty, year, race etc, but not to detail student’s behavior. Student behavior is not captured in any university application. Very limited knowledge management for explicit knowledge. The current system just display result without exploring the insight of the information. The current systems display only the result of explicit knowledge without tacit knowledge.
A Framework of Dashboard System for Higher Education
57
Based on the above findings, a new Knowledge Based Dashboard system has been designed to facilitate knowledge visualization as part of knowledge management activities specifically to support knowledge sharing, usage and exploration in Higher Learning Institutions. It has been designed as a holistic approach by combining various techniques including software intelligent engines, graph-based visualization technique, integration of explicit and tacit knowledge and trend analysis. This paper explains the overall concept and design of knowledge based Dashboard framework, dashboard features, visualization technique used in the dashboard and the components of the dashboard by illustrating the dashboard prototype.
2 Related Works 2.1 Visual Data Mining The main reason for data visualization is the limitation of human beings to absorb the large amount of information. The volumes of data are overwhelming and the human visual systems and brain are not equipped to work with the data in this form [2]. In brief, data visualization is a method of presenting the output so that the entire problem and the solution are clearly visible to domain experts [3]. Data visualization uses graphical and numerical tools to reveal the information contained in data [2]. It is more effective approach for understanding and communication than the use of common numerical tools. Graphical methods hold a key for visualization. 2.2 Data Visualization Technique and Software Visual representation of data has evolved rapidly in recent years. Ankerst (1999) gives a comprehensive overview over existing visualization techniques for large amounts of multidimensional data [4]. In general, visualization techniques are powerful tools that are frequently employed in knowledge discovery processes. Visualizations can make complex relationships easily understandable and stimulate visual thinking [5]. Especially, tools which visualize the cluster structure of data are valuable for exploring and understanding data. Such tools include data histograms for one-dimensional data as well as algorithms which project high-dimensional data to a two-dimensional visualization space trying to preserve the topology [5]. Recently, several techniques of data visualization have been introduced. Among the popular techniques are independence diagrams [6], equiwidth histograms, correlation coefficients and scatterplots, geometric techniques, icon-based techniques, pixeloriented techniques, hierarchical techniques, graph-based techniques and dynamic techniques [7]. 2.3 Graph-Based Visualization Technique Multiple relationships can be represented within a single graph by displaying the name of each relationship as the label of an edge or arc, or by using different colors or
58
W.M.B. Wan Mohd, A. Embong, and J.M. Zain
line styles to render the edges or arcs associated with each relationship [12]. Beside the graph structure, interactive graph-based interfaces have been discussed in various studies. Interactivity is desirable to browse and edit data. However, as graphs increase in size, interactive interfaces risk information overload and low responsiveness. Bondy and Murty (1976) provide an excellent introduction to graph theory [9]. The basic data structures and algorithms for manipulating graphs are out-lined in [10]. A general introduction to database theory can be obtained from [11]. Subsequently, Freire (2004) highlighted the challenge of preserving the mental map during graph navigational actions. Performing incremental layouts after each navigational or editing step implies that there will be greater differences between successive views, and therefore greater care is required to preserve the mental map during interaction. Freire (2004) have also identified a series of general “factors” that contribute to mental map preservation which are predictability, degree of change and traceability[12]. 2.4 Digital Dashboard Digital dashboard is visual, at-a-glance displays of data pulled from disparate business systems to provide warnings, action notices, next steps, and summaries of business conditions [14]. Today, digital dashboard technology is available "out-ofthe-box" with many software providers on the scene. The concept of “information overload” has been the driving force behind the creation of Digital Dashboards. In a typical organization there are so many different sources of pertinent information from which to make decisions, yet it can be so difficult to obtain the information into a format that is truly useful [15]. The term Dashboard has transcended the automotive arena to enter the business world hand in hand with visualization. The main function of a digital dashboard technology is to synchronize information in one visible and easy-to-read place. Hence a digital dashboard is a graphic representation that contains a series of gauges and depictions that summarize the state of the company, be it financial, sales or more generally of any indicator that allows you to know the situation, possibly in real time, of your business [15]. 2.5 Data Visualization in Higher Education Major applications of data visualization identified for higher education institutions are alumni, institutional effectiveness, marketing and enrollment management; for example, data visualization allows the higher institutions to act before a student drops out, to plan for resource allocation based on number of students taking a particular course, to identify those who are most likely to donate or participate in alumni related activities, to identify how do students learn best, what courses are often taken together and what learning experiences are most contributive to overall learning outcomes [1]. Although there are several studies done on data visualization in higher education, the effectiveness of the visualization system is still an open area to be explored. The study proposes a holistic approach of visualizing the data in higher education environment.
A Framework of Dashboard System for Higher Education
59
The motivation for this paper is to develop a framework of Graph-based Dashboard Visualization system as a visualization tool for knowledge discovery in higher education environment which integrates the dashboard and graph-based visualization concepts.
3 The Framework of Dashboard System for Higher Education Environment As the computer system become more advance, the demand for better visualization of data increases. In this paper, several concepts and techniques has been combined and integrated to improve the data visualization, hence improve the understanding of the data. The proposed framework has the following characteristics: (1) User friendly interface, (2) Representing the overview of the data dimensions using dashboard concept or single visual cue and (3) Providing graph-based visualization interface using cluster and feature vectors view with the color scheme to understand the insight of the data while preserving the mental-map. 3.1 Framework Design Fig. 1 illustrates the framework of the new visualization system for higher education. The system is designed in 3 structured layers to provide a systematic, natural and convenient visualization of data [9]. In a layered graph, each layer corresponds to an abstraction level that contains a view of a collection of data starting from overview and drill down further into the detail to zoom insight the data. The first layer represents the summary or overview of data which implies the health of the institution in dashboard form, while the second layer provides the users with the insight of the data to further understand the internal pattern through the cluster view in graph form, and the third layer provide the behavior of the cluster through feature vectors view for further action. In higher education system, each level of organization’s hierarchy needs a specific view of the data. Thus, the proposed framework is designed to fulfill each level’s need by providing different view for each level of user in the institution. The dashboard layer is designed for management members of the organization to visualize the overall health of main subject in the organization. The second layer which shows the cluster of the object such as student is designed for implementer such as lecturers in higher education institution. The cluster of student is important to help the lecturers to understand the overall pattern and identify clusters need to be further investigated for necessary action. The last layer which is feature vectors layer explains the behavior of each cluster to give more information to the implementer to take further action. Furthermore, the interaction with the domain knowledge at dashboard layer shall allow them to change the performance indicator and the system shall trigger the clustering process to reprocess the clustering algorithm based on the input. Subsequently in the cluster layer, the interactivity shall improve the clustering process based on domain expert’s inputs. At the third layer which is feature vector view, the domain expert shall facilitate the system to define more attributes for each particular cluster. These interactions shall improve the intelligence and accuracy of the clustering process and at the same time, the system facilitates the knowledge forecasting.
60
W.M.B. Wan Mohd, A. Embong, and J.M. Zain
Layer Interaction
Dashboard View
Layer Interaction
Cluster View
Layer Interaction
Feature Vectors View
Fig. 1. Framework of Graph-based Dashboard Data Visualization System
3.1.1 Dashboard Layer First layer represents dashboard of the organization, in which the main function is to synchronize information generated through data mining process in one visible and easy-to-read place. The concept of summarizing data through aggregation and roll-ups concept has been applied in this study[16]. Applying the roll-ups and aggregation concepts, the framework proposes a new aggregation and roll-ups techniques illustrated in Fig. 2.
Student 1
Student 2 Student 3
Student 4 Cluster 1 Student 6
Student 5
Transformation through aggregation And roll-ups techniques
40
60
20
80 6
Current number of student In the associated cluster
0
10 0
Targeted number of student In the associated cluster
Fig. 2. The transformation process through aggregation and roll-ups technique applied in graphbased Dashboard System
A Framework of Dashboard System for Higher Education
61
3.1.2 Cluster Layer Cluster layer comprises of clusters generated by a data mining process with clustering technique namely MaxD K-means clustering [16]. MaxD Clustering technique is identified as the most appropriate clustering technique for the study because it does not need any parameter to be input for the clustering engine. In higher education environment, the need to explore the unknown is highly demanded. For example, the unknown pattern of student’s cluster has to be determined without specifying the number of clusters expected. The generated clusters then are visualized in layer 2 using graph-based technique as shown in Fig. 3. Graph-based technique is proposed in this framework since it easily illustrates the size of cluster, distance between the items and the centroid and distance among the clusters. The clustering process will generate several clusters such as excellence, less excellence, medium cluster and not performing cluster of students. The clustering technique is considered more explorative compared to the normal reporting system. In normal reporting, the category of students is fixed and predefined, while, in clustering process, the number of generated category is naturally derived by the process of clustering. The visualization is further improved by using different color for each cluster.
Clustered using MaxD Clustering engine
Fig. 3. Clustering process using MaxD Clustering engine
3.1.3 Feature Vectors Layer To further understand the behavior or attributes of each cluster, a feature vectors is incorporated in third layer of the system. The idea is motivated from the study done by jonyer (2001), whereby the data is presented as feature vectors and represented as a collection of small, star-like, connected graphs. In the study, SUBDUE (The substructure discovery system) yields better results using a more general representation including a placeholder node (student in our example) that serves as the center node in the star [17]. Fig. 4 below illustrates the example of feature vectors of a student.
62
W.M.B. Wan Mohd, A. Embong, and J.M. Zain
> 3.5
CGPA
Yes
Active
Having notebook
Co-curriculum
Student
Stay in Hostel Avg. number of subject per semester Yes
5
Fig. 4. Feature Vectors for Students in Higher Education
4 System Architecture The system starts with clustering processing to cluster the information in the existing organization’s database. At the same time, input from Key Performance Indicator (KPI) System will generate the dashboard for the targeted KPI. For the above processing, interaction with the domain experts shall improve the intelligence of the system. The outputs from the above processes shall be further processed by the Visualization Engine to produce the final output through Interactive Dashboard Knowledge Visualization System. The architecture of the system is shown in Fig. 5.
Key Performance Indicator
Domain Expert
Dashboard Process Visualization Engine Process Clustering Process (MaxD)
Interactive Dashboard Knowledge Visualization System Visualization Webpage
Database Domain Expert
Fig. 5. System Architecture of Interactive Dashboard Knowledge Visualization System
A Framework of Dashboard System for Higher Education
63
5 System Prototype The dashboard for Higher Education Institutions is designed to facilitate the management of the university in decision making and strategic planning. The web based prototype of the Dashboard was developed using Php Programming Language and MySQL database system, while the clustering and forecasting engines were developed using Oracle Form 6i and Oracle Database 11g. In this prototype, a Dashboard of student performance has been designed and developed to illustrate the usage of dashboard and visualization concept. Based on the above Dashboard features and visualization techniques, the components of Dashboard are derived as the following: 1) Health of the Institution at a Glance, 2) Knowledge Exploration, 3) Trend Analysis and Tacit Knowledge of experts. Those components are demonstrated via a development of prototype as shown in Fig. 6(a) and 6(b). Each Dashboard component shall be explained in the following section.
Fully Web-based
Overall Summary Reflects the Health of The Institution
Incorporating Data Mining Through MaxD K-Means Clustering
Provide Drill-down capability To Explore Student’s Behavior
Screen Shot of University Dashboard Case Study: First Class Student (Pg. 1)
Fig. 6(a). Components of Dashboard
Incorporating Tacit Knowledge
Screen Shot of University Dashboard Case Study: First Class Student (Pg. 2)
Provide Drill-down capability For Further analysis
Incorporating Document Bank for further research
Fig. 6(b). Components of Dashboard
64
W.M.B. Wan Mohd, A. Embong, and J.M. Zain
5.1 Health of the Institution at a Glance The first component of the proposed Dashboard is ‘at a glance dashboard’. Using dashboard as shown in Fig. 7, the management of the institution can view at a glance the status of the performance in current semester, previous semester and next semester.
Fig. 7. Health of the Institution at a Glance
Fig. 8. Clusters Generated By MaxD K-Means Clustering Engine
5.2 Knowledge Exploration Knowledge exploration is the second components of the proposed Dashboard. The knowledge of the performance is explored by displaying the clusters of students based on their performance. The clusters have been generated by MaxD K-Means Clustering engine. In this case study, the clusters of student are based on the range of CGPA
A Framework of Dashboard System for Higher Education
65
derived by the clustering engine. The clusters illustrate the group of students with the same performance. Then each cluster shall be further explored via cluster’s behavior. The purpose of viewing the behavior is to help the management to identify the characteristics for each cluster in order to make recommendation and action. For example, by exploring the cluster with CGPA 3.5, the characteristic of the student in this cluster can be identified through cluster’s behavior and as a recommendation, the same characteristics and behavior can be practiced by other students to improve their CGPA. The prototype of the clusters and cluster’s behavior is shown in Fig. 8 and Fig. 9. Fig. 9 shows examples of characteristics or behavior of first class students which describe their study styles, attitudes and engagement time. Thus, by exploring these characteristics, other student will be able to practice the same thing in order to help them improve their performance.
Result First Class Student’s Factors (Student)
Strategize Learning
Interested With the subject
Create Own assignments Good SPM Result
Attend Class fully
Search extra material
Well Educated parents
First Class Factors (Student)
Consistent Learning Style
Study Group
Close to lecturers
Well Defined target Mix with Good Friends
Positive Attitudes
Fig. 9. Behavior of First Class Student Cluster
5.3 Trend Analysis The third component of the Dashboard is trend analysis. The trend analysis basically is the process of data transformation from information into knowledge to help the management in decision making. Trend analysis is done through analyzing knowledge in the form of trend or pattern for certain duration of time. The purpose of trend analysis is to identify the problems at high level. If there is any problem at high level, then the detail information will be furnished to investigate and solve the problems. Trends and patterns will be generated through executive information system (EIS) from institution’s database. The trends are presented in graphical format for effective visualization. Fig. 10 shows the trend analysis for student performance in graphical form with drill-down feature.
66
W.M.B. Wan Mohd, A. Embong, and J.M. Zain
Fig. 10. Trend Analysis of Student Performance
Besides the trend analysis shown in Fig. 10, the study has identified other type of trend analysis for student performance based on the investigation from selected participants from Malaysian Higher Learning Institutions. The types of trend analysis are shown in Fig. 11.
Executive Information System
Trend by Faculty
Trend by Subject
Trend by subject
Trend by Lecturer
Trend by Year
Trend by Lecturer
Trend by Faculty
Trend by class
Trend by Subject
Trend by Semester
Trend by Semester
Trend by semester
List of student
List of Student
List of Student
List of Student
List of Student
Student Profile
Student Profile
Student Profile
Student Profile
Student Profile
Fig. 11. Types of Trend Analysis for Student Performance
In general, trend analysis in Higher Institution Education should include all aspect of management including trend analysis of teaching delivery performance, staff performance, management and supervision performance, facility performance, student affair and services, financial performance, product development performance and external service performance. All these analyses will be generated through EIS. 5.4 Expert Opinion Expert opinions captured in the Dashboard for Higher Education Institution is mainly about interpretation and opinion from the experts on the generated knowledge displayed on the Dashboard. In the current situation, this knowledge is captured during the meeting and is not attached or embedded with the generated knowledge. Fig. 12
A Framework of Dashboard System for Higher Education
67
shows the components of tacit knowledge captured from education experts to interpret the result of knowledge processing including the result of clustering, forecasting and trend analysis. The knowledge proposed in this study comprise of expert interpretation based on knowledge, experience and observation. The referred publications such as journal, conference paper, books and article are also captured as supporting tacit knowledge from external experts.
Fig. 12. Components of Expert Opinion
Fig. 13 illustrates the prototype of expert opinion captured from experts during the analysis of knowledge displayed on the Dashboard. The expert opinion includes interpretation of knowledge, opinion to solve a particular problems related to the knowledge displayed and referred publications for the related subject matter. The referred publication is also known as knowledge bank.
Fig. 13. The Prototype of Expert Opinion in Dashboard
68
W.M.B. Wan Mohd, A. Embong, and J.M. Zain
6 Conclusion In this paper, we addressed problems with the existing visualization system, in which the understanding of data plotted in current system is difficult to understand. We propose a holistic framework of data visualization system which shall improve the visual impact of the system. By designing the system to be multi-layered, it shall improve the visualization and personalization level. The framework of the system combines dashboard and graph-based concept to ensure that data is visualized in appropriate view for each group of users. The example of graph-based dashboard system design and prototype is illustrated for higher education environment.
References 1. Luan, J.: Data Mining and Its Applications in Higher Education. In: Serban, A., Luan, J. (eds.) Knowledge Management: Building a Competitive Advantage for Higher Education. New Directions for Institutional Research, vol. 113. Jossey Bass, San Francisco (2002) 2. Anand, M., Bharath, B.N., Chaitra, Kiran Kumar, M.S.,Vinay, C., Bharatheesh, T.L.: Visual Data MiningApplication in Material, http://www.geocities.com/anand_palm/ 3. Sarabjot, S.A., Bell, D.A., Hughes, J.G.: The Role of Domain Knowledge in Data Mining. In: Conference on Information and Knowledge Management (CIKM 1995), Proceedings of the fourth international conference on Information and knowledge management, Baltimore, MD, USA, pp. 37–43. ACM Press, New York (1995) 4. Ankerst, M., Elsen, C., Ester, M., Kriegel, H.-P.: Visual Classification: An Interactive Approach to Decision Tree Construction. In: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD 1999, San Diego CA USA, pp. 392–396. ACM Press, New York (1999) 5. Pampalk, E., Goebl, W., Widmer, G.: Visualizing Changes in the Structure of Data for Exploratory Feature Selection. In: SIGKDD 2003, Washington, DC, USA (2003) 6. Berchtold, S., Jagadish, H.V., Ross, K.A.: Independence Diagrams: A Technique for Visual Data Mining. In: Proc. KDD 1998, 4th Intl. Conf. on Knowledge Discovery and Data Mining, New York City, pp. 139–143 (1998) 7. Han, J., Cercone, N., RuleViz : A Model for Visualizing Knowledge Discovery Process. In: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, Boston, MS, USA, pp. 244–253. ACM Press, New York 8. Emanuel, G.N.: Challenges in Graph-Based Relational Data Visualization. In: Proceedings of the 1992 conference of the Centre for Advanced Studies on Collaborative research CASCON ’92, IBM Centre for Advanced Studies Conference, Toronto, Canada, vol. 1, pp. 259–277 (1992) 9. Bondy, J.A., Murty, U.S.R.: Graph Theory with Applications. North Holland, Amsterdam (1976) 10. Tarjan, R.E.: Data Structures and Network Algorithms. CBMS-NSF Regional Conference Series in Applied Mathematics, vol. 44. SIAM, Philadelphia (1983) 11. Ullman, J.D.: Principles of Database and Knowledge-base Systems, vol. 1. Computer Science Press, Rockville (1988) 12. Freire, M., Rodríguez, P.: A graphbased interface to complex hypermedia structure visualization. In: Proceedings of the working conference on Advanced visual interfaces, ACM 2004, Gallipoli, Italy, pp. 163–166 (2004)
A Framework of Dashboard System for Higher Education
69
13. Eades, P., Sugiyama, K.: How to Draw a Directed Graph. Journal of Information Processing 13(4) (1990) 14. Abel, T.: Microsoft Office 2000: Create Dynamic Digital Dashboards Using Office, OLAP, and DHTML. MSDN Magazine (2000), http://msdn.microsoft.com/msdnmag/issues/0700/Dashboard/ (posted on 2006) 15. Marcus, A.: Dashboard in your Future. Interactions (January-February 2006), http://www.denniskennedy.com/blog/2006/06/ is_there_a_digital_dashboard_in_your_future_a.html (posted on June 22, 2006) 16. Mohd, W.M.W., Zain, J.M., Embong, A.: MaxD K-means: Auto-generation of Initial Number of Cluster and Centroid based on Distance of Data Points. In: Industrial Conference on Data Mining, ICDM 2007 Symposium Poster (2007) 17. Jonyer, I., Cook, D.J., Holder, L.B.: Graph-Based Hierarchical Conceptual Clustering. Journal of Machine Learning Research 2, 19–43 (2001)
An Efficient Indexing and Compressing Scheme for XML Query Processing I-En Liao , Wen-Chiao Hsu, and Yu-Lin Chen National Chung Hsing University, Computer Science and Engineering, 250 Kuo Kuang Road, Taichung 402, Taiwan [email protected]
Abstract. Due to the wide-spread deployment of business-to-business (B2B) E-commerce, XML has become the standard format for data exchange over the Internet. How to process XML queries efficiently is an important research issue. Various indexing techniques have been proposed in the literature. However, they suffer from some of the following problems in various degrees. First, some indexing methods require huge size for index structures, which could be bigger than the original XML document in some cases. Second, some of them require long index construction time to minimize the size of index structures. Third, some of them can’t support complex queries efficiently. To overcome the aforementioned problems, we propose an indexing method called NCIM (Node Clustering Indexing Method). The experimental results show that NCIM can compress XML documents with high compression rate and low index construction time. It also supports complex queries efficiently. Keywords: XML Indexing, Structural Summary Index, XML Query, Node Clustering.
1
Introduction
XML (Extensible Markup Language) is a self-describing data representation format, and it is also a metalanguage specification for describing other languages. With hundreds of XML-based languages been developed, XML has become the standard format for data exchange over the Internet. The proliferation of XML documents asks for efficient query processing techniques as a result. On the other hand, the XML representation is not efficient in terms of data storage compared to a database, since tag names are repeated throughout the XML document. Therefore, we can also expect that compressing schemes are needed in the future when more and more handheld devices, such as PDA and mobile phones, download many XML documents for query purposes. XPath [1] and XQuery [2] are the most widely used query languages for XML query specifications. These two query languages use path expressions to traverse XML tree structure. They process XML queries by searching all paths in
Corresponding author.
F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 70–84, 2010. c Springer-Verlag Berlin Heidelberg 2010
An Efficient Indexing and Compressing Scheme for XML Query Processing
71
the XML tree for matching patterns, and this process may be time-consuming because the matching patterns may be scattered at different locations in the document. Various indexing techniques have been proposed in the literature for improving the performance of query processing. They can be categorized into three main classes [3]: structural summary indices, structural join indices, and sequence-based indices. However, these indexing methods focused mainly on how to speed up the query processing of XML documents. They suffer from some of the following problems in various degrees. First, some indexing methods require huge size of index structures, which could be bigger then the original XML document in some cases. Second, some of them require long index construction time in order to minimize the size of index structures. Third, some of them cannot support complex queries efficiently. To overcome the aforementioned problems, we propose a novel indexing method called NCIM (Node Clustering Indexing Method), which uses hash-based tables to build indices. The merits of the proposed method are two folds: 1. By clustering nodes with the same tag name, NCIM is not only an indexing method but also a compressing scheme for XML documents. The proposed method has very low index construction time as well as small index size. 2. The index structures of NCIM are stored using hash-based tables which support fast access to data and efficient processing of various queries, such as twig queries, partial matching queries, and content queries with constrained predicates. The rest of this paper is organized as follows. We review the related work in Section 2. In Section 3, we introduce the concept of NCIM and demonstrate how it works to process various queries. The experimental results are reported in Section 4, and the conclusion is given in Section 5.
2
Related Work
In this section, we briefly present some research results in the literature on indexing XML documents. 2.1
Structural Summary Indexing Methods
The basic idea of the structural summary indexing methods is to merge the same sub-structures in an XML document. Two common ways to summarize structure are summary by label paths [4,5,6,7] and summary by bisimulation [8,9,10,11]. DataGuides [6] summarize all label paths in the XML document(Fig. 1). A label path is a sequence of one or more labels from root to a specific node. Each node in a DataGuide has an extent for the corresponding nodes in the original XML document. The advantage is to reduce the search space for simple regular expression. However, DataGuide is not feasible for twig queries, since the summarized structure is not the same as the original XML document.
72
I-E. Liao, W.-C. Hsu, and Y.-L. Chen
Fig. 1. (a) The XML data graph and (b) it’s DataGuide
Fig. 2. The 1-index and the A(k)-index
The 1-index [11] groups nodes that have the same set of incoming paths together based on the concept of bisimulation [12]. Two data nodes u and v are bisimilar iff 1. u and v have the same label; 2. if u has a parent u0 , then v has a parent v0 such that v0 is bisimilar to u0 , and symmetrically for v. A(k)-index [10] groups nodes based on local similarity to reduce the size of the 1-index. The parameter k guarantees the similarity up to kth ancestors. Fig. 2 represents the 1-index and the A(k)-index with different value of k. The maximal bisimulation of A(k)-index, as in Fig. 2(d), is equal to 1-index. If the length of a query is less than or equal to k, the evaluation result of A(k)-index is accurate. Otherwise, referencing to the XML document is indispensable. Unlike 1-index and A(k)-index, that set a fixed local similarity for all equivalence classes, D(k)-index [8,9] adopts different local similarities according to the query load. In other word, each index node may have different similarity at a given time. By varying k, D(k)-index can be flexible and of a smaller size compared to 1-index and A(k)-index. However, keeping track of query load is space-consuming and the adjustment of k is time-consuming. The PCIM (Path Clustering Indexing Method) [13] clustered paths with the same root-to-leaf nodes and reduce the space cost of the index. An XML document is indexed using two hash-based tables, the Structural Index and the Content Index, with tag names as hashing keys for efficient searching(Fig. 3).
An Efficient Indexing and Compressing Scheme for XML Query Processing
73
Fig. 3. (a) The Structural Index and (b) the Content Index
Each node n is labelled with a quadruple (level, n , n , p), where “level” represents the depth of n with the root as level 1, “n ” is the serial number of n derived from a depth-first traversal of the tree (the root node is assigned 1 also), “n ” is the largest serial number of n’s descendents, and the “p” is the position of the corresponding content in the Content Index. The PCIM reduces the index space with a high compression ratio and efficiently process complex queries. 2.2
Structural Join Indexing Methods
Most of the structural join indices rely on encoding the XML document to capture the position and the structural relationship between document elements. The codes of elements are clustered by the tag name. A given query is broken down into individual simple paths. The query processor determines elements that match sub-query tree nodes and then joins the obtained sets together with a merge-join algorithm [3]. This approach produces huge amount of intermediate results that cause unnecessarily high processing cost. Various techniques have been proposed to speed up merge-join processing. Chien et al. [14] proposed a structural join algorithm, that based on B+ -tree, improves the performance by inserting sibling pointers in the indexes, which allows descendants to be skipped if they do not match the required structural relationship. However, it is not effective in skipping ancestors. XR-tree [15] extends B+ -tree and indexes element nodes on their region codes in pair (start, end). XRtree effectively skips not only descendants but also ancestors. A problem of this approach is that it is inefficient in handling recursive ancestor elements. When a node happens to be ancestor of two or more other nodes, it will be searched for and retrieved as many times as the number of its descendant nodes [16]. Catania et al. [17] proposed a new lazy approach to handle XML updates and structural join processing in an efficient way. This approach uses segments, labelled by the global position, local position and length, as the unit of updates. The traditional structural join algorithm is improved by the segment-based extended algorithm, which improves query performance. One shortcoming is that the update log must be maintained in order to manage the segments. The XCut [18] is a comprehensive holistic twig matching algorithm to expedite XML query processing. XCut takes advantage of double-ended queues to queue
74
I-E. Liao, W.-C. Hsu, and Y.-L. Chen
Fig. 4. An example for TwigStack
two inner entries of index trees accessed by level-order traversal and XB+ -tree for fast evaluation of a twig by calling the routine recursively from the twig root in a way similar to XRTwig. This approach permits early filtration at the inner level of index trees when processing a twig query. Joint with the leaf entries having full specifications of query elements, parent-child edges, and predicates selection can be determined quickly at query tree nodes. The TwigStack [19], a stack-based algorithm, uses a chain of linked stacks to compactly represent partial results of root-to-leaf query paths, which are then composed to obtain matches for the twig pattern. However, before a new element comes, the TwigStack has to check the element index to make sure all branches are satisfied. Also, it encounters obstacles when following-sibling relationships exist in the query. An example of TwigStack for twig query are shown in Fig. 4. The TRACK [20], similar to stacks in TwigStack, builds the tree structure for leaf nodes and then traces leaf-to-root paths back to the root. The algorithm can avoid too much time spending on checking the element index and make sure all the branches are satisfied before a new element comes. The TRACK uses the tree structures instead of the chain of linked stacks to encode final tree matches that avoid the merging process and improve the process of queries. 2.3
Sequence-Based Indexing Methods
Sequence-based indexing methods aim at avoiding expensive join operations in query processing [21]. They represent XML documents and queries as sequences and sub-sequences so that a query can be answered through subsequence matching. Many algorithms proposed in the literature start with sequence representations of tree structures. For example, ViST [22] labels nodes in pre-order traversals. PRIX [23] transforms XML documents and twig queries into Pr¨ ufers sequences that are mapped to a virtual trie tree. A problem with ViST is that the index has quadratic size in the worst case, if the trees indexed are very deep [24]. PRIX uses Pr¨ ufer sequences to solve this problem. Wang and Meng [21] used a more cleaver sequencing which is similar to ViST. Examples presented in Fig. 5 and Table 1 show the different labelling schemes among these approaches. Since structural information had been included in sequences, extra join operations are not necessary. Besides, twig queries and partial matching are transformed to sequences for matching, and they can be preformed efficiently. One challenge of the sequence-based methods is how to avoid false alarm and false dismissal. Table 1 illustrates such cases.
An Efficient Indexing and Compressing Scheme for XML Query Processing
75
Fig. 5. An XML document and query examples
Table 1. False alarm and false dismissal in different methods
3
Overview of NCIM
In this section, we introduce the methodology of the proposed NCIM (Node Clustering Indexing Method) method and demonstrate how queries can be processed efficiently using NCIM. 3.1
The Methodology of NCIM
The construction of NCIM can be divided into two phases: document labelling and index construction. Practically, these two phases can be done in parallel through traversing XML document once. We explain these steps using the XML data tree shown in Fig. 6, where ovals represent elements with tags’ name inside, and rectangles represent plain texts. We treat the attributes as elements and add the character “@” in the front of attribute name. The document labelling scheme of NCIM applies only to the element nodes (the text nodes are not included) of an XML data tree. A element node n of an XML data tree is labelled with a 3-tuple (level, n , n ) for non-leaf nodes and a 2-tuple (level, n ) for leaf nodes, where “level” is the depth of n, “n ” (start number) is the serial number of n derived from a depth-first traversal of the data tree, and “n ” (end number) is the serial number after visiting all child nodes of n. Leaf nodes omit n for saving space. An attribute node has the same values of level and n as its parent for identifying to whom it belongs, and it does not have n for the same reason as leaf nodes.
76
I-E. Liao, W.-C. Hsu, and Y.-L. Chen
Fig. 6. An XML data tree
Fig. 7. A labelled XML data tree
The benefit of this labelling schema is in easy determination of the structural relationships between nodes. Given two nodes x and y, if x ∈ (y , y ], we know that x is a descendent of y. Moreover, if the difference of levels between x and y is equal to 1, then x and y have parent-child relationship. On the other hand, when the difference is greater than 1, we know that x and y have ancestor-descendent relationship. Fig. 7 shows the labelling results on the sample XML data tree. In index construction phase, the NCIM contains four hashed-based tables: the non-leaf node index, the leaf node index, the level index of non-leaf node, and the level index of leaf node. The non-leaf nodes labelled by (level, n , n ) are clustered using the pair (tag name, level) and stored in the non-leaf node index. That means (tag name, level) is used as the hashing key, and each hash entry points to a linked list which holds n and n of the nodes with same tag name and level. The structure of the leaf node index is similar to that of the non-leaf node index. However, since the leaf nodes are labelled by (level, n ), the place in the linked list node that holds n is used to store the text content. Fig. 8 illustrates the non-leaf node and leaf node indices, respectively, for Fig. 7. The advantage of using hash tables is to gain fast accesses on the needed data. Moreover, clustering the nodes with the same tag name and level reduces the index space and accelerates the processing of queries. A problem arises when a
An Efficient Indexing and Compressing Scheme for XML Query Processing
77
Fig. 8. (a)Non-leaf node index and (b) Leaf node index
Fig. 9. Level indices of (a) non-leaf nodes and (b) leaf nodes
query contains the ancestor-descendant axis (“//”). It is hard to determine the level of the tag after “//”. To solve this problem, we create the other two hash tables, namely, the level index of non-leaf nodes and the level index of leaf nodes, as shown in Fig. 9. The tag name is used as the hash key, and the levels with same tag name are clustered together in a linked list. Most of the indexing methods store the indices in the memory for fast access. In order to reduce the required space, we use integer labelling scheme to represent structure information where possible. Besides, we cluster nodes with same tag name and level to compress repetitive sub-structures in the XML documents. 3.2
Query Processing in NCIM
A query expressed as an XPath expression can be represented as a query tree. There are several basic types of query. The single-path query is a simple query starting from the root, e.g., /dblp/article. The partial matching(or recursive) query contains “//”, e.g., //article/author. The twig query contains “[ ]”, e.g., /dblp/article[year]/title). The content query with constrained predicates is a query with predicates constraining on the retrieved text, e.g., //article/year= “1997”. Any query is basically composed of one or more basic types of query. The pseudo-code of the query evaluation algorithm is presented in Fig. 10. Given a path expression Q, it is evaluated by calling a recursive function PathMatch(line 6). The PathMatch retrieves the first path relation symbol, “/”, “//” or “[”, and the following tag name segment, the substring between the first and the second symbols in subQ(lines 11-12). If the symbol is “[”, it encounters a twig structure and should recrusively call the PathMatch funcrion(lines 13-17). If the symbol is “/”, the level of the node is set to pLevel + 1(line 19). If the symbol is “//”, use the tag name as a search key, get the levels from level indices, and
78
1 2 3 4 5 6 7 8
I-E. Liao, W.-C. Hsu, and Y.-L. Chen
Input: A path expression, Q Output: A set of subtrees in Document that match Q, M Function QueryEvaluation (Q) { pNode = null; //Parent node pLevel = 0; //Level of the parent node Call PathMatch( pNode, pLevel, Q ); Output the matched results M; }
9 Function PathMatch (pNode, pLevel, subQ ) 10 { if subQ is null, return; 11 R = the first symbol in subQ; 12 nNode = the first tag name segment in SubQ; 13 if( R = ‘‘[’’) { 14 TwigQ = the subString between ‘[’ and ‘]’ in subQ; 15 Call PathMatch( pNode, pLevel, TwigQ); 16 SubQ = the substring of subQ after TwigQ; 17 Call PathMatch( nNode, subQ); } 18 else { 19 if( R = ‘‘/’’ ) nLevel is set to pLevel +1; 20 else Get correct nNode’s levels from the Level indeices; 21 Call CheckMatch( pNode, R, nNode, nLevel ); 22 SubQ = the substring of subQ after nNode segment; 23 Call PathMatch(nNode, nLevel, subQ); } 24 } 25 Function CheckMatch(pNode, R, nNode, nLevel) 26 { if (nNode contains ‘‘=’’) 27 Get the corresponding linked lists of nNode from Leaf node index; 28 else 29 Get the corresponding linked lists of nNode from 30 Non-leaf and Leaf node indices; 31 if(M is empty) Add the nNode’s labels to M; 32 else { 33 Compare the pNode’s lists in M with nNode’s lists; 34 if(matched pNodes are found for an nNode) 35 Add the nNode’s labels to M; 36 if(no matched nNode for a pNode) 37 Delete pNodes labels from M; } 38 }
Fig. 10. Query Evaluation Algorithm
An Efficient Indexing and Compressing Scheme for XML Query Processing
79
filter out the levels that are less than pLevel+1 (line 20). Funcrtion CheckMatch is called(line 21) to retrieve the corresponding linked lists in leaf or non-leaf nodes indices by using the tag name in nNode and the value of nLevel(lines 26-30). Finally, compare the start and end numbers in the linked lists and check whether they are suitable for the query(lines 31-37). The results, stored in M, are produced at the end of procedure QueryEvaluation (line 7). We demonstrate a complex query, //Mastersthesis[author =“Frank”]/title, to show how NCIM processes the query. The segment of “//Mastersthesis” is first retrieved. Because it starts with “//”, we get “2” as the level of “Mastersthesis” form level index of non-leaf nodes and then find the location of (Mastersthesis, 2) in the non-leaf node index. There are two nodes in the linked list. Keep them as candidates and proceed to the next step. Since we encounter a twig structure, [author =“Frank”], we keep the parent node “author” and deal with the twig structure. Because “author” contains a content predicate, i.e., author =“Frank”, only the level index of leaf node has to be checked. Then we search the leaf nodes index and get the positions of “author” using (author, 3) as the search key. There is only one node of “author” selected. Compare the start numbers and the end numbers of “Masterthesis” and “author”, a matched result is produced, i.e., the second node of (Masterthesis, 2), and the third node of (author, 3). The remaining sub-query, /title, is processed in the same way. The final matched structure is the second node of (Masterthesis, 2), the third node of (author, 3), and the third node of (title, 3).
4
Experimental Results
In this section, we report the experimental results of the proposed indexing technique. We compare the performance of NCIM with the PCIM [13], Wang/Meng’s method [21], and XQEnging [25] on three datasets: DBLP, Swiss-Prot, and XMark. All algorithms are implemented in java, and the experiments have been performed on a Windows XP system with a 1.86GHz Intel Core2 CPU, 2GB RAM, and 1132MB page size. 4.1
Datasets
We chose the DBLP [26], the XMark [27], and the Swiss-Prot [28] as the datasets for experiments because they have different characteristics and are widely used in benchmarking XML indexing methods. The DBLP has many repetitive structures comparing to the other two datasets. The XMark has the largest maximumdepth among these three datasets. The total number of nodes in Swiss-Prot is largest among three datasets. The statistical data of datasets are shown in Table 2, where the size of contents is equal to the size of strings in the text nodes, and the size of structure is defined to be the total size minus the size of contents. 4.2
Performance Analysis
We examined the performance with respect to the index construction time, required index size, and query processing time.
80
I-E. Liao, W.-C. Hsu, and Y.-L. Chen Table 2. The statistical data of different datasets Datasets
Sizes (MB) Structure Content Total DBLP 56 71 127 SwissProt 67 42 109 XMark 42 69 111
Number of Nodes Max Elements Attributes Depth 3,332,130 404,276 6 2,977,031 2,189,859 5 1,666,315 381,878 12
Fig. 11. The comparison of index construction time
The index Construction time of different methods are compared and shown in Fig. 11. Usually, the structural summary indexing methods spend more time than other classes of indexing methods because summarizing structure is timeconsuming. That is the reason that PCIM requires more time than the other three methods. Clearly, NCIM is very efficient in index building time. It is important to know that the XQEngine is the only one, among four methods, that builds index only on the structural part of XML. As we know, the contents (plain texts) of XML are not compressed in other three methods. We filter out the plain texts from each indexing method in order to measure the compression rate more accurately. Fig. 12 shows the size of original XML structure vs. the required index size without text contents for different methods.
Fig. 12. Size of XML structure vs. index sizes of different methods
An Efficient Indexing and Compressing Scheme for XML Query Processing
81
The Wang/Meng’s method, a sequence-based indexing method, records the ancestor’s tag name from root for each node and requires larger index size. The XQEngine, an open source embedded Java component, requires the largest index space. The index size of the PCIM is the smallest one among four methods because it is a summarized structure. There is a modest increase in the index size of NCIM comparing to PCIM in each dataset. This is the trade-off between time and space. Obviously, it is worth because NCIM saves a lot of time in building index than PCIM. Using the compression ratio defined in Equation (1), the PCIM and the NCIM can compress an XML document up to 88% and 74%, respectively, that are much better than Wang/Meng’s method (7% at most) and XQEngine (-9% at most). The compression ratio charts are shown in Fig. 13. Comp.Ratio =
uncomp.size − comp.size × 100% uncomp.size
(1)
Fig. 13. Compression ratio of different methods
We compared the query processing time using the 20 queries, as listed in Table 3, on different datasets. Each query is composed of one or more basic types of query. Fig. 14 shows the comparison of query processing time among Wang/Meng’s method, XQEngine, PCIM, and NCIM. Both PCIM and NCIM store indices in hash tables and are more efficient than the other two methods. In most of the cases, the NCIM outperforms the PCIM for some reasons. First, PCIM stores text contents in the other tables, whereas NCIM stores them in the leaf node index under the corresponding tag name, which results in reducing search time on processing queries with selection predicates. Second, the integer comparisons are much faster than string comparisons. The PCIM uses strings to represent labels, but NCIM uses integers where possible. Third, in PCIM, the contents with the same path are combined and separated by two full-form characters, “ | ” and “ ; ”. It takes time to decompose them during query processing.
82
I-E. Liao, W.-C. Hsu, and Y.-L. Chen Table 3. The statistical data of different datasets Query No. Dataset XPath Expressions Q1 Q2
DBLP DBLP
Q3 Q4 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20
DBLP DBLP DBLP Swissprot Swissprot Swissprot Swissprot Swissprot Swissprot XMark XMark XMark XMark XMark XMark XMark XMark
//article[@rating=“SUPERB”]/title //article[@reviewid=“RosenthalDataAdministration gpezda”]/title //inproceedings/author=“Stephan Busemann” //inproceedings/booktitle /dblp/mastersthesis[author][school] /root/Entry[@id=“PRT ANTGR”]//Comment //Entry/Org=“Homo” //Ref/Author=“Chang Y.S” //Entry/Ref//Cite /root/Entry/Ref/MedlineID //Entry[AC][Species] //item[@id=“item8”]/location //item[@id=“item8108”]/location //item[@id=“item8”]/description//bold //person//city=“Windhoek” //open auction/initial=“17.72” //item//mailbox//from /site/regions/africa/item/location //person[name][phone]
Number of Matching Patterns 27 1 7 212,273 5 1 4,326 25 92,546 76,574 108,586 1 1 2 52 4 20,946 550 25,358
Fig. 14. The comparison of query processing time
5
Conclusions and Future Work
In this paper, we proposed a novel indexing method, called the NCIM (Node Clustering Indexing Method). The NCIM clusters the nodes with same tag names
An Efficient Indexing and Compressing Scheme for XML Query Processing
83
and stores them in the hash-based tables. The experimental results show that the NCIM can compress XML documents effectively with average compression ratio of 66.6%. The index construction time of NCIM is below 30 seconds on three tested datasets, and it also supports complex queries efficiently. A limitation of the NCIM is that we assume the indices can fit into the main memory. Although the NCIM can compress XML documents effectively, it may not be suitable when the index size of XML document exceeds the size of main memory. There are some recent work [29,30] discussing storing the indices in the secondary memory. The future work will be on solving that problem as well as supporting update operations.
References 1. Clark, J., DeRose, S.: XML path language (XPath) version 1.0. W3C Recommendation (1999), http://www.w3.org/TR/xpath 2. Boag, S., Chamberlin, D., Fernandez, M., Florescu, D., Robie, J., Simeon, J., Stefanescu, M.: XQuery 1.0: An XML query language. Working Draft (2001), http://www.w3.org/TR/2001/WD-xquery-20011220 3. Catania, B., Maddalena, A., Vakali, A.: XML document indexes: A classification. IEEE Internet Computing 9(5), 64–71 (2005) 4. Chung, C., Min, J., Shim, K.: APEX: an adaptive path index for XML data. In: 2002 ACM SIGMOD International Conference on Management of Data (SIGMOD ’02), pp. 121–132. ACM Press, New York (2002) 5. Cooper, B., Sample, N., Franklin, M.J., Hjaltason, G.R., Shadmon, M.: A fast index for semistructured data. In: 27th International Conference on Very Large Data Bases, pp. 341–350 (2001) 6. Goldman, R., Widom, J.: DataGuides: enabling query formulation and optimization in semistructured databases. In: 23rd International Conference on Very Large Data Bases, pp. 436–445 (1997) 7. Zhang, B., Geng, Z., Zhou, A.: SIMP: efficient XML structural index for multiple query processing. In: 9th International Conference on Web-Age information Management (WAIM), pp. 113–118. IEEE Computer Society, Washington (2008) 8. Chen, Q., Lim, A., Ong, K.W.: D(k)-index: an adaptive structural summary for graph-structured data. In: 2003 ACM SIGMOD international Conference on Management of Data (SIGMOD ’03), pp. 134–144. ACM Press, New York (2003) 9. Chen, Q., Lim, A., Ong, K.W.: Enabling structural summaries for efficient update and workload adaptation. Data Knowl. Eng. 64(3), 558–579 (2008) 10. Kaushik, R., Shenoy, D., Bohannon, P., Gudes, E.: Exploiting local similarity for indexing paths in graph-structured data. In: 18th International Conference on Data Engineering (ICDE), pp. 129–140. IEEE Computer Society, Washington (2002) 11. Milo, T., Suciu, D.: Index structures for path expressions. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 277–295. Springer, Heidelberg (1998) 12. Haw, S., Lee, C.: Evolution of structural path indexing techniques in XML databases: A survey and open discussion. Advanced Communication Technology 3, 2054–2059 (2008) 13. Hsu, W.-C., Liao, I.-E., Wu, S.-Y., Kao, K.-F.: An efficient XML indexing method based on path clustering. In: 20th IASTED International Conference on Modelling and Simulation (MS ’09), pp. 339–344 (2009)
84
I-E. Liao, W.-C. Hsu, and Y.-L. Chen
14. Chien, S., Vagena, Z., Zhang, D., Tsotras, V.J., Zaniolo, C.: Efficient structural joins on indexed XML documents. In: 28th International Conference on Very Large Data Bases. VLDB Endowment, pp. 263–274 (2002) 15. Jiang, H., Lu, H., Wang, W., Ooi, B.C.: XR-Tree: indexing XML data for efficient structural joins. In: 19th International Conference on Data Engineering (ICDE), pp. 253–264. IEEE Computer Society, Washington (2003) 16. Moro, M.M., Vagena, Z., Tsotras, V.J.: Tree-pattern queries on a lightweight XML processor. In: 31st International Conference on Very Large Data Bases. VLDB Endowment, pp. 205–216 (2005) 17. Catania, B., Ooi, B.C., Wang, W., Wang, X.: Lazy XML updates: laziness as a virtue, of update and structural join efficiency. In: 2005 ACM SIGMOD international Conference on Management of Data (SIGMOD ’05), pp. 515–526. ACM Press, New York (2005) 18. Sheu, S., Wu, N.: XCut: Indexing XML data for efficient twig evaluation. In: 22nd International Conference on Data Engineering (ICDE), p. 127. IEEE Computer Society, Washington (2006) 19. Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: optimal XML pattern matching. In: 2002 ACM SIGMOD international Conference on Management of Data (SIGMOD ’02), pp. 310–321. ACM Press, New York (2002) 20. Li, D., Li, C.: Track: a novel XML join algorithm for efficient processing twig queries. In: 19th Conference on Australasian Database, pp. 137–143 (2007) 21. Wang, H., Meng, X.: On the Sequencing of Tree Structures for XML Indexing. In: 21st International Conference on Data Engineering (ICDE), pp. 372–383. IEEE Computer Society, Washington (2005) 22. Wang, H., Park, S., Fan, W., Yu, P.S.: ViST: a dynamic index method for querying XML data by tree structures. In: 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD ’03), pp. 110–121. ACM Press, New York (2003) 23. Rao, P., Moon, B.: PRIX: Indexing and querying XML using Prufer sequences. In: 20th International Conference on Data Engineering (ICDE), p. 288. IEEE Computer Society, Washington (2004) 24. Grimsmo, N.: Faster path indexes for search in XML data. In: 19th Conference on Australasian Database, pp. 127–135 (2007) 25. Katz, H.: XQEngine: XML query engine (2005), http://xqengine.sourceforge.net/ 26. Ley, M.: DBLP database web site (2008), http://www.informatik.uni-trier.de/~ ley/db/ 27. XMARK, An XML benchmark project (2003), http://monetdb.cwi.nl/xml/ 28. UniProt Consortium, UniProtKB/Swiss-Prot, http://www.expasy.ch/sprot/sprot_details.html 29. Chen, Z., Gehrke, J., Korn, F., Koudas, N., Shanmugasundaram, J., Srivastava, D.: Index structures for matching XML twigs using relational query processors. Data Knowl. Eng. 60(2), 283–302 (2007) 30. Wu, X., Liu, G.: XML twig pattern matching using version tree. Data Knowl. Eng. 64(3), 580–599 (2008)
Development of a New Compression Scheme Eyas El-Qawasmeh, Ahmed Mansour, and Mohammad Al-Towiq Jordan University of Science and Technology, Jordan [email protected], [email protected], [email protected]
Abstract. Huffman coding is a simple lossless data compression technique that tries to take advantage of entropy by using a Variable-Length encoding to build a code table to encode a source file symbols. In this paper, we have re-visited Huffman coding and enhance it by considering the second order form of characters rather the first order form of characters. Results showed that using the second order form improves the compression ratio by around 8% than the existing Huffman coding. Keywords: Compression, Decompression, Huffman Coding.
1 Introduction Data compression aims to remove redundant data in order to reduce the size of a data file [Mathews, 1995] [Ziviani, et. Al. 2000]. For example, an ASCII file is compressed into a new file, which contains the same information, but with smaller size. The compression of a file into half of its original size increases the free memory that is available for use [Mandal, J., 2000]. The same idea applies to transmission of messages through the network with limited bandwidth channels. Data compression has a wide range of applications especially in data storage and data transmission [Ziviani, et al., 2000] [Brisaboa, et. all, 2003]. Some of these applications include: 1) voice compression applications such as “satellite connectivity, international voice trunking, Wireless Local Loop and rural telephony”, [Cutter Network site, http://www.bestdatasource.com/RAD/VMUX2100.htm, visited June, 2010], 2)Video compression, and 3) Image compression [Servetto, S, 1999]. Other applications include virusscanning software, archiving systems, and real-time interactive control of large-scale simulations at remote super computer sites [University of South Carolina (USC), http://ip.research.sc.edu/catalog/97154 /sw.htm. Visited June, 2006]. Currently, the communications through networks have resulted in a large amount of data that is transferred daily, especially the multimedia data, which slows the network due to large sizes of its files. Therefore, use of efficient compression techniques will reduce the time of data transmission and the cost of communications [Adler, M. and Mitzenmacher, 2001] [Wu, D., et. Al, 2001]. Compression increases the use of the World Wide Web, since any improvements on compression increases the amount of transmitted data over the Internet. F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 85–90, 2010. © Springer-Verlag Berlin Heidelberg 2010
86
E. El-Qawasmeh, A. Mansour, and M. Al-Towiq
Currently, there are many lossless compression techniques and algorithms. In lossless techniques, the restored file is identical to the original file. These techniques can be classified into three main categories. They are: substitution, statistical, and dictionary based data compression [El-Qawasmeh, et al., 2007]. The substitution category replaces a certain repetition of characters by a smaller one. Run Length Encoding (RLE) is one of these techniques that take advantage of repetitive characters. The second category involves generation of the shortest average code length based on an estimated probability of characters [Vitter, 1989]. An example of this category is Huffman coding [Huffman D.A, 1952]. In Huffman coding, the most common symbols in the file assigned the shortest binary codes, and the least common assigned the longest codes. The last category is dictionary based data compression schemes such as Lempel-Zif-Welch (LZW). This category involves the substitution of sub-string of text by indices or pointer code, relative to a dictionary of the substrings [Mandal, J., 2000] [Nelson M.1989]. The proposed algorithm uses the ideas of Huffman. However, instead of considering the frequency of each letter alone in the text, it considers a pair of letters. This called the second order form of letters where each two unique letters are considered (we call it a unique couple of letter). In other words, the proposed methods consider the frequency of each two adjacent letters and apply Huffman code on it. This will allow us to get some saving since many cases of couple of letters (pair of letters) do not exist in many languages. The conducted experiments verified our algorithm. The organization of this paper will be as follows: Section 2 is the current related word. Section 3 is the suggested approach. Section 4 is the performance results. Section 5 is a discussion, and section 6 is the conclusion.
2 Current Related Work Huffman coding is one of the best entropy encoding algorithms used as a lossless data compression technique. The main idea in Huffman coding is to use a Variable-Length encoding to build a code table to encode a source file symbols like characters in a file. Huffman coding algorithm takes an input file and the frequency of each letter, and then it applies the greedy method on it. It combines the lowest two frequencies to get a new node in a bottom-up fashion until the result is a tree with one node marked as a root with frequency equal to one. In case, the frequency of the letters is not available, Huffman scan the file that it needs to do compression for it and find the frequency of each letter. The proposed method that will be described in detail in the following section, reads the whole file, and consider the frequency of each two adjacent letters. As a result, we will gain some saving. The reason for saving is that many adjacent letters will not occur, and this reduces the number of bits that we need to presents these two bits. More details are listed in the following section.
3 Proposed Approach The suggested method uses the algorithm of Huffman. However, it introduces an enhancement to it so that we can consider every two letters. It starts by reading the
Development of a New Compression Scheme
87
whole file, and through reading it, it builds a table that contain every two unique symbols, and their corresponding frequency in respect to the whole file. As an example, let us assume that we have a file that contains only the following text: ”Hello Holland”. Then our proposed method will treat "He", "ll", "o ", "Ho", "ll", "an" and "d.". It will build Table No. 1. Table 1. Frequency table for a file that contains “Hello Holland”
Couple of letters Frequency
"He" 1/7
"ll" 2/7
"o " 1/7
"Ho" 1/7
"an" 1/7
"d." 1/7
3.1 Compression Process The algorithm of the compression process is listed below. It should have a setup. The setup starts by building the frequency table for the couple of letters. The number of entries in this table is equal to the number of distinct couples of letters and all of them will be assigned a frequency equal to zero at beginning. For example, the two adjacent letters such as “ab” will be treated as one couple and they will be assigned a frequency equal to zero. Improved_Huffman_Compression (Input: Any file) Begin - Scan the file and for each couple of letters increment the frequency value in the frequency table. - Generate Huffman tree from the frequency table. - Generate the associated code table and find the associated code for each two unique letters. - Replace the two unique letters with its associated code. End We improve the traditional Huffman code by taking each two characters as one symbol (a couple of letters) and perform the remaining steps as the traditional Huffman code. On the other side, the decompression process, which is identical to decompression done by Huffman coding, however, in the proposed scheme, each couple of letters represents a one symbol. The details are listed below. 3.2 Decompression Process On the other side, the decompression works exactly as Huffman. Improved_Huffman_Decompression (Inputs: Compressed file, Huffman tree) Begin - Create an empty file to write the decompressed file. - Scan the compressed file and exchange each associated code with its two unique letters. - Store the couple of in the decompressed file. End
88
E. El-Qawasmeh, A. Mansour, and M. Al-Towiq
On the other side, the decompression process, which is identical to decompression done by Huffman coding, however, in the proposed scheme, each couple of letters represents a one symbol. The details are listed below.
4 Performance Results In order to find the performance of the suggested method, we did many experiments. All these experiments were carried out on on P4 with a CPU speed of 2.00GHz with 2.5 GB RAM and it uses Windows 7 Ultimate OS. The first experiment that was conducted is to find the percentage of compression ratio and compare it with the traditional Huffman coding. The compression measure that we used is the Compression ratio = new file size / old size. In this experiment, we did compression for files with the following size: 15, 30, 45, 60, 75, 60, 105, 120, 135, and 150 KB. The results of these experiments are depicted in Figure 1. 100% Compression Ratio
Proposed 80%
Huffman
60% 40% 20% 0% 0k
15k 30k 45k 60k 75k 90k 105k 120k 135k 150k Size in KB
Fig. 1. The compression ratio between different file sizes
The tested files in Figure 1 were generated in a systematic way. For example, for the 100 KB, we generated a 100 x 1024 x 8 bits where each bit is either 0 or 1. The generation of 0s and 1s was randomly, and each 8 bits are converted to its corresponding ASCII code. It is clear from Figure 1 that the suggested approach is better than regular Huffman with an amount that is approximately 8%. The second experiment that we conducted is the compression time. In this experiment, we find the required time for doing the compression using our suggested approach and the traditional Huffman. Results of this experiment is depicted in Figure 2.
Development of a New Compression Scheme
89
180 160 140 120
Proposed
100
Time (s)
Huffman
80 60 40 20 0
0k 15k 30k 45k 60k 75k 90k 105k 120k 135k 150k Size in KB Fig. 2. The CPU time Normalized
It is clear from Figure 2 that the suggested approach is better than regular Huffman in the compression time with an amount that is half of the original Huffman.
5 Discussion We have seen that the suggested approach reduce the size of the files by an amount of around 8% less than traditional Huffman. However, this enhancement does not decreases the compression ratio only, it also reduce the running time of the compression process to the half of the orgianl Huffam, this reduction comes from the reduction in the number of symbols comparing to the traditional Huffman coding. The suggested approach takes advantage from the idea of first order forms. It uses second order form where it merges the letter with the following one. This assumes that the number of different permeation is around 256*256 = 65,0000. However, get less than this number. The reason for this is that many unique couples of letters that do not occur in the most languages have no entry in the frequency table and thus have been removed from the frequency table if there frequency is zero. It is also expected that if we use the third order form, then the compression ratio will be improved. However, the size of the frequency table will be higher, and it does not worth to investigate it. However, it is practically applicable. The authors have tried many approaches to reduce the compression ratio using other approaches. However, they failed. The first failed approach that we tested is to divide the text into letters and take the ASCII number of the letter then check the length of the ASCII number if it is less than three we add zeros to the left of it to make the length equal three (because the length of ASCII number of each letter is one or two or three), after that we establish a frequency table of ten positions from zero to ten and apply Huffman on it, the expected results were negative because we add digits
90
E. El-Qawasmeh, A. Mansour, and M. Al-Towiq
(zeros) to the ASCII numbers, so for each letter the number of digits in the ASCII is three, that means it is length is 12 bit and it is very cost. The second failed approach that we tested is to make some enhancement on The first failed approach by taking the ASCII numbers in hexadecimal, so here there is no need to add zeros to the left of each ASCII number because the length of it is two in all the ASCII numbers, and then follow the same steps as in the first failed approach. The obtained results were better than the obtained results in the first failed approach but it still negative.
6 Conclusions The paper suggested lossless method try to take advantage from the second order form of letters. It uses Huffman coding. However, instead of finding the frequency for each single letter, it finds it for any couple of letters. The compression ratio was improved by around 8% and the running time of compression was reduced to around 50% of the original size.
References 1. Mathews, G.J.: Selecting a General-Purpose Data Compression Algorithm. In: Proceedings of the Science Information Management and Data Compression Workshop, pp. 55–64. NASA science Publication 3315, USA (1995) 2. Ziviani, N., Moura, E., Navarro, G., Baeza-Yates, R.: Compression: A Key for NextGeneration Text Retrieval Systems. IEEE Computer 33(11), 37–44 (2000) 3. Mandal, J.: An Approach Towards Development of Efficient Data Compression Algorithms and Correction Techniques. Ph. D. Dissertation, Jadavpur University, India (2000) 4. Brisaboa, N., Iglesias, E., Navarro, G., Parama, J.: An Efficient Compression Code for Text Databases. In: Proceedings of 25th European Conference on IR Research, ECIR, Italy, pp. 468–481 (2003) 5. Cutter Network site, http://www.bestdatasource.com/RAD/VMUX2100.htm (visited June 2006) 6. Servetto, S.: Compression and Reliable Transmission of Digital Image and Video Signals. Ph.D Dissertation, University of Illinois at Urbana-Champaign, USA (1999) 7. University of South Carolina (USC), http://ip.research.sc.edu/catalog/97154sw.htm (visited June 2006) 8. Adler, M., Mitzenmacher, M.: Towards Compressing Web Graphs. In: Proc. of the IEEE Data Compression Conference, Utah, USA, pp. 203–212 (2001) 9. Wu, D., Hou, Y., Zhu, W., Zhang, Y., Peha, J.: Streaming Video over the Internet: Approaches and Directions. IEEE Trans. on Circuits and Systems for Video Technology 11(3), 282–300 (2001) 10. El-Qawasmeh, E., Al-Towaiq, M., Snasel, V.: Increasing The Efficiency of Data Compression. In: Proceedings of National Conference on Software Engineering & Computer Systems 2007 (NaCSES’07), Kuantan, Malaysia, August 20-21, pp. 1–6 (2007) 11. Vitter, J.S.: Dynamic Huffman Coding. Journal of ACM 15(2), 158–167 (1989) 12. Huffman, D.: A Method for the Construction of Minimum-Redundancy Codes. Proceedings IRE 40(9), 1089–1101 (1952) 13. Nelson, M.: LZW Data Compression. Dr. Dobb’s Journal 14(10), 62–75 (1989)
Compression of Layered Documents Bruno Carpentieri Dipartimento di Informatica ed Applicazioni “R.M. Capocelli”, Università di Salerno, Via Ponte Don Melillo, Fisciano (SA), Italy [email protected]
Abstract. Bitmaps documents are often a superposition of graphic and textual content together with pictures. Simard et al. in [1] showed that the compression performance on these documents could be improved when we separate images from text and we apply different compression algorithms on these different types of data. In this paper we study layered image compression via a new software tool. Keywords: Layered documents, text compression, image compression, lossless compression.
1 Introduction Digital documents are nowadays widespread and the idea of a paperless office is becoming more realistic. Data compression is crucial for the transmission and storage of digital documents. Research and standardization in data compression is primarily targeted to specific types of digital information sources: there are separate compression standards and techniques for each type of images, text, audio, etc.. As an example, image compression algorithms are designed having in mind a specific image type with well-defined characteristics (natural images, synthetic images, textual images, etc.) and for each image type there are specific compression algorithms. Many standard algorithms are therefore available to compress digital images (JPEG, JPEG2000 [4],, JPEG-LS [5], etc.) and there are several compression formats for textual data (JBIG, JBIG2, CCIT G4 [6], etc.). Each of these standards and formats reflects a specific area of research on a restricted type of data on which more algorithms are continually being developed to improve on existing methods. When we apply one of these algorithms to a different kind of image it does often not perform well enough, compared to other algorithms that are better tailored for that type of images. Bitmaps documents are generally images that contain a superposition of graphic or textual content together with pictures. These documents do not compress well using standard image compression techniques, because the presence of text introduces sharp edges on top of the smooth surfaces typically found in natural images (see [1], [2]). F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 91–97, 2010. © Springer-Verlag Berlin Heidelberg 2010
92
B. Carpentieri
Fig. 1. Original image (left side) and two layers separation (right side)
The compression performance on these documents can be improved by separating images from text and by applying different compression algorithms on the different types of data that are in the document. Figure 1 shows a layered image and its decomposition into two separate layers. These layers can be compressed separately with specifically tailored compressors to enhance both compression ratio and decompressed image quality.
2 SLIm In [1], Simard et al. have proposed to separate text and line drawing from the background image through their SLIm algorithm. The resulting layers can then be compressed separately with algorithms specifically tailored (the acronym SLIm stands for Segmented Layered Image). SLIm segments the image into three separate components: background, foreground and binary mask where a binary mask is a binary image in which every pixel is a flag that indicates if the pixel belongs to the foreground or to the background. Text, annotations and drawings are captured through the binary mask and compressed by using an appropriate coder. The rest of the data (image) goes through an image coder. Every coder shall deal only with the type of data for which it was designed. It can be shown that this separation improves significantly the compression of the whole document. After the separation, the pixels that do not belong to the current layer can be treated as “don’t care” pixels and they can be opportunely set to enhance the compression of the current layer.
Compression of Layered Documents
93
Fig. 2. SLIm diagram
As shown in Figure 2 (from [1]), SLIm has four steps: computation of the mask, segmentation of foreground/background, codec for the layer related to the images, codec for the mask. After the mask has been calculated, the segmenter uses the mask to separate the foreground (for example text and/or bitmaps) from the background (image). They are separately compressed by using off-the-shelf coders to codify the whole bitmap of the background (in which the foreground pixels are substituted by “don’t care” pixels) and the whole bitmap of the foreground (in which the background pixels are substituted by “don’t care” pixels). SLIm improves the compression of each layer by assigning values that enhance the compression performance of the specific compressor to the "don't care" pixels. Finally SLIm recombines them as specified by the mask.
3 Don’t Care Pixels Once the mask has been computed, SLim separates foreground from background. The decoder will reconstruct the image accordingly to the mask: a pixel that belongs to the foreground does not need to be encoded in the background and a pixel that belongs to the foreground does not need to be encoded in the background. Therefore the pixels of the original image that, after the separation, do not belong to the layer that is currently compressed could be set to a value that can improve the compression of the current layer: they will be not decoded by the decoder in the decompression of this layer. These pixels, in this framework, are called “don’t care” pixels (see [1], [3]). One of the critical steps is to decide the values that have to be assigned to the “don’t care” pixels. The problem, as stated in [1] and [3], is to interpolate the valid pixels in the masked locations in a way that produces a smooth image, without sharp edges. This shall improve compression. One of the solutions proposed by Simard et al. in [1] is to run two averaging filters, one that scans the image from left to right and from top to bottom and replaces each masked pixel by the average of the left and above pixels, and the other that runs in the opposite direction, from the bottom right of the image.
94
B. Carpentieri
Fig. 3. The application diagram
After the two filters are executed a linear combination of the results from both filters is weighted by the distance of the nearest non-masked pixel encountered by the filters and this linear combination is substituted in the image to the “don’t care”. Other simple solutions are to assign a constant value (for example 0 or 1) to the “don’t care” pixels. Ansalone and Carpentieri, in [3], propose other efficient solutions for the setting of the “don’t care” pixels based on the “perimeter” average, i.e. an average of the values of the pixels surrounding the text regions.
4 A Software Tool for Layered Image Compression To test the compression of layered documents, we have developed a JAVA software tool that applies the techniques described in [1]. Figure 3 illustrates the structure of this software tool. The Application module includes the Java class CTSoftware that is the application root and implements the user interface. The Compressor package includes the classes that are used to compress a bitmap image. These are the MainCompress class, that is used to organize the all the services that are offered by the compressor, MAskComputation that computes the binary mask as in [1], Segmenter that separates the foreground from the background, ImageEncoder that codes background and foreground as indicated by the user, BinaryEncoder that codes the binary mask, CombineBit that creates the final compressed file.
Compression of Layered Documents
95
Fig. 4. The test data set, from left to right, top to bottom: ABCD, Diabolik1, NAIF, Prova, Diabolik2
The Decompressor package includes the classes that are used to decompress a file coded by the Compressor. It includes the MainDecompress class that is the root class that calls all the decompressor services, the BinaryDecoder class, that decodes the binary mask, the CreateImage class that restores the original bitmap by merging together foreground and background, the Dscript class that interfaces with the external applications to decompress the subfiles, the ImageDecoder class that decodes background and foreground. The Off-the-Shelf components are pre-existing, external programs that are used to compress and decompress from/to one of the existing standard formats as JPEG, JPEG-LS, JPEG near lossless, JPEG2000, JBIG, etc.. Figure 4 shows five compound documents on which the software tool has been tested. Table 1 presents the results of our preliminary experiments in which the five images have been losslessly compressed. Each row of the table shows the image name, its original size, the two methods used to compress the background and foreground layer, the strategy used to set the “don’t care” pixels, the resulting compressed size and the compression ratio.
96
B. Carpentieri Table 1. Preliminary experimental results
Image Name ABCD ABCD ABCD ABCD Diabolik1 Diabolik1 Diabolik1 Diabolik1 NAIF NAIF NAIF NAIF NAIF Prova Prova Prova Prova Prova Diabolik2 Diabolik2 Diabolik2 Diabolik2 Diabolik2 Diabolik2
Original size (in KB) 79,782 KB 79,782 KB 79,782 KB 79,782 KB 836,318 KB 836,318 KB 836,318 KB 836,318 KB 476,062 KB 476,062 KB 476,062 KB 476,062 KB 476,062 KB 122,870 KB 122,870 KB 122,870 KB 122,870 KB 122,870 KB 259,948 KB 259,948 KB 259,948 KB 259,948 KB 259,948 KB 259,948 KB
Separate layers compression Background Foreground JPEG-LS JPEG-LS JPEG-LS JPEG-LS JPEG-LS JPEG-LS JPEG-LS JPEG-LS JPEG-LS JPEG-LS JPEG-LS JPEG-LS JPEG-LS JPEG-LS JPEG-LS JPEG-LS JPEG-LS JPEG-LS JPEG-LS JPEG-LS JPEG-LS JBIG JPEG-LS JBIG JPEG-LS JBIG JPEG-LS JPEG-LS JPEG-LS JPEG-LS JPEG-LS JBIG JPEG-LS JBIG JPEG-LS JBIG JPEG-LS JPEG-LS JPEG-LS JPEG-LS JPEG-LS JPEG-LS JPEG-LS JPEG-LS JPEG-LS JBIG JPEG-LS JBIG
Don’t care BLACK WHITE AVERAGE FILTERING BLACK WHITE AVERAGE FILTERING BLACK WHITE BLACK WHITE FILTERING BLACK WHITE BLACK WHITE FILTERING BLACK WHITE AVERAGE FILTERING WHITE BLACK
Compressed size (in KB) 5,99 KB 7,49 KB 11,684 KB 27,412 KB 209,447 KB 212,464 KB 280,645 KB 432,810 KB 16,532 KB 16,101 KB 9,213 KB 8,776 KB 21,788 KB 7,636 KB 10,252 KB 7,056 KB 7,618 KB 12,756 KB 199,149 KB 198,944 KB 252,166 KB 259,948 KB 140,635 KB 137,876 KB
Compression ratio 1:12.987466 1:10.389634 1:6.8283124 1:2.9104772 1:3.9929814 1:3.936281 1:2.9799855 1:1.9322982 1:28.796394 1:29.567232 1:73.62543 1:54.2459 1:21.849733 1:16.090885 1:11.984979 1:17.41355 1:16.128904 1:9.63233 1:1.0227618 1:1.0238158 1:0.8077298 1:0.783549 1:1.4483024 1:1.477284
These preliminary experimental results show the importance of the setting of the “don’t care” pixels: depending on the strategy used to set these “don’t care” pixels the compression ratio is very different. To the best of our knowledge this software tool is unique, there are no other similar tools in the literature so far.
5 Conclusions Simard et al. in [1] showed that the compression performance on compound documents can be improved by separating images from text and by applying different compression algorithms on the different types of data that are in the document. In this paper we study layered image compression via a new software tool, the preliminary experimental results obtained through this software tool show that the choice of the settings for the “don’t care” pixels is crucial. Future work will concentrate on the improvement of the technique described in [3] for setting the “don’t care” pixels.
Acknowledgments I would like to thank my students Federico Distante and Giovanni Longo who implemented a preliminary version of the software tool described in this paper.
Compression of Layered Documents
97
References 1. Simard, P.Y., Malvar, H.S., Rinker, J., Renshaw, E.: A Foreground/Background Separation Algorithm for Image Compression. In: Proceedings of the Data Compression Conference (DCC’04). IEEE Press, Snowbird (2004) 2. de Queiroz, R.L.: Compression of Compound Documents. In: Proceedings of ICIP 1999, pp. 209–213 (1999) 3. Ansalone, A., Carpentieri, B.: How to set “don’t care” pixels when lossless compressing layered documents. WSEAS Transactions on Information Science and Applications 4(1) (2007) 4. Taubman, D.S., Marcellin, M.W.: JPEG2000 Image Compression, Fundamentals, Standards, and Practice. Kluwer, Boston (2002) 5. Weinberger, M.J., Seroussi, G., Sapiro, G.: The LOCO-I lossless image compression algorithm: Principles and standardization into JPEG-LS. IEEE Trans. Image Processing IP-9, 1309–1324 (2000) 6. Malvar, H.S.: Fast adaptive encoder for bi-level images. In: Proceedings of IEEE Data Compression Conference, pp. 253–262 (2001)
Classifier Hypothesis Generation Using Visual Analysis Methods Christin Seifert, Vedran Sabol, and Michael Granitzer Know-Center Graz, Austria {cseifert,vsabol,mgrani}@know-center.at http://www.know-center.at
Abstract. Classifiers can be used to automatically dispatch the abundance of newly created documents to recipients interested in particular topics. Identification of adequate training examples is essential for classification performance, but it may prove to be a challenging task in large document repositories. We propose a classifier hypothesis generation method relying on automated analysis and information visualisation. In our approach visualisations are used to explore the document sets and to inspect the results of machine learning methods, allowing the user to assess the classifier performance and adapt the classifier by gradually refining the training set. Keywords: Text Categorisation, Visual Analysis.
1
Introduction
In today’s information-driven world new documents, such as news, scientific publications, technical reports or patents are produced at an astonishing rate. Frequently the need arises to supply recipients with particular topics of interest with relevant new documents. This task can be automated by using a classifier trained to recognise documents which are relevant to a particular topic, i.e. classify the documents to the corresponding topical category. Obviously, accurate automatic approaches for large data sets are highly desirable. As the performance of a classification model strongly depends on the training data, the classifier needs training data that is representative for the data set. All categories must be sufficiently covered with examples which preferably contain no contradictions. The situation is further aggravated in dynamic data sets where the problem of keeping the training set up to date may arise as new topics appear, vocabularies drift, or interests of the recipients gradually change. In this paper we present a work in progress which attempts to address these issues by a visual analytics-based approach, where automated analysis is combined with information visualisation to unite the strengths of high-speed computer processing with immense pattern recognition capabilities of the human visual apparatus. We propose a classifier hypothesis generation method combining unsupervised and supervised machine learning methods complemented by human involvement via visual analysis GUI components. Users’ general knowledge and intuition are F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 98–111, 2010. c Springer-Verlag Berlin Heidelberg 2010
Classifier Hypothesis Generation Using Visual Analysis Methods
99
decisive factors in steering the training set definition process and providing feedback to the system. Definition of the training set for a category usually begins by selecting candidate documents using a keyword search. In our approach the potentially large and diverse search result set is analysed by a clustering algorithm and presented to the user in an Information Landscape visualisation. The insights the user gains while exploring the topical structures in the landscape are used for specifying the training set of the classifier. After the classifier was trained and new documents were classified, classification results with high confidence values may be gradually added to the training set if deemed good by the user. A classifier visualisation is used to (periodically) assess the training set of the classifier and identify classes and documents where the confidence values produced by the classifier do not appear satisfactory. These documents can be analysed again using clustering and Information Landscape with the goal of improving and tuning the training set assignments. Requirements for the proposed concept, which is currently in a prototype stage, are derived in cooperation with real world users who, within their productive environments, need to supply relevant documents to recipients interested in particular topical categories. Obviously, correct assignments of documents to categories as well as precise category definitions are central to the task. The remainder of the paper is structured as follows: In section 2 we will briefly review related work, section 3 describes our approach and the required building blocks in detail. Section 4 is dedicated to a usage scenario on the Reuters-21578 text collection. We conclude and given an outlook to future work in section 5.
2
Related Work
The interdisciplinary research field of visual analytics focuses on reasoning facilitated by interactive visual interfaces [22]. It is a combination of automated discovery and interactive visualisation [9] used for understanding patterns in large data sets and discovering previously unknown facts and relationships. A core challenge in Visual Analytics is the analysis of massive repositories of unstructured texts [22]. The Information Landscape is a visual representation used to visualize complex relationships in large data sets. It has been successfully applied to convey topical relatedness in text document data sets in systems such as IN-SPIRE [11]. Information Landscapes have been extended to accommodate hierarchically organized text repositories, for example in InfoSky [3] and [17]. Mayer et al. [13] propose a map-based interface to large text collections based on self-organising maps (SOMs). This SOM-based visualisation also provides access to hierachical topical clusters and could be used in our combined approach interchangeable with the Information Landscape. Interactive machine learning offers a way to integrate background knowledge of domain experts into data mining models, in case the visualisations are designed appropriately [21]. For the task of classification, several visualisations have been proposed, most of them for specific classification models like for the Naive Bayes classifier [5]. Visualisations suitable for arbitrary classifiers are either restricted
100
C. Seifert, V. Sabol, and M. Granitzer
to binary classification tasks, i.e. the visualisation based on self-organizing maps in [16], or applicable to only a small number of classes like the cobweb-based visualisation of the classifier decision quality [6]. A visualisation for arbitrary classifier has been proposed in [19], however the applicability to massive text data sets has not been shown yet. Interactive approaches include an interactive decision tree [12] and support vector machine construction [15]. However these approaches are not targeted at laypersons. The Nora project [14] aims at constructing classifiers from large text repositories by letting the user label text documents with one of two category labels. The authors claim that their user interface design can be easily adapted to other text classification task where users can create training sets for classifiers. We think that an explicit representation of the overall classifier and its quality is missing in the interface. Another interactive text classification application has been proposed in the field of intrusion detection [4] where the authors combined a Naive Bayes classifier with a colour coded representation of text, and again let the user interactively label the incoming connection as benign or malicious. Here again, the user has no overview of the classifier and its quality, making it hard to assess the suitability of the classification model for the task. Besides visual approaches, automatic approaches exist for creating classifiers from large data sets, namely semi-supervised learning techniques [24] and activelearning techniques [20]. Semi-supervised models need to be carefully designed and adapted to the problem structure in order to improve classifier performance compared to purely supervised learning [24]. Thus, semi-supervised approaches are not suitable for our application domain, because we can not make a-priori assumptions about the classification problem (e.g., estimated model complexity, data distribution). Active-learning techniques generate new training data by asking the user to label the date items for which the classifier is least confident [20]. Such techniques can be combined with our visual approach to increase the classifier’s performance, once the categories are defined. For our application domain, the categories may evolve over time and thus a pure active learning approach is not applicable.
3
Combined Approach
Our approach to user-centred hypothesis generation for text categorisation is summarised in figure 1(a). Automatic techniques (depicted in dark grey boxes) alternate with users’ analysis and actions (depicted by the light-grey boxes, tagged with the symbolic user). As figure 1(a) shows the approach is characterised by an analysis-action loop which is terminated when the user is satisfied with the classification hypothesis. In detail, our approach consists roughly of the following steps: 1. Search (optional): Automatically finding potential documents of interest using keyword search. 2. Pattern analysis in the Information Landscape: Understand topical structures in the data set and select documents covering relevant topics. If
Classifier Hypothesis Generation Using Visual Analysis Methods
(a) process
101
(b) building blocks
Fig. 1. Overview of the combined user-centred clustering and classification approach
no search was performed before the whole document set is is chosen for the Information Landscape. 3. Building the classification hypothesis: – Training: Use selected documents to train a new category or to modify (add/remove documents from) an existing one. – Classification: Classify selected documents and, optionally, add documents with high confidence values to the training set of the assigned category. – Visualisation: Inspect the classifier and, if deemed necessary, select documents for further analysis and refinement in the Information Landscape (step 2). The building blocks necessary to implement the described approach are depicted in figure 1(b) and briefly described in detail in sections 3.1- 3.7. Note that the majority of the applied algorithms and visualisations are implemented within the KnowMiner knowledge discovery framework [10] and VisTools visualisation library [17]. Applying clustering and classification methods in the document space will result in a structure as depicted in figure 2. This structure contains two, in general orthogonal dimensions: the clustering tree and a forest of tree stumps imposed by the classification. The first structure devised by the clustering algorithm is visualised in the Information Landscape. Note, that the tree-like structure imposes a non-overlapping division of of the document set, i.e., a document only belongs to one of the next-to-bottom level clusters. The second structure, the decision tree stumps were imposed by the classification algorithm. The categories (on the right-hand side of figure 2) further divide the document set - independent of the cluster hierarchy. As we apply a multi-label classifier this second structure
102
C. Seifert, V. Sabol, and M. Granitzer
Fig. 2. Structure of the document space: hierarchical cluster tree superimposed by categories modelled by the classifier. Dark red documents are training documents for one class, light red are the documents classified by the classifier.
is in general not a partitioning of the document set, i.e., one document can belong to more than one category. The clustering hierarchy corresponds to topical structures implicitly present in the document space which are useful for gaining insight into the data, whereas the division imposed by the classifier corresponds to the structure explicitly defined by the application domain and users’ needs. 3.1
Document Preprocessing and Indexing
Before any of the analytic algorithms and visualisations can be applied on text documents, these documents need to be preprocessed and transformed into a term space representation. Each document is represented by term vector where components of the vector are the frequencies (occurrence counts) of terms in the document. To extract relevant terms from a document we apply a part of speech tagger to identify nouns which are subsequently stemmed and stop word-filtered. As vectorisation of text documents may be quite time consuming, raw vectors are stored so they can be quickly retrieved when needed for processing by an algorithm. Also, all documents which are imported and stored in their vectorised form are also indexed so they can be quickly retrieved using full text search. 3.2
Hierarchical Clustering
Clustering is an unsupervised machine learning technique which partitions a given set of items, in our case text documents, into subsets (clusters) of related items. Documents assigned to the same cluster are similar to each other according to a similarity function. We apply the k-means algorithm recursively using the cosine similarity measure, which is known to perform well for text data [23].
Classifier Hypothesis Generation Using Visual Analysis Methods
103
Recursive application of k-means creates a hierarchy of clusters and sub-clusters where the leaves of the created cluster tree correspond to single documents. Each cluster in the hierarchy is labelled with the highest weight terms (i.e. extracted keywords) of the underlying documents. A mechanism for splitting and merging of clusters attempts to guess the ”optimal” number of child-clusters, whereby the number of children is limited for usability reasons (i.e. to avoid scanning of long lists). The split-and-merge strategy also prevents the degeneration of the hierarchy. The resulting cluster hierarchy is suitable for browsing of the document set and we also refer to it as ”virtual table of contents”. 3.3
Projection Algorithm
The projection algorithm performs a dimensionality reduction of the high dimensional term space. In the resulting 2D visualisation space high-dimensional relationships (i.e. topical similarities) are preserved as well as possible so that topically similar documents (and clusters) are placed close to each other while dissimilar ones are positioned far apart. The algorithm [17] is a combination of hierarchical clustering techniques, force-directed placement and spatial tessellation, which proceeds recursively along the cluster hierarchy: First, the top level clusters are placed inside a rectangular area using a simple force-directed placement algorithm. The similarity of the centroids is calculated as the cosine similarity in the vector space representation of the centroids. After the top-level centroids are placed a Voronoi subdivision is calculated using the centroids as generator points for the Voronoi regions. The sub-clusters of a specific cluster are recursively projected inside the Voronoi region of a this cluster. The leafs (documents) of the hierarchy are placed within the Voronoi area of their parent cluster using the same force-directed placement method. 3.4
Multi-label Text Classification
The purpose of text classification algorithms is assigning category labels to previously unseen documents. Classification is a supervised machine learning technique, meaning that the algorithms learns the categories from a training set comprising of document-category pairs. Especially in text classification tasks single-label classification is not sufficient, meaning that each document may belong to more than one predefined category. We apply an adapted K-Nearest Neighbour (KNN) algorithm [2,1] for multi-label classification. As similarity we use the cosine similarity on the TF-IDF weighted vector-space representation of the documents. The output of the classifier for each classified document is a list of categories accompanied with a confidence value for each category. The visual analytics application is in principle independent of the specific classifier, as long as multi-label classification is supported. We use a KNN implementation to accommodate the dynamic nature of document repositories which are often growing at a fast rate. KNN training performance is suitable for frequently changing training sets where documents defining a training set of category are added (or removed) fairly often, and where categories need to be reorganised from time to time.
104
3.5
C. Seifert, V. Sabol, and M. Granitzer
Information Landscape
The Information Landscape is a visual representation based on the geographic map metaphor. It is used to visualize complex relationships in large data sets by conveying relatedness through spatial proximity. We use an Information Landscape to visualise projection and clustering results provided by the algorithms described in sections 3.2 and 3.3. The cluster hierarchy is represented by nested polygonal areas which were generated by the Voronoi area subdivision. A region of the landscape corresponding to a cluster is labelled by highest weight terms of cluster’s centroid. Documents, which are placed at the bottom of the hierarchy, are visualised as dots. Hills represent regions populated by a large number of topically related documents. They are separated by lower areas or see which represent sparsely populated regions. The landscape (see figure 3) is an interactive GUI component designed for explorative navigation in the visualised data set. It adheres to the well-known information visualisation mantra (“overview first, zoom and filter, details-ondemand“) by providing an overview of the whole data set and, when required, offering insight into at finer levels of detail. Labels are useful both for orientation and navigation - clicking on the label will trigger a short animated ”flight” to the corresponding cluster and reveal the areas and labels of its sub-clusters. At the finest level of details information on individual documents will be displayed. This provides an adaptive level of detail which is always adjusted to the zoom level and the area currently explored by the user. Free zooming, panning, rotating and tilting are also available. The landscape offers several mechanisms for selecting documents: using a lasso selection tool, single selection through mouse clicks, depending on the cluster membership, and using search. Selected documents are enlarged and/or displayed in a different colour. 3.6
Classifier Visualisation
The visualisation of the classifier provides an overview of the quality and the model of arbitrary classifiers. It is described in detail in [18] for single-label classification tasks. In our application data items may belong to more than one category (multi-label classification). For each classified document the classifier delivers category assignments consisting of pairs in the form category label, confidence, where confidence is a real number between 0 and 1 (highest). In the visualisation the categories are equally distributed on the circumference of a circle while the items are attracted to the categories according to their confidence values. This means that items placed near the categories belong to this category with high confidence. On the contrary, items placed in the centre of the circle are assigned to more than one category with high confidence. Thus, the visualisation gives an overview of the item distributions over categories. If most of the items are placed in the centre, this indicates a strong multi-label classification model. In the contrary, if most of the items are placed near the categories, it is an indication of a predominantly single-label classification model. As mentioned in [18] in the classifier visualisation the placement of the items is ambiguous.
Classifier Hypothesis Generation Using Visual Analysis Methods
105
This ambiguity is resolved by user interaction: Moving the mouse over a data item displays the corresponding assignments of the classifier, highlighting the category for which the classifier is most confident. Furthermore, on mouse over the content of the data item is shown allowing the user to assess the classifier’s decision. The classifier visualisation offers single selection on mouse click as well as a lasso selection to select several similarly classified items 3.7
Interaction Mechanisms
Table 1 shows all tasks that can be performed using both interfaces. Some tasks, like “delete items” can be performed in both user interfaces. Other tasks can only be performed in one of the interfaces. For instance, “delete class” is only possible from the classification window, since a class is not visible as such in the Information Landscape. Table 1. Overview of the tasks that can be performed from the interfaces Information Landscape (IL) and classifier window (CW) task
invoked in results in IL CW
create category
delete from category
1
delete category
classify inspect documents
4
inspect category
inspect classifier
a new category is created from the selected documents and the classifier is retrained selected documents are deleted from the category and the classifier is retrained the selected category is deleted and the classifier is retrained the selected documents are classified and the output of the classifier is presented to the user an Information Landscape is built from the selected documents an Information Landscape is built from the training documents of the selected category shows the classifier visualisation for the training data
Usage Scenario
We performed our experiments on the Reuters 21578 text collection. The hierarchical clustering results in 10 top-level clusters as shown in Figure 3. This structure of the information space is purely unsupervised. The Information Landscape in Figure 3 gives an overview of the entire text collection, showing clusters of similar documents and associated labels. The user can investigate the cluster hierarchy and get an insight in the overall content of the collection. The user 1
Category deletion can not be explicitly invoked from the Information Landscape, but a category is automatically deleted if all its training documents are deleted.
106
C. Seifert, V. Sabol, and M. Granitzer
Fig. 3. Selection of documents in the Information Landscape
might be interested in other partitions of the data set which are not detected by the unsupervised methods. For example, the user might want to distinguish the categories “politics”, “computers”, “cars”, “sports” and “planes”. First, these categories are not explicitly represented, they only exist as a mental model in the user’s mind. While investigating the Information Landscape the user might come across documents that belong to one of these categories. The user can then select these documents (as shown in figure 3) and can create a new category from the selected documents. In the background, the selected documents are added to the classifier as new training data for the specific category (if the category does not exist in the classifier yet, it will be created). After repeating the steps “investigation” and “adding training documents to the classifier” the user might have found example documents for each of the categories of interest. He or she might then be interested of the current available documents for each category and the quality of the classifier that he or she has implicitly generated. This information is provided by the classification window. The training document for each class are presented as a list to the user. If the user detects wrongly assigned documents for a category he or she can simply remove them from the list and the classifier is retrained on the reduced training data set. For assessing the overall classifier quality the user can switch to the classification visualisation view as shown in Figure 5. In the figure, it can be seen that there are many documents belonging to more than one class (the central area). Only for the class “car” there are documents belonging to no other class. Further, there are some documents belonging to exactly 2 classes, these are the documents lying on the imaginary line between the “cars” and the “sports” rectangle as well as on the imaginary
Classifier Hypothesis Generation Using Visual Analysis Methods
107
line between “cars” and “computers”. The user might investigate the content of the interesting documents by moving the mouse over the items and eventually discover misclassified items. If user discovers that the classification model is not in line with his or her mental model of the categories, e.g., that the categories should be more distinct (i.e. lesser documents in the centre of the visualisation), the user could select the conspicuous documents and generate a new Information Landscape in order to further investigate them. Similarly a new Information Landscape can be generated for all training documents of one category. The resulting landscape is shown in figure 6 This might lead to further insights and actions, for instance finding and deletion wrongly assigned training documents. After cleaning up the classifier by consolidating the training sets, the user might be interested if there are more documents inside the collection that fit into these categories. Back in the Information Landscape he or she then selects documents and gets them classified. The classification result for the documents selected in figure 3 is shown in figure 4. Then the classification results can be investigated and, in the case the classifier correctly classified the documents, can be added to the trainings data set.
5
Discussion
Generating classifier hypothesis for large dynamic text data repositories is a challenging and time-consuming task. We described our work in progress, which combines automatic and visualisation-based approaches. The user is presented an interactive visualisation of the text collection, the Information Landscape, which
Fig. 4. Classification results for selected documents
108
C. Seifert, V. Sabol, and M. Granitzer
Fig. 5. Overview of the classifier’s training data set
Fig. 6. Information landscape generated by selected documents (training documents for class “plane”
Classifier Hypothesis Generation Using Visual Analysis Methods
109
is useful for gaining insights into topical structures present in the data set. The newly discovered information is useful for defining the training set of the classifier. The resulting classifier can be evaluated by the means of a classifier visualisation and refined further if necessary. We see clear advantages of our approach in the case when the categories are not pre-defined, but emerge during investigation of the document set. However, we also believe that the Information Landscape is useful for analysis and improvements of existing training sets, for example when the quality of the classifications is deemed unsatisfactory by the user.
6
Future Work
As our approach relies heavily on visualization and user interaction, usability evaluation of selected components would be necessary to discover eventual shortcomings. Some components, such as the Information Landscape, have already been evaluated in formal usability experiments [3,17]. Evaluation of the classifier visualisation component and its interaction with the Information Landscape appears as a natural next step. Even more importantly, an evaluation of the overall effectiveness of our approach is necessary to assess its practical applicability. As the development of our method is being driven by real world scenarios, we plan to test the effectiveness of our method with pilot users who in their daily work deal with assigning documents to topical categories. To obtain objective performance figures, we will compare the performance (expressed by quality and productivity indicators) of our classifier-based method to the solutions currently employed by the pilot users. These include either reading of documents and manually assigning them to categories, or when the amount of documents is prohibitively large, constructing complex Boolean search queries to narrow down the document set. Also, subjective user satisfaction should be evaluated through questionnaires and by collecting user remarks, which will help us identify main problems sources and provide hints on how to deliver remedies. Further direction for development of the system include primarily incorporating active learning methods to improve the classifier’s performance once the categories are defined. Also we plan to integrate other classification models, such as Support Vector Machines [8] and the Class-Feature-Centroid classifier [7]. Acknowledgments. The Know-Center is funded within the Austrian COMET Program - Competence Centers for Excellent Technologies - under the auspices of the Austrian Ministry of Transport, Innovation and Technology, the Austrian Ministry of Economics and Labor and by the State of Styria. COMET is managed by the Austrian Research Promotion Agency (FFG).
References 1. Aha, D.W.: Tolerating noisy, irrelevant and novel attributes in instance-based learning algorithms. Int. J. Man-Mach. Stud. 36(2), 267–287 (1992) 2. Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991)
110
C. Seifert, V. Sabol, and M. Granitzer
3. Andrews, K., Kienreich, W., Sabol, V., Becker, J., Droschl, G., Kappe, F., Granitzer, M., Auer, P., Tochtermann, K.: The InfoSky Visual Explorer: Exploiting hierarchical structure and document similarities. Information Visualization 1(3-4), 166–181 (2002) 4. Axelsson, S.: Combining a bayesian classifier with visualisation: Understanding the IDS. In: Proceedings of the 2004 ACM workshop on Visualization and data mining for computer security, pp. 99–108. ACM Press, New York (2004) 5. Becker, B.G.: Research report: Visualizing decision table classifiers. In: Information Visualization. IEEE Computer Society Press, Los Alamitos (1998) 6. Diri, B., Albayrak, S.: Visualization and analysis of classifiers performance in multiclass medical data. Expert Systems with Applications 34(1), 628–634 (2008) 7. Guan, H., Zhou, J., Guo, M.: A class-feature-centroid classifier for text categorization. In: Proceedings of the 18th international conference on World Wide Web (WWW), pp. 201–210. ACM, New York (2009) 8. Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: N´edellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998) 9. Keim, D.A., Mansmann, F., Oelke, D., Ziegler, H.: Visual analytics: Combining automated discovery with interactive visualizations. In: Boulicaut, J.-F., Berthold, M.R., Horv´ ath, T. (eds.) DS 2008. LNCS (LNAI), vol. 5255, pp. 2–14. Springer, Heidelberg (2008) 10. Klieber, W., Sabol, V., Muhr, M., Kern, R., Granitzer, M.: Knowledge Discovery using the Knowminer framewor. In: Proceedings of the IADIS International Conference on Information Systems, pp. 307–314 (2009) 11. Krishnan, M., Bohn, S., Cowley, W., Crow, V., Nieplocha, J.: Scalable visual analytics of massive textual datasets. In: IEEE International on Parallel and Distributed Processing Symposium, IPDPS 2007, pp. 1–10 (March 2007) 12. May, T., Kohlhammer, J.: Towards closing the analysis gap: Visual generation of decision supporting schemes from raw data. In: Joint Eurographics and IEEE VGTC Symposium on Visualization (EuroVis), Computer Graphics Forum, vol. 27, pp. 911–918 (2008) 13. Mayer, R., Roiger, A., Rauber, A.: Map-based interfaces for information management in large text collections. Journal of Digital Information Management 6(4), 294–302 (2008) 14. Plaisant, C., Rose, J., Yu, B., Auvil, L., Kirschenbaum, M.G., Smith, M.N., Clement, T., Lord, G.: Exploring erotics in emily dickinson’s correspondence with text mining and visual interfaces. In: JCDL’06: Proceedings of the 6th ACM/IEEECS joint conference on Digital libraries, pp. 141–150. ACM, New York (2006) 15. Poulet, F.: Towards Effective Visual Data Mining with Cooperative Approaches, pp. 389–406. Springer, Heidelberg (2008) 16. Rheingans, P., des Jardins, M.: Visualizing high-dimensional predictive model quality. In: Proceedings of IEEE Visualization, pp. 493–496 (2000) 17. Sabol, V., Kienreich, W., Muhr, M., Klieber, W., Granitzer, M.: Visual knowledge discovery in dynamic enterprise text repositories. In: IV ’09: Proceedings of the 2009 13th International Conference Information Visualisation, pp. 361–368. IEEE Computer Society, Washington (2009) 18. Seifert, C., Lex, E.: A novel visualization approach for data-mining-related classification. In: Proceedings of the 13th International Conference on Information Visualisation (IV), July 2009, pp. 490–495. Wiley, Chichester (2009) 19. Seifert, C., Lex, E.: A visualization to investigate and give feedback to classifiers. Poster and Demo at Eurovis 2009 (June 2009) (unpublished)
Classifier Hypothesis Generation Using Visual Analysis Methods
111
20. Settles, B.: Active learning literature survey. Computer Sciences Technical Report 1648, University of Wisconsin–Madison (2009) 21. Ware, M., Eibe, F., Holmes, G., Hall, M., Witten, I.H.: Interactive machine learning: letting users build classifiers. International Journal of Human-Computer Studies 55(3), 281–292 (2001) 22. Wong, P.C., Thomas, J.: Visual analytics. IEEE Computer Graphics and Applications 24, 20–21 (2004) 23. Zhao, Y., Karypis, G.: Evaluation of hierarchical clustering algorithms for document datasets. In: CIKM ’02: Proceedings of the eleventh international conference on Information and knowledge management, pp. 515–524. ACM Press, New York (2002) 24. Zhu, X.: Semi-supervised learning literature survey. Technical Report 1530, Computer Sciences, University of Wisconsin (2008)
Exploiting Punctuations along with Sliding Windows to Optimize STREAM Data Manager Lokesh Tiwari and Hamid Shahnasser* San Francisco State University San Francisco, CA, USA [email protected]
Abstract. Data stream processing has found enormous use in real time applications such as financial tickers, network and sensor monitoring, manufacturing processes and others. STREAM is Data Stream Management System which implements sliding window query model in its architecture. We explore the existing architectural model in an effort to achieve faster results and improve the overall system functionality. In this paper, we analyze the STREAM data manager for the possible optimization of the query processing. We discuss the potential inadequacies of the current query model and put forward the use of punctuations along with sliding windows to improve the memory utilization and query processing. Keywords: Data Stream Management System, Punctuation, Sliding window, STREAM data manager.
1 Introduction Data Stream Management Systems (DSMS) provide a comprehensive tool for real time stream monitoring. Data Stream Management Systems that provide online data management capabilities has been successfully used in systems related to the financial applications, military purposes, sensor networks and others [1]-[4]. Data Stream is a continuous sequence of tuples which is ordered by a timestamp. In other words, stream is a sequence of timestamped tuples which are unbounded and include the notion of time. As the input data stream of DSMS is continuous and unbounded, the memory requirement of DSMS correspondingly increases. The continuous nature of the real time data stream makes it difficult to define the start and end constraints on the input stream while processing the data. Thus, the foremost requisite of any DSMS is to bound the data stream to keep a check on the memory requirements and processing resources [5], [6]. Most of the applications using Data Stream Management Systems are real time with no prior information of the data traffic stream [2]-[4]. The length of the data elements, classification of incoming tuples with respect to the relative streams, stream burst rate and time variations of incoming tuples are * This work has been partially supported by UNCFSP NASA Science and Technology Institute grant. F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 112–119, 2010. © Springer-Verlag Berlin Heidelberg 2010
Exploiting Punctuations along with Sliding Windows
113
unknown. Thus queries cannot be optimized to use the available memory resources without avoiding the possibility of memory overload. Without knowing the size of the input stream, it is impossible to place a boundary on the memory requirements of the queries. There is unlimited number of attributes that would be required to fully cover the entire domain and anticipate every possible data tuple for future event. For real time data management applications, the incoming stream needs to be combined with various other continuous streams in order to evaluate the required queries. Such join operations over the continuous data stream need to store the incoming data to join it with corresponding future data. The operator’s throughput is affected (in a negative way) as the join state becomes bulky and thereby affecting the overall system performance [7]. Thus, it does not make any practical logic to compare an individual tuple of one unlimited stream with all the tuples of another infinite stream for query evaluations. The STREAM management system addresses this problem of infinite memory requirement with the use of sliding windows in its architecture [7], [8]. The architecture restricts the memory requisite of the join state by dropping the tuples out of the active window in a timely fashion. 1.1 Organization of the Paper The rest of the paper is organized as follows. Related work covered in section 2 talks about previous work regarding data stream management system. Section 3 presents an example of an online bidding system explaining the challenges involved with sliding windows as being used in STREAM data manager. Section 4 consists of the analysis with regard to optimization of the architecture of STREAM data manager. Future work and conclusion is presented in section 5.
2 Related Work Among various well-known Data Stream Management Systems - TelegraphCQ uses SQL-like query language which uses the sliding window for the bounds on continuously arriving data stream [9] Gigascope uses GSQL, which is a specific subset of SQL [10], [11]. Aurora implements a declarative streaming query language for the processing system [12]. STREAM uses CQL based query system with sliding windows in its architecture [13]. Several join processing solutions have been implemented as the continuous data stream management receives increasing attention. The primitive well-known pipelined join solution are the symmetric hash join [14], XJoin [15] and ripple joins [16]. These joins faced the problem of potentially unlimited runtime for the join state as the streaming data arrives continuously. Other related work emphasized on memory consumption while keeping a check on the required resources by using constraints. These constraints could be time-based constraints as with sliding windows approach [7], [8], [17], [18] or value based as with k-constraint algorithms [19]. Another approach using dynamic constraints that help in query optimization is application of Punctuations [20], [21]. The time-based window join algorithms exploit the time range for tuples to use as a boundary over incoming data stream. Tuples that are outside the active window can be removed from the join operation [8], [18]. The K-constraint-exploiting algorithm
114
L. Tiwari and H. Shahnasser
[19] exploits clustered data arrival patterns to detect and purge expired data to shrink the join state. The patterns are statistically specified and hence only characterize the restrictive cases of real-world data. This approach is limited in its usability as the precision of the join result suffers when the actual data fails to obey the static constraints such as ordering and clustering [19]. Punctuations [20], [21] are assertions about what can and cannot appear in the remainder of data stream. Punctuations allow the blocking operators of the query to start processing on the input when appropriate to obtain most accurate results. Punctuations are essentially end-of-stream indicators which provide a bound on the input for a blocking operator.
3 Motivation The current STREAM architecture employ sliding windows approach to bound the dataset and apply a constraint on the memory usage [7], [8], [22]. An active window which can be time-based (such as Range of 12 hours) or count-based (over 1000 tuples) is defined within the query manager. The query manager considers the data tuples that are present in the active window for processing. Once the data element moves out of the window range, the element can be dropped and would not be considered for further query evaluation. STREAM system has been successfully implemented for an online bidding website such as eBay [23]. We introduce a running example to explain the possible optimization of STREAM data manager. In any online auction system, multiple auctions for individual items are open at any given time. For the items opened for bidding, multiple users keep on bidding with increasing value. Thus, a continuous stream of bids for each item is received. Also, at any given time, new items are being submitted and new members are registered with the online auction system. Thus, multiple streams are available for real time processing at any time. In the STREAM data manager, each auction is represented by a tuple in the Auction stream and each bid is represented by a tuple in the Bid stream. The tuples arrive in an order based on timestamp as required by the system. A common query in such an environment is represented in the example below. This query asks for the number of bids for each unique auction within a period of 24 hours from its opening time. The query plan would require a join operator which joins the Auction stream and the Bid stream to evaluate the result. The group-by operator groups the tuples in the result stream by the item identifier value. The 24 hour window is applied onto the Auction stream using the join operator. However, the use of siding window causes concerns as explained further. Example of CQL Query for STREAM
Select
A.item, Count (*)
From
Auction [Range 24 hours] A, Bid B Where
Group By
A.item = B.item A.item
Exploiting Punctuations along with Sliding Windows
115
Table 1. Auction Stream (A)
Item X Y Z
Seller Seller_1 Seller_2 Seller_3
Bid start price 25. 00 75. 00 50. 00
timestamp 10:05:00 10:10:06 10:12:52
Table 2. Bidding Stream (B)
Item Bidder Bidder_1 X Bidder_2 Y Bidder_3 Z < X, *, *, * >
Bidding price 30. 00 90. 00 45. 00
timestamp 11:14:23 11:33:00 14:33:09
As there are many auctions within the 24 hour period and the number of bids for each item within this period is also increasing at a high rate, the join state for both streams increases consistently. The increase in the join state of the streams eventually increases the memory and the processing requirements. Each item has a unique item identifier associated with it as shown in the Table 1. If many auctions are opened during this range of 24 hours with corresponding large number of bids, the join state becomes vast. Each bid tuple corresponds to its equivalent auction stream as each auction has unique identifier. The bid steam need not be maintained in the system after completion of the join operation as it is not going to match any future auction tuples. In such a case, we can insert punctuations for such an instance so that the process is terminated once the bid stream is joined to the corresponding Auction stream. The sliding window also uses additional time and resources for Auction tuples that are open for duration less than the window range of 24 hours. With respect to the online auctions, there would be many auctions with a smaller active period (auctions that are open for less than 24 hours). Such auctions can be removed from the state before the end of the window range to save the resources and produce faster partial results. Punctuation for announcing the end of the join operation can be implemented to optimize the system in such a case. The use of punctuation allows the STREAM data management system to output results before the final window range. In Table 2, the incoming tuple <X, *, *, *> is a punctuation which represents that no further bids for item X are coming in future. The punctuations can be inserted to notify the end of auction stream. In Table 2, the punctuation informs that there are no more bids expected for the auction item X. The use of punctuation is useful for the above mentioned cases; however, the sliding window proves functional for the Auctions that have the duration of more than 24 hours. Join results are not produced for the tuples that arrive after the window range. The future bids for such auctions are directly being dropped as they do not contribute to the required query results. The isolated use of punctuations would lead to overload of the system as there would be countless punctuations to be addressed for each ongoing auction. The computational cost of the system would increase,
116
L. Tiwari and H. Shahnasser
effectively reducing the system performance. In this paper, we propose the combined use of sliding window and punctuations for efficient use of the resources in such operations.
4 Analysis Exploiting punctuations alongside sliding window in STREAM can be used to optimize the system as discussed in section 3. Using punctuations in combination with sliding windows would allow the STREAM data manager to provide faster partial results. The results can be processed once the punctuation is received (the operator need not wait for the end of active window). Also, punctuations help resolve the issue caused by blocking operators [21]. 4.1 Memory Resource Overhead Memory utilization is an important aspect of any Data Stream Management System since the incoming data stream is unbounded which eventually requires unlimited memory for query processing. To cope with the high memory requirement, constraints need to be applied on the data stream. STREAM data manager uses sliding window to keep a check on the memory requirements [7], [8]. With the use of sliding windows, a time based constraint (defining a time range) or count based constraint (defining the number of tuples) is applied on the incoming real time data stream. The memory requirements of the system are kept in check as the data tuples under consideration for the query are restricted within the active window. However, there are certain cases where the processing can be done before the end of the active window (as explained in section 2). In such join operations, the partial results can be generated before the end of active window and free up memory resources. Punctuations can be used to produce the partial results by implying the end of related data tuples. Use of punctuations effectively allows reducing the memory consumption as the data that is no longer needed to be stored can be purged. Fine grained punctuation may degrade the efficiency of the system and increase the processing time. Ding et. al [20] observe that for a punctuation-to-tuple ratio of 15%, the punctuation processing overhead does not affect the overall system throughput. However, the STREAM data manager also needs to block any tuples violating punctuation, which causes additional computational cost. 4.2 Blocking Operators Combining the punctuations with the existing sliding windows approach in STREAM would also prove helpful with the blocking operators. Blocking query operator is a query operator that is unable to produce the first tuple of its output until it has seen its entire input. Common examples of sorting query operators are Sum, Sort, Count, and Average. Such operators (blocking) are an issue when used with streaming data since the incoming streams could be unbounded. The current STREAM system can process the blocking operators once the active window is completed and produce the required results. The inclusion of punctuations would enable the system to produce results
Exploiting Punctuations along with Sliding Windows
117
even if the sliding window is not completed. Punctuations allow the blocking operators of the query to start processing on the input stream as soon as possible to obtain most accurate results. This is useful for real time applications such as online bidding and financial tickers [3], [23]. 4.3 Join Operators The real time data streams systems such as for online bidding systems or sensor networks adhere to some semantic constraints which can be utilized to detect and purge the no longer needed data in join state [1], [3]. In streams with data clustered together by unique identifiers, the termination point for each join value is already known. This termination point can be used to confine the join operation with use of punctuations inserted into the data streams. Once a punctuation (signifying the end of join) is received in the stream, the corresponding punctuation can be checked in the other streams involved in the join operation and restrict the joining with any future tuples. 4.4 Overall Assessment In this paper, we propose the combined use of punctuations along with sliding windows to optimize the overall performance of the system. As explained in earlier sections, we can exploit punctuations in the existing system to further enhance the functionality and increase the output processing rate. Using punctuations in a Data Stream Management requires the system to understand the semantic of the punctuations that are included in the incoming stream and allow for the apt actions. Exclusive use of punctuation for bounding the incoming real time data stream increases the overhead of the system and results in higher computational cost [20], [21]. In addition, it creates a dependency for the input to carry the additional punctuation information which may not be possible for all data streams. To increase the efficiency of the system in order to optimize the memory utilization, a combination of sliding window and punctuation would be ideal. The inadequacies of sliding windows (being used by STREAM) can be eliminated with use of punctuations. The computational cost of the combined system would be almost similar to the current system using sliding windows when no punctuations are received within the active window range.
5 Future Work and Conclusion The STREAM data manager shows optimization possibilities in various aspects as reviewed in earlier sections. In this paper, we propose the combined use of punctuations in addition to the existing sliding windows schema to achieve better flexibility of system and increase the query processing throughput. Additional research is required for design of a schema to incorporate punctuations into the STREAM data manager. Poor designed punctuation architecture may result in decreasing the existing efficiency of the STREAM data manager as punctuations add for additional computational cost.
118
L. Tiwari and H. Shahnasser
References 1. Madden, S., Franklin, M.J.: Fjording the stream: an architecture for queries over streaming sensor data. In: 18th International Conference on Data Engineering, pp. 555–566 (2002) 2. Hammad, M.A., Aref, W.G., Elmagarmid, A.K.: Stream window join: tracking moving objects in sensor-network databases. In: 15th International Conference on Scientific and Statistical Database Management, pp. 75–84 (2003) 3. Zhu, Y., Shasha, D.: StatStream: statistical monitoring of thousands of data streams in real time. In: 28th International Conference on Very Large Data Bases, pp. 358–369 (2002) 4. Madden, S., Franklin, M.J., Hellerstein, J.M., Hong, W.: The design of an acquisitional query processor for sensor networks. In: ACM SIGMOD International Conference on Management of Data, pp. 491–502 (2003) 5. Golab, L., Ozsu, M.T.: Issues in data stream management. SIGMOD Rec. 32(2), 5–14 (2003) 6. Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: Twenty-First ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 1–16 (2002) 7. Arasu, A., Babcock, B., Babu, S., McAlister, J., Widom, J.: Characterizing memory require-ments for queries over continuous data streams. ACM Transactions on Database Systems (TODS), 162–194 (2004) 8. Motwani, R., Widom, J., Arasu, A., Babcock, B., Babu, S., Datar, M., Manku, G., Olston, C., Rosenstein, J., Varma, R.: Query Processing, Resource Management, and Approximation in a Data Stream Management System. CIDR (2003) 9. Chandrasekaran, S., Cooper, O., Deshpande, A., Franklin, M.J., Hellerstein, J.M., Hong, W., Krishnamurthy, S., Madden, S.R., Reiss, F., Shah, M.A.: TelegraphCQ: continuous data-flow processing. In: ACM SIGMOD international Conference on Management of Data, pp. 668–668 (2003) 10. Cranor, C., Johnson, T., Spataschek, O., Shkapenyuk, V.: Gigascope: a stream database for network applications. In: ACM SIGMOD International Conference on Management of Data, pp. 647–651 (2003) 11. Cranor, C., Gao, Y., Johnson, T., Spataschek, O., Shkapenyuk, V.: Gigascope: high performance network monitoring with an SQL interface. ACM SIGMOD International Conference on Management of Data, p. 623ff. (2003) 12. Abadi, D., Carney, D., Çetintemel, U., Cherniack, M., Convey, C., Lee, S., Stonebraker, M., Tatbul, N., Zdonik, S.: Aurora: a new model and architecture for data stream management. The VLDB Journal 12, 120–139 (2003) 13. Arasu, A., Babu, S., Widom, J.: The CQL continuous query language: semantic foundations and query execution. The VLDB Journal, 121–142 (2006) 14. Wilschut, N., Apers, P.M.: Dataflow query execution in a parallel main-memory environment. In: First international Conference on Parallel and Distributed information Systems, pp. 68–77 15. Urhan, T., Franklin, M.: XJoin: A reactively scheduled pipelined join operator. IEEE Data Engineering Bulletin, 27–33 (2000) 16. Haas, P.J., Hellerstein, J.M.: Ripple joins for online aggregation. In: ACM SIGMOD international Conference on Management of Data, pp. 287–298 (1999) 17. Golab, L., Özsu, M.T.: Processing sliding window multi-joins in continuous queries over data streams. In: 29th international Conference on Very Large Data Bases. VLDB Endowment, vol. 29, pp. 500–511 (2003)
Exploiting Punctuations along with Sliding Windows
119
18. Kang, J., Naughton, J.F., Viglas, S.D.: Evaluating window joins over unbounded streams. In: 19th International Conference on Data Engineering,, pp. 341–352 (2003) 19. Babu, S., Srivastava, U., Widom, J.: Exploiting k-constraints to reduce memory overhead in continuous queries over data streams. ACM Trans. Database Syst. 29(3), 545–580 (2004) 20. Ding, L., Rundensteiner, E.A.: Evaluating window joins over punctuated streams. In: Thirteenth ACM international Conference on information and Knowledge Management, pp. 98–107 (2004) 21. Li, H., Chen, S., Tatemura, J., Agrawal, D., Candan, K.S., Hsiung, W.: Safety guarantee of continuous join queries over punctuated data streams. In: 32nd international Conference on Very Large Data Bases, pp. 19–30 (2006) 22. Motwani, R., Widom, J., Arasu, A., Babcock, B., Babu, S., Srivastava, U., Wang, Z.: STREAM: the Stanford stream data manager, http://infolab.stanford.edu/stream 23. Motwani, R., Widom, J., Arasu, A., Babcock, B., Babu, S., Srivastava, U., Wang, Z.: STREAM: the Stanford stream data manager, http://infolab.stanford.edu/stream/sqr
A Framework for In-House Prediction Markets Miguel Velacso1 and Nenad Jukic2 1
Carlson School of Management, University of Minnesota, 321 Nineteenth Avenue South Minneapolis, MN 55455, USA [email protected] 2 School of Business Administration, Loyola University Chicago, 1 E Pearson Chicago, IL 60611, USA [email protected]
Abstract. In-house prediction markets are a new method for collecting and aggregating information dispersed throughout an organization. This method is capable of accessing and aggregating certain organizational information that has previously not been attainable via traditional methods such as surveys, polls, group meetings, or suggestion boxes. Such information is often of great tactical and/or strategic value. Existing in-house prediction markets, which are either opened to all members of the organization or to pre-selected groups of experts within the organization, base participant’s power to influence the market strictly on the amount of their assets (usually in mock currency). We propose a more nuanced design approach that considers additional factors for determining the participant’s influence on the market over the long term. The goal of this design approach is to improve the accuracy and decision-support viability of inhouse prediction markets. Keywords: Prediction Markets, Organizational Information, Decision Support, Crowdsourcing.
1 Introduction Somebody visiting Google’s headquarters during the last four years may notice that quite a few employees seem to be sitting at their desks during lunch time trading stocks on their computers. Although e-trading has become quite ubiquitous, one would be jumping to conclusions too soon by that Google employees just like to trade online. What the visitor is actually seeing is the result of a new method for collecting and aggregating dispersed information, usually called prediction markets. Instead of buying shares, Google employees are trading in stocks representing the “Number of customers that Gmail will have at the end of the fiscal year” or the “Probability of a new competitor in the field of video search to appear in the market this quarter”. Accessing dispersed information within a company to better inform decision making has always been an issue, more so since most managers nowadays realize that employees along the company’s hierarchy possess information that is hard to reach but may be relevant to decision making [1]. Traditional methods of accessing and aggregating this dispersed information like surveys, polls, group meetings, suggestion F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 120–127, 2010. © Springer-Verlag Berlin Heidelberg 2010
A Framework for In-House Prediction Markets
121
boxes, etc, have proven to be of value, and are well documented by past and current management literature (for a condensed review on how these methods differ from prediction markets, see [10]). The first documented prediction markets [8] took place during the US presidential elections between 1868 and 1940, but have only recently gained some momentum amongst the managerial community. Pioneered by Siemens in 1997, more and more companies have introduced prediction markets into their corporate culture; Google’s being the largest implementation so far with almost 300 markets and over 80,000 trades since their inception in April 2005 [3]. This renewed interest in prediction markets has been fueled by the success of experimental prediction markets during the last decade that have emerged from very different environments. Academia has produced some, such as the Iowa Electronic Markets, operated by faculty at the University of Iowa Henry B. Tippie College of Business. Others, such as Intrade (www.intrade.com) or Hollywood Stock Exchange (www.hsx.com), have been created by for-profit firms that act as a market platform and allow users to set up markets for trading in countless topics. As James Surowiecki stated in his book The Wisdom of Crowds [11]: ". . . the most mystifying thing about [prediction] markets is how little interest corporate America has shown in them... companies have remained, for the most part, indifferent to this source of potentially excellent information, and have been surprisingly unwilling to improve their decision making by tapping into the collective wisdom of their employees."
2 Brief Overview of Prediction Markets In this section we give a brief overview of prediction markets (for a more detailed description see [13]). In its simple form, a prediction market is just like a market for a single stock, where the price of the stock reflects the probability of an event happening. A common example is a stock that pays $100 if candidate A wins the presidential election and $0 if candidate A loses. A price of $60 reflects the market giving event “candidate A wins the election” a 60% probability of happening.
Fig. 1. Prediction market example - US Presidential Elections 2008
122
M. Velacso and N. Jukic
The evolution of this stock can be seen in a real prediction market over the 2008 US presidential elections in Figure 1, candidate A being in this case the now President Obama. While the price of a stock is commonly used in prediction markets, it is not implied that these markets have to be actually traded in a real currency. Various instantiations of prediction markets use mock currency instead of real one, and prizes are awarded to top performers. Figure 2 depicts a typical prediction market set up.
Real Event Dispersed Information
Motivate
Trades
Agent
Rewards
Used to trade
Marketplace
Results of trades accumulate into
Portfolio Value
Used to purchase
Fig. 2. Typical prediction market set up
Prediction markets elicit and aggregate dispersed information possessed by the different agents who disclose this information via trades in the existing prediction markets or marketplace. At the end of a particular market, the real event determines the outcome of the stock (in the previous example, the outcome of the election determined whether the “Obama” contract paid $100 or $0), which in turn determines the result of the trades of the agents. The accumulated results of the trades and the market value of stocks an agent may have at a particular point in time1 conform the portfolio value that said agent has available for trading or for purchasing rewards. The motivation provided by these rewards constitutes the information eliciting mechanism of prediction markets. In the past 2008 US presidential elections, the predictions at Intrade over the outcomes of each state were 100% accurate, beating any other major prediction method (http://electoralmap.net/index.php). The Hollywood Stock Exchange’s track record for predicting the winners of the 8 top categories at the Oscar awards is also impressive: since its inception in 1999, it has predicted correctly 83% of said awards (http://www.hsx.com/about/press_rel.htm). Studies on the accuracy of prediction markets versus polls and surveys [2], [5] show that prediction markets are at least as accurate, when not clearly superior. It is important to note at this point that, while the 1
The portfolio value is translated to currency at the end of a market, since the market stops and pays off the stock when the real event takes place.
A Framework for In-House Prediction Markets
123
information obtained from a prediction market can be used for making predictions, the price of a stock does not as such reflect the probability of an event happening, but the consensus about the probability of an event happening that agents trading in said market give to the event. The power of prediction markets is in its information aggregation and eliciting capabilities2. There are elements in the design of prediction markets that make them aggregate information differently and more efficiently than by other means. After all, how many employees spend their free time voluntarily and carefully answering polls or surveys?
3 A Framework for In-House Prediction Markets Prediction markets where real currency is used face the potential problem when the amount of power of a given agent in a prediction market is determined by the current available monetary wealth of the agent. If two agents have the exact same information to disclose to the market via a trade in a particular stock, the agent that has more available money will influence the final price more than the agent that has less. Everything else being equal, it doesn’t seem like a difference in wealth is a good reason for a difference in the potential influence on the price that each agent has. Although “public” prediction markets have captured more academy and media attention than their “in-house” counterparts, we focus on internal prediction markets deployed by companies for decision support purposes. While the design change we propose can be easily extended to a public prediction market, we argue that there is a difference on the final objective of both types of prediction markets that makes our framework a better fit for in-house implementation. Public prediction markets are run by companies, but their objective is not to provide information for decision making but to benefit from the fees the agents pay when they trade. An in-house prediction market, on the other hand, is run by a company to improve decision making, and thus a mechanism directed to the improvement of the aggregation of information is a more direct concern, since the company does not profit from the number and volume of trades, but from the information the agents disclose via those trades. This decision in turn determines that we are going to be dealing exclusively with mock currency markets, since companies have found that legislation issues prevents them for utilizing real currency [11]. Moreover, a study on the issue of real vs. mock currency [9] showed no statistical difference between the predictions of a sports-based real money market and a similar mock currency one. Mock currency markets also lend themselves better to the kind of design modification we propose, which tries to acknowledge the existence of experts3 amongst the agents participating in the market, while recognizing that an a priori identification of said experts is impractical both from the point of view of pure implementation problems and the fact that a categorization of expertise is likely to be tainted with whatever particular notions of the concept of 2
Perhaps a more accurate name for prediction markets would be “information markets”, by which they are also known, but the term has not stuck with the community so far. 3 For the purpose of this paper, we define an expert as an agent which has correct information and discloses it to the market by participating in trades.
124
M. Velacso and N. Jukic
expertise the market organizer has. Instead of an a priori classification, we propose a proxy classification of the expertise of the agents, based on previous performance. Prediction markets have a built-in mechanism for rewarding participating agents according to the contribution to the market: the profits/losses an agent obtains from his trades in the market. We propose to adapt this concept for the a posteriori identification of experts.
4 Proposed Framework Principles A literature review of several real world implementations of prediction markets in large companies [3], [4], [6], [7] shows that while some companies (e.g. Google) have open in-house markets, where all employees have access to the prediction markets, others (e.g. Intel, GE) restrict the access to the market to a previously defined set of agents, deemed by the company to be experts on the issue traded. In our solution, an a priori identification of experts is substituted by a mechanism that assigns higher weight to agents that recurrently contribute good information to the system, so our design acts as an open in-house market. By giving higher weight to experts while maintaining an open market so all potential agents can participate, our design is positioned to elicit more relevant information than a preselected experts design and aggregate information better than an open design in which every agent’s opinion is considered to have the same weight (i.e. an average of the opinion of the agents). At the end of a market we find the other potential point of improvement. In most real world implementations, the amount of currency the participating agents have at the end of a determined period of time is taken into account for rewarding expert agents for their contribution, be it by allowing them to redeem their mock currency for prizes, awarding a prize to the agent with most earnings, or/and displaying the agent’s results for social recognition. The amount each agent has to trade at the end of a market is then either reset, so all agents start again with the same potential capability of influencing the market, or agents are free to carry their historical performance over consecutive markets, thus gaining a capability of influencing the market that is directly proportional to the relative value of their portfolio. The amount of currency an agent has available for trading can be assimilated to the amount of power that an agent has for influencing the current market price, which we now formally term “market power”. In actual documented implementations this market power is the same as the amount of the portfolio of an agent at a given time. If an agent doubles his portfolio’s value by a successful series of trades, his market power becomes twice as much as it was before. While this equal variation of power and portfolio value can be accepted from the point of view of making it easy for the agent to link his performance with the rewards, we propose to separate the concepts of market power and amount available for purchasing rewards4 (which we now term “incentive power”). The incentive power is going to consist on the benefits/loses the agent obtains from the normal trading in the market. 4
Or social recognition by means of a ranking, or to determine the top best agent/s for a series of prizes.
A Framework for In-House Prediction Markets
125
Separating market power from incentive power serves two different functions. First, it allows us to implement short term rewards while maintaining a market power based on long term performance. By separating incentive power from market power we allow for agents to still have access to rewards without having to “sacrifice” market power by exchanging it for prizes. Second, it allows us to set a different market power for each agent. We argue that while the variation of an agent’s market power should be linked to the agent’s performance, it should not be a direct 1 to 1 proportion (as is the case in current implementations of prediction markets that are linked to agents’ performance). While 1 to 1 proportion seems adequate for determining the agent’s incentive power in a simple, intuitive way, we propose that there are factors that permit further qualification of information contribution, depending on the particular circumstances in which the agent contributed to the system. The argument about the characteristics of real actual prediction markets and our proposed solution is summarized in Table 1. We address the problem of whether to open the market to preselected experts or to all potential agents by giving higher weight to experts while maintaining an open market so all potential agents can participate. Table 1. Comparison of the current in-house prediction markets and the proposed framework Problem
Solutions in the cases reviewed - Pres elected Experts
Agent selection
Amount availab le for trading
Amount availab le for rewards
- Open to all in-hous e agents - Equal to historical performance - Same initial amount for every market
- Equal to performance
Our Solution - Open to all in-hous e agents (but different weights)
- Linked to performance but moderated by other factors
- Equal to performance
Current prediction markets designs either reset the agents’ amount available for trading at the end of each market so all of the agents have the same amount of currency, or they allow agents to carry on their benefits to the next market in a 1 to 1 proportion. Our framework requires assigning higher weight to expert’s opinions, and it does so by linking that weight to recurrent performance in previous prediction markets, so a reset of the amount available for trading at the end of each market that translates into the same initial weight for all agents in the upcoming markets is not a viable way for implementing it. A marketplace in which the agent’s market power is determined directly by the agent’s historical performance is more similar to our solution, although we argue that while market power should be linked to historical performance, it should be moderated and not in a 1 to 1 proportion. Our proposed model extension and refinement is expressed in Figure 3, which complements and modifies the actual typical design seen in Figure 2.
126
M. Velacso and N. Jukic
Current market information accumulates into
Real Event Dispersed Information
Current market information contributes to
Trades
Marketplace Results of trades contribute to Results of trades accumulate into
Historical Market Information
Determines
Motivate
Agent
Rewards
Used to trade
Weight Assignment Mechanism
Incentive Power
Used to purchase
Market Power
Fig. 3. Proposed prediction market set up
As we can see in Figure 2, currently the results of the trades of an agent are used to determine both the portfolio value (the amount available for trading, which we term market power) and the amount available for rewards (which we term incentive power). The portfolio value is either reset at the end of a determined period of time, or agents are allowed to carry the portfolio value over to new markets. Our solution takes a middle approach. In it, every agent starts with the same market power (e.g. 100 points). After a market or series of parallel markets has been run, we allow the agents to exchange their final points for rewards, but instead of having every agent start at the same market power level in the subsequent round of prediction markets, we assign a different market power to each agent based on the weight assignment mechanism that assigns more power to agents with higher level of expertise.
5 Conclusions and Future Work Existing in-house prediction markets, base participant’s power to influence the market strictly on the amount of their assets (usually in mock currency). We propose a more nuanced design approach that considers additional factors for determining the participant’s influence on the market over the long term. The goal of this design approach is to improve the accuracy and decision-support viability of in-house prediction markets. In our future work we will develop a working prototype prediction market system based on our proposed framework. We are currently developing several different weight assignment mechanisms (using different algorithms) for testing in our prototype. The goal of the prototype is to prove the proposed framework’s ability to improve the quality of the input (prediction of the agent moderated by his market power) of a prediction market by deriving the market power as a function of the agent’s past performance and the markets circumstances on which the performance was achieved.
A Framework for In-House Prediction Markets
127
References 1. Berg, J.E., Rietz, T.A.: Prediction Markets as Decision Support Systems. Information Systems Frontiers 5(1), 79–93 (2003) 2. Chen, Y., Chu, C., Mullen, T., Pennock, D.M.: Information Markets vs. Opinion Polls: An Empirical Comparison. In: Proceedings of the 6th ACM Conference on Electronic Commerce, EC ’05, Vancouver, BC, Canada, June 2005, pp. 58–67. ACM, New York (2005) 3. Cowgill, B., Wolfers, J., Zitzewitz, E.: Using Prediction Markets to Track Information Flows: Evidence from Google, Working Paper (2008), http://bocowgill.com/GooglePredictionMarketPaper.pdf 4. Dye, R.: The Promise of Prediction Markets: A Roundtable. The McKinsey Quarterly, pp. 83–93 (2008) 5. Gruca, T.S., Berg, J.E., Cipriano, M.: Consensus and Differences of Opinion in Electronic Prediction Markets. Electronic Markets 15(1), 13–22 (2005) 6. Hopman, J.W.: Using Forecasting Markets to Manage Demand Risk. Intel Technology Journal 11(2) (2007) 7. LaComb, C.A., Barnett, J.A., Pan, Q.: The Imagination Market. Information Systems Frontier 9(2-3), 245–256 (2007) 8. Rhode, P., Strumpf, K.: Historical Prediction Markets: Wagering on Presidential Elections, UNCC Hill, NUNCC Hill. Journal of Economic Perspectives 18(2) (2004) 9. Servan-Schreiber, Wolfers, Pennock, Galebach: Prediction Markets: Does Money Matter? Electronic Markets 14(3) (2004) 10. Sunstein, C.R.: Infotopia: How Many Minds Produce Knowledge. Oxford University Press, Oxford (2006) 11. Surowiecki, J.: The Wisdom of Crowds: Why The Many Are Smarter Than The Few and How Collective Wisdom Shapes Business, Economies, Societies, and Nations, 1st edn. Doubleday, New York (2004) 12. Surowiecki, J.: Wagers of Sin. The New Yorker (September 25, 2006) 13. Wolfers, J., Zitzewitz, E.: Prediction Markets. The Journal of Economic Perspectives 18(2), 107–126 (2004)
Road Region Extraction Based on Motion Information and Seeded Region Growing for Foreground Detection Hongwu Qin, Jasni Mohamad Zain, Xiuqin Ma, and Tao Hai Faculty of Computer Systems and Software Engineering University Malaysia Pahang Kuantan, 26300, Malaysia [email protected], [email protected], [email protected], [email protected]
Abstract. This paper proposes a road region extraction method based on the motion information of foreground objects and seeded region growing (SRG) algorithm. By learning on a training set of a scene over a period of time, we get the trajectory of moving object, then use SRG algorithm in which the trajectory is used as seed to extract road region. As a result, instead of detecting foreground objects in a conventional pixel by pixel manner, detection can be mainly performed on or near the pixels of road region so as to facilitate and accelerate foreground detection. In addition, the regions outside road region most of the time do not need to be transmitted in visual communication. Experimental results represent the accuracy and usefulness of our proposed method. Keywords: Foreground detection, seeded region growing, region extraction, visual surveillance.
1 Introduction Foreground detection in a video sequence is a fundamental and critical task in many vision applications including automated visual surveillance, traffic monitoring and analysis, and human-machine interface. In a computer vision application, foreground detection is always followed by several other tasks: objects tracking, objects classification and activity recognition. So, an efficient foreground detection algorithm is demanded in order to real-time implement the entire vision application. Many foreground detection methods have been proposed in the past decade. A widely-used approach to foreground detection is background subtraction, where each video frame is compared against a background model. Pixels in the current frame that deviate significantly from the background model are considered to be foreground objects [1]. However, either the simplest background model of temporally averaged image or the complex nonparametric approaches [2] [3] and parametric approaches[4] [5] are time-consuming because every pixel per frame has to be processed in these methods. Can we only process a part of pixels in one frame? In fact, processing a part of pixels is enough to accomplish task in many vision applications. An example is shown in Fig. 1(a), an automated visual surveillance system is placed on expressway for traffic analysis; the image captured by camera includes sky, road, grass, and tree. Obviously, it is not necessary to process the pixels in sky F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 128–134, 2010. © Springer-Verlag Berlin Heidelberg 2010
Road Region Extraction Based on Motion Information and SRG
129
region and most of pixels in grass and tree regions, we only need to extract road regions and process the pixels in or near it. From Fig. 1(b), we can estimate that it will save almost 2/3 processing time for us. Therefore, as long as we find out the road areas, we can pay more attention on dealing with road region so as to improve the efficiency of foreground detection.
10% 8%
38%
road sky grass tree
44%
(a)
(b)
Fig. 1. (a) An example of outdoor scene in surveillance system; (b) Percentage of different region in the scene image
In this paper, we propose a method that integrates motion information of foreground objects and SRG algorithm to extract road region in outdoor automated visual surveillance system. By learning on a training set of a scene over a period of time, we get the trajectory of moving objects, then use SRG algorithm in which the trajectory is used as seeds to extract road regions. In our experiment, we use the motion trajectories of vehicles because the trajectories of vehicles are more reliable than that of human for locating the road. Our work is based on two assumptions: the first one is that the outdoor surveillance systems are mainly used for monitor human and vehicles. So the scenes with high probability contain roads. The other is when moving vehicle appears in the scene it must be moving on roads. The two assumptions are reasonable for an outdoor surveillance system. The extraction of road region has another advantage that in visual communication, most of the time the regions outside road region do not need to be encoded and transmitted. The rest of the paper is organized as follows. In section 2, we present the method for capturing the motion trajectory. Section 3 describes the SRG algorithm. Section 4 provides experimental results and discussions. Conclusions are made in section 5.
2 Motion Trajectory Extraction To obtain the motion trajectories of moving objects, we need to learn on a training set of a scene over a period of time. This procedure includes three steps: detection, classification and tracking. It is possible some moving background objects like waving trees exist in the scene. Therefore, we use Gaussian mixture model (GMM)[4] as background model for foreground detection, and then we classify the objects into two types: human and vehicles. After that, Kalman Filter is used to track the objects. Finally, select parts of foreground pixels as motion trajectories. Each pixel in the scene is modeled by a mixture of K Gaussian distributions. The probability of observing the current pixel value can be written as
130
H. Qin et al. K
P ( X t ) = ∑ ω i ,t ∗ η ( X t ; μ i ,t , Σ i ,t ) i =1
where
ωi,t
is the weight parameter,
μ i,t
is the mean value and Σ i ,t
(1)
= σ i2 I is the
th
covariance matrix of the i Gaussian component in the mixture at time t , and where η is a Gaussian probability density function
η ( X t ; μ , Σ) =
1 n 2
( 2π ) Σ
e
1 2
1 − ( X t − μ t )T Σ −1 ( X t − μ t ) 2
(2)
First, the Gaussians are ordered by the value of ω / σ , and then the first B distributions are chosen as the background model, where b
B = arg min (∑ ω i > T ) i =1
b
(3)
The threshold T is the minimum prior probability that the background is in the scene. Background subtraction is performed by marking a foreground pixel any pixel that is more than 2.5 standard deviations away from any of the B distributions [10]. After the background subtraction and denoise, we get the masks of moving objects. It is clear that the most obvious types of targets which will be of interest are humans and vehicles. For this reason, a dispersedness model [9] is used to classify humans and vehicles. Dispersedness Dk is defined as Dk =
Pk2 Sk
(4)
where Pk is the perimeter, and S k is the area of the object. Clearly, a human, with its more complex shape, will have larger dispersedness than a vehicle. When an object appears with Dk < λ , the object is classified as vehicle, otherwise as human. λ is a threshold obtained by training. Then, we match the vehicle frame by frame, namely tracking. Here we model a Kalman Filter for every vehicle for tracking. The detail about Kalman Filter can be found in [6]. During tracking we select a part of pixels in the vehicle as the motion trajectory. We can not affirm that all pixels of the vehicle are inside road region. However, we can affirm the most bottom pixels of the mask of the vehicle must be inside road areas. So we select these pixels to form the trajectory. Finally, we get the trajectory of the vehicle through tracking the vehicle for a period of time. Fig.3 (d) illustrates the trajectory of a vehicle moving on the road.
3 Seeded Region Growing SRG [7] performs a segmentation of an image, which starts with assigned seeds, and grow regions by merging a pixel into its nearest neighboring seed region. The seeds could be selected either manually or automatically [8]. The algorithm is suitable to segment the
Road Region Extraction Based on Motion Information and SRG
131
images in which the individual objects or regions in the image are characterized by connected pixels of similar value [7]. In general, the pixels in road areas have similar value. Therefore, we use SRG algorithm to extract the road areas. Instead of segmenting the image to many regions using a number of seeds in conventional SRG algorithm, here we only need to segment one region, namely road region, so one seed region is enough. We use the trajectory of vehicle obtained previously as seed region. The algorithm for implementing SRG is as follows, Put seed points in set R . Calculate the mean of R . Calculate ξ of every seed point with the mean.
Put all seed points in List in an increasing order of ξ . While List is not empty: Remove first point y from List. Repeat test eight neighbors of y : If the neighbor point has not been in R Calculate δ of the neighbor point with the mean of R . If ξ > M Put the neighbor point in set B . Else Put the neighbor point in R . Put the point in the List according to ξ . Update the mean of the pixels in R .
ξ is a measure of distance between a pixel and the region it adjoins. The definition for ξ is where R denotes the road region, B denotes the boundary, M is a threshold and N
ξ = ∑ ξi i =1
ξ i =| vi − mean[vi ( y )] | y∈R
(5) (6)
vi is the color value of the pixel. If we implement SRG in RGB color space, then i ∈ {R, G, B} , vi denote one of R, G, and B components. List denotes a sequen-
where
tially sorted list, which is similar to SSL in [7]. The RGB color space is suitable for color display, but is not good for color analysis because of its high correlation among R, G, and B components. However, in YUV [11] color space, the intensity and chromatic components can be easily and independently controlled. When different parts of road have different illumination, YUV model can eliminate the influence caused by different illumination. Fig.2 illustrates the difference between RGB and YUV on measuring the similarity of pixels on roads. At a first glance, it seems that all pixels on the road have the same color value. In fact there exists difference among them, especially the areas inside the red rectangle in Fig.2 (a) has
132
H. Qin et al.
distinct different intensity from other parts of the road due to the occlusion of buildings. We select a pixel marked by a blue point in Fig.2 (a) and try to find the pixels which have similar value to it in both RGB and YUV color space. Fig.2 (b) is the result in RGB space with the same threshold 20 in R, G, and B components. Fig.2 (c) is the result in YUV space with the threshold 60 in Y, 6 in U and V components. It is obvious that more similar pixels are included in YUV space. So YUV model are more suitable to find out similar regions. Y = 0.299R + 0.587G + 0.114B
(7)
U = − 0.14713 R − 0.28886 G + 0 .436 B
(8)
V = 0.615R − 0.51499G − 0.10001B
(9)
(a)
(b)
(c)
Rectangle and similar pixels Fig. 2. The difference between RGB and YUV on measuring the similarity of pixels on roads. (a) the region enclosed by the red rectangle has different illumination with other road region; (b) red pixels are similar to the blue point in (a) in RGB space; (c) red pixels are similar to the blue point in YUV space.
4 Experiment Results In order to evaluate the accuracy and usefulness of our road extraction method, two types of experiments have been performed: ( ) compare with the result of manual segmentation, ( ) perform foreground detection pixel by pixel and foreground detection only in road region respectively, and compare their performance. The experiments were performed on a PC with Intel Pentium D 3.0GHz Processor. The test images are from PETS2001 dataset. The image resolution is 320×240 pixels. Fig. 3 illustrates the results of manual segmentation and our method respectively. To measure the accuracy of our method, we define the accuracy 0 ≤ a ≤ 1 of our method as follows
Ⅰ
Ⅱ
a=
P ∩Q P ∪Q
(10)
where P denotes the segmentation result obtained by our method and Q denotes the manual segmentation result. The accuracy is 1 only if P is identical to Q . The accuracy on the test dataset is a = 95% .From the result, we can see that only several little regions are not extracted. These regions can be filled by simple image processing techniques
Road Region Extraction Based on Motion Information and SRG
133
such as dilation and opening operations. In this case almost an entire region can be extracted. This experimental result demonstrates the accuracy of the proposed method. We conducted foreground detection pixel by pixel and foreground detection only in road region using three different algorithms respectively: temporal difference (TD), background model that can be adaptive with a simple Infinite Impulse Filter (BIIF) and Gaussian background model (GB). Considering partial projection of the object moving on the road may be out of the road region, so the boundary of the road region should be extended some pixels toward outside when foreground detection algorithms are being implemented. Here we extended 25 pixels. Fig. 4 illustrates the ratio of computation time of each algorithm performing in two manners. From the result, we can see it will save about 64% of computation time on average. This experimental result demonstrates the usefulness of the proposed method.
(a)
(d)
(b)
(e)
(c)
(f)
Trajectory and road Fig. 3. (a) is the original scene without any foreground objects; (b) and (c) is two frames of the training image sequences which include a car; In (d), trajectory of the car is marked by a red line; (e) red pixels show the manual segmentation mask of road region; (f) red pixels show the mask of road region by using proposed algorithm.
Fig. 4. Ratio of computation time of each algorithm performing in two manners
134
H. Qin et al.
5 Conclusion In this paper we present a road region extraction method based on vehicle motion trajectory and seeded region growing algorithm. By learning a scene over a period of time, we get the motion trajectory of vehicle, and then perform SRG algorithm which use the pixels on the trajectory as seeds to extract road region. The method can be used as preprocessing step in an automated visual surveillance system to facilitate and accelerate foreground objects detection, tracking, and analysis. Experimental results represent the accuracy and usefulness of our proposed method. Acknowledgments. Special thank go to my supervisor, Prof. Dr. Jasni Mohamad Zain, who guided me through this research, inspired and motivated me. We thank University Malaysia Pahang for supporting this research.
References 1. Cheung, S.-C., Kamath, C.: Robust Techniques for Background Subtraction in Urban Traffic Video. In: Panchanathan, S., Vasudev, B. (eds.) Proc. Elect. Imaging: Visual Comm. Image Proce. 2004 (Part One) SPIE, vol. 5308, pp. 881–892 (2004) 2. Kim, K., Chalidabhongse, T.H., Harwood, D., Davis, L.: Background Modeling and Subtraction by Codebook Construction. In: Proc. IEEE Int’l Conf. Image Processing, vol. 5, pp. 3061–3064 (2004) 3. Sheikh, Y., Shah, M.: Bayesian object detection in dynamic scenes. In: CVPR, vol. (1), pp. 74–79 (2005) 4. Stauffer, C., Grimson, W.E.L.: Adaptive Background Mixture Models for Real-Time Tracking. In: Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 246–252 (1999) 5. Wren, C., Azarbaygaui, A., Darrell, T., Pentland, A.: Pfinder: realtime tracking of the human body. IEEE Trans. Pattern Anal. Machine Intell. 19, 780–785 (1997) 6. Koller, D., Weber, J., Huang, T., Malik, J., Ogasawara, G., Rao, B., Russel, S.: Towards robust automatic traffic scene analysis in real-time. In: Proc. of the International Conference on Pattern Recognition, Israel (November 1994) 7. Adams, R., Bischof, L.: Seeded region growing. IEEE Transactions on Pattern Analysis and Machine Intelligence 16(6), 641–647 (1994) 8. Fan, J., Yau, D.K.Y., Elmagarmid, A.K., Aref, W.G.: Automatic image segmentation by integrating color-edge extraction and seeded region growing. IEEE Transactions on Image Processing 10(10), 1454–1466 (2001) 9. Lipton, A.J., Fujiyoshi, H., Patil, R.S.: Moving target classification and tracking from real-time video. In: Proc. of IEEE Workshop on Applications of Computer Vision, pp. 8–14 (1998) 10. KaewTraKulPong, P., Bowden, R.: An Improved Adaptive Background Mixture Model for Real-Time Tracking with Shadow Detection. In: Proc. European Workshop Advanced Video Based Surveillance Systems (2001) 11. http://en.wikipedia.org/wiki/YUV
Process Mining Approach to Promote Business Intelligence in Iranian Detectives` Police Mehdi Ghazanfari, Mohammad Fathian, Mostafa Jafari, and Saeed Rouhani Department of Industrial Engineering, Iran University of Science and Technology, Tehran, Iran [email protected]
Abstract. Most of business processes leave their ‘‘footprints’’ in transactional information systems, i.e., business process events are recorded in so-called event logs on Enterprise systems. These systems can be used as a lead in police investigation. This paper field is in providing techniques and tools for discovering process, control, data, organizational, and social structures from event logs as the goal of process mining and the basic idea of process mining is to diagnose business processes to promote Detectives` Police knowledge in computer crimes. In this paper we focus on the potential use of process mining techniques to enable Iranian Detectives` Police for discovering grounds of crime. This application is an approach that provides new view in police investigation. This paper explains process mining how can assist the monitoring enterprise systems. Keywords: Business Intelligence, Evaluation, Process mining; Detective police.
1 Introduction Nowadays many systems have some kind of event log often referred to as ‘‘history’’, ‘‘audit trail’’, ‘‘transaction log’’, etc [2–5]. The event log typically contains information about events referring to an activity and a case. The case (also named process instance) is the ‘‘request’’ which is being handled, e.g., a customer order, a job application, an insurance claim, a building permit, etc. Typically, events have a timestamp indicating the time of occurrence. Moreover, when people are involved, event logs will characteristically contain information about the executive person or the one who has initiate the event, i.e., the performer. Besides the availability of event logs there is an increasing interest in monitoring business processes and meanwhile there is a constant pressure to improve the performance and efficiency of business processes [6, 7]. This requires more fine-grained monitoring facilities like those illustrated by today’s buzzwords such as business activity monitoring (BAM), business operations management (BOM), and business process intelligence (BPI). Business process mining, or process mining for short, aims the automatic construction of models explaining the behavior observed in the event log. For example, based on some event log, one can construct a process model expressed in terms of a Petri net. Over the last couple of years many tools and techniques for process mining have been developed [2, 3, 5, 8]. F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 135–142, 2010. © Springer-Verlag Berlin Heidelberg 2010
136
M. Ghazanfari et al.
Therefore, it is important to consider new techniques, tools and algorithms while reaching this field, because Transactional systems, i.e., ERP, SCM & CRM will be implemented on all business, so process mining will be a knowledge discovery key in police investigation. The remainder of this paper is organized as follows. Section 2 briefly introduces Detectives` Police, Section 3 briefly introduces the concept of business process mining and Section 4 describes the proposed approach for Iranian Detectives Police. Sections 5 presents the results of mining the process, with proposed approach on sample case. Section 6 includes the conclusion and concludes the paper.
2 Police Detectives Roles Police detectives are responsible for investigations and detective work. Detectives may be called Investigations Police, Judiciary/Judicial Police, and Criminal Police. Detectives typically make up roughly 15%-25% of a police service's personnel. In some cases, police are assigned to work "undercover", where they conceal their police identity to investigate crimes, such as organized crime or narcotics crime, that are unsolvable by other means. In some cases this type of policing shares aspects with espionage [11]. Detectives, by contrast, usually investigate crimes after they have occurred and after patrol officers have responded first to a situation. Investigations often take weeks or months to complete, during which time detectives spend much of their time away from the streets, in interviews and courtrooms, for example. Rotating officers also promotes cross-training in a wider variety of skills, and serves to prevent "cliques" that can contribute to corruption or other unethical behavior [12].
3 Process Mining: An Overview The goal of process mining is to extract information about processes from transaction logs [2]. We assume that it is possible to record events such that (i) each event refers to an activity (i.e., a well-defined step in the process), (ii) each event refers to a case (i.e., a process instance), (iii) each event can have a performer also referred to as originator (the person executing or initiating the activity), and (iv) events have a timestamp and are totally ordered. Table 1 shows an example of a log involving 19 events, five activities, and six originators. In addition to the information shown in this table, some event logs contain more information on the case itself, i.e., data elements referring to properties of the case. Event logs such as the one shown in Table 1 are used as the starting point for mining. We distinguish three different perspectives: (1) the process perspective (‘‘How?’’), (2) the organizational perspective (‘‘who?’’), and (3) the case perspective (‘‘what?’’) [1]. The process perspective focuses on the control flow, i.e., the ordering of activities. The goal of mining this perspective is to find a good characterization of all possible paths, e.g., expressed in terms of a Petri net [9] or event-driven process chain (EPC)
Process Mining Approach to Promote Business Intelligence
137
[10]. The organizational perspective focuses on the originator field, i.e., which performers are involved and how are they related. The goal is to either structure the organization by classifying people in terms of roles and organizational units or to show relations between individual performers. The case perspective focuses on properties of cases. Cases can be characterized by their path in the process or by the originators working on a case. However, cases can also be characterized by the values of the corresponding data elements. For example, if a case represents a replenishment order, it may be interesting to know the supplier or the number of products ordered. To illustrate the first two perspectives consider Fig. 1. The log shown in Table 1 contains information about five cases (i.e., process instances). The log shows that for four cases (1–4) the activities A–D have been executed. For the fifth case only three activities are executed: activities A, E, and D. each case starts with the execution of A and ends with the execution of D. If activity B is executed, then also activity C is executed. However, for some cases activity C is executed before activity B. Based on the information shown in Table 1 and by making some assumptions about the completeness of the log (i.e., assuming that the cases are representative and a sufficient large subset of possible behaviors is observed), we can deduce the process model shown in Fig. 1(a). The model is represented in terms of a Petri net [9]. The Petri net starts with activity A and finishes with activity D. These activities are represented by transitions. After executing A there is a choice between both executing B and C in parallel or just executing activity E. Note that for this example we assume that two activities are in parallel if they appear in any order. By distinguishing between start events and complete events for activities it is possible to explicitly detect true parallelism, i.e., concurrent execution of tasks. Fig. 1(a) does not show any information about the organization, i.e., it does not use any information on the people executing activities. However, Table 1 shows information about the performers. For example, we can deduce that activity A is executed by either Saeed or Reyhaneh, activity B is executed by Saeed, Reyhaneh, Ali, or Mehdi, C is executed by Saeed, Reyhaneh, Ali, or Mehdi, D is executed by Fatemeh or Amir, and E is executed by Amir. We could indicate this information in Fig. 1(a). The information could also be used to ‘‘guess’’ or ‘‘discover’’ organizational structures. For example, a guess could be that there are three roles: X, Y, and Z. For the execution of A role X is required and Saeed and Reyhaneh have this role. For the execution of B and C role Y is required and Saeed, Reyhaneh, Ali, and Mehdi have this role. For the execution of D and E role Z is required and Fatemeh and Amir have this role. For five cases these choices may seem arbitrary but for larger data sets such inferences capture the dominant roles in an organization. The resulting ‘‘activity–Role–performer diagram’’ is shown in Fig. 1(b). The three ‘‘discovered’’ roles link activities to performers. Fig. 1(c) shows another view on the organization based on the transfer of work from one individual to another, i.e., not focusing on the relation between the process and individuals but on relations among individuals (or groups of individuals). Consider, for example, Table 1. Although Mehdi and Ali can execute the same activities (B and C), Ali is always working with Saeed (cases 1 and 2) and Mehdi is always working with Reyhaneh (cases 3 and 4). Probably Mehdi and Ali have the same role but based on the small sample
138
M. Ghazanfari et al.
shown in Table 1 it seems that Saeed is not working with Mehdi and Reyhaneh is not working with Ali. These examples show that the event log can be used to derive relations between performers of activities, thus resulting in a Sociogram. For example, it is possible to generate a Sociogram based on the transfers of work from one individual to another as is shown in Fig. 1(c). Each node represents one of the six performers and each arc represents that there has been a transfer of work from one individual to another. There is a ‘‘transfer of work from A to B’’ if, for the same case, an activity executed by A is directly followed by an activity executed by B. For example, both in cases 1 and 2 there is a transfer from Saeed to Ali. Fig. 1(c) does not show frequencies. However, for analysis purposes these frequencies can be added. The arc from Saeed to Ali would then have weight 2. (Typically, we do not use absolute frequencies but weighted frequencies to get relative values between 0 and 1.) Fig. 1(c) shows that work is transferred to Fatemeh but not vice versa. Ali only interacts with Saeed and Mehdi only interacts with Reyhaneh. Amir is the only person transferring work to herself. Besides the ‘‘How?’’ and ‘‘Who?’’ question (i.e., the process and organization perspectives), there is the case perspective that is concerned with the ‘‘What?’’ question. Fig. 1 does not address this. In fact, focusing on the case perspective is most interesting when also data elements are logged but these are not listed in Table 1. Table 1. An event log Case id
Activity id
Originator
Timestamp
Case 1
Activity A
Saeed
9-3-2007:15.01
Case 2
Activity A
Saeed
9-3-2007:15.12
Case 3
Activity A
Reyhaneh
9-3-2007:16.03
Case 3
Activity B
Mehdi
9-3-2007:16.07
Case 1
Activity B
Ali
9-3-2007:18.25
Case 1
Activity C
Saeed
10-3-2007:9.23
Case 2
Activity C
Ali
10-3-2007:10.34
Case 4
Activity A
Reyhaneh
10-3-2007:10.35
Case 2
Activity B
Saeed
10-3-2007:12.34
Case 2
Activity D
Fatemeh
10-3-2007:12.50
Case 5
Activity A
Reyhaneh
10-3-2007:13.05
Case 4
Activity C
Mehdi
11-3-2007:10.12
Case 1
Activity D
Fatemeh
11-3-2007:10.14
Case 3
Activity C
Reyhaneh
11-3-2007:10.44
Case 3
Activity D
Fatemeh
11-3-2007:11.03
Case 4
Activity B
Reyhaneh
14-3-2007:11.18
Case 5
Activity E
Amir
17-3-2007:12.22
Case 5
Activity D
Amir
18-3-2007:14.34
Case 4
Activity D
Fatemeh
19-3-2007:15.56
Process Mining Approach to Promote Business Intelligence
139
Fig. 1. Some mining results for the process perspective (a) and organizational (b and c) perspective
The case perspective looks at the case as a whole and tries to establish relations between the various properties of a case. Note that some of the properties may refer to the activities being executed, the performers working on the case, and the values of various data elements linked to the case. Using clustering algorithms it would, for example, be possible to show a positive correlation between the size of an order or its handling time and the involvement of specific people. Orthogonal to the three perspectives (process, organization, and case), the result of a mining effort may refer to logical issues and/or performance issues. For example, process mining can focus on the logical structure of the process model (e.g., the Petri net shown in Fig. 1(a)) or on performance issues such as flow time. For mining the organizational perspectives, the emphasis can be on the roles or the social network (cf. Fig. 1(b) and (c)) or on the utilization of performers or execution frequencies.
4 Proposed Approach To apply process techniques in police investigation field, converted log data to transactions like T that is equal to cases in process mining problem can be used. With the above discussion, the proposed approach for finding the interesting process models is schematically illustrated in the flow chart in Fig. 2. The proposed approach is described as follow: Step 1. Step 2.
Convert Suspected Systems Event Log to Transaction Data. Input Data for process mining.
Step 3.
Select best process model in regards of criteria.
Step 4.
Form Activity chains and show process.
Step 5.
Diagnose Crime Pattern by Detectives
140
M. Ghazanfari et al.
Convert Event Log
Input data
Select best process model in regards of criteria
Form Activity chains and show process
Form Activity chains and show process model
Fig. 2. Flow chart of the proposed approach
5 Illustrative Example An example of event log data in order process of official ERP is used to illustrate the proposed approach presented in Section 3. Event log first are filled in transactional format in Clementine (powerful data mining tool). This example log is illustrated in Fig 3. For process mining a Clementine stream in Fig 4 was designed to derive, filter and set data from source in order to apply the algorithm on it.
Fig. 3. Log Example
In following proposed approach, the algorithm is run for 6 times because there were 7 tasks. This means in this approach for finding chains we need the relation of tasks one after another. By using algorithm, the association rules between task1 & task2, task2 & task3, task3 & task4, task4 & task5, task5 & task6, task6 & task7 are mined with specified thresholds. After finding rules with respect of criteria and forming chains, the configurable rules are selected. The selected rules for the sample case are presented in Fig 5.
Process Mining Approach to Promote Business Intelligence
141
Fig. 4. Design process mining Stream in Clementine
In final step of the approach, which is formed by streams-association rules- are sorted. In Fig 5 rules that form one chain are colored alike. Finally with drawing these chains, process model is produced. Mined process model which is the result of example event log mining, can be illustrated. With applying the proposed approach, a data mining technique like association rules algorithm can be tested in family research field process mining. Increasing trend of using intelligent tools in systems like ERP, SCM, and CRM develop the need of process mining applications to detectives’ police investigation, so design new approach that support discovering knowledge in transactional systems and business area can be useful.
Fig. 5. Explored Association Rules
142
M. Ghazanfari et al.
6 Conclusion It is important to consider new techniques, tools and algorithms while reaching to detectives’ police investigation field, because Transactional systems, i.e., ERP, SCM & CRM will be implemented on every organization world while, so process mining will be a knowledge discovery key for detectives’ police . In this paper we focused on the potential use of process mining algorithms as key police investigation field. This application is formed as an approach that gives new criteria in police investigation research. The major section of this approach is looking for chains that will be formed. An example of log was applied to illustrate how process mining with multiple criteria can discover knowledge. Applying other process mining algorithms, comparison of them and designing and implementing tools with the base of this approach are recommended for future research.
References [1] Van der Aalst, W.M.P., Van Hee, K.M.: Workflow Management: Models, Methods, and Systems. MIT Press, Cambridge (2002) [2] Van der Aalst, W.M.P., Van Dongen, B.F., Herbst, J., Maruster, L., Schimm, G., Weijters, A.J.M.M.: Workflow mining: a survey of issues and approaches. Data Knowl. Eng. 47(2), 237–267 (2003) [3] Agrawal, R., Gunopulos, D., Leymann, F.: Mining process models from workflow logs. In: Sixth International Conference on Extending Database Technology, pp. 469–483 (1998) [4] Grigori, D., Casati, F., Dayal, U., Shan, M.C.: Improving business process quality through exception understanding, prediction, and prevention. In: Apers, P., Atzeni, P., Ceri, S., Paraboschi, S., Ramamohanarao, K., Snodgrass, R. (eds.) Proceedings of the 27th International Conference on Very Large Data Bases (VLDB’01), pp. 159–168. Morgan Kaufmann, Los Alamitos (2001) [5] Sayal, M., Casati, F., Dayal, U., Shan, M.C.: Business process cockpit. In: Proceedings of 28th International Conference on Very Large Data Bases (VLDB’02), pp. 880–883. Morgan Kaufmann, Los Alamitos (2002) [6] Ghazanfari, M., Rouhani, S., Jafari, M., Taghavifard, M.: ERP Requirements for supporting management decisions & Business Intelligence. The Icfai University Journal of Information Technology V(3) (2009) [7] Rouhani, S., Ghazanfari, M.: Ranking Explored Association Rules with ANP. In: The first national Data mining Conference (IDMC), Amirkabir University of technology, Tehran (2007) [8] Van der Aalst, W.M.P., Weijters, A.J.M.M., Maruster, L.: Workflow Mining: Which Processes can be Rediscovered? BETA Working Paper Series, WP, vol. 74. Eindhoven University of Technology, Eindhoven (2002) [9] Reisig, W., Rozenberg, G. (eds.): APN 1998. LNCS, vol. 1491. Springer, Heidelberg (1998) [10] Keller, G., Teufel, T.: SAP R/3 Process Oriented Implementation. Addison-Wesley, Reading (1998) [11] Ratcliffe, J.: Strategic Thinking in Criminal Intelligence. The Federation Press, Annadale (2007) [12] Sheptycki, J.: High Policing in the Security Control Society. Policing: a Journal of Policy and Practice 1(1), 70–79 (2007) doi:10.1093/police/pam005
Copyright Protection of Relational Database Systems Ali Al-Haj1, Ashraf Odeh2, and Shadi Masadeh3 1
Princess Sumaya University for Technology, Amman, Jordan [email protected] 2 Royal Scientific Society, Amman, Jordan [email protected] 3 Al-Isra Private University, Amman, Jordan [email protected]
Abstract. Due to the increasing use of databases in many real-life applications, database watermarking has been suggested lately as a vital technique for copyright protection of databases. In this paper, we propose an efficient database watermarking algorithm based on inserting a binary image watermark in Arabic character attributes. Experimental results demonstrate the robustness of the proposed algorithm against common database attacks. Keywords: Relational databases, copyright protection, watermarking.
1 Introduction Considerable amount of research has been done for watermarking multimedia data [1,2,3], however there has been relatively little research done on watermarking database systems [4,5,6]. Database watermarking is different than database security which focuses on issues like access control techniques and data security issues and not on securing proof of rights over relational data. This is why database watermarking has been suggested lately as a vital technique for copyright protection of databases. The increasing use of databases in many real-life applications is creating an ever increasing need for watermarking databases. The followings are examples where database watermarking might be of a crucial importance [7,8,9,10]: •
•
•
Protecting rights over outsourced relational databases is ever increasing interest, especially considering areas where sensitive, valuable data is to be outsourced. A good example is a data mining application, where data is sold in pieces to parties specialized in mining it. Given the nature of the data, it is hard to associate rights of the originator of it. Watermarking can be used to solve this issue. There are companies specialized in compiling large number of semiconductor parts into databases. Such companies license these databases at high prices to design engineers. So they need a methodology to verify their ownership of their databases in cases where design engineers may manipulate the databases and claim its ownership. The internet is exerting tremendous pressure on those data providers to create web services that allow users to search and access databases remotely.
F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 143–150, 2010. © Springer-Verlag Berlin Heidelberg 2010
144
A. Al-Haj, A. Odeh, and S. Masadeh
While this trend is a boon to end users, it is exposing the data providers to the threat of data theft. They are therefore demanding capabilities for identifying pirated copies of their data. Watermarking databases has unique requirements that differ from those required for watermarking digital audio or video products. Such requirements include; maintaining watermarked database usability, preserving database semantics, preserving database structure, watermarked database robustness, blind watermark extraction, and incremental updatability , among many other requirements. In this paper we describe a database watermarking algorithm for Arabic character attributes. The remaining of this paper is organized as follows. In section 2 the watermarking algorithm is described in details. In section 3 the robustness of the proposed algorithm is evaluated against common database attacks. Finally, concluding remarks are given in section 4.
2 Proposed Time Attribute-Based Watermarking Many Arabic characters are expandable; and thuds watermark bits can be hidden in their extensions, without sacrificing the readability and appearance of the character. For example (one) can be written ( )واﺣ ﺪor in extended form ()واﺣــ ـﺪ. Both have the same meaning but the second can carry binary information. In the algorithm, a binary image is used to watermark the relational database. The bits of the image are segmented into short binary strings that are encoded in non-numeric, multi-word attributes of selected tuples of the database. The embedding process of each short string is based on expanding the first character of a word whose location is determined by the decimal equivalent of the short string. Extraction of a short string is done locating the word in which one of its characters was expanded. The image watermark is then constructed by converting the decimals into binary strings. A major advantage of using the space-based watermarking is the large bit-capacity available for hiding the watermark. This facilitates embedding large watermarks or multiple small watermarks. This is in contrast to bit-based algorithms where watermark bits have limited potential locations that can be used to hide bits without being subjected to removal or destruction. The proposed algorithm has two procedures: watermark embedding procedure and watermark extraction procedure. The two procedures are described in the following sub-sections. 2.1 Watermark Embedding Procedure The watermark embedding procedure consists of the following operational steps: Step 1: Arrange the watermark image into m strings each of n bits length. Step 2: Divide the database logically into a sub-set has m tuples. Step 3: Embed the m short stings of the watermark image into each m-tuple. Step 4: Embed the n-bit binary string in the corresponding tuple of a sub-set as follows:
Copyright Protection of Relational Database Systems
• •
145
Find the decimal equivalent of the string. Let the decimal equivalent be d. Embed the decimal number d in a pre-selected non-numeric, multiword attribute by expanding the first expandable character of the dth word of the attribute.
Step 5: Repeat step 4 for each tuple in the subset. Step 6: Repeat steps 4 and 5 for each subset of the database under watermarking. The watermark is a of 3 x 3 binary image. Each of the three 3-bit binary strings is transformed into its decimal equivalent as shown in Figure 1(a), and embedded in the 3-tuple sub-set, as shown in Figure 1(b),. The count of the word with the red-colored character-extension (-) indicates the decimal equivalent of the embedded short binary string. Extension is performed on the fist expandable character of the word.
ϩΩϮϋ ΪϤ˰Σ Ϳ νϮϋ ϑήη ˯ϻ ϩΩϮϋ ΪϤΣ Ϳ νϮϋ ϑή˰η ΪϤΤϣ (a).
(a)
ϩΩϮϋ ΪϤ˰Σ Ϳ νϮϋ ϑήη Ϊϳί
(b)
Fig. 1. (a). Binary image watermark, and (b). its decimal equivalent vector
An illustration of the embedding procedure is shown in the two figures below. The binary image watermark is transformed into its decimal equivalent vector as shown in Figure 1, and a snapshot of the watermarked database is shown in Figure 2. The tuples in the figure constitute the records of the database, and the Ais are the 'Time' attributes of the database tuples. A snapshot of the relational database after embedding the watermark throughout the database is shown in Fig. 2. The tuples in the figure constitute the database, and the A's are the watermarked non-numeric, multi-word attributes for each tuple. 2.2 Watermark Extraction Procedure The Watermark extraction procedure is blind. It requires neither the knowledge of the original un-watermarked database nor the watermark itself. This property is critical as it allows the watermark to be detected in a copy of the database relation, irrespective of later updates to the original relation. The watermark extraction procedure is a direct reversal of the watermark embedding procedure as described in the following steps: Step 1: Locate the tuples of each sub-set in the database. Step 2: Locate the non-numeric multi-word attribute of each tuple in the sub-set.
146
A. Al-Haj, A. Odeh, and S. Masadeh
Step 3: In the selected attribute: • Find the word which has one of its characters expanded. • Count the number of the word starting from the begining. • Convert decimal equivalent of the count into a binary string. Step 4: Repeat steps 2 and 3 for all tuples of the sub-set. Step 5: Construct watermark by putting together extracted strings into an m x n image. Step 6: Repeat steps 1 through 5 to extract all copies of the embedded watermark. A0
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
…
…
An-1
An
tuple1 tuple2 tuple3 tuple4 tuple5 tuple6 tuple7 tuple8 tuple9 tuple10 tuplen Fig. 2. A snapshot of the watermarked database
3 Performance Evaluation The database watermarking algorithm described in this chapter has been evaluated and tested on an experimental database that we have constructed. The database consists of 1200 tuples, and runs under the Oracle platform. We concentrated our performance evaluation on the robustness of the proposed algorithm by virtue of the fact that, database watermarking algorithms must be developed in such a way to make it difficult for an adversary to remove or alter the watermark beyond detection without destroying the value of the object. In particular, the database watermarking algorithm should make the watermarked database robust against the following types of attacks: subset deletion attack, subset addition attack, subset alteration attack, and finally subset selection attack. (1). Subset Deletion Attack: In this type of attack, the attacker may delete a subset of the tuples of the watermarked database hoping that the watermark will be removed. The graph shown in Fig. 3 indicates that the watermark will be removed only and only if most of the database tuples were deleted! That is, even the removal of 95% of
Copyright Protection of Relational Database Systems
147
watermark detected
100% 80% 60% 40% 20% 0% 0
10
20
30
40
50
60
70
80
90
100
sub deletion(%)
Watermark detected in sub deletion attack Fig. 3. Robustness results due to the 'subset deletion attack'
the database will not result in removing the watermark. This is due to the fact that the proposed algorithm embeds the same watermark everywhere in the database, making this type of attack ineffective. (2). Subset Addition Attack: In this type of attack, the attacker adds a set of tuples to the original database. This type of attack has little impact on the watermark embedded through our algorithm. The graph shown below in Fig. 4 indicates that the watermark will never be removed even if the added tuples are as many as the original tuples. That’s, only the added tuples will not carry the watermark information.
watermark detected
100% 80% 60% 40% 20% 0% 0
10
20
30
40 50 60 sub addition(%)
70
80
90
100
Watermark detected in sub addition attack Fig. 4. Robustness results due to the 'subset addition attack'
(3). Subset Alteration Attack: In this type of attack, the attacker alters the tuples of the database through operations such as such linear transformation. The attacker hopes by doing so to remove the watermark from the database. The graph shown in
148
A. Al-Haj, A. Odeh, and S. Masadeh
Fig. 5 indicates that the watermark will remain in the watermarked database even if 90 % of the tuples of the database were altered. This is due to the fact that the proposed algorithm embeds the same watermark everywhere in the database, making this type of attack ineffective.
watermark detected
100% 80% 60% 40% 20% 0% 0
10
20
30
40 50 60 sub alteration(%)
70
80
90
100
Watermark detected in sub alteration attack Fig. 5. Robustness results due to the 'subset alteration attack'
(4). Subset Selection Attack: In this type of attack, the attacker randomly selects a subset of the original database that might still provide value for its intended purpose. The attacker hopes by doing so that the selected subset will not contain the watermark. However, since the proposed algorithm embeds the watermark in the whole database, this attack has little impact. The graph shown in Fig. 6 indicates that the watermark will remain in the watermarked database even if the attacker selects a subset as small as 10% of the original database. That is, no matter how small the subset the attacker selects, the watermark will remain in the selected subset and thus maintain the required copyright protection.
watermark detected
100% 80% 60% 40% 20% 0% 0
10
20
30
40 50 60 sub selection(%)
70
80
Watermark detected in sub selection attack Fig. 6. Robustness results due to the 'subset selection attack'
90
100
Copyright Protection of Relational Database Systems
149
Finally we computed the embedding and extraction times of the Arabic-Character Attribute based Watermarking Algorithm as a function of the size of the database (number of tuples used for watermarking). Furthermore, to show the time-performance of the algorithm compared with other reported database watermarking algorithms, we applied the algorithms of [11,12] on the same experimental database. The results are shown in Fig. 7 for the embedding time and in Fig. 8 for the extraction time. As seen in the figures, our algorithm takes more time than other two algorithms.
Embedding Results
Embedding Time (ms)
700 600 500 400 300 200 100 0 1000
3000
5000
7000
9000
11000
Data Size (Tuples)
Statistical WM (Sion et al. 2003)
Arabic-character attribute WM
Bit-based WM (Huang et al.2004)
Fig. 7. Embedding time of the Arabic-Character Algorithm WM based compared with two other algorithms
Extracting Results 700 Extracting Time (ms)
600 500 400 300 200 100 0 1000
3000
5000
7000
9000
11000
Data Size (Tuples)
Statistical WM (Sion et al. 2003)
Arabic-character attribute WM
Bit-based WM (Huang et al.2004)
Fig. 8. Extracting time of the Arabic-Character Algorithm WM based compared with two other algorithms
150
A. Al-Haj, A. Odeh, and S. Masadeh
4 Conclusions In this paper, we described a watermarking algorithm based on hiding watermark bits in the extensions of expandable Arabic characters of non-numeric, multi-word, attributes of subsets of tuples. A major advantage of using this approach is the large bit-capacity available to hide large watermarks. This is opposite to the other proposed algorithms where watermark bits have limited potential bit-locations that can be used to hide them effectively without being subjected to removal or destruction. The robustness of the proposed algorithm was verified against a number of database attacks such subset deletion, subset addition, subset alteration and subset selection attacks.
References 1. Potdar, V., Han, S., Chang, E.: A Survey of Digital Image Watermarking Techniques. In: Proceedings of the IEEE International Conference on Industrial Informatics, pp. 709–716 (2005) 2. Langelaar, G., Setyawan, I.: Watermarking Digital Image and Video Data. IEEE Signal Processing Magazine 17, 20–43 (2000) 3. Arnold, M.: Audio Watermarking: Features, applications and Algorithms. In: Proc. of the 5th IEEE International Conference on Computer and Multimedia and Expo., pp. 1013– 1016 (2000) 4. Agrawal, R., Hass, P., Kiernan, J.: Watermarking relational data: framework, algorithms and analysis. The VLDB Journal The International Journal on Very Large Data Bases 12(3), 157–169 (2003) 5. Lee, Y., Swarup, V., Jajodia, S.: Fingerprinting Relational Databases: Schemes and Specialties. IEEE Trans. Dependable and secure Computing 2(1), 34–45 (2005) 6. Agrawal, R., Kiernan, J.: Watermarking Relational Databases. In: Proc. of the 28th International Conference on Very Large Databases, Hong Kong, August 2002, pp. 946–950 (2002) 7. Zhang, Z., Jin, X., Wang, J., Li, D.: Watermarking Relational Database Using Image. In: Proc. of the 2004 International Conference on Machine Learning and Cybernetics, China, August 2004, pp. 1739–1744 (2004) 8. Hildebrandt, E., Saake, G.: User Authentication in Multi-database systems. In: The 9th International Workshop on Database and Expert Systems Applications, Vienna, Austria (1998) 9. SIIA: Database Protection: Making the case for a new federal database protection law (2000), http://www.siia.net/sharedcontent/gove/issues/ip/dbbrief.html 10. Zhang, Z., Jin, X., Wang, J., Li, D.: A Robust Watermarking Scheme for Relational Data. In: Proc. of the 13th Workshop on Information Technology and Engineering, December 2003, pp. 195–200 (2003) 11. Huang, M., Cao, J., Peng, Z., Fang, Y.: A new watermark mechanism for relational data. In: Proc. of the 4th International Conference on Computer and Technology, China, September 2004, pp. 946–950 (2004) 12. Sion, R., Atallah, M., Prabhakar, S.: Rights Protection for Relational Data. IEEE Trans. Knowledge and Data Engineering 16(12), 1509–1525 (2004)
Resolving Semantic Interoperability Challenges in XML Schema Matching Chiw Yi Lee, Hamidah Ibrahim, Mohamed Othman, and Razali Yaakob Faculty of Computer Science, Universiti Putra Malaysia 43300 Serdang, Selangor, Malaysia [email protected], [email protected], [email protected], [email protected]
Abstract. XML (Extensible Markup Language) Schema is a text based description of XML document which provides valid format of XML dataset. Nowadays, XML has become the most popular used method in encoding structured data for information exchange. The layering technology in XML allows it to be used independently from standardized representation and interoperates disparate Information Systems (ISs) in exchanging valuable information. However, these distributed IS are always structured in multiple data formats and with different semantic meaning. The nature of heterogeneous and distributed in ISs had eventually risen up the problem of semantic conflicts. This paper proposed some rules which focus on: Word Similarity and Word Relatedness using WordNet that are not only reconcile semantic conflicts but also bring integrated access to XML data by mapping dispersed electronic data which is semantically matched. Even the proposed approach is rather generic; this paper focuses only on certain domain which is medical domain. Keywords: exchange.
Semantic
conflicts,
Semantic
interoperability,
Information
1 Introduction Obtaining high quality information has become in recent years a challenging task as data should be gathered and filtered from a large, open and frequently changing network of distributed data sources, with blurred semantics and no central control over the data sources’ structure and availability [2]. Thus, XML (Extensible Markup Language) meta data was developed to cater for this need. It is a simple but powerful markup language that consists of a set of rules which encodes information electronically. It also enables the use of XSL to build user interfaces to present and edit the underlying data [5]. XML Schema (www.w3.org/XML/schema.html) is a document type definition (DTD) successor that expresses shared vocabularies and provides a guide for characterizing an XML document’s structure, content, and semantics [12]. In order to maintain the validity and well-form of the information being encoded, every XML document needs to be referenced to a text-based description of schema which contains F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 151–162, 2010. © Springer-Verlag Berlin Heidelberg 2010
152
C.Y. Lee et al.
the structure of the content, semantic meanings and constraint data type of each element inside the XML document. Besides, this description also includes identification information such as source, format, date, and user defined annotations that are tailored precisely to describe their respective component, without subscribing to a standardized vocabulary of element names. This means every single set of these self-describing XML meta data is domain specific but it is still human and machine readable. Due to the nature of independence from standardized representation, XML has become the dominant standard for representing information over the World Wide Web nowadays. It allows information to be exchanged across distributed ISs and this has been a key research field in every single domain recently. The widespread adoption of electronic records all around the world has brought to the exponentially grow of digitalized information. However, these distributed and independently developed information is stored in various proprietary of data type, standard, terminology, data architecture, communication method, and semantics that may contain conflicting data among themselves. The distributed schema set normally has their own data structure, constraints, and format and do not comfort to any well known rigid structure such as relational model. These loosely coupled set of schemas make it an interesting issue but complicated obstacle in schema matching process. Semantic conflict, which is known as conflict occurs: (i) between two semantically different concepts, (ii) or use of different terms to describe the same concept in most databases. It is the main conflict that causes semantic interoperability failure in electronic information exchange nowadays. Data across constituent databases may be related but differing in terms of name, structure, and meanings. Identification of semantically related data in various databases and figuring out the resolution of the schematic differences among the semantically related data are like a heavy burden that falls on every enterprise to manually convert between the data formats, resolve conflicts, integrate data, and interpret results in order to make visible use of this information. Thus, resolving semantic interoperability challenges in electronic data provides a uniform method for query translation and heterogeneity resolution to reduce the burden of enterprise to be familiar with all contents and structures of the information sources. Manually constructing schema matching across distributed XML schemas would be a tedious and time consuming task. Therefore, this paper aims to resolve semantic interoperability challenges in medical domain by automating patient XML schema matching to improve the efficiency of electronic patient data exchange across heterogeneous data sources over the internet. The proposed rules not only maintain the semantic interoperability of information exchange, but also improve the correctness and reliability of data exchange over the internet. The paper is organized as follows: section 2 provides background information of previous approaches that have been done over the semantic issues in integrating patient data or maintaining the interoperability of patient information exchange. Section 3, on the other hand presents the semantic interoperability challenges, which
Resolving Semantic Interoperability Challenges in XML Schema Matching
153
is called semantic conflicts and briefly describes the types of semantic conflicts. Section 4 shows the rules to reconcile semantic conflicts and an attribute-match algorithm proposed by us. Section 5 shows the experiment results and section 6 discusses future work and concludes the aim of this research.
2 Previous Approaches Several papers had discussed and reviewed the semantic issues in integrating patient data in order to achieve data interoperability whenever information exchange is needed. There are plenty of approaches being proposed to resolve the incompatibles in electronic data exchange. Some approaches are being used specifically in healthcare domain while some are general in used. Some of these solutions use differing standards and data architectures to reconcile semantic conflicts that may prove to be the greatest obstacle to semantic operability [3]. In [13], a semantic approach which employed Object-Relationship-Attribute Model for Semi-Structured Data (ORA-SS) is proposed to resolve structural conflicts in XML schema integration process. The ORA-SS model distinguishes between objects, relationships and attributes and outputs an integrated schema. Meanwhile, [14] presented an XML schema integration framework that is capable in integrating simultaneously n schemas by detecting and resolving possible conflicts that occur between the XML schemas. The integration process is divided into 5 main steps: schema cleaning and union, normal form conversion, conflict resolution, schema merging, and reconstruction. It employed XDD as its underlying model to represent the common ontology, general rule, and resolution guidelines required by integration process. Archetypes are computable expression of a domain content model of medical records [7]. This paper defined data interoperability as the ability to transfer data to and use data in any conforming system such that the original semantics of the data are retained irrespective of its point of access and therefore, it claimed that standardization data is critical in accurate information exchange. It is proposed as a method for modeling clinical concepts which conform to the openEHR Reference Model (RM) and mainly focused on the issue of ambiguity of the intended meaning of data in both the data models and terminology systems. The expression is in the form of structured constraint statements, inherited from the RM and the intended purpose of archetypes is to empower clinicians to define the content, semantics, and data entry interfaces of systems independently from the information system [1]. In [8], a methodology used to perform data mapping function in data standardization and a Model Standardization using Terminology (MoST) system are developed to test the methodology using openRHR Archetype Models and SNOMED. Context and non-context methods with lexical and semantic techniques are employed to find matches and the most appropriate matches resulting from filtering process are presented to the modeler. In [9], papers related to patient data exchange and published between years of 1995-2005 are reviewed by examining the approaches used to integrate patient medical data from heterogeneous data sources. The result showed that HL7 is the most common used messaging standard. HL7 (http://www.hl7.org) is a not-for-profit
154
C.Y. Lee et al.
volunteer organization that develops specifications; the most widely used messaging standard that enables disparate healthcare applications to exchange key set of clinical and administrative data. It provides standards for interoperability that improve care delivery, optimize workflow, reduce ambiguity, and enhance knowledge transfer among healthcare providers, government agencies, the vendor community, fellow Standard Developing Organizations (SDOs), and patients. A semantic mediation of exchanging messages in improving the interoperability among healthcare information systems is proposed in [11]. Exchanged messages are transformed into OWL (Web Ontology Language) ontology instances and then mediated through an ontology mapping tool called OWLmt that uses OWL-QL engine to reason over the source ontology instances while generating the target ontology instances according to the mapping patterns defined through a GUI. In Japan, [4] designed a prototype model that utilized HL7 Clinical Document Architecture (CDA) with the Japanese local standard in order to develop a referral letter system for electronic exchange of clinical information. The model expressed the corresponding concepts as standard data items in HL7 CDA and found that most of the content defined in the referral module in Medical Markup Language (MML) could be represented in the HL7 CDA model in order to provide worldwide standard to meet local clinical needs. A composite approach for automated source-to-target mapping that contains both direct and indirect schema mappings is presented in [6]. Three basic techniques are applied to compare source and target schema elements: (1) terminological relationships (e.g., synonyms and hypernyms), (2) data-value characteristics (e.g., string length), and (3) domain-specific, regular-expression matches (i.e. the appearance of expected strings). However, one possible constraint in this approach is the need to construct comprehensive domain ontology for the application domain.
3 Semantic Interoperability Challenges: Semantic Conflicts Figure 1 shows the diagram which classifies semantic conflict into: i) data level conflicts and ii) schema level conflicts. This paper only focuses on schema level conflicts and will leave data level conflicts as future work. Semantic conflict at the schema level refers to only those meta data of the XML schema such as element name, element data type, identifier, and constraints on data format or range of the data. However, the attached XML document data will not be taken into consideration. There are 6 common types of schema level conflicts, which are: naming conflicts, entity identifier conflicts, schema isomorphism conflicts, generalization conflicts, aggregation conflicts, and schematic discrepancies. Naming Conflicts It is a conflict where two similar or identical concepts that are represented by two totally different name or perhaps two almost alike names. It also occurs when two different concepts that are named in almost alike or totally similar name.
Resolving Semantic Interoperability Challenges in XML Schema Matching
155
SEMANTIC CONFLICT
Data Level Conflicts Data Value Conflicts
Data Representation Conflicts Data Unit Conflicts Data Precision Conflict Known Data Value Reliability Conflicts Spatial Domain Conflicts
Schema Level Conflicts Naming Conflicts Entity Identifier Conflicts Schema Isomorphism Conflicts Generalization Conflicts Aggregation Conflicts Schematic Discrepancies
Fig. 1. Diagram of the Types of Semantic Conflict
Example: Two identical concepts which have different name such as: sex and gender. However, sometimes two almost alike names would have different meaning such as: mail_address and email_address. Entity Identifier Conflicts It is a conflict where two sets of semantically identical XML schema are using different identifiers (primary key). Elements in XML schema are structured in text based and all of them are user-defined for human readable. Therefore, different XML schema might have different identifiers. Example: Assume that elements in XML Schema Book1 and Book2 are extracted and shown in the following format: Book1(ISDN, Title, Code) might have ISDN as its identifier while Book2(ISDN, Title, Code) might use Code as its identifier. Although both sets of elements are similar in every attribute, it is impossible to say that both identifiers: Book1.ISDN is identical to Book2.Code. Schema Isomorphism Conflicts It is a conflict where two sets of semantically identical XML schema consist of different number of elements. It is possible that concept A that only consists of two elements EA1 and EA2 is identical to concept B which consists of 4 elements EB1, EB2, EB3, and EB4. Example: Assume that elements in XML schema Address1 and Address2 are extracted and shown in the following format: Address1(street, town, city, country)
156
C.Y. Lee et al.
consists of 4 elements, while Address2(street, city, country) only consists of 3 elements. However, both schemas Address1 and Address2 are still considered semantically identical to each other. Generalization Conflicts It is a conflict where the first XML schema is being inherited by the second schema. It might have some elements that are being inherited by another schema which is referred to as subconcept. The inheritance relationship of sub- and superconcept gives rise to a hierarchy. Example: Assume that elements in XML schema student, male_student, and female_student are extracted and shown in the following format: student(id, name, sex), male_student(id, name), and female_student(id, name). It is obvious that schema student is the superconcept of both male_student and female_student. Aggregation Conflicts It is a conflict where the first XML schema is an inheritance concept of the second schema. It might inherit the same elements or behaviors of another schema which is referred to as superconcept. The inheritance relationship of sub- and superconcept also gives rise to a hierarchy as stated above. Example: Assume that elements in XML schema Address1, Address2, and street are extracted and shown in the following format: Address1(street, city, country), Address2(street, city, country), and street(street1, street2). It is obvious that both schemas Address1 and Address2 are semantically identical to each other even though schema Address2 is aggregated by schema street. Schematic Discrepancies It is a conflict where data in one database are corresponding to meta data in another. However, this conflict is not considered in this paper. Example: Assume that elements in XML schema Collection and Item are extracted and shown in the following format: Collection(cd, book, journal), and Item(ISBN, type). In this case, schema Item might contain a tuple of data with Item.type = “book” that is corresponding to the metadata (element) in schema Collection.
4 Resolving Semantic Conflicts Due to the nature of independence from standardized representation, elements in XML schema are written in a hierarchical view for users’ convenience to add new elements and to delete unneeded elements. Before dealing with the conflicting elements, it is wise to extract only validated elements from XML schema. Figure 2(i) shows an example of simple XML schema while Figure 2(ii) depicts extracted elements from the XML schema. Assume that there is a source XML schema s and a target XML schema t. Schema s consists of an element set S[S(attrs1, attrs2, …, attrsn)] while schema t consists of an element set [T(attrt1, attrt2, …, attrtm)]. Each of them consists of a set of elements that might be conflicting with each other. We have proposed five rules to resolve the semantic conflicts stated in section 3 and this is presented below.
Resolving Semantic Interoperability Challenges in XML Schema Matching
157
<xsd:element name="PMS_Patient_DEMOGRAPHIC"> <xsd:complexType> <xsd:element name="PATIENT_NAME" type="xsd:string"/> <xsd:element name="ID_NO" type="xsd:string"/> <xsd:element name="CURR_CITY" type="xsd:string"/> <xsd:element name="PATIENT_CATEGORY" type="xsd:string"/> <xsd:element name="DATE_OF_BIRTH" type="xsd:date"/> <xsd:element name="GENDER" type="xsd:string"/> <xsd:element name="RACE" type="xsd:string"/> <xsd:element name="RELIGION" type="xsd:string"/> <xsd:element name="NATIONALITY" type="xsd:string"/> <xsd:element name="MARITAL_STATUS" type="xsd:string"/> <xsd:element name="MATRIC_NO" type="xsd:string"/> <xsd:element name="MODE_OF_PAYMENT" type="xsd:string"/>
2(i)
|---- PMS_Patient_DEMOGRAPHIC |---- PATIENT_NAME |---- ID_NO |---- CURR_CITY |---- PATIENT_CATEGORY |---- DATE_OF_BIRTH |---- GENDER |---- RACE |---- RELIGION |---- NATIONALITY |---- MARITAL_STATUS |---- MATRIC_NO |---- MODE_OF_PAYMENT
2(ii) Fig. 2(i). A Simple Example of XML Schema; Fig. 2(ii). Extracted Elements from XML Schema
Rule 1 Matching a non-vague element name E1 with another non-vague element name E2.
∈
∈
If (attrsi words in WordNet) && (attrtj words in WordNet) where i = {1, 2, 3, …, n}, j = {1, 2, 3, …, m}, then calculate word similarity between attrsi and attrtj. Formulation for word similarity between attrsi and attrtj as shown below: (NE1/NE2) * 100% .
(1)
158
C.Y. Lee et al.
where NE1 is the number of characters in element name E1 that is similar to the substring of element name E2, while NE2 is the total number of characters in element name E2. Equation (1) is only applied to both E1 and E2 which return glosses (definitions of word) from WordNet and NE1 < NE2. Rule 2 Matching a vague element name E1 with a non-vague element name E2.
∈
words in WordNet) && (attrtj ∉ words in WordNet) where i = {1, 2, 3, If (attrsi …, n}, j = {1, 2, 3, …, m}, then tokenize attrtj to o partitions until Fkattrtj words in WordNet where k = {1, 2, 3, …, o}, and calculate word similarity between attrsi and Fkattrtj. or If (attrsi ∉ words in WordNet) && (attrtj words in WordNet) where i = {1, 2, 3, …, n}, j = {1, 2, 3, …, m}, then tokenize attrsi to p partitions until Flattrsi words in WordNet where l = {1, 2, 3, …, p}, and calculate word similarity between Flattrsi and attrtj.
∈
∈
∈
Obviously, E1 does not return any glosses from WordNet while E2 returns glosses from WordNet. In this case, E1 is tokenized into n+1 fragments based on the existence of n underscore; _, as the separator. The formula for word similarity between attrsi and Fkattrtj or Flattrsi and attrtj would be the number of fragments in E1 that is similar to E2 divided by the n+1 fragments as shown by the equation below: [FE1/(n+1)] * 100% .
(2)
where FE1 is the number of fragments in vague element E1 that is similar to nonvague element name E2. Rule 3 Matching both vague element names E1 and E2. If (attrsi ∉ words in WordNet) && (attrtj ∉ words in WordNet) where i = {1, 2, 3, …, n}, j = {1, 2, 3, …, m}, then tokenize attrsi to p partitions until Flattrsi words in WordNet where l = {1, 2, 3, …, p}, tokenize attrtj to o partitions until Fkattrtj words in WordNet where k = {1, 2, 3, …, o}, and calculate word similarity between Flattrsi and Fkattrtj.
∈
∈
Both vague element names are tokenized into n+1 fragments based on the existence of n underscore; _, as the separator. The equation for word similarity between Flattrsi and Fkattrtj is shown below: [FE1/(NE1+1)] * (FE2/(NE2+1)] * 100 .
(3)
Resolving Semantic Interoperability Challenges in XML Schema Matching
159
where FE1 is the number of fragments in element name E1 that is similar to fragments of element name E2, while NE1 is the total number of underscore; _, in E1. Similarly FE2 is the number of fragments in element name E2 that is similar to fragments of element name E1, while NE2 is the total number of underscore; _, in E2.
∈
Rule 4 If equation (1), (2), & (3) shows 0 (zero), and (attrsi words in WordNet) && (attrtj words in WordNet) where i = {1, 2, 3, …, n}, j = {1, 2, 3, …, m}, then calculate the overlapped phrase(s) score for both element names.
∈
Overlapped phrase refers to shared words (overlaps) for both definitions (gloss) of two element names. The equation for obtaining overlapped phrase score is shown below [10]: (NOP1)2+ (NOP2)2+ (NOP3)2+…+(NOPM)2 .
(4)
where Nop is the N consecutive words for each overlapped phrase that shared (overlapped) between two element names, while M is the number of last overlapped phrases.
∈
Rule 5 words in WordNet) && If equation (1), (2),(3) & (4) shows 0 (zero), and (attrsi (attrtj words in WordNet) where i = {1, 2, 3, …, n}, j = {1, 2, 3, …, m}, then obtain hypernym and hyponym list for both element names from WordNet.
∈
WordNet is a lexical database where each unique meaning of a word is represented by a synonym or synset which is connected to each other through explicit semantic relations that are defined in WordNet. If a noun synset A is connected to another noun synset B through the is-a-kind-of relation then A is said to be a hypernym of synset B and B a hyponym of A [10]. A summarized of algorithm consisting of rules as stated above is depicted in Figure 3.
5 Experiment and Results Figure 4 shows an extracted XML source schema patient and an extracted XML target schema outpatientt that both are shown in relational model. These two schemas would be used in data exchange over the internet. The experiment is tested using these two schemas, where patient as the source schema and outpatientt as the target schema using the rules being proposed above to reconcile the semantic conflicts during the process of electronic patient data exchange.
160
C.Y. Lee et al.
For each attribute in schema S and T Check whether attrsi words in WordNet && attrtj where i = {1, 2, 3, …, n}, j = {1, 2, 3, …, m}
∈
∈
∈
∈ words in WordNet,
If (attrsi words in WordNet) && (attrtj words in WordNet) Calculate the word similarity between attrsi and attrtj with equation (1) else if (attrsi words in WordNet) && (attrtj words in WordNet) Tokenize attrtj to o partitions until Fkattrtj words in WordNet where k = {1, 2, 3, …, o} Calculate the word similarity between attrsi and Fkattrtj with equation (2) else if (attrsi words in WordNet) && (attrtj words in WordNet) Tokenize attrsi to p partitions until Flattrsi words in WordNet where l = {1, 2, 3, …, p} Calculate the word similarity between Flattrsi and attrtj with equation (2) else if (attrsi words in WordNet) && (attrtj words in WordNet) Tokenize attrsi to p partitions until Flattrsi words in WordNet where l = {1, 2, 3, …, p} Tokenize attrtj to o partitions until Fkattrtj words in WordNet where k = {1, 2, 3, …, o} Calculate the word similarity between Flattrsi and Fkattrtj with equation (3) else if equation (1), (2), & (3) == 0 && (attrsi words in WordNet) && (attrtj words in WordNet) Calculate the overlapped phrase(s) score for both element names with equation (4) else Obtain hypernym and hyponym list for both element names from WordNet
∈
∈
∉
∉
∈
∉
∉
∈
∈
∈ ∈
∈
Fig. 3. Attribute-match Algorithm
patient (id, p_name, sex, home_addr(street, town, country)) outpatient (passport_no, name, gender, address) Fig. 4. patient and outpatient Schemas
Table 1 shows a brief view of conflicts being tackled for every element from two different XML schemas. Given the result generated by the proposed rules and attribute-match algorithm, the result shows that every semantic conflict has been resolved and matching elements are found.
Resolving Semantic Interoperability Challenges in XML Schema Matching
161
Table 1. Result of Matching and Unmatched Elements Source Element
Target Element
Result
Type of Conflicts
p_name sex home_addr
name gender address
Matched Matched Matched
street
address
Matched
postcode
address
Matched
id
passport_no
Not Matched
Naming Conflict Naming Conflict Generalization/Aggregation/ Schema Isomorphism Conflict Generalization/Aggregation/ Schema Isomorphism Conflict Generalization/Aggregation/ Schema Isomorphism Conflict Entity Identifier Conflict
The results shown in Table 1 are significant if compared with other works because they are based on a very simple algorithm that relies on 5 main rules for both element names from different sources.
6 Discussion and Conclusion As stated in section 2, standardization is the most popular method in electronic data exchange nowadays. Most of the developed countries apply this technique in HL7 and Archetypes. However, it appears as an obstacle especially in non-developed countries as most of them are still using simple or even outdated schema that are not compatible with the format in HL7 and Archetypes. It takes a great effort to convert all these schemas into the same format and standard as being used in HL7 and Archetypes. Furthermore, it is too time consuming. Therefore, the proposed rules and attribute-matching algorithm helps in tackling the limitation as stated above, by providing a generic approach in reconciling semantic conflicts in those heterogeneous and non-standardize electronic data exchange without changing their structure of schemas but still generating the same result as in HL7 and Archetypes. However, the patient data security issue will be an improvement for this research that should be considered in the future. Acknowledgement. Our thanks to the University Putra Malaysia (UPM) for supporting this work under grant number 05/01/07/0162RU.
References 1. Beale, T., Hear, S.: Archetype Definition and Principles, Technical Report Rev 0.6, OpenEHR Foundation (2005) 2. Gal, A.: Semantic Interoperability in Information Services: Experiencing with CoopWARE. ACM SIGMOD Record 28(1), 68–75 (1999) 3. Ferranti, J.M., Musser, R.C., Kawamoto, K., Hammon, W.E.: The Clinical Document Architecture and the Continuity of Care Record: A Critical Analysis. Journal of the American Medical Informatics Association 13(3), 245–252 (2006)
162
C.Y. Lee et al.
4. Yong, H., Jinqiu, G., Ohta, Y.: A Prototype Model using Clinical Document Architecture (CDA) with a Japanese Local Standard: Designing and Implementing a Referral Letter System. Acta Medica Okayama 62(1), 15–20 (2008) 5. Topkara, U., Song, X.C., Woo, J., Park, S.P.: Connected in a Small World: Rapid Integration of Heterogeneous Biology Resources. In: Proceedings of the 2nd International Workshop on Grid Computing Environments /Supercomputing, Tampa, FL (2006) 6. Xu, L., Embley, D.W.: A Composite Approach to Automating Direct and Indirect Schema Mappings. Information Systems 31(8), 697–732 (2006), doi:10.1016/j.is.2005.01.003 7. Qamar, R., Rector, A.: Semantic Issues in Integrating Data from Different Models to Achieve Data Interoperability. In: Proceedings of the Medinfo Conference, Brisbane, Australia (2007) 8. Qamar, R., Rector, A.: Semantic Mapping of Clinical Model to Biomedical Terminologies to Facilitate Data Interoperability. In: Proceedings of the Healthcare Computing 2007 Conferences, Harrogate, UK (2007) 9. Cruz-Correia, R.J., Vieira-Marques, P.M., Ferreira, A.M., Almeida, F.C., Wyatt, J.C., Costa-Pereira, A.M.: Reviewing the Integration of Patient Data: How Systems are Evolving in Practice to Meet Patient Needs. Proceedings of BMC Medical Informatics and Decision Making (2007) 10. Banerjee, S., Pedersen, T.: Extended Gloss Overlaps as a Measure of Semantic Relatedness. In: Proceedings of the 18th International Conference on Artificial Intelligence (IJCAI’03), Acapulco, Mexico (2003) 11. Bicer, V., Laleci, G.B., Dogac, A., Kabak, Y.: Artemis Message Exchange Framework: Semantic Interoperability of Exchange Messages in the Healthcare Domain. ACM SIGMOD Record 34(3), 71–76 (2005), doi:10.1145/1084805.1084819 12. Vakali, A., Catania, B., Maddalena, A.: XML Data Stores: Emerging Practices. IEEE Internet Computing 9(2), 62–69 (2005) 13. Yang, X., Li Lee, M., Wang Ling, T.: Resolving Structural Conflicts in the Integration of XML Schemas: A Semantic Approach. In: Song, I.-Y., Liddle, S.W., Ling, T.-W., Scheuermann, P. (eds.) ER 2003. LNCS, vol. 2813, pp. 520–533. Springer, Heidelberg (2003) 14. Doan, D.D., Wuwongse, V.: XML Database Schema Integration using XDD. In: Dong, G., Tang, C., Wang, W. (eds.) WAIM 2003. LNCS, vol. 2762, pp. 1066–1075. Springer, Heidelberg (2003)
Some Results in Bipolar-Valued Fuzzy BCK/BCI-Algebras A. Borumand Saeid and M. Kuchaki Rafsanjani Faculty of Math and Computer, Shahid Bahonar university of Kerman, Kerman, Iran [email protected], [email protected]
Abstract. In this note, by using the concept of Bipolar-valued fuzzy set, the notion of bipolar-valued fuzzy BCK/BCI-algebra is introduced. Moreover, the notions of (strong) negative s-cut, (strong) positive t-cut are introduced and the relationship between these notions and crisp subalgebras are studied.
1
Introduction
As it is well known, BCK/BCI-algebras are two classes of algebras of logic. They were introduced by Imai and Iseki [6, 7]. BCI-algebras are generalizations of BCK-algebras. Most of the algebras related to the t-norm based logic, such as M T L-algebras, BL-algebras [3, 4], hoop, M V -algebras and Boolean algebras et al., are extensions of BCK-algebras. In 1965, Zadeh [12] introduced the notion of a fuzzy subset of a set. Since then it has become a vigorous area of research in different domains. There have been a number of generalizations of this fundamental concept such as intuitionistic fuzzy sets, interval-valued fuzzy sets, vague sets, soft sets. Lee [9] introduced the notion of bipolar-valued fuzzy sets. Bipolar-valued fuzzy sets are an extension of fuzzy sets whose membership degree range is enlarged from the interval [0, 1] to [−1, 1]. In a bipolar-valued fuzzy set, the membership degree 0 means that elements are irrelevant to the corresponding property, the membership degree (0, 1] indicates that elements somewhat satisfy the property, and the membership degree [−1, 0) indicates that elements somewhat satisfy the implicit counter-property. Bipolarvalued fuzzy sets and intuitionistic fuzzy sets look similar each other. However, they are different each other(see [9]). Now, in this note we use the notion of Bipolar-valued fuzzy set to establish the notion of bipolar-valued fuzzy BCK/BCI-algebras; then we obtain some related results which have been mentioned in the abstract.
2
Preliminaries
In this section, we present now some preliminaries on the theory of bipolarvalued fuzzy set. In his pioneer work [12], Zadeh proposed the theory of fuzzy F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 163–168, 2010. c Springer-Verlag Berlin Heidelberg 2010
164
A.B. Saeid and M.K. Rafsanjani
sets. Since then it has been applied in wide varieties of fields like Computer Science, Management Science, Medical Sciences, Engineering problems etc. to list a few only. Definition 1. [9] Let G be a nonempty set. A bipolar-valued fuzzy set B in G is an object having the form B = {(x, μ+ (x), ν − (x)) | x ∈ G} where μ+ : G → [0, 1] and ν − : G → [−1, 0] are mappings. The positive membership degree μ+ (x) denotes the satisfaction degree of an element x to the property corresponding to a bipolar-valued fuzzy set B = {(x, μ+ (x), ν − (x)) | x ∈ G}, and the negative membership degree ν − (x) denotes the satisfaction degree of an element x to some implicit counter-property corresponding to a bipolar-valued fuzzy set B = {(x, μ+ (x), ν − (x)) | x ∈ G}. If μ+ (x) = 0 and ν − (x) = 0, it is the situation that x is regarded as having only positive satisfaction for B = {(x, μ+ (x), ν − (x)) | x ∈ G}. If μ− (x) = 0 and ν − (x) = 0, it is the situation that x does not satisfy the property of B = {(x, μ+ (x), ν − (x)) | x ∈ G} but somewhat satisfies the counter property of B = {(x, μ+ (x), ν − (x)) | x ∈ G} . It is possible for an element x to be such that μ+ (x) = 0 and ν − (x) = 0 when the membership function of the property overlaps that of its counter property over some portion of G. For the sake of simplicity, we shall use the symbol B = (μ+ , ν − ) for the bipolar-valued fuzzy set B = {(x, μ+ (x), ν − (x)) | x ∈ G}. Definition 2 (6). Let X be a non-empty set with a binary operation “ ∗ ” and a constant “0”. Then (X, ∗, 0) is called a BCI-algebra if it satisfies the following conditions: (i) ((x ∗ y) ∗ (x ∗ z)) ∗ (z ∗ y) = 0, (ii) (x ∗ (x ∗ y)) ∗ y = 0, (iii) x ∗ x = 0, (iv) x ∗ y = 0 and y ∗ x = 0 imply x = y, for all x, y, z ∈ X. We can define a partial ordering ≤ by x ≤ y if and only if x ∗ y = 0. If a BCI-algebra X satisfies 0 ∗ x = 0, for all x ∈ X, then we say that X is a BCK-algebra. A nonempty subset S of X is called a subalgebra of X if x ∗ y ∈ S, for all x, y ∈ S. We refer the reader to the books [5, 11] for further information regarding BCK/BCI-algebras. Definition 3 (11). Let μ be a fuzzy set in a BCK/BCI-algebra X. Then μ is called a fuzzy BCK/BCI-subalgebra of X if μ(x ∗ y) ≥ min{μ(x), μ(y)}, for all x, y ∈ X.
Some Results in Bipolar-Valued Fuzzy BCK/BCI-Algebras
165
Bipolar-Valued Fuzzy Subalgebras of BCK-Algebras
3
From now on (X, ∗, 0) is a BCK/BCI-algebra, unless otherwise is stated. Definition 4. A bipolar-valued fuzzy set B = (μ+ , ν − ) is said to be a bipolarvalued fuzzy subalgebra of a BCK/BCI-algebras X if it satisfies the following conditions: (BF1) μ+ (x ∗ y) ≥ min{μ+ (x), μ+ (y)}, (BF3) ν − (x ∗ y) ≤ max{ν − (x), ν − (y)}, for all x, y ∈ X. Example 1. Consider a BCI-algebra X = {0, a, b, c} with the following Cayley table: ∗0abc 00abc aa0c b bbc0a ccba0 Let B = (μ+ , ν − ) be a bipolar-valued fuzzy set in X with the mappings μ+ and ν − defined by: 0.7 if x = 0 μ+ (x) = 0.3 if x = 0
and −
ν (x) =
−0.4 −0.2
if x = 0 if x = 0
It is routine to verify that B is a bipolar-valued fuzzy subalgebra of X. Lemma 1. If B is a bipolar-valued fuzzy subalgebra of X, then μ+ (0) ≥ μ+ (x) and ν − (0) ≤ ν − (x), for all x ∈ X. Proposition 1. Let B be a bipolar-valued fuzzy subalgebra of X, and let n ∈ N . Then n n + + − (i) μ ( x ∗ x) ≥ μ (x) and ν ( x ∗ x) ≤ ν − (x), for any odd number n, n n (ii) μ+ ( x ∗ x) = μ+ (x) and ν − ( x ∗ x) = ν − (x), for any even number n. where
n
n−times
x ∗ x =x ∗ x ∗ ... ∗ x .
Theorem 1. Let B be a bipolar-valued fuzzy subalgebra of X. If there exists a sequence {xn } in X, such that lim μ+ (xn ) = 1 and lim ν − (xn ) = −1. n→∞
n→∞
Then μ+ (0) = 1 and ν − (0) = −1.
Theorem 2. The family of bipolar-valued fuzzy subalgebras of X forms a complete distributive lattice under the ordering of bipolar-valued fuzzy set inclusion ⊂.
166
A.B. Saeid and M.K. Rafsanjani
A fuzzy set μ of X is called anti fuzzy subalgebra of X, if μ(x∗y) ≤ max{μ(x), μ(y)}, for all x, y ∈ X. Proposition 2. A bipolar-valued fuzzy set B of X is a bipolar-valued fuzzy subalgebra of X if and only if μ+ is a fuzzy subalgebras and ν − is an anti fuzzy subalgebras of X. Definition 5. Let B = (μ+ , ν − ) be a bipolar-valued fuzzy set and (s, t) ∈ [−1, 0] × [0, 1]. Define: 1) the sets Bt+ = {x ∈ X | μ+ (x) ≥ t} and Bs− = {x ∈ G | ν − (x) ≤ s}, which are called positive t-cut of B = (μ+ , ν − ) and the negative s-cut of B = (μ+ , ν − ), respectively, 2) the sets > Bt+ = {x ∈ X | μ+ (x) > t} and < Bs− = {x ∈ G | ν − (x) < s}, which are called strong positive t-cut of B = (μ+ , ν − ) and the strong negative s-cut of B = (μ+ , ν − ), respectively, (t,s) 3) the set XB = {x ∈ X | μ+ (x) ≥ t, ν − (x) ≤ s} is called an (s, t)-level subset of B, (t,s) 4) the set S XB = {x ∈ X | μ+ (x) > t, ν − (x) < s} is called a strong (s, t)-level subset of B, 5) the set of all (s, t) ∈ Im(μ+ )×Im(ν − ) is called the image of B = (μ+ , ν − ). Theorem 3. Let B be a bipolar-valued fuzzy subset of X such that the least upper bound t0 of Im(μ+ ) and the greatest lower bound s0 of Im(ν − ) exist. Then the following condition are equivalent: (i) B is a bipolar-valued fuzzy subalgebra of X, (t,s) (ii) For all (s, t) ∈ Im(ν − ) × Im(μ+ ), the nonempty level subset XB of B is a (crisp) subalgebra of X. (iii) For all (s, t) ∈ Im(ν − ) × Im(μ+ ) \ (s0 , t0 ), the nonempty strong level (t,s) subset S XB of B is a (crisp) subalgebra of X. (t,s) (iv) For all (s, t) ∈ [−1, 0] × [0, 1], the nonempty strong level subset S XB of B is a (crisp) subalgebra of X. (t,s) (v) For all (s, t) ∈ [−1, 0] × [0, 1], the nonempty level subset XB of B is a (crisp) subalgebra of X. Theorem 4. Each subalgebra of X is a level subalgebra of a bipolar-valued fuzzy subalgebra of X. Theorem 5. Let S be a subset of X and B be a a bipolar-valued subset of X which is given in the proof of Theorem 3.10. If B is a bipolar-valued fuzzy subalgebra of X, then S is a subalgebra of X. Now we generalize the Theorem 3.10 Theorem 6. Let X be a BCK-algebra. Then for any chain of subalgebras S0 ⊂ S1 ⊂ · · · ⊂ Sr = X there exists a bipolar-valued fuzzy subalgebra B of X whose level subalgebras are exactly the subalgebras of this chain.
Some Results in Bipolar-Valued Fuzzy BCK/BCI-Algebras
167
Theorem 7. If B = (μ+ , ν − ) is a bipolar-valued fuzzy subalgebra of X, then the set XB = {x ∈ X | μ+ (x) = μ+ (0), ν − (0) = ν − (x)} is a subalgebra of X. Theorem 8. Let M be a subset of X. Suppose that N is a bipolar-valued fuzzy set of X defined by: α if x ∈ M μ+ (x) = N β otherwise
and − (x) νN
=
γ δ
if x ∈ M otherwise
for all α, β ∈ [0, 1] and γ, δ ∈ [−1, 0] with α ≥ β and γ ≤ δ. Then N is a bipolar-valued fuzzy subalgebra if and only if M is a subalgebra of X. Moreover, in this case XN = M . Proof. Let N be a bipolar-valued fuzzy subalgebra. Let x, y ∈ X be such that x, y ∈ M . Then + + μ+ N (x ∗ y) ≥ min{μN (x), μN (y)} = min{α, α} = α
and
− − − νN (x ∗ y) ≤ max{νN (x), νN (y)} = min{γ, γ} = γ
therefore x ∗ y ∈ M . Conversely, suppose that M is a subalgebra of X, let x, y ∈ X. (i) If x, y ∈ M , then x ∗ y ∈ M , thus + + μ+ N (x ∗ y) = α = min{μN (x), μN (y)}
and
− − − νN (x ∗ y) = γ = max{νN (x), νN (y)}.
(ii) If x ∈ M or y ∈ M , then + + μ+ N (x ∗ y) ≥ β = min{μN (x), μN (y)}
and
− − − νN (x ∗ y) ≤ δ = max{νN (x), νN (y)}.
This shows that N is a bipolar-valued fuzzy subalgebra. Moreover, we have + − + + − XN := {x ∈ X | μ+ N (x) = μN (0), νN (x) = μN (0)} = {x ∈ X | μN (x) = α, νN (x) = γ} = M.
168
4
A.B. Saeid and M.K. Rafsanjani
Conclusion
Bipolar-valued fuzzy set is a generalization of fuzzy sets. In the present paper, we have introduced the concept of bipolar fuzzy subalgebras of BCK/BCI-algebras and investigated some of their useful properties. In our opinion, these definitions and main results can be similarly extended to some other fuzzy algebraic systems such as groups, semigroups, rings, nearrings, semirings (hemirings), lattices and Lie algebras. It is our hope that this work would other foundations for further study of the theory of BCK/BCI-algebras. Our obtained results can be perhaps applied in engineering, soft computing or even in medical diagnosis. In our future study of fuzzy structure of BCK/BCI-algebras, may be the following topics should be considered: (1) To establish a bipolar-valued fuzzy ideals of BCK/BCI-algebras; (2) To consider the structure of quotient BCK/BCI-algebras by using these bipolar-valued fuzzy ideals; (3) To get more results in bipolar-valued fuzzy BCK/BCI-algebras and application.
References 1. Akram, M., Dar, K.H., Jun, Y.B., Roh, E.H.: Fuzzy structures on K(G)-algebra. Southeast Asian Bulletin of Mathematics 31(4), 625–637 (2007) 2. Gau, W.L., Buehrer, D.J.: Vague sets. IEEE Transactions on Systems, Man and Cybernetics 23, 610–614 (1993) 3. Hajek, P.: Metamathematics of Fuzzy Logic. Kluwer Academic Publishers, Dordrecht (1998) 4. Haveshki, M., Borumand Saeid, A., Eslami, E.: Some types of filters in BL-algebras. Soft Computing 10, 657–664 (2006) 5. Huang, Y.S.: BCI-algebra. Science Press, China (2006) 6. Imai, Y., Iseki, K.: On axiom systems of propositional calculi. XIV Proc. Japan Academy 42, 19–22 (1966) 7. Iseki, K.: An algebra related with a propositional calculus. XIV Proc. Japan Academy 42, 26–29 (1966) 8. Iseki, K., Tanaka, S.: An introduction to the theory of BCK-algebras. Math. Japonica 23(1), 1–26 (1978) 9. Lee, K.M.: Bipolar-valued fuzzy sets and their operations. In: Proc. Int. Conf. on Intelligent Technologies, Bangkok, Thailand, pp. 307–312 (2000) 10. Lee, K.M.: Comparison of interval-valued fuzzy sets, intuitionistic fuzzy sets, and bipolar-valued fuzzy sets. J. Fuzzy Logic Intelligent Systems 14(2), 125–129 (2004) 11. Meng, J., Jun, Y.B.: BCK-algebras. Kyungmoon Sa Co, Korea (1994) 12. Zadeh, L.A.: Fuzzy sets. Inform. and Control 8, 338–353 (1965)
The Effect of Attentiveness on Information Security Adeeb M. Alhomoud School of Computing, Informatics and Media University of Bradford [email protected]
Abstract. This paper presents a brief overview of a larger study on the impact of attentiveness, in addition to other factors related to the human factors, on information security in both private and government organizations in Saudi Arabia. The aim of the initial experiment was to sense the existence of attentiveness in relation to information security; the results were encouraging enough to go for a larger scale experiment with additional human factors such as awareness, workload and passwords, etc. We believe this is to be the first study to investigate the effect of attentiveness as a part of the Saudi culture context on organizational information security. Keywords: Human Factors, Attentiveness, Information Security, Saudi Organizations, Saudi Culture.
1 Introduction Nowadays any organization has to consider security as part of its own infrastructural design and provide all that is needed to strengthen defences against intrusions and attacks. The initial steps carried out in securing any network, whether it is a small network or an enterprise network, involve buying the technology needed and putting it into the infrastructural design of the network. As this technology is the first thing that comes to mind when planning to provide a secure environment in any organization, we need to keep in mind and focus on other important aspects, i.e. the human who is going to interact with this technology (as a technical member of staff or an end-user). Knowing that technology is not enough by itself to secure the information and that information security involves both technology and humans [1], we are going to address the human part of security as it is often described to be the weakest link in a security system. Most organizations can afford to buy most, if not all, of the security related equipment (e.g. intrusion dedication/prevention systems, firewalls, monitoring systems) necessary to protect their network, but how much effort and money is spent on avoiding human factors and errors that will eventually impact the organization. The objective of this paper is to shed light on the existence of attentiveness in the field of information security within Saudi culture. As this initial survey is part of a larger study we link attentiveness to the human factor, along with other human factors, and measure the impact on both private and government sectors in Saudi Arabia. F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 169–174, 2010. © Springer-Verlag Berlin Heidelberg 2010
170
A.M. Alhomoud
This paper is organized as follows: by now, the reader is already familiar with the content of section one which is an introduction. Section two will provide a literature review. Section three will explain the meaning of attentiveness, while section four covers the initial experiment. This is followed by the discussion and conclusions in section five.
2 Literature Review The human threats within an organization’s information security are described as human errors or human factors. As human errors are interlinked with human factors, since one can lead to the other, in the following paragraphs we will cover the major human errors that are practiced in the information security field based on previous studies. The “Act of Human Error or Failure” is at the top of Whiteman’s list with regard to threats to information security [2]. These human errors can be non-deliberate (such as an accidental programming error that crashes the system or unintended cutting of a cable during excavation) or deliberate, made by human choice which usually results from poor management or operational practices. An example of human error is what is called a mistake or slip. Mistakes can happen by applying the wrong security patch to the wrong piece of software, while a slip is forgetting the task of applying the patch to the related software. In 2000, the Western Union system was compromised and opened to an attack following which approximately 15,700 customer credit card numbers were stolen. The attack was accomplished when Western Union undertook regular maintenance and a file containing all credit card information was left unprotected when the system resumed operation. Another example of a mistake that can lead to a security breach is leaving network ports open on a firewall or simply misconfiguring the devices [3]. Human errors can be caused by workload. Based on interviews with security specialists and network administrators carried out by Kraemer and Crayon, the level of workload can play a major role in security. As one network administrator said: “It is pretty much on the backburner. It is like we don’t have anybody … but it’s like I don’t have time to do it either”. This illustrates how workload may affect the management of tasks, as many other examples of network administrators’ mistakes are connected to how they manage tasks such as patch management and vulnerability monitoring [4]. A study on Dutch organizations [5] showed that the level of security is higher in organizations where system administrators do not have too high a workload. It has also been found that the delay in applying system patches can be due to two reasons: workload or a small chance that a vulnerability will be exploited. Therefore, we can conclude that not having the right number of human resources capable of fulfilling the job in the right time period will lead to a security risk, which we believe is related to poor management in the first place. The following two subsections are divided to represent the relationships between humans and passwords, and humans and their impact.
The Effect of Attentiveness on Information Security
171
2.1 Passwords and the Human Element Passwords play a major role in security. Losing a password means impairing security or damaging one of the security elements (availability, integrity and confidentiality). Passwords are considered to be first line of defence against intrusion into computer systems [6]. Nowadays, the end-user entry of username–password is practiced many times during the day and on a daily basis (e.g. employees logging into their company’s network, checking emails, etc). Network and website administrators usually set the rule that will force the user to use no less than 8 characters. The reason for this is, basically, the longer the password, the longer it will take to be guessed or cracked. It is normal practice in some organizations where administrators use automatic password generators to create complex passwords that will be hard to crack. Some enforce a more strict policy for passwords; using passwords not found in a dictionary, not names of friends and not made up of numbers only. The reason for this is to avoid brute force attacks as they use dictionary words as passwords; a dictionary of 60,000 words was compiled back in the 1980s [6] . But this enforcement of password policy can backfire on administrators. A survey undertaken by Carstens and McCauley-Bell, addressing 257 university students and employees, showed that 27% of them wrote down their passwords on paper to refer to it when inputting [7]. Another type of human error involves using social security numbers, telephone number, dates of birth, etc: “the easier the password is to remember it is likely to be the easier to be guessed or cracked” [8]. In addition, staff or users who choose weak passwords or share their user IDs with others are considered be contributing human errors which will lead to a weak security environment [3]. A survey study by Adams and Sasse in relation to user behaviour towards passwords indicates that from a total of 139 responses received, 50% of the respondents wrote down their passwords in one form or another, while the remaining 50% left these questions blank [9]. Users tend to write down their passwords when they are forced to change them every month. The more restrictions we enforce on passwords to make them more secure, the more they are likely to be less memorable which leads to writing them down [9]. 2.2 The Impact of the Human Earlier study states that 70% of security breaches at companies occur because of actions taken by employees, either indirectly or directly [10]. In fact, Gonzalez and Sawicka stated that “ … human factors are implicated in 80-90% of organizational accidents” [1]. The CSI survey shows that insider abuse of net access increased from 42% to 59% in 2007 [11]. Fortunately, this percentage decreased in 2008 to 44% [12]. But sadly, 16.1% of respondents in the 2009 CSI survey estimated that nearly all of their losses were caused by insiders in terms of non-malicious attacks (such as careless insiders) [13].
172
A.M. Alhomoud
3 The Meaning of Attentiveness In our study we want to introduce a human factor that is related to culture, especially in Saudi Arabia; we are trying to find an English term that describes the behaviour we are targeting. In the mean time we will use the term “attentiveness”. According to the Oxford English Dictionary the word “attentiveness” (noun) is derived from the word attentive which is “... considerately attending to the comfort or wishes of others” [14]. It dates back to the 14th century and one of its meanings is “heedful of the comfort of others” [15]. To be more precise, we will try to explain what we really mean about the behaviour we are studying that we believe introduces a security risk. In Saudi culture, most people are considered to be generous and this generosity can affect information security. How is that so? In any organization (and this is seen and personally encountered) an employee that has a certain level of access may need to escalate his limited access or gain access to a resource that he or she is not allowed to have access to. A common practice is for that employee to simply ask his friend or colleague to provide him with a username and password. Although his colleague may not want to give him his password, he will still do it. Some will refuse to give out their passwords, but instead type the password in person. This raises a question: why do they refuse to give it out but type it in person? Is there any difference between the two?
4 The Initial Experiment The survey was designed according to the guidelines and rules for conducting a survey and writing a questionnaire [16], [17], [18], [19]. We designed this experiment in the English language and translated it into Arabic to ensure as accurate answers as possible and to make sure that questions were understandable. As mentioned before, this survey was meant for a larger study, but it addresses the attentiveness aspect and impact on information security. We carried out this initial survey on 43 Saudi users within Saudi organizations. The respondents’ demographic records show the major respondent ages were between 25 and 35 years old. The number of male participants was 28, to 15 female. Sixteen of the respondents held Masters degrees, three held diplomas, another sixteen held Bachelors degrees, and eight had high school certificates. We asked the 43 respondents if they had ever given any of their passwords to any one of their friends; 42% of the respondents said yes. And when asked what they would do if a colleague at work or a place of study insisted on using their computer, 44% said that they would allow him/her with some feelings of discomfort, at this stage we believe that “attentiveness” is responsible for this type of action: why would the user do something he or she is not comfortable doing, and what are the consequences to an organization’s information security?
The Effect of Attentiveness on Information Security
No 16%
173
Yes with some feelings of discomfort 44%
YYes without aany feelings of discomfort d 40%
F 1. Indication of attentiveness existence Fig.
When we asked our 43 respondents r what they would do if their colleague or friiend asked them to provide their password in order to access the internet, 13 respondeents o their password, while 27 said that they would refusee to said that they would give out wing give the password, but theyy would type it in for their colleague without him know it. Only 3 of the respondentts would refuse to expose their password or type it in.
Refuse to give out the passwo ord but typee it themseelf 63%
Give out the password 30%
Refuse to give out the password 7%
Fig. 2.. Respondents’ percentage of typing in passwords
The number of people willing to type in their password without revealing the password was more than do ouble the number of those who shared their password, and this gave us the motivation n to pursue this issue and consider how this behaviouur is affecting security.
5 Discussions and Co onclusions To our knowledge, this is th he first study that focuses on the influence of attentivenness as an important factor whiich can have great consequences on an organization’s information security. The fin ndings mentioned in the previous section encouraged us to go further with a larger exp periment to investigate the effectiveness of the attentivenness
174
A.M. Alhomoud
factor, along with other human related factors such as awareness, workload, etc. on Saudi organizations’ (private and government) information security; with considerations of gender, age and education level being major moderating factors.
References 1. Gonzalez, J.J., Sawicka, A.: A Framework for Human Factors in Information Security. In: WSEAS International Conference on Information Security, Rio de Janeiro (2002) 2. Whitman, M.E.: Enemy at the gate: threats to information security. Communications of the ACM 46, 95 (2003) 3. Hinson, G.: Human factors in information security. IsecT Ltd. (2003) 4. Kraemer, S., Carayon, P.: Human errors and violations in computer and information security: The viewpoint of network administrators and security specialists. Applied Ergonomics 38, 143–154 (2007) 5. Caminada, M., van de Riet, R., van Zanten, A., van Doom, L.: Internet security incidents, a survey within Dutch organizations. Computers & Security 17, 417–433 (1998) 6. Gehringer, E.F.: Choosing passwords: security and human factors. ISTAS 2, 369–373 (2002) 7. Carstens, D.S., McCauley-Bell, P.R., Malone, L.C., DeMara, R.F.: Evaluation of the human impact of password authentication practices on information security. Informing Science: International Journal of an Emerging Transdiscipline 7, 67–85 (2004) 8. Schultz, E.E., Proctor, R.W., Lien, M.C., Salvendy, G.: Usability and security an appraisal of usability issues in information security methods. Computers & Security 20, 620–634 (2001) 9. Adams, A., Sasse, M.A.: Users are not the enemy. Communications of the ACM 42, 46 (1999) 10. Kaplan-Leiserson, E.: People and Plans: Training’s Role in Homeland and Workplace Security. T AND D 57, 66–74 (2003) 11. CSI: Computer Crime and Security Survey 2007. Computer Security Institute (2007) 12. CSI: Computer Crime and Security Survey 2008. Computer Security Institute (2008) 13. CSI: Computer Crime and Security Survey online podcast presentation (2009) 14. Compact Oxford English Dictionary, http://www.askoxford.com 15. Merriam-Webster Dictionary, http://www.m-w.com 16. Moser, C.A., Kalton, G.: Survey methods in social investigation (1972) 17. Fink, A.: The survey handbook. Sage Publications, Thousand Oaks (2003) 18. Fink, A.: How to conduct surveys: A step by step guide. Sage Publications, Thousand Oaks (2008) 19. Berdie, D.R., Anderson, J.F., Niebuhr, M.A.: Questionnaires: design and use. Scarecrow Press (1986)
A Secured Mobile Payment Model for Developing Markets Bossi Masamila, Fredrick Mtenzi, Jafari Said, and Rose Tinabo School of Computing, Dublin Institute of Technology, Kelvin Street, Dublin 8, Dublin, Ireland [email protected], [email protected], [email protected], [email protected]
Abstract. The evolution of mobile payments continues to take place at the great speed with changing business models and technology schemes. With the omnipresent availability of mobile phone and other mobile devices, mobile payment presents an investment opportunity to developing markets. However, the success of mobile payment depends on the security of the underlying technology and the associated business model. In this paper, a mobile payment model for developing market is discussed. This mobile payment model comprises the features that can accommodate secured local payments in remote locations with limited network coverage.
1 Introduction Mobile payment can be defined as any payment transaction, which involves the use of mobile devices, such as mobile phone [1]. It provides a convenient way of payment in which a customer can conduct payment anytime anywhere for services such as bills and retailers payments; payroll services, loan disbursement and repayment. Mobile payment presents a strategic value in both developed and developing markets. However, the value of the transaction, customer experience demands and payment model are not developed equally in each market. Mobile payment model defines the framework on which economic, social and other values can be created through mobile payment services. The choices of the business model include mobile payment platform and various mobile bearer channels, suited to different market segments [2]. Most of the developing markets, bank and other financial institutions do not have the mobile payment model that allows them to deal with low saving balances, small transactions and larger number of customers [3]. This prompt a need for a system that can be accessible to every neighborhood and that can handle a large volume of low-value daily saving transactions for developing markets. With the omnipresent availability of mobile phone and other mobile devices, mobile payment has the promising future especially to the developing markets. It has the potential to overcome the financial access problem to developing markets, and extend beyond traditional banks’ coverage by accommodating unbanked communities. It will F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 175–182, 2010. © Springer-Verlag Berlin Heidelberg 2010
176
B. Masamila et al.
drive the economic system toward a cashless transaction environment by eliminating or minimizing physical cash handling, which has been a source of fraudulent or criminal activity. It will open the window for the value of money to be better utilized as cash held outside the banking system is not available for investment [3, 4, 5]. However, the success of mobile payment depends on the security of the underlying technology. As such, security can be regarded as the enabling factor for the success of mobile payment applications [6]. Mobile payment systems involve the integration of different industries such as mobile network operator, bank, Merchants, retailer, agents, and utility companies. These industries handle different information systems, vary in size, they are exposed in different security threats, and they have different security schemes. However, interconnecting these industries gives some common advantage characteristics, but the costs of poor security are often distributed [7]. In this paper, we present a secured mobile payment model for developing market. Properly designed mobile payment model can provide better security than the traditional means of payments [8]. This paper is structured as follows: Section one presents an introduction to mobile payment, section two presents the proliferation of mobile payment services, it discussed the mobile payment landscape. Section three presents security concern in mobile payment services. Section four presents the mobile payment model for developing markets and section five presents concluding remarks for this paper.
2 Proliferation of Mobile Payment Services 2.1 Introduction The spreading of mobile phone and increasing values of their applications across the world depends on the number of factors such as technology change, economic difference, technology adaptability, culture and regulations [9]. Equally, the spread and application of mobile payment across the world are not uniform. This section will give a brief discussion on the proliferation of mobile payment, starting with few cases in developed countries, which will be followed by cases from developing countries. 2.2 Mobile Payment in Developed Countries In most of the developed economies people have wide and easy access to the banking system, with an extensive usage of internet systems and credit cards. Most retail establishments have facilities that can accept both internet based and credit card transaction in addition to cash and such facilities can be easily accessed in local service areas such as taxi cabs and even parking meters [5]. For example, in Japan the giant mobile operator DoCoMo extended the functionality of the SIM card by including credit card services. By using contactless FeliCa Technology, the account represented by the chip in the phone can be charged by waving the phone in close proximity to FeliCa point of sale device. FeliCa technology is deployed in mass transit system in Japan, by NFC vendors such as Mobile Suica, Osaif-Keita system. In South Korea where the mobile
A Secured Mobile Payment Model for Developing Markets
177
market access is almost saturated, consolidations between mobile carriers and banks have been formed for different business strategies. Subscribers exploit their mobile phones to shop in virtual malls, and online boutiques and are able to use their handsets for almost any type of payment; credit card, cash, prepaid vouchers and post-paid subscribers billing [10]. 2.3 Mobile Payment in Developing Markets In developed markets, there is a large number of prepaid users who use their phone for text and voice as well as recharging handsets on the prepaid based system. These users are an ideal target for mobile payment services. They have no access to any bank, and they are not connected to the internet or credit card systems, but the can perform financial transactions that evidence their ability to purchase and activate prepaid cards for additional credit and transfer air-time credit to their friends, and comrades [5, 9] In Philippines, a middle income developing country, major mobile service provider Globe Telecoms and Smart Communications have developed a larger scale mobile banking services. Their product such as G-cash and Smart Money are used to transfer money around the country and circumventing the banking system [10, 5]. Smart is associated with Banco D’Oro in offering SmartMoney and Maestro debit card that can enable SMART client to use conventional ATM and POS devices and can be used for mobile payment services. Globe Telecoms provides G-cash product that supports local and international remittances transfers and payments with the support of over 700,000 agents countrywide. M-Pesa is the mobile banking services offered by Safaricom in Kenya [9]. In Kenya, M-Pesa services are available to subscribers with or without a bank account, and supports subscriber to deposit cash, transfer money, withdraw money through M-Pesa urgent or participating ATM network, buy Safaricom airtime, pay bills and manage M-PESA account [11]. In Tanzania, the Mobipawa account, supported by e-fulusi Africa, allow subscribers to transfer, receive, save and withdraw money as well as purchase goods and services through the use of mobile phones [12]. Most of these successful deployments have been through consolidations. Consolidation provides shared access that creates the opportunity to gain greater returns from all sorts of infrastructure investments [4]. For instance, Vodafone partnered with Safaricom in Kenya for M-Pesa product and has extended the same version to Tanzania and Afghanistan [13]. Zain and Western Union are working together to deliver mobile money transfer services in Africa and the Middle East through Zain’s Zap platform. In Philippines, Western Union service allows mobile subscribers to receive funds sent from selected Western Union Agent locations directly into their mobile phone. MoneyGram International and SMART Communications enable SMART subscribers to receive money transfer to their SMART Money account on their mobile phones [14]. Globe Telecoms has partnered with a number of financial institutions in the Philippines, in order to mitigate the legal constraints of running the financial service while holding a telecommunication license [15]. In India, the Nokia Money initiative based on Obopays platform for developing market is designed to work in partnership with multiple network operators and banks, involving distributors and merchants in a dynamic open ecosystem that will seamlessly provide the new services to customer.
178
B. Masamila et al.
With the growing mobile phone penetration rate and availability of mobile banking platforms and open source software will fuel the spread and use of mobile payment services in developing markets.
3 Related Work Mobile payment technologies are presented by [16] with the state of the art in mobile payment solution, and mobile payment principals. However, this paper has pointed out few areas that need future consideration such as security and standardization. Mcash platform presented by [15] through PostgreSQL, Database, open source WAP and SMS gateway that runs in UNIX platform. This platform is intended for money transfer that can accommodate low-income communities with limited security capabilities. However, this work can be used for developing markets, which are characterized with low-income, but it lacks the ability to support local payments. [20] presented Signet: Low-cost auditable transaction Using SIMs and Mobile phone that can securely enable in-personal transactions in developing markets. It provides a ground work for developing mobile financial transactions in a trusted environment in a close range independent of the main network. But this system does not grantee the security of authentication credentials. Initiatives on developing common standards for financial services have been around for sometime now. Security standards, forums, and organizations have been formed through merged vendors, manufactures and developers. These can be used for defining different secured environments for mobile transactions [21]. For instance, Mobile electronic Transaction (MeT) was formed by Ericsson, Motorola, and Nokia in order to develop a common framework for mobile e-business by creating a personal trusted device that integrates security and transaction applications into a mobile terminal platform [22, 23]. The Payment Card Industry Data Security Standard (PCI DSS) was jointly created by Visa International, Master-card, Discover Financial Services, JCB International and American Express to create the common industry security requirements. The PCI DSS is a multifaceted security standard that establishes comprehensive requirements for enhancing payment account data security [24, 25]. Another initiative is the creation of Secure Electronic Transaction (SET) as detailed by [24], was jointly supported by Master-card, Visa, Microsoft, and Netscape. SET was created to ensure the security of financial transactions on the Internet. SET seeks to preserve relationships between merchants and acquires as well as between payers and their bank. It concentrates on securely communicating credit card numbers to the existing financial infrastructure [8]. Though some of these standards share common aims, they have significant differences in their contents and intended use, which reflects on their implicit design goals. Some are intended to be applied in a very prescriptive manner to a limited range of information systems and available security technologies, while others are intended to be applicable to a very wide range of information system. According to [26], the lack of standards has given rise to lot of local and fragmented versions of mobile payments offered different stakeholders. This lack of common standard and unequal market segment distribution leads to lack of end-to-end security in mobile financial transactions motivate the need to address the mobile payment model for developing markets.
A Secured Mobile Payment Model for Developing Markets
179
4 Security Concerns in Mobile Payment Systems Security is the major concern in the adaption of mobile payment in developing markets. As such the adoption and wide spread application of mobile payment depends on the strength of security. The following are security concerns in mobile payment systems for developing markets: •
•
•
•
Larger network that have emerged as the result of consolidation are prone to security implications. Applications for mobile payment solutions are complex in nature with mismatching set of possibilities that are caused by the involvement of multiple players [27, 28, 29]. The lines differentiating these players have become blurred with the crossover of mobile phone. The benefit of consolidation and sharing infrastructure are apparent, but the costs of poor security are often distributed. Proliferation of mobile payment technologies has led to lack of cohesive technology standards that can provide a universal mode of payment. This lack of common standard creates local and fragmented version of mobile payment offered by different stakeholders, which leads to lack of end-to-end security [26]. In developing markets, mobile payment service providers depend on agents for customer acquisition and for managing liquidity. They access customer’s sensitive information such as the user name, mobile number and other credentials that are used for identification and authentication purpose. These agents are not well equipped to preserve customer’s sensitive information and can easily lead to information leakage. Any loss of control over protected or sensitive information by service providers is a serious threat to business operations as well as, potentially, customer security [6, 30]. With the current technology and the wide spread of mobile applications, mobile devices that use mobile payments and users cause major risk to the security of mobile payment. Mobile devices can be easily infected with virus that could perform unauthorized payments or send user information such as PIN codes through close range communication technologies such as bluetooth, Radio frequency identification (RFID) and Near Field Communication (NFC) [31].
For wide application and usability, a secured mobile payment model for developing market, must address these security concern.
5 Mobile Payment Model for Developing Markets According to [26] mobile payment solution can be classified according to the type of payment effected and based on the technology adopted. The combination of these classifications presents three basic payment models; Bank account based model, Credit card based model and Telecommunication Company billing based model. Where in each model, the mobile number is linked with the payment solution. In this paper, we suggest the bank account payment model capable of local payment services. In this model mobile network operator or mobile banking switch operators are allowed to run the payment and account management platforms on behalf of the banks (See Fig. 1.).
180
B. Masamila et al.
Fig. 1. Mobile Payment Platform (Author, 2010)
Mobile payment platform remains in the realm of the banks under appropriate subcontracting agreements with mobile operators. In this model, the user could order products and services from one or mode service providers or businesses who will then contact, the bank through the mobile network operator for verification involving the user and amount of purchase. This scenario could be used for peer-to-peer business-toconsumer and business-to-business mobile payment supporting both micro-payment and macro-payments. With current technology advancement, close range identification and authentication between the customers, agents and merchants can be achieved through bluetooth, Radio Frequency Identification (RFID) and Near Field Communication (NFC) [32]. In this model, bluetooth technology is preferred as handsets with bluetooth capability are becoming a commonplace in the developing markets, where RFID and NFC are well pronounced in developing markets. Bluetooth with solutions such as Signet [20] will extend mobile payment remote parts of developing markets. This local communication should address identification, authentication and confidentiality issues between the users, agents and merchants as any failure of security in these three services may lead to a serous business threat.
6 Conclusion Mobile payment in developing market are yet to be adopted in large scale though handsets, network operators and mobile payment technologies are all in place. The penetration of mobile phone in developing market and the wide application of mobile banking makes mobile payment an attractive market. However, common global standards for interoperability of devices and security of payments are essential for success of mobile payment systems. Local payments with close range technologies such as Bluetooth will speed-up the usage of mobile payment in developing markets.
A Secured Mobile Payment Model for Developing Markets
181
References 1. Microsoft and M-Com: Mobile Payments-Delivering Compelling Customer and Shareholder Value through a Complete, Coherent Approach, http://www.mcom.co.nz/assets/sm/263/12/ M-Comandmicrosoft-MobilePaymentWhitePaper.pfd 2. Richard, B., Alemayehu, M.: Developing E-banking Capabilities in a Ghanaian Bank: Preliminary Lessons. Journal of Internet Banking and Commerce (2006) 3. Christen, B., Mas, I.: It’s time to address the microsavings challenge, scalably. Enterprise Development and Microfinance 20(4), 274–285 (2009) 4. Prahalad, C., Hammond, A.: Serving the world’s poor, profitably. Harvard Business Review 80(9), 48–59 (2002) 5. Wishart, N.: Micro-payment systems and their application to mobile networks. An infoDev Report, The International Bank for Reconstruction and Development. The World Bank, Washington (2006) 6. Pickens, M., Porteous, D., Rotman, S.: Scenarios for Branchless Banki in 2020. Technical report, CGAP and DFID (2009) 7. Jenkins, B.: Developing Mobile Money Ecosystems. IFC and Harvard Kennedy School (2008) 8. Asokan, N., Janson, P., Steiner, M., Waidner, M.: The state of the art in electronic payment systems. IEEE Computer 30(9), 28–35 (1997) 9. Donner, J.: Research approaches to mobile use in the developing world: A review of the literature. The Information Society 24(3), 140–159 (2008) 10. Porteous, D.: The enabling environment for mobile banking in Africa. Bankable Frontiers Associates, Boston, USA (2006) 11. Ivatury, G., Mas, I.: The early experience with branchless banking. Consultative Group to Assist the Poor, CGAP (2008) 12. E-fulusi.: Products and Services-Mobipawa (2007), http://www.mobipawa.co.tz/mobipawa 13. Camner, G., Sjoblom, E.: Can the success of M-PESA be repeated? (2009), http://www.valuablebits.com/?p=430 14. Mostafaa, T., Mohsen, A.K., Tom, J.P.: Micro-Payment Systems and their application to Mobile Networks (2006), http://www.infodev.org/en/Publication.43.html 15. Mirembe, D., Kizito, J., Tuheirwe, D., Muyingi, H.: Model for Electronic Money Transfer for Low Resourced Environments: M-Cash. In: Proc. Third International Conference on Broadband Communications, Information Technology & Biomedical Applications, pp. 389–393 (2008) 16. McKitterick, D., Dowling, J.: State of the art review of mobile payment technology. TCD Computer Science Technical Reports, Citeseer (2003) 17. El-Masri, S., Suleiman, B.: A Framework for Providing Mobile Web Services. In: The Second International Conference on Innovations in Information Technology (IITâTM05), Citeseer (2005) 18. Vijay, S.A.: Mobile Application Security Framework for the Handheld Devices in Wireless Cellular Networks. In: Wireless World Research Forum (WWRF), Helsinki, Finland (2007) 19. Beach, A., Gartrell, M., Han, R.: Solutions to Security and Privacy Issues in Mobile Social Networking. In: IEEE International Conference on Computational Science and Engineering, vol. 4, pp. 1036–1042 (2009)
182
B. Masamila et al.
20. Paik, M., Subramanian, L.: Signet: low-cost auditable transactions using SIMs and mobile phones. SIGOPS Oper. Syst. Rev. 43(4), 73–78 (2009) 21. Antovski, L., Gusev, M.: M-payments. In: Proceedings of the 25th International Conference on, Information Technology Interfaces, ITI 2003, pp. 95–100 (2003) 22. Susanna, F., Bengt, S.: Secure electronic transactionsâ”The mobile phone evolution continues. Ericsson Review 4, 162–167 (2001) 23. Ding, M., Hampe, J.: Changing Technological and Business Landscapes for mPayment: Is Local Mobile Payment Emerging as the Winner. In: 8th International Workshop on Mobile Multimedia Communications, Munich, Germany (2003) 24. Rowlingson, R., Winsborrow, R.: A comparison of the Payment Card Industry data security standard with ISO17799. Computer Fraud & Security 2006(3), 16–19 (2006) 25. Christian, J.M.: PCI DSS and Incident Handling: What is requred before, during and after an incident. Technical report, SANS Institute (2009) 26. Carr, M.: Mobile Payment Systems and Services: An Introduction. Mobile Payment Forum, Citeseer (2007) 27. Nie, J., Hu, X.: Mobile Banking Information Security and Protection Methods. In: International Conference on Computer Science and Software Engineering (2008) 28. Mallat, N., Rossi, M., Tuunainen, V.: Mobile banking services. Communications of the ACM 47(5), 42–46 (2004) 29. Varshney, U.: Supporting Group-Oriented Mobile Services Transaction, pp. 10–15. IEE Computer Society (2007) 30. Krugel, G., Solin, M., Desai, S., Paul, L., White, A.: Mobile Money for the Unbanked; Annual Report 2009. Technical report, GSMA (2009) 31. Wang, P., Gonzalez, M., Hidalgo, C., Barabasi, A.: Understanding the spreading patterns of mobile phone viruses. Science 324(5930), 1071 (2009) 32. Dellutri, F., Me, G., Strangio, M.: Local authentication with bluetooth enabled mobile devices. In: Proceedings of the Joint International Conference on Autonomic and Autonomous Systems and International Conference on Networking and Services, p. 72. IEEE Computer Society, Los Alamitos (2005)
Security Mapping to Enhance Matching Fine-Grained Security Policies Monia Ben Brahim, Maher Ben Jemaa, and Mohamed Jmaiel ReDCAD Laboratory, University of Sfax, National School of Engineers of Sfax, BP 1173, 3038 Sfax, Tunisia [email protected], {maher.benjemaa,Mohamed.Jmaiel}@enis.rnu.tn http://redcad.org/
Abstract. In the heterogeneous environment of Web service, it is common that the data processed by a service consumer and a service provider present several syntactic heterogeneities such as data name heterogeneity and data structure heterogeneity. However, current approaches of security policy (SP) matching don’t consider such heterogeneities that may exist between the protection scopes of fine-grained SPs. In this paper, we show how this can lead to wrong matching results and propose a security mapping approach to enhance the correctness of matching results when dealing with fine-grained SPs. Keywords: Web service, message security, fine-grained security, security policy, policy matching, security mapping, mediation policy.
1 Introduction Message security is one of the major concerns when using web services. That’s why several new specifications appeared to enable security aspects for web service messages. The two OASIS standards WS-Security [1] and WS-SecurityPolicy [2] are the most important of these specifications. WS-Security standard defines mechanisms that enhance SOAP to protect messages. WS-SecurityPolicy language is built on top of WS-Policy [3] framework and defines a set of policy assertions that consider the WS-Security message security model. Message protection mainly means the confidentiality and the integrity of data transmitted through the message. Confidentiality and integrity can be assured by applying security mechanisms such as encryption and digital signature, based on XML Encryption [4] and XML Signature [5] specifications. An important characteristic of encryption and signature is their high flexibility. They can be applied to arbitrary parts of the SOAP message, and even different parts of a message can be encrypted (or signed) using different encryption (or signature) algorithms and keys. We call a security policy that describes alike fine-grained security a fine-grained security policy. Along with its functional description, a WS-Security-enabled WS provides its security policy description as well. Similarly, a security aware service requestor (a client F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 183–196, 2010. © Springer-Verlag Berlin Heidelberg 2010
184
M. B. Brahim, M. B. Jemaa, and M. Jmaiel
application) has its own security policy that specifies its preferred policy to protect the exchanged messages with the service. Therefore, when searching a WS, the requestor, given a set of functional equivalent service implementations, it only selects a service whose security policy is compatible with its policy. To automate the check of compatibility between two security policies, WS-Policy framework defines a mechanism called policy intersection. Because it is a purely syntactic approach, WS-Policy intersection has several limitations that sometimes lead to wrong results when computing the compatibility between two policies [10]. To overcome the lack of semantics, some researchers tried to augment WS-Policy (including WS-SecurityPolicy) with semantics either using ontologies [15], [17], [19], or by introducing an entailment relation that reflects the semantics of assertions and policies [10]. These semantics-based approaches greatly improve the effectiveness and accuracy of the policy intersection mechanism. However, the semantic policy matching can’t be directly applied when matching two fine-grained security policies. In fact, while the scopes of protection (data to be protected) of two fine-grained SPs to be matched are semantically equivalent, many syntactic heterogeneities may exist between them. This is due to the heterogeneous environment of Web services where it is common that the service consumer and the service provider process syntactically heterogeneous data. Particularly, in Web service compositions, it is expected that the data spreading between a Web service and its pre-Web services and its post-Web services in the process flow present many syntactic heterogeneities. The semantics added to WS-SecurityPolicy concern the message security concepts and the semantic relations between them such as generalization and specification relations, and not the syntactic relations between the protection scopes in the involved security policies. Thus, applying the security policy matching mechanisms offered by the semantic approaches without considering the syntactic heterogeneities of data can also lead to wrong matching results. In this paper, we propose a security mapping process to resolve the syntactic heterogeneities between the protection scopes of two fine-grained SPs. This security mapping process generates a mediation policy that can be matched with one of the original policies, without any heterogeneity issue. In this way, we improve the correctness of policy matching when dealing with fine-grained SPs. Besides, by providing a security mediation technique, our approach supports the loose-coupling offered by SOA and Web service technology. The paper is organized as follows. In the next section, we present the WSSecurityPolicy standard and the current semantic approaches based on it. We also explain how the presented approaches allow the specification of fine-grained SPs and how the SPs can be attached to the service and request descriptions. In section 3 we exhibit the importance of a security mapping step before matching fine-grained SPs. Section 4 introduces in detail the security mapping approach. An illustrative example is presented in Section 5, followed by conclusion and future work.
2 Security Policies Web service providers and requestors specify their message related security requirements and capabilities using policies. WS-SecurityPolicy has recently become a
Security Mapping to Enhance Matching Fine-Grained Security Policies
185
standard for specifying policies related to message security. Moreover, based on WSPolicy (including WS-SecurityPolicy) which has a strong industry support, several approaches propose extending WS-Policy with specification of semantics as well as mechanisms for matching the semantics-enriched policies. 2.1 WS-SecurityPolicy WS-SecurityPolicy standard defines a set of security policy assertions for use with the WS-Policy framework. These assertions describe how SOAP messages are to be secured according to the WS-Security protocol. Typically, the published policies are compliant with the WS-Policy normal form. In the normal form, An All operator represents a policy alternative and comprises the alternative assertions. Policy alternatives are grouped into an ExactlyOne operator. WS-SecurityPolicy assertions mainly describe: the token types for security tokens, the cryptographic algorithms, and the scope of protection which means the parts of the SOAP message that shall be encrypted or signed. WS-Policy also defines the Policy intersection algorithm to match two policies written in normal form. According to Policy intersection, matching two policies is reduced to finding two compatible alternatives. A policy alternative A1 is compatible with a policy alternative A2 if for each assertion in A1, there exists a compatible assertion in A2 and vice versa. The assertion matching is a purely syntactic comparison of the Qualified Names of the assertions without considering domain-specific (such as security) semantics. A policy matching process that relies only on this domain-independent matching of assertions is prone to get wrong results [10]. 2.2 Semantic Approaches Despite its lack of semantics, WS-Policy framework is widely accepted in the industry. To not lose this industry support and to overcome the lack of semantics, many approaches propose enhancing WS-Policy with domain knowledge. In [15], the authors add constructs to WS-Policy language to express semantic information. A semantic approach is also investigated in [17] where the authors enhance the policies of WS-Policy with semantics by creating the policy assertions based on terms from an ontology. In [17], a semantic policy matching algorithm is also proposed. The approach proposed by Garcia et al. [19] is similar to the one proposed in [17], but it mainly deals with integrity and confidentiality of message exchange between a web service and a service client. Garcia et al. create a message security ontology as foundation to support WS-Policy with semantics. In [10], the author explains the weaknesses of WS-policy intersection and presents a new approach for checking the compatibility of policies. His solution is based on an entailment relation that reflects the semantics of assertions and policies. Concerning message security, the above mentioned approaches enable more expressive specification of security policies and more accurate policy matching than the WS-Policy framework does. However, as we will explain it in the rest of this paper, when dealing with fine-grained SPs, a security mapping process is needed before applying the policy matching mechanisms proposed by these approaches.
186
M. B. Brahim, M. B. Jemaa, and M. Jmaiel
2.3 Fine-Grained Security Policy WS-Security protocol [1] allows a coarse-grained encryption and signing of the whole message body. Moreover, based on the high flexibility offered by XML Signature and XML Encryption, it provides support for fine-grained signatures and encryption. In fact, the data to be protected in a SOAP message can be an arbitrary XML element or XML element content. This allows protecting only the critical data in the message instead of protecting the whole body. In general, this allows to distinguish between diferent body parts with different sensitivities (i.e. at different security levels) and protect them with different encryption and signature algorithms and keys. Any security policy language that considers the WS-Security message security model provides fine-grained security description. For instance,WS-SecurityPolicy standard defines the protection assertions EncryptedElements and SignedElements to specify arbitrary elements in the message that require confidentiality protection and integrity protection respectively. The message nodes to be protected are identified using XPath [6] expressions. Below, we give two examples of a fine-grained SP. The one on the left is written in WS-SecurityPolicy language, and the one on the right represents a fine-grained SP which is semantics-enriched in the way proposed by Garcia et al [19].
2.4 Policy Attachment In the following, we explain the attachment of security policies to service and request descriptions. Service policy attachment. A service provider can attach his (WS-Policy based) security policies to the Web service description. Policy documents can be attached to the WSDL elements at the endpoint level, at operation level, and at message level. The WS-PolicyAttachment [7] specification indicates how to associate policy documents with WSDL elements. The effective security policy for a SOAP message results from the combination of the corresponding endpoint, operation, and message policies. Client policy attachment. In a security aware service discovery, the service requestor specifies its security policy along with its functional request. Dynamic service discovery usually requires semantic descriptions of the service as well as of the service
Security Mapping to Enhance Matching Fine-Grained Security Policies
187
request. If the requestor wants to apply a fine grained security to its data to be exchanged with the service, it should attach its fine-grained security policy to a detailed description of its data. For instance, a client can use WSDL with semantic annotations (SAWSDL) to describe the service it expects and attach the fine-grained security policy documents to the appropriate WSDL elements. In fact, the use of SAWSDL or SAWSDL based languages to describe the service request has been proposed in many approaches ([12], [16], [9]) in recent literature.
3 The Need for Security Mapping The semantic web service matching returns a set of services that satisfy the functional requirements of the service requestor. Among these candidate services, only a service whose SP is compatible with the client policy can be chosen. Thus, client and service policies must be matched to check their compatibility. Each one of the fine-grained SPs has its own scope of protection. Although these protection scopes are semantically equivalent data, they may be syntactically different. The syntactic heterogeneities1 include data name heterogeneity, data structure heterogeneity, as well as Inputs/Outputs number heterogeneity2. Suppose that a service request include the following Input: Name Card
Type Card CType CNumber
Description The credit card information which includes the card type and the card number
The service request also specifies a fine-grained SP with the WS-SecurityPolicy language. The protection assertion of this policy is: <sp:EncryptedElements> <sp:XPath> /.../Card
In the other hand, suppose that a candidate service description includes the following Inputs:
Input1 Input2 1
Name Card CNumber
Type String String
Description The credit card type The credit card number
We ignore the operation name heterogeneity because the operation name is an atomic data and the related heterogeneity can be resolved by semantic matching only. 2 We consider both WSDL 1.1 and WSDL 2.0.
188
M. B. Brahim, M. B. Jemaa, and M. Jmaiel
Let suppose that the protection assertion of the service policy is identical to the protection assertion of the client policy and that the other security assertions of the two policies are compatible to each other. According to WS-Policy intersection, the two protection assertions are also compatible. Thus the security policies of requestor and provider are compatible and the candidate service can be chosen. Of course such a matching yields a false positive result because the requestor SP requires that both the card number and the card type must be encrypted, while the service SP indicates that only the card type is encrypted in the incoming SOAP message. In fact, WS-Policy intersection completely ignores the parameters of assertions (i.e. attributes and child elements). So, even if the protection assertion of the service policy was empty, the two protection assertions would remain compatible according to WS-Policy intersection. Even if we were to use semantic approaches to represent and match the two protection assertions, it would also lead to a false positive result. In fact, when dealing with protection scope comparison, semantic policy matching just does a string matching to the XPath expressions. This can also yield false negative results when the protection scopes have the same content but different names or paths. Let examine another counter-example where semantic policy matching (and of course WS-Policy intersection also) doesn’t return a correct result. Suppose now that a service request includes the following Inputs:
Input1 Input2
Name CType CNumber
Type String String
Description The credit card type The credit card number
The requestor indicates, through a fine-grained SP, that it is willing to protect CType Input with TDES encryption algorithm and CNumber Input with AES-256 encryption algorithm. Furthermore, suppose that a candidate service description includes the following Input: Name Card_Info
Type String
Description The credit card information which is a concatenation between the card type and the card number
In his SP, the candidate service provider requires that the CardInfo Input must be encrypted using the AES-256 cipher. Actually, since AES-256 cipher is stronger than TDES cipher [14], the service and client policies are compatible. However, semantic policy matching considers that the two policies are not compatible because the matcher doesn’t consider the fact that cardInfo is the concatenation of CType and CNumber. Indeed, semantic approaches rely on a description (e.g. security ontology) of semantic relationships between the security concepts. But this security description doesn’t include any information about the scopes of protection.
Security Mapping to Enhance Matching Fine-Grained Security Policies
189
To sum up, we are confronted with the following limitations of WS-Policy intersection and semantic policy matching: − WS-Policy Intersection is a syntactic approach and completely ignores protection assertions parameters − Semantic policy matching is useful only when matching coarse-grained SPs or when the protection scopes of fine-grained SPs are semantically and syntactically identical Therefore, to enhance WS-Policy intersection and to effectively benefit from the semantic reasoning capability provided by the semantic policy matching, we must resolve the syntactic heterogeneities between the protection scopes of fine-grained security policies. In the following section, we present a security mapping process that resolves these heterogeneities.
4 The Security Mapping Approach Generally, a candidate service (more exactly a service operation) has two security policies: an inbound security policy (ServiceInbSP) which is attached to its incoming message, and an outbound policy (ServiceOutbSP) which is attached to its outgoing message. Similarly, a service request specifies two security policies: the request inbound security policy (RequestInbSP) which is related to the incoming message of the expected service, and the request outbound security policy (RequestOutbSP) which is related to the outgoing message of the expected service. So, a security aware service selection involves matching, in one hand, the RequestInbSP with the ServiceInbSP and, on the other hand, the RequestOutbSP with the ServiceOutbSP. To resolve the heterogeneity issues discussed in the previous section, before matching two fine-grained SPs, we project one of them on the protection scope of the other policy. We call this projection process a security mapping. The security mapping generates a SP that we call a mediation policy. The mediation policy can then be matched with the other SP. The security mapping is based on the XML schema mapping between the two protection scopes. For the service discovery phase, one of the two policies must be projected regardless it is the client policy or the service policy. However, for the service invocation phase, this differs. In fact, in the service invocation process, it is expected that a trusted mediator will transform, based on the XML schema mapping, the secure SOAP messages exchanged between the client and the service. For this reason, from the selection step, we distinguish between two kinds of security mapping: the clientto-service security mapping and the service-to-client security mapping. In the former mapping, it is the RequestInbSP that must be projected on the Inputs of the candidate service to generate a mediation SP that can be matched with the ServiceInbSP. While, in the latter mapping, it is the ServiceOutbSP that must be projected on the Outputs specified in the service request to generate a mediation SP that can be matched with the RequestOutbSP. Fig. 1 illustrates the interaction between the client-to-service security mapping process and the policy matching process.
190
M. B. Brahim, M. B. Jemaa, and M. Jmaiel
Fig. 1. Interaction between the client-to-service security mapping and the policy matching
4.1 Security Mapping Algorithm The security mapping process is based on the mapping between the XML schemas of the protection scopes. Given one target schema and several source schemas, the XML schema mapping produces a set of mapping links that describe the way instances of the target schema are derived from the instances of the source schemas [18]. In the case of client-to-service security mapping, the source schemas for the XML schema mapping are the request Inputs’ schemas and the target schema is a service Input’s schema. While in the case of service-to-client security mapping, the source schemas are the service Outputs’ schemas and the target schema is a request Output’s schema. It is efficient that the XML schema mapping exploits the semantic matching between request and service Inputs and Outputs since this semantic matching has already been done in the semantic service matching phase [13]. In the following, we detail the algorithm of the client-to-service security mapping. The service-to-client mapping can be realized analogously. A (non-optimized) version of the client-to-service security mapping is shown in Fig. 2. The generation of the mediation SP is made iteratively for each service Input (line 7). The schema mapping between the XML schema of the service Input as target schema and the XML schemas of request Inputs as source schemas produces a set of mapping links between each data element of the service Input and the data elements of the request Inputs (lines 9-10). A mapping link describes how a concrete data element of the service Input can be fulfilled by the data elements of the request Inputs. Based on this mapping link and on the requestInbSP, the basicSecurityMapping algorithm called in line 11 generates the mediation SP on the involved data element of the service Input. In fact, the basic security mapping determines how the mediation SP on this data element can be obtained from the client SP on the related data elements of the request Inputs. The basic security mapping operations corresponding to several kinds of schema mapping links are detailed in the next section.
Security Mapping to Enhance Matching Fine-Grained Security Policies
191
Fig. 2. Client-to-service security mapping algorithm
Then, the mediation SP on the service Input is obtained by merging all mediation SPs related to the data elements of this Input (line 14). Similarly, the mediation SP on the incoming message of the service is the result of merging all mediation SPs related to the service Inputs (line 16). The merge operation is a policy processing operation defined in WS-Policy. It consists in combining sub-policies written in the normal form to construct a single policy. Besides, we define a new policy processing operation that we call clean. This operation improves the writing form of a policy resulting from a merge operation. This is done by eliminating all repetitions that may occur after the merge step. For instance, if two SPs, related each one to a service Input, are identical, merging these two SPs will lead to a resulting SP that contains repetitive assertions. The role of the clean operation is to eliminate these repetitions. Most XML schema mapping approaches (such as [13] and [8]) apply a top-down strategy to generate mapping links. At the top level, correspondences between complex elements are established. At the bottom level, finer level matching is done to identify correspondences among simple elements inside each pair of compatible
192
M. B. Brahim, M. B. Jemaa, and M. Jmaiel
complex elements. A possible optimization of the security mapping algorithm is to avoid useless finer level schema mappings between matched complex elements, whenever the SP on the source element is coarse-grained. For instance, given that the two complex schema elements shown in Fig. 3 are matched and that SPPerson (the security policy on Person data element) is coarse-grained, we can deduce the mediation SP on Employee data element and it is not necessary to refine the mapping process to identify correspondences among the elements inside Person and Employee schemas.
Fig. 3. Two matched complex schema elements
4.2 Basic Security Mapping Operations The aim of this section is to define the basic security mapping operations corresponding to several kinds of XML schema mapping operations. According to [8], there exist various possible transformation operations between two XML schemas elements or attributes. The most common are connect, rename, concatenate, split, union, and selection: − The connect operation defined as Connect: t= connect (s), generates a construction3 t which has the same content and label as s without any modification. In this case, the security policy on the construction t SPt = SPs. − The rename operation defined as Rename: t= rename (s), generates a construction that is the same as the construction s, but with a different name t. In this case, SPt = Keep (SPs). The keep function preserves the same SP on s, but takes into account the linguistic differences between s and t. − The concatenate4 operation defined as Concatenate: t = concatenate (s 1 ... s i ), generates a construction t whose value is obtained by concatenating s 1 ... s i values. In this case, SPt = Merge (MaxConf (SPs1… SPsi), MaxInteg (SPs1… SPsi)). The MaxConf function returns the strongest confidentiality policy among SPs1…SPsi. While, the MaxInteg function returns the strongest integrity policy among SPs1… SPsi. We rely on [14], to compare confidentiality mechanisms and integrity mechanisms. 3 4
Construction refers to schema elements or attributes. Called also merge operation. But we choose here the concatenate appellation to avoid any confusion with the merge policy operation.
Security Mapping to Enhance Matching Fine-Grained Security Policies
193
− The split operation defined as Split: (t1...ti) = split criteria (s), generates the constructions t1...ti by splitting the construction s, respecting to a separation criterion. An example of separation criterion is “white space” in the case of strings. Therefore, SP (t1. . .ti) = Keep (SPs). − The union operation defined as Union: t = union (s1...si), generates a construction t whose content is the union of si values. In this case, SPt = Merge (SPs1...SPsi). The merge operation combines the policies SPsi to form a single security policy. − The selection operation defined as Selection: t = selectP (s) (where P is a predicate), generates a construction whose content is the part of content of s that satisfies the predicate P. In this case, SPt = Get (SPs, t_access_path). The get operation finds, based on the access path of t in s, the SP defined on t in the larger SP defined on s. When the source and target XML schemas are large and complex and the source data security policy is very fine-grained, many combinations of some of the abovementioned basic operations will be necessary to realize a complete security mapping as detailed in the previous section.
5 An Illustrative Example In this section, we present an example to clarify how the mediation SP is generated by the security mapping process. Doing so, we assume that a service requestor is searching for a pharmacy Web service. The service request specifies an Input which is an Eprescription. Fig. 4 shows the XML schema of the E-prescription in the form of a schema graph. The structure of the E-prescription is inspired from [11]. The rquestInbSP concerning the confidentiality and the integrity of the E-prescription is as follows5:
Fig. 4. The schema graph of the E-prescription Input described in the service request 5
For a simplicity reason, we omit many details of a message security policy.
194
M. B. Brahim, M. B. Jemaa, and M. Jmaiel
− Encrypt E-prescription/Prescriber/ LName and E-prescription/Prescriber/Speciality with AES-256 AND − Encrypt E-prescription/DrugList/Drug/Name with TDES AND − Sign E-prescription/DrugList, E-prescription/PatientInfo, and E-prescription/InsuranceInfo. Moreover, we assume that a candidate service specifies an E-prescription Input whose schema graph is shown in Fig. 5. Due to space constraints, we will detail only the security mapping corresponding to two XMl schema mapping links. The first schema mapping link is: DoctorInfo = union(concatenate(select[1](Prescriber), select[2](Prescriber)), select[3](Prescriber)). So: SPDoctorInfo = Merge(Merge(MaxConf (Get(SPPrescriber, Prescriber/FName), Get(SPPrescriber, Prescriber/LName)), MaxInteg(Get(SPPrescriber, Prescriber/ FName), Get(SPPrescriber, Prescriber/LName))), Get(SPPrescriber, Prescriber/ Speciality)). Since there is no integrity requirements specified by the service requestor on the FName and LName data elements, and given that the AES-256 cipher is stronger than the TDES cipher, we get: SPDoctorInfo = Merge(Get(SPPrescriber, Prescriber/LName), Get(SPPrescriber, Prescriber/Speciality)). From this we can deduce the mediation SP on DoctorInfo data element, which is: − Encrypt E-prescription/Prescriber/DoctorInfo/FullName with AES-256 AND (Med1) − Encrypt E-prescription/Prescriber/DoctorInfo/Speciality with AES-256 (Med2) The second schema mapping link is: DrugList = connect(E-prescription/DrugList). So: SPDrugList = SPE−prescription/DrugList. This means that the mediation SP on the DrugList data element is: − Sign E-prescription/DrugList
(Med3)
The security mapping process continues in this way and at the end, the Clean of the Merge of all (Medi) security policies will lead to the following mediation SP: − Encrypt E-prescription/Prescriber/DoctorInfo/FullName and E-prescription/ Prescriber/DoctorInfo/Speciality with AES-256 AND − Encrypt E-prescription/DrugList/Drug/Description/Designation with TDES AND − Sign E-prescription/DrugList, E-prescription/PatientInfo, and E-prescription/ InsuranceDetails. The obtained mediation SP can then be matched with the serviceInbSP.
Security Mapping to Enhance Matching Fine-Grained Security Policies
195
Fig. 5. The schema graph of the E-prescription Input described by the candidate service
6 Conclusion and Future Work Current security policy matching approaches don’t return correct results when matching fine-grained SPs. This is due to syntactic heterogeneities of the protection scopes. To resolve this problem, we presented in this paper a security mapping approach that relies on the generation of a mediation SP. We distinguished between two kinds of security mapping: the client-to-service security mapping that should be applied when the SPs concern the incoming messages of the service, and the service-to-client security mapping which is suitable for SPs attached to the service’s outgoing messages. Our approach is useful when loosely coupled service requestor and service provider process heterogeneous data. It is also of a particular interest in service compositions where the heterogeneity of the data flow is inevitable. We also gave an idea on how to attach the consumer SP on its service request and we expect that a security mapping-based matching of SPs can be easily integrated in semantic service discovery approaches, especially those doing the concrete data mapping as a part of the service matching process itself (like the work presented in [16]). The implementation of the security mapping process and its integration in securityaware service discovery, composition, and invocation will be in the focus of our future work.
References 1. WS-Security, http://www.oasis-open.org/committees/wss 2. WS-SecurityPolicy, http://docs.oasis-open.org/ws-sx/ ws-securitypolicy/ 3. WS-Policy, http://www.w3.org/TR/ws-policy/ 4. XML Encryption, http://www.w3.org/Encryption/2001/
196 5. 6. 7. 8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18. 19.
M. B. Brahim, M. B. Jemaa, and M. Jmaiel XML Signature, http://www.w3.org/Signature/ XPath, http://www.w3.org/TR/xpath20/ WS-PolicyAttachment, http://www.w3.org/TR/ws-policy-attach Boukottaya, A., Vanoirbeek, C.: Schema matching for transforming structured documents. In: DocEng ’05: Proceedings of the 2005 ACM Symposium on Document engineering, pp. 101–110. ACM, New York (2005) Chabeb, Y., Tata, S., Ozanne, A.: Yasa-m: A semantic web service matchmaker. In: Proceedings of the IEEE International Conference on Advanced Information Networking and Applications. IEEE Computer Society, Los Alamitos (2010) Hollunder, B.: Domain-specific processing of policies or: WS-Policy intersection revisited. In: ICWS 2009: Proceedings of the IEEE International Conference on Web Services, pp. 246–253. IEEE Computer Society, Los Alamitos (2009) eHealth Initiative: Executive summary–electronic pre-scribing: Toward maximum value and rapid adoption, http://www.ehealthinitiative.org/initiatives/erx/document.asp Klusch, M., Kapahnke, P., Zinnikus, I.: Hybrid adaptive Web service selection with SAWSDL-MX and WSDL-Analyzer. In: Aroyo, L., Traverso, P., Ciravegna, F., Cimiano, P., Heath, T., Hyvönen, E., Mizoguchi, R., Oren, E., Sabou, M., Simperl, E. (eds.) ESWC 2009. LNCS, vol. 5554, pp. 550–564. Springer, Heidelberg (2009) Lécue, F., Salibi, S., Bron, P., Moreau, A.: Semantic and syntactic data flow in Web service composition. In: ICWS ’08: Proceedings of the 2008 IEEE International Conference on Web Services, pp. 211–218. IEEE Computer Society, Los Alamitos (2008) Tosic, V., Erradi, A., Maheshwari, P.: On extending WS-Policy with specification of XML Web service semantics. In: CEA’07: Proceedings of the 2007 annual Conference on Interna-tional Conference on Computer Engineering and Applications, pp. 407–412. World Scientific and Engineering Academy and Society (WSEAS), Singapore (2007) Ono, K., Nakamura, Y., Satoh, F., Tateishi, T.: Verifying the consistency of security policies by abstracting into security types. In: ICWS’07: Proceedings of the 2007 IEEE International Conference on Web Services, pp. 497–504. IEEE Computer Society, Los Alamitos (2007) Tran, V.X., Puntheeranurak, S., Tsuji, H.: A new service matching definition and algorithm with SAWSDL. In: Proceedings of the third IEEE International Conference on Digital Ecosystems and Technologies, pp. 371–376. IEEE Computer Society, Los Alamitos (2009) Verma, K., Akkiraju, R., Goodwin, R.: Semantic matching of Web service policies. In: Proceedings of the Second Workshop on Semantic and DynamicWeb Processes, pp. 79–90 (2005) Xue, X.: Automatic Mapping Generation and Adaptation for XML Data Sources. PhD thesis, Université de Versailles (2006) Zuquim Guimaraes Garcia, D., Beatriz Felgar de Toledo, M.: Web service security management using semantic web techniques. In: SAC’08: Proceedings of the 2008 ACM Symposium on Applied Computing, pp. 2256–2260. ACM, New York (2008)
Implementation and Evaluation of Fast Parallel Packet Filters on a Cell Processor Yoshiyuki Yamashita1 and Masato Tsuru2 1
2
Saga University, Honjyo 1, Saga, 840-8502 Japan [email protected] Kyushu Institute of Technology, Kawazu 680-4, Iizuka, 820-8502 Japan [email protected]
Abstract. Packet filters are essential for most areas of recent information network technologies. While high-end expensive routers and firewalls are implemented in hardware-based, flexible and cost-effective ones are usually in software-based solutions using general-purpose CPUs but have less performance. The authors have studied the methods of applying code optimization techniques to the packet filters executing on a single core processor. In this paper, by utilizing the multi-core processor Cell Broadband Engine with software pipelining, we construct a parallelized and SIMDed packet filter 40 times faster than the naive C program filter executed on a single core.
1
Introduction
Packet filters basically inspect the header and/or payload of each incoming packet and, accordingly, perform appropriate actions (pass, discard, logging, modification, etc.) on the packet based on a given filter rule (a set of filter patterns). Packet filters are essential for network traffic management and security management, and so are implemented in a variety of systems and devices, including not only IP routers and firewalls, but also various types of networked information equipment. Software-based packet filters on general-purpose CPUs are cost-effective and flexible, but are generally relatively slow, whereas hardwarebased packet filters (e.g., packet filters using ASIC or FPGA [7,13]) are fast, but expensive and less flexible. Recently, the rapid growth of network bandwidth has led to the requirement for high-speed packet filters. On the other hand, emerging applications of packet filters require much more scalability and flexibility in handling filter rules, which should be easily modifiable in response to changes in circumstances or requirements. In order to realize packet filters that enable both flexibility and high-speed operation in a cost-effective manner, it is of practical importance to make software-based packet filters fast enough that
This work was supported in part by Hitachi, Ltd. and National Institute of Information and Communications Technology.
F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 197–212, 2010. c Springer-Verlag Berlin Heidelberg 2010
198
Y. Yamashita and M. Tsuru
the filters would be effective even for a large filter rule consisting of a number of filter patterns or under intensive traffic load. This requires an effective combination of both the higher-level optimization related to algorithmic structures that are adaptable for the input packet sequence and the lower-level (machine code) optimization related to acceleration techniques in a compiler study. To address this problem, the authors have focused on the code-level optimization and shown that the software pipelining technique [1], which is one of the aggressive loop optimization techniques, makes the filters more than four times faster than the naive implementations in C [9,10,11,12]. In this paper, in addition to the software pipelining, we further apply the parallelization and SIMD techniques on the multi-core processor Cell Broadband Engine [4] (Cell in below) and show that the parallelized program executed on seven cores is more than 40 times faster than the naive C program executed on a single core. In order to implement fast packet filters, we can consider the following two types of optimizations (see [12] in details): Type A optimization is applied to a packet-based loop to process intensive input traffic (i.e., for a huge number of input packets). Type B optimization is applied to a pattern-based loop to handle a large filter rule (a lot of filter patterns in this paper). Two [9,10] of the authors’ works discussed the type A optimization while the other two [11,12] discussed the type B optimization. In this paper, the parallelization and SIMD techniques are applied as the type A optimizations and the software-pipelining as the type B optimization. Thus the optimized program developed in this research is effective for both cases of intensive input traffic and a large size of filter rules. Figure 1 illustrates our experimental research framework. We use SONY PLAYSTATION 3 (PS3 in below) as a packet filtering (receiver-side) computer. Parallelized packet filter programs introduced in Section 3 are run on Cell in the PS3. In virtual network experiments discussed in Section 5, packets are stored in a hard disk and read by the PS3. In real network experiments discussed in Section 6, we use Apple Mac Pro as a packet generating (sender-side) computer connected to the PS3 via 1 Gbps Ethernet. We will mainly discuss the average execution time per packet on both virtual and real networks in this paper but also the average latency time of processing each packet (the average time period between receiving a packet and obtaining the corresponding action to the packet) on a real network. The execution time per packet and the latency time of processing each packet indicate the two aspects of throughput and latency of our packet filter program, respectively.
2
Specification of Filter Rules
The syntax and semantics of the filter rules we discuss in this paper are based on the syntax and semantics of the common popular static IP filter rules, such as the Cisco IOS access list [2].
Implementation and Evaluation of Fast Parallel Packet Filters
PLAYSTATION 3
Mac Pro
Cell Broadband Engine packet
real network
PPE
action
parallel execution
virtual network
SPE
SPE
SPE
SPE
SPE
SPE
CPU
HD
199
Fig. 1. Experimental framework for parallel packet filters on Cell Broadband Engine ip filter ip filter ip filter ip filter ip filter ip filter ...
1 2 3 4 5 6
reject X.X.X.0/24 * * * * pass * X.X.X.0/24 established pass X.X.X.X/29 X.X.X.X tcp * pass X.X.X.0/24 X.X.X.X tcp * pass * * udp * domain pass X.X.X.X/29 X.X.X.X tcp *
* * smtp 5000-6000 pop3
Fig. 2. Example of a filter rule (each X.X.X.X is replaced with a concrete IP address)
We assume that a filter rule consists of one or more filter patterns, which are stored in a text formatted file. Figure 2 is such an example; in which each line represents one filter pattern to check the IP addresses, protocol, and port numbers of every input packet. After being invoked, a packet filter program reads the rule file and translates the contents of the filter patterns into an inner binary representation stored in a memory array. The program then checks whether an input packet matches the conditions that each filter pattern represents. The program proceeds from the first filter pattern at the top line to the bottom line. If the packet matches a filter pattern, the program performs the corresponding action of the pattern. Otherwise, the program drops the packet1 if the packet matches no filter pattern in the rule. Hereinafter, we generally refer to the entire set of rule patterns simply as a rule and each filter pattern simply as a pattern. The following is the syntax of every filter pattern in this paper: ip filter n action sip dip proto spt dpt The parameters n, action, sip, dip, proto, spt, and dpt are defined below. n is a pattern identification number (unsigned 16-bit integer). We assume that the numbers of patterns are arranged in ascending order. action is an action when the pattern is chosen. Usually, the action is pass, reject, or another special action. 1
This is referred to as a default rule. In some cases, the default rule may be to accept the packet if there is no pattern to match.
200
Y. Yamashita and M. Tsuru
result = REJECT; for(int i = 0; i < n_udp_patterns; i++){ SIP = packet.sip & udp_pattern[i].sip_bit_mask; if(SIP == udp_pattern[i].sip){ DIP = packet.dip & udp_pattern[i].dip_bit_mask; if(DIP == udp_pattern[i].dip){ if(udp_pattern[i].proto == udp){ if(packet.spt >= udp_pattern[i].spt_minimum_value){ if(packet.spt <= udp_pattern[i].spt_maximum_value){ if(packet.dpt >= udp_pattern[i].dpt_minimum_value){ if(packet.dpt <= udp_pattern[i].dpt_maximum_value){ FLAGS = packet.udp_flag_field & udp_pattern[i].udp_flag_field_bit_mask; if(FLAGS == udp_pattern[i].udp_flag_field){ result = udp_pattern[i].action; break; } ... } else if(udp_pattern[i].proto == *){ result = udp_pattern[i].action; break; } ... } Fig. 3. C program for filtering UDP packets
sip (or dip) is the source (destination) IP address of the input packet, which is one of the following three patterns. “*” is a wild card, which indicates an arbitrary address. “x1 .x2 .x3 .x4 ” is a concrete address, where each xi is an unsigned 8-bit integer value. “x1 .x2 .x3 .x4 /m” is a concrete address with a mask bit-width m, where m is a non-negative integer such as 0 ≤ m ≤ 32. proto is a protocol identifier, which is one of the following four patters. “*” is a wild card, which indicates an arbitrary protocol. “tcp” indicates the TCP protocol. “udp” indicates the UDP protocol. “established” indicates the TCP packet after the TCP connection is established2 . spt (or dpt) is the source (destination) port number of the input packet, which is one of the following four patterns. “*” is a wild card, which indicates an arbitrary port number. We specify an arbitrary port number if the protocol of the pattern is not tcp/udp. “p” is a concrete port number (unsigned 16-bit integer). “p1 -p2 ” is a range of port numbers specified by two port numbers p1 ≤ p2 . “name” is a specific port name such as smtp, www, or domain. It is easy to write a computer program to perform the packet filtering process as explained above. Figure 3 shows the concrete but straightforward form of the loop program specialized for UDP packets, where the input packet is assumed to 2
The packet filter must check the ack and rst bits in the flag field of the TCP packet.
Implementation and Evaluation of Fast Parallel Packet Filters packet
PPE
action
packet
PPE
201
action
naive C program PPE use no SPEs.
SPE
SPE
SPE
SPE
SPE
SPE
SPE
SPE
SPE
SPE
SPE
SPE
naive C program
C6
C1 4 packets
PPE
4 actions
PPE
4 actions
SPE
SPE
SPE
SPE
SPE
S6-4
SPE
SPE
SPE
SPE
SPE
SPE
SPE
SIMDed code
4 packets
SIMDed, software pipelined code
SS6-4
Fig. 4. Program architectures of C1, C6, S6-4, and SS6-4
be stored in the variable packet, and the filter patterns are stored in the array udp pattern. Their data entities are structured with the data members such as sip, dip, and proto. Such a loop program, however, is very slow without code optimization techniques. The authors’ studies [9,10,11,12] show that the software pipelining technique [1] makes the program more than four times faster than the naive C program. In this paper, we further investigate the effects of parallelization in a multi-core environment.
3
Parallelized Packet Filters
Cell consists of a general-purpose processor-core named PowerPC Processor Element (PPE in below) and 8 calculation specific processor-cores named Synergistic Processor Element (SPE in below). We can use 6 of 8 SPEs under the PLAYSTATION 3 Linux environment. Using these cores, we develop and investigate the following four kinds of filter programs, which we call them “C1”, “C6”, “S6-4”, and “SS6-4” in convenience and illustrate their program architectures in Figure 4 for intuitive understanding. C1 (stands for a C program on 1 PPE) runs on PPE only, receiving packets and filtering the packets. C6 (stands for a C program on 1 PPE and 6 SPE) runs in parallel, where the parallelization is controlled as follows.
202
Y. Yamashita and M. Tsuru
1. When PPE receives a packet, PPE assigns one of resting SPEs (for example, the third left-most SPE in Figure 4) to filter the packet, or suspends if all the SPEs are busy until at least one SPE moves into the resting state. 2. The above assigned SPE transfers the header of the packet from the main memory to its own local memory3 , filters the packet, then sends its corresponding action (pass/reject) to PPE, and moves into resting state again. 3. PPE waits for a next event and performs either of the following actions. (a) When PPE notices the action sent from a SPE, PPE performs the action. (b) When PPE receives a successive packet, PPE does the same as in the above 1. S6-4 (stands for an assembly code on 1 PPE and 6 SPEs for 4 packets) runs in the same way as the above C6. The difference is that PPE packs four successive packets and each SPE executes SIMD instructions to process the packed packets simultaneously. The SPE code of this program is written in the assembly by hand. SS6-4 (stands for a software pipelined S6-4) runs in the same way as the above S6-4. The difference is that the SPE code is software-pipelined in terms of rule patterns. Thus, the larger the rule size is, the faster the SPEs run. The authors use pcap library [6] for sending/receiving packets and IBM Cell Broadband Engine Software Development Kit [3] for multi-core parallelization. Software pipelining is the key technology of the authors’ existing researches [9,10,11,12] and is also one of the key technologies of this research. One difference among them is that the special software pipelining technique called EMS [8] for a loop with conditional branches is applied in the existing researches while in this research it is enough to apply the standard technique which is applicable to only a loop without conditional branches and is likely to be seen in many compiler textbooks, because all the conditional branches in the loop in Figure 3 are removed by using SIMD compare instructions [5].
4
Experimental Environment
Figure 5 illustrates our experimental environments. As mentioned in Section 1, the packet generating (sender-side) computer is Apple Mac Pro 8-core (2.8 GHz, Mac OSX ver. 10.5) and the packet filtering (receiver-side) computer is SONY PLAYSTATION 3 (3.2 GHz, Fedora Core 5 Linux ver. 2.6.16). We treat only UDP packets. 3
PPE and SPEs have no shared memory among them but each SPE has its own local memory so that we have to transfer data between the main memory and each SPE’s local memory before and after the SPE filters a given packet. The parallelized programs C6, S6-4, and SS6-4 use DMA (direct memory access) to transfer data from the main memory to a local memory and use a mail box queue to transfer a 32 bit data from a local memory to the main memory.
Implementation and Evaluation of Fast Parallel Packet Filters
Mac Pro
udp packets only, 5,000 pps to 200,000 pps, 128 bytes/1428 bytes long, 1 Gbps Ethernet
203
PLAYSTATION 3 real network virtual network
HD 50,000 packets Fig. 5. Experimental parameters
In the case of virtual network, 50,000 packets sent from the packet generating computer are stored in a hard disk of the packet filtering computer beforehand and read by the packet filtering computer. In the case of real network experiment, the packet generating computer continuously sends test packets for three seconds at a given constant bit rate, but the packet filtering computer ignores the incoming test packets in the first one and the last one seconds. That is, we investigate the stable performance during only the middle one second period. The sending rate is 5,000 pps (packets per second) to 200,000 pps. We do not use TCP packets in our experiments because, in the case of TCP packets, the operating system controls the sending rate dynamically prior to the user setting. In order to investigate the effect of packet lengths, we treat short and long packets of 128 bytes and 1428 bytes long, respectively. Four kinds of filter rules whose size (the number of rule patterns) is 1, 1000, 2000, and 3000 are prepared. We expect that, the longer the filter rule is, the higher the filtering cost is, because we set up the rules so as every packet in our experiments matches only the pattern at the last line of the rule.
5
Off-Line Experiments on a Virtual Network
In the off-line experiments, we measure the time period during which each of the four filter programs C1, C6, S6-4, and SS6-4 reads 50,000 packets from the hard disk and filters them (see Figure 5), and then analyze the performances of the programs from several view points. 5.1
Total Execution Time per Packet
Table 1 shows the average total execution times per packet (the total times divided by 50,000), where each of them includes packet receiving (reading) time from the virtual network and packet filtering time. First of all, it is naturally understandable that the total execution times of C1 and C6 are linearly proportional in terms of rule size. Second, in contrast, those of S6-4 and SS6-4 are almost constant independent from rule size, because the
204
Y. Yamashita and M. Tsuru
Table 1. Average total execution time per packet on a virtual network (including packet receiving time) rule size
1
1000
2000
3000
C1 C6 S6-4 SS6-4
1.2 5.4 2.5 2.5
17.2 8.4 2.6 2.6
33.2 15.8 2.6 2.5
49.4 23.3 2.7 2.6
Time unit is μ sec. Table 2. Average of effective filtering time rule size C1 C6 S6-4 SS6-4
1
1000
2000
3000
15.7 7.4 0.7 0.4
31.6 14.8 1.5 0.8
47.8 22.0 2.2 1.1
0.1 ≤ 0.1 ≤ 0.1 ≤ 0.1
Time unit is μ sec.
execution time of SPEs (that is, the time of filtering packets) are shorter than the execution time of PPE (that is, the time of receiving packets and controlling SPEs). In order to discuss it further, we extract the packet filtering time from the total time next. 5.2
Effective Filtering Time
Table 2 shows effective execution times of only the packet filtering function, excluding the packet receiving time, of the programs C1, C6, S6-4, and SS6-4. Here effective means taking account of the effects of parallelization and SIMD operations; the execution time of C1 equals the raw execution time that is the average time period (measured on PPE) between starting and finishing the filter function, the execution times of C6 in the table are the values obtained by dividing the raw execution times (measured on SPEs) by 6 because the filter function of C6 runs on 6 SPEs in parallel, and the execution times of S6-4 and SS6-4 are the values divided by 24 (= 6 × 4) because the filter functions of S6-4 and SS6-4 run on 6 SPEs in parallel and treat 4 packets simultaneously in a SIMD manner. From the table we can observe the followings. 1. C6 is about two times faster than C1, because the parallelization degree is 6 but one SPE is three times or so slower than one PPE in the case of C programs4. 4
Because the instruction set of SPE is specialized for a 16-byte memory boundary, the SPE code conventionally built by a usual gcc is slower if there are a lot of memory accesses not aligned to a 16-byte boundary.
Implementation and Evaluation of Fast Parallel Packet Filters
205
Table 3. Comparison among several processors: the execution time per rule pattern Itanium 2
Xeon
Cell
(0.9GHz)
(2.8GHz)
(3.2GHz)
Machine Cycle Time (MC)
3.9
4.7
1.3
Real Time (pico sec)
4.3
1.7
0.4
2. S6-4 is about 10 times faster than C6 because of SIMD acceleration and best choice instructions by hand. 3. SS6-4 is about two times faster than S6-4 because of software pipelining. As the consequence, SS6-4 is about 40 times faster than C1. Recall that, in Table 1, the total execution times of S6-4 and SS6-4 are constant near around 2.6 μ sec and considerably longer than all the effective filtering times of S6-4 and SS6-4 in Table 2. It implies that the bottleneck of the total execution is PPE, which receives input packets and manages the parallel execution of SPEs. 5.3
Comparison to Other Processors
Referring from the authors’ existing works [11,12], we can compare the speed of our filter programs on Cell to those on Itanium 2 and Xeon processors. Table 3 summarizes the execution times on the three processors of matching a packet with each rules pattern. The fastest filter program SS6-4 on Cell is rather faster than those on Itanium 2 and Xeon with respect to both machine cycle and real time. This is not only because SS6-4 is parallelized but also because it is SIMDed and software-pipelined.
6
Experiments on a Real Network
In the authors’ previous works [9,10,11,12], the performances were investigated only on a virtual network because the behaviors of single-core processors are straightforward, and thus, it is expected that the similar performances can be archived on a real network. On the other hand, in this paper, we exploit the advantage of multi-core processors that is not so simple and may be affected by the time sequence of packet arriving. Therefore, the performance investigation on a real network is needed. In this section, we first measure the total execution times per packet and show that the experimental results on a real network is quit similar to the results on a virtual network (Table 1). Next, we discuss the latencies (the time periods between receiving a packet and obtaining the corresponding action to the packet) on a real network. The experiments show that the latencies of SS6-4 are longer than those of C1. So that we propose a solution to make the latencies of SS6-4 shorter.
206
Y. Yamashita and M. Tsuru
r
n nP
1
NP
ΔP
N
(a) The relation of N to n
Δ
(b) The relation of Δ to r
Fig. 6. The assumed relation of the number N of sending packets to the number n of receiving packets and the assumed relation of the time interval Δ of sending packets to the packet filtering ratio r
6.1
Total Execution Time per Packet
Consider the situation that the packet generating computer sends N UDP packets per second with a constant packet interval (i.e., at a constant bit rate) and that the packet filtering computer receives and filters n of the N packets successfully by a packet filter program P (ref. Figure 5). In such a situation we assume that the relation of N to n can be illustrated as the left hand graph of Figure 6, because 1. the packet filtering computer can receive all the N packets (that is, n = N ) if N is smaller than a certain threshold NP , and 2. the packet filtering computer can receive at most nP packets if N is bigger than NP , It is usually expected that NP = nP . The value of NP depends on the hardware performance of the NIC (network interface card), the behavior of the operating system, the speed of the packet filter program P executed on the packet filtering computer, and so on. Next we define the packet filtering ratio r as r = n/N and the time interval Δ between sending each packet as Δ = 1/N . Thus, the assumed relation of Δ to r can be illustrated as the right hand graph of Figure 6 by replacing N and n in the left hand graph of Figure 6 with Δ and r. This graph shows that 1. r is 1 when Δ is longer than the threshold ΔP (= 1/NP ), and 2. r is smaller than 1 if Δ is shorter than ΔP . And further r is just 1 if Δ = ΔP , and this case means that the filter program P runs restlessly with hardly waiting for receiving the next packet. So that we can recognize the interval ΔP as the average total execution time in which the packet filter program P reads a packet and filters it. As a consequence, we can obtain the average total execution time per packet ΔP from the value of nP because ΔP = 1/NP = 1/nP .
Implementation and Evaluation of Fast Parallel Packet Filters n nP=109893 100000
207
r 1
50000
rr ΔP = 9.1 50000
100000
(a) The relation of N to n
N
5
10
Δ
(b) The relation of Δ to r
Fig. 7. Example of the experimental relation of the number N of sending packets to the number n of receiving packets and the experimental relation of the time interval Δ of sending packets to the packet filtering ratio r (program: SS6-4, rule size: 1 line, packet size: 128 bytes)
The authors experimentally obtained the relations of N to n for each of the filter programs C1, C6, S6-4, and SS6-4 by varying N from 5,000 pps to 200,000 pps by 1,000 pps. The left hand graph of Figure 7 is an example of such relations when the packet filter program is SS6-4, the rule size (the number of rule patterns) is one, and the UDP packets are 128 bytes long. The right hand graph of Figure 7 is its corresponding relation of Δ to r. In this case, nP (and also NP ) is 109,893 pps so that the average total execution time per packet can be calculated as 9.1 μ sec (= 1/109, 893). In the graphs of Figure 7 there are anomalous performance losses when N ≥ 109, 893 pps and Δ ≤ 9.1μ sec. This unexpected (irregular) behavior under over-loaded conditions might come from inefficient resource racing in the computer. However, since the over-loaded condition is not the main focus, we do not go further investigation in this paper. The authors observed the similar losses in all the cases of our experiments, though not shown here due to page space limitation. In the same way as on the virtual network, we test the four filter program C1, C6, S6-4, and SS6-4 with the four filter rules of 1, 1000, 2000, and 3000 lines long and two UDP packets of 128 bytes and 1428 bytes long. Table 4 show their experimental results. In order to easily understand, Figure 8 illustrates them as bar graphs (middle and right) as well as the bar graph (left) of those on the virtual network. In those graphs, similar tendencies are seen despite the virtual or real, and the packet size, which strongly suggest the validity of our method. In addition, the baseline execution time increases on a real network and with a larger packet size. This implies that the processing time to receive packets from a real network is more dominant, that is, the filtering process performances of S6-4 and SS6-4 are fast enough in this experimental setting.
208
Y. Yamashita and M. Tsuru
Table 4. Average total execution time per packet on a real network (including packet receiving time) packet size rule size C1 C6 S6-4 SS6-4
128 1 9.2 14.4 10.0 9.8
1428
1000
2000
3000
25.2 13.6 9.6 9.6
42.8 17.5 10.2 10.4
60.1 25.7 10.2 9.9
1 11.6 14.6 12.7 12.8
1000
2000
3000
26.7 14.5 12.8 12.8
44.3 16.1 12.8 12.8
61.4 22.8 12.8 12.8
Time unit is μ sec. 60
60
60
40
40
40
20
20
20
0
C1 C6 S6-4 SS6-4 packet size 128 bytes on virtual network
0
C1 C6 S6-4 SS6-4 packet size 128 bytes on real network
0
C1 C6 S6-4 SS6-4 packet size 1428 bytes on real network
Fig. 8. Comparison graphs of Execution time per packet on virtual and real networks
6.2
Latency Time of Packet Filtering
Our last topic in this paper is to evaluate the latency time of processing each packet (the time period between receiving a packet and obtaining the corresponding action to the packet) in under-loaded conditions on a real network. In order to measure the latency time, we memorize both of the time at which the filter program has received a packet and the time at which the program has judged the corresponding action to the packet into memory for every packet and analyze them statistically after the program ends. Graphs (a) to (d) in Figure 9 show examples of the relation of the time interval Δ of sending each UDP packet of 128 bytes long to the latency time L of processing the packet by the four programs C1, C6, S6-4, and SS6-4, respectively, for the four rule size of 1, 1000, 2000, and 3000 lines long. Note that the lines of the graphs (a) and (b) are almost flat, not depending to Δ, because the filter functions of C1 and C6 filter only one packet for every function call and the execution times of the functions do not depend on the time interval Δ. In contrast, the lines of graphs (c) and (d) are proportional with respect to Δ, because the filter functions of S6-4 and SS6-4 filter four packets simultaneously in a SIMD manner. To understand the reason deeply, suppose that four successive packets are received at intervals of Δ. In this case, the first, second, third, and fourth packet must wait in 3Δ, 2Δ, Δ, and 0, respectively,
Implementation and Evaluation of Fast Parallel Packet Filters
L
L
200
200
150
150
100
3000 2000 1000 1 100 Δ
50
0
0
50
3000
100
2000
50
1000
0
0
(a) case of C1
1 100
50
Δ
(b) case of C6 L
L 200
200
3000 2000 1000
150
3000 2000 1000 1
150
1 100
100
50
50
0
209
0
50
(c) case of S6-4
100
0 Δ
0
50
100
Δ
(d) case of SS6-4
Fig. 9. Examples of the relation of the time interval Δ of sending each packet to the latency time L of processing each packet
before the filter function gets invoked. Thus the average waiting time over the four packets is 1.5Δ (= (3 + 2 + 1 + 0)Δ/4), which equals the gradient of the lines in the graphs (c) and (d). In below, in order to make our discussion simple, let us treat the average values over all Δ ∈ [20μ sec, 91μ sec] and all the rule sizes 1, 1000, 2000, and 3000. Table 5 shows such average latency times for the programs C1, C6, S64, and SS6-4. We can simply observe that the latency times of C6, S6-4, and SS6-4 are several times longer that that of C1. The reason why the latency of C6 is longer than that of C1 is because of the difference of performance (recall Table 2, in which C6 is not six times faster than C1 in terms of effective filtering time). The reason why the latencies of S6-4 and SS6-4 are longer than that of C1 is because S6-4 and SS6-4 are SIMDed as we discussed above by referring to Figure 9.
210
Y. Yamashita and M. Tsuru Table 5. Average latency time of processing each packet on a real network latency C1 C6 S6-4 SS6-4
28.3 83.1 92.4 78.4
μ sec. Table 6. The relation of a time constant τ to the average latency time of processing each packet on a real network for the improved SS6-4 τ
latency
0.0 0.025 0.25 2.5 25.0 250.0 ∞
25.9 25.6 24.4 26.7 33.6 72.3 78.4
μ sec.
6.3
Improving of SIMD Procedures
Let us focus on the packet filter program SS6-4 here. To decrease the latency time of this program, we improve the SIMD procedures of SS6-4 as follows by introducing a time constant τ and a dummy packet. 1. When PPE receives a packet from the real network, PPE puts it into a packet buffer (this is exactly the same as the ordinary action of SS6-4). 2. When PPE receives no packet within the time interval τ , PPE puts a dummy packet into a packet buffer (as if PPE received the dummy packet). 3. If the packet buffer has have four packets and at least one of them is not a dummy, PPE orders one of resting SPEs to filter the packets in a SIMD manner. 4. When PPE notices the four actions sent from a SPE, PPE performs the actions corresponding to real packets but ignores the actions corresponding to dummy packets. Table 6 shows the latency times of the improved SS6-4 under the time constants5 τ = 0.0, 0.025, ..., 250.0 (μ sec), where τ = 0.0 means that the filter function starts without wait as soon as the program has just received a real packet, that is, the program always executes SIMD operations with one real packet and three 5
Because the minimum time resolution of the time base register of the Cell on PLAYSTATION 3 is 0.025μ sec, we can set the value of τ only a multiple of 0.025μ sec.
Implementation and Evaluation of Fast Parallel Packet Filters
211
L 200
150
100
50
0
0
50
3000 2000 1000 1 100 Δ
case of the optimally improved SS6-4 (τ = 0.25μ sec) Fig. 10. Example of the relation of the time interval Δ of sending each packet to the latency time L of processing each packet
dummy packets. The latency value 78.4μ sec for τ = ∞ is referred from Table 5. From this table, We see that the latency is minimum at τ = 0.25μ sec and that the minimum latency is shorter than the latency of the program C1. In the same way as Figure 9, Figure 10 shows an example of the relation of the time interval Δ of sending each packet to the latency time L of processing the packet by the optimally improved SS6-4. And further the authors made sure that the average execution times per packet of the improved program do not increase.
7
Future Works
In the next step, we will verify the maximum filtering performance on higher data throughput computer and NIC (network interface card) because the bottleneck is the receiving process rather than the filtering process in our experiments as indicated in Table 4. To do so, we need a Cell-based server machine instead of a PS3. In addition, it is of importance to investigate the filtering performance for more realistic incoming packet flows with bursty arrival because we only test for constant bit rate flows. And further we try to apply the same techniques as in this research to Intel’s widely deployed multi-core processors. As explained in Section 1, we used the multi-core processor techniques to improve the type A optimization. For our future challenges, we try to exploit the techniques to improve the type B optimization by separating and distributing a large filter rule.
References 1. Appel, A.W.: Modern Compiler Implementation in C. Cambridge University Press, Cambridge (1997) 2. Cisco: Configuring IP Access Lists, Document ID: 23602, http://www.cisco.com/warp/public/707/confaccesslists.html
212
Y. Yamashita and M. Tsuru
3. IBM: IBM Cell Broadband Engine Software Development Kit (2006), http://www.alphaworks.ibm.com/tech/cellsw 4. IBM: Cell Broadband Engine Programming Handbook, Cell Broadband Engine resource center (2007), http://www.ibm.com/developerworks/power/cell/documents.html 5. IBM: SPU Assembly Language Specification, Cell Broadband Engine resource center (2007), http://www.ibm.com/developerworks/power/cell/documents.html 6. Jacobson, V., et al.: tcpdump(1), bpf..., Unix Manual Page (1990) 7. Singh, S., Baboescu, F., Varghese, G., Wang, J.: Packet Classification Using Multidimensional Cutting. In: ACM SIGCOMM’03 (2003) 8. Warter, N.J., Haab, G.E., Bockhaus, J.W.: Enhanced Modulo Scheduling for Loops with Conditional Branches. IEEE MICRO-25 (1992) 9. Yamashita, Y., Tsuru, M.: Code Optimization for Packet Filters. In: SAINT 2007, Workshop on Internet Measurement Technology and its Applications to Building Next Generation Internet (2007) 10. Yamashita, Y., Tsuru, M.: Software Pipelining for Packet Filters. In: Perrott, R., Chapman, B.M., Subhlok, J., de Mello, R.F., Yang, L.T. (eds.) HPCC 2007. LNCS, vol. 4782, pp. 446–459. Springer, Heidelberg (2007) 11. Yamashita, Y., Tsuru, M.: Implementations of Fast Packet Filters and their Evaluations. IPSJ Transactions on Advanced Computing System (TACS) 1(1), 1–11 (2008) (in Japanese) 12. Yamashita, Y., Tsuru, M.: Implementing Fast Packet Filters by Software Pipelining on x86 Processors. In: Dou, Y., Gruber, R., Joller, J.M. (eds.) APPT 2009. LNCS, vol. 5737, pp. 420–435. Springer, Heidelberg (2009) 13. Yusuf, S., Luk, W.: Bitwise Optimised CAM for Network Intrusion Detection Systems. In: Int. Conf. Field Programmable Logic and Applications (2005)
On the Algebraic Expression of the AES S-Box Like S-Boxes M. Tolga Sakallı1 , Bora Aslan2 , Ercan Bulu¸s3 , Andac S ¸ ahin Mesut1 , 1 glu1 Fatma B¨ uy¨ uksara¸co˘glu , and Osman Karaahmeto˘ 1
2
Trakya University, Computer Engineering Dept., Edirne, Turkey Kırklareli University, Computer Programming Dept., L¨ uleburgaz-Kırklareli, Turkey 3 Namık Kemal University, Computer Engineering Dept., C ¸ orlu-Tekirda˘ g, Turkey {tolga,andacs,fbuyuksaracoglu}@trakya.edu.tr, [email protected], [email protected], [email protected]
Abstract. In the literature, there are several proposed block ciphers like AES, Square, Shark and Hierocrypt which use S-boxes that are based on inversion mapping over a finite field. Because of the simple algebraic structure of S-boxes generated in this way, these ciphers usually use a bitwise affine transformation after the inversion mapping. In some ciphers like Camellia, an additional affine transformation is used before the input of the S-box as well. In this paper, we study algebraic expressions of S-boxes based on power mappings with the aid of finite field theory and show that the number of terms in the algebraic expression of an S-box based on power mappings changes according to the place an affine transformation is added. Moreover, a new method is presented to resolve the algebraic expression of the AES S-box like S-boxes according to the given three probable cases. Keywords: S-boxes, Power Mappings, Algebraic Expression, Finite Fields.
1
Introduction
Boolean functions and vectorial boolean functions (S-boxes) are vital elements in the design of symmetric ciphers. S-boxes are only nonlinear and the most important component of a block cipher. Bijective S-boxes play an important role in designing secure block ciphers. To date, the techniques for the construction of S-boxes have included pseudo-random generation, finite field inversion, power mappings and heuristic techniques. From these techniques, the use of finite field inversion operation in the construction of an S-box yields linear approximation and difference distribution tables in which the entries are close to uniform. Therefore, this provides security against differential and linear attacks. Moreover, because of the fact that S-boxes generated using finite field inversion give good results from the point of cryptographic properties which are LAT (Linear Approximation Table), DDT (Difference Distribution Table), F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 213–227, 2010. c Springer-Verlag Berlin Heidelberg 2010
214
M.T. Sakallı et al.
completeness, avalanche, strict avalanche, bit independence (Example the AES (Advanced Encryption Standard) S-box) [1] [2], these algebraic S-boxes have received significant attention from cryptographers. Rijndael designed with the aim of being resistant against differential [3] and linear cryptanalysis [4] is an iterated block cipher [1] and was adopted as the Advanced Encryption Standard (AES) in 2001. Because of the two important cryptographic criteria mentioned, the designers decided to choose an S-box based on inversion mapping which provides uniform distribution from the point of both DDT and LAT [5]. Rijndael S-box was chosen in terms of Nyberg’s suggestion [6] and is based on the inversion mapping over GF (2n ) (with n = 8) f (x) = x−1 , x ∈ GF (28 ), f (0) = 0. As shown above, this mapping has a simple algebraic expression that may enable some attacks such as the interpolation attacks [7] [8]. In order to overcome this problem, this mapping was modified in such a way that does not modify its resistance towards both linear and differential cryptanalysis while overall S-box description becomes complex in GF (28 ). This was achieved by adding a bitwise affine transformation after the inverse mapping [9] [10]. In some ciphers, on the other hand, an additional transformation is used before the input of the S-box as well. So, in order to make S-box description complex, we have three different choices according to the place affine transformation is added. These are: – to add a bitwise affine transformation after the output of an S-box (case1), – to add a bitwise affine transformation before the input of an S-box (case 2), – to add a bitwise affine transformation both after the output of an S-box and before the input of an S-box (case3). Some ciphers in the literature like Shark [11], Square [12], AES use an affine transformation after the output of the inversion mapping. In Camelia [13], an additional affine transformation is used before the input of the S-box as well. On the other hand, in [17], it is stated that the complexity of cryptanalytic attacks like interpolation attacks depends on the degree of the polynomial approximation or on the number of terms in the polynomial approximation expression. In addition, it is expressed that the cryptanalyst has to evaluate the algebraic description of the S-boxes or the round function using the Lagrange interpolation formula. However, algebraic expression of the AES S-box using Lagrange interpolation over GF (28 ), where GF (28 ) is defined by irreducible polynomial x8 + x4 + x3 + x + 1, is given in [9] as S(x) = 63 + 05 x254 + 09 x253 + f 9 x251 + 25 x247 + f 4 x239 + 01 x223 + b5 x191 + 8f x127 . Although this expression has enough security from the point of algebraic degree, the number of terms which is 9 in the algebraic expression of the AES S-box
On the Algebraic Expression of the AES S-Box Like S-Boxes
215
is not so good as the algebraic degree. Lui Jing-mei etc., in [14], presented a new scheme to resolve the algebraic expression of the AES S-box and showed the improvement of the AES S-box which increases the number of terms in the algebraic expression from 9 to 255. In fact, their improvement fits the second case with a small difference mentioned before. The difference in their improvement was to add affine transformation matrix before the input of the S-box and XOR the affine constant after the inversion mapping. In our study, we propose a new method to determine the algebraic expression of power mapping based S-boxes designed by using case 1, case 2 and case3. This method does not only show the reason of being so sparse of the algebraic expression of the AES S-box, but it is also very effective method for computing the algebraic expression of an S-box designed by using case1, case2 or case3. After giving theoretical preliminaries for the description of proposed method and for determining the algebraic expression of an S-box based on power mappings, we give possible maximum number of terms in the algebraic expression of power mapping based S-boxes (with n = 8) designed by using the defined cases. Moreover, we give an example of S-box, shown in Appendix, designed by using case 3 and power mapping x254 . Our method can be considered as an alternative method of resolving algebraic expression of the AES S-box like S-boxes versus Lagrange interpolation which is rough and slow computation method when compared with the proposed method.
2
Mathematical Background and Definitions
In this section, we present a background for the algebraic preliminaries required throughout this paper. The reader is referred to [15] [16] for the theory of finite fields. In this paper, we will use the hexadecimal notation to denote the field elements when needed. Therefore, the finite field element, bn−1 αn−1 + bn−2 αn−2 + · · · + b0 , bi ∈ 0, 1 where α denote the primitive element used to construct the finite field is represented by the hexadecimal number consisting of bits (bn−1 bn−2 · · · b0 ). Let F = GF (2), K = GF (2n ) and λ ∈ K. Then the trace of λ relative to subfield F is 2 n−1 T rFK (λ) = λ + λ2 + λ2 + · · · + λ2 . Let {α0 · · · αn−1 } be any basis of GF (2n ) over GF (2) and let {β0 · · · βn−1 } be the corresponding dual basis. Let f (x0 , · · · , xn−1 ) = (f0 (x), · · · , fn−1 (x)) be a permutation over GF (2)n , then g(x) =
n−1
αi fi (x0 , · · · , xn−1 )
i=0
is also a bijective mapping over GF (2n ). Each output coordinate of f (x) can be expressed as fi (x) = T r(g(x)βi )
216
M.T. Sakallı et al.
n−1 where x = i=0 xi αi . Moreover, the dual basis {β0 · · · βn−1 } is given by McEliece [15] as n−1 bkj αk βj = k=0 −1
where B = [bij ] = A with the elements bij for 0 ≤ i, j ≤ n − 1 and A is n × n matrix over GF (p) (GF (2) for our case) with the elements aij given by aij = T r(αi αj ), 0 ≤ i, j ≤ n − 1. Thus, we have
⎡
⎤ T r(α0 α0 ) T r(α0 α1 ) · · · T r(α0 αn−1 ) ⎢ T r(α1 α0 ) T r(α1 α1 ) · · · T r(α1 αn−1 ) ⎥ ⎢ ⎥ A=⎢ ⎥. .. ⎣ ⎦ . T r(αn−1 α0 ) T r(αn−1 α1 ) · · · T r(αn−1 αn−1 )
Definition 1. A polynomial having the special form with coefficients βi from GF (2n ) is called a linearized polynomial over GF (2n ). L(x) =
t
i
βi x2
(1)
i=0
Definition 2. A cyclotomic coset mod N that contains an integer s is the set
Cs = s, sq, · · · , sq m−1 (modN ) (2) where m is the smallest positive integer such that sq m ≡ s(modN ). Lemma 1. Let A be a linear mapping over GF (2n ), then A(x), x ∈ GF (2n ) can be expressed in terms of a linearized polynomial over GF (2n ). I.e., we can express A(x) as A(x) =
n−1
i
βi x2
(3)
i=0
Example 1. Let n = 8 and let GF (28 ) be defined by the irreducible polynomial p(x) = x8 +x4 +x3 +x+1 as in the AES specifications. In addition, let β = α+1, where α is a root of p(x) and β is primitive element in the defined GF (28 ). Then, using the mathematical background given in Section 2, we can give the coordinate functions of the input bits as x0 = T r(β 228 x), x4 = T r(β 73 x), x1 = T r(β 204 x), x5 = T r(β 48 x), x2 = T r(β 179 x), x6 = T r(β 23 x), x3 = T r(β 2 x),
x7 = T r(β 253 x).
On the Algebraic Expression of the AES S-Box Like S-Boxes
217
Hence the coordinate functions of the output of affine transformation of the AES S-box can be expressed as f0 = T r(β 166 x) + 1, f4 = T r(β 72 x), f1 = T r(β 53 x) + 1,
f5 = T r(β 76 x) + 1,
f2 = T r(β 36 x),
f6 = T r(β 51 x) + 1,
f3 = T r(β 11 x),
f7 = T r(β 26 x).
Using the obtained output coordinates, the algebraic expression of the affine transformation of the AES S-box can be written as A(x) =
n−1
fi αi = f0 + αf1 + α2 f2 + α3 f3 + · · · + α7 f7 .
i=0
Since we can write α, α2 , α3 , α4 , α5 , α6 , α7 as β 25 , β 50 , β 75 , β 100 , β 125 , β 150 , β 175 respectively, A(x), the output of affine transformation can be rewritten as A(x) = (β 166 x + (β 166 )2 x2 + (β 166 )4 x4 + · · · + (β 166 )128 x128 ) +β 25 (β 53 x + (β 53 )2 x2 + (β 53 )4 x4 + · · · + (β 53 )128 x128 ) +β 50 (β 36 x + (β 36 )2 x2 + (β 36 )4 x4 + · · · + (β 36 )128 x128 ) +β 75 (β 11 x + (β 11 )2 x2 + (β 11 )4 x4 + · · · + (β 11 )128 x128 ) +β 100 (β 72 x + (β 72 )2 x2 + (β 72 )4 x4 + · · · + (β 72 )128 x128 ) +β 125 (β 76 x + (β 76 )2 x2 + (β 76 )4 x4 + · · · + (β 76 )128 x128 ) +β 150 (β 51 x + (β 51 )2 x2 + (β 51 )4 x4 + · · · + (β 51 )128 x128 ) +β 175 (β 26 x + (β 26 )2 x2 + (β 26 )4 x4 + · · · + (β 26 )128 x128 ) + 63 . If we want to find the coefficient, A0 , for x term in the A(x), we can express it as the sum of β 166 , β (53+25)mod255 , β (50+36)mod255 , β (75+11)mod255 , β (100+72)mod255 , β (125+76)mod255 , β (150+51)mod255 and β (175+26)mod255 . By noting that for x ∈ n−1 GF (2n ), xa = xamod 2 , then A0 can be obtained as
218
M.T. Sakallı et al.
A0 = β 166 + β 78 + β 86 + β 86 + β 172 + β 201 + β 201 + β 201 , A0 = 2A + 78 + DC + DC + 7A + 2D + 2D + 2D , A0 = 05 . After calculating other coefficients for other terms, the resultant algebraic expression of the affine transformation of the AES S-box is A(x) = 63 + 05 x + 09 x2 + f 9 x4 + 25 x8 + f 4 x16 + 01 x32 + b5 x64 + 8f x128 . Note that + operation used in the equations above represents modulo 2 addition or XOR operation between finite field elements and the hexadecimal values in the vertical quote marks represent the field elements in GF (28 ).
3
Algebraic Expression of Power Mapping Based S-Boxes
In this section, we present some definitions and theorems to express how algebraic expression of an S-box changes according to the defined cases in Section 1 and explain the reason why only nine terms are involved in the expression of the AES S-box. Moreover, we give the definitions of two important criteria, DDT (Difference Distribution Table) also called XOR Table and LAT (Linear Approximation Table) for S-boxes. For 8-bit to 8-bit S-boxes based on power mappings, we give maximum LAT and DDT values in Table 1. Definition 3. Let S: GF (2n ) → GF (2n ) be an S-box having n-bit inputs and n-bit outputs. For any given a, b ∈ GF (2n ), the DDT can be constructed using XOR(a, b) = # {x ∈ GF (2n ) : S(x) + S(x + a) = b}
(4)
where a, b are called the input difference and output difference respectively. Also, ∇f = max {XOR(a, b) : a, b ∈ GF (2n ), a = 0} is called differential uniformity and we say that an S-box is nonlinear if ∇f smaller than 2n . Moreover, the DDT of an S-box gives information about the security of the block cipher against differential cryptanalysis. If the differential uniformity is large, this is an indication of an insecure block cipher from the point of differential cryptanalysis. Definition 4. Let S : GF (2n ) → GF (2n ) be an S-box having n-bit inputs and n-bit outputs. For any given Γa , Γb ∈ GF (2n ), the LAT can be constructed using LAT (Γa , Γb ) = # {x ∈ GF (2n ) : Γa • x = Γb • S(x)} − 2n−1
(5)
where x • y denotes the parity (0 or 1) of bitwise product of x and y. Also, Γa , Γb are called input mask and output mask respectively. LAT is important tool to
On the Algebraic Expression of the AES S-Box Like S-Boxes
219
measure the security of the S-boxes against linear cryptanalysis. Large absolute LAT values are not desired since they indicate high probability of linear relations between the input and the output. Definition 5. Nonlinearity measure of an n×n S-box related with the maximum entry of LAT value can be given as N LMS = 2n−1 − max |LATS (Γa , Γb )|
(6)
Definition 6. Let f(x) = xd be a function. If ∇f = 2 for this function, then this function is called APN (Almost Perfect Nonlinear) function. Definition 7. We say that two functions f and g are equivalent if the lists of values XOR(a, b) of these functions with a, b ∈ GF (pn ) are equal [18]. n
Proposition 1. Inversion mapping, f (x) = x2 (2n ) for n even is differentially 4 uniform [6].
−2
with x ∈ GF (2n ) over GF
Proposition 2. f (x) = xd , x ∈ GF (2n ) over GF (2n ) for n even, where d = 2n − 2i − 1 for i = 1, 2, .., n − 1, is differentially 4 uniform. n
Proof. Since d = x2 −2 with x ∈ GF (2n ) for inversion mapping over GF (2n ), n 2i mod(2n −1) the function x2 −2 , according to the proposition 1, is differentially 4 uniform. Therefore, 2i mod(2n −1) n 2i mod(2n −1) n = x2 −1−1 x2 −2 n i mod(2 −1) = x−2
(7)
means that f (x) = xd with x ∈ GF (2n ) where d = 2n −2i −1 for i = 1, 2, . . . , n−1 is differentially 4 uniform. Theorem 1. Let F (x) = xd be a function of GF (2n ) which corresponds to the (n) Boolean mapping f (x1 , . . . , xn ) = (f1 (x), . . . , fn (x)) over F2 . Then the function G(x) corresponding to the Boolean mapping obtained by applying a linear transformation to the output coordinates of f (x1 , . . . , xn ) can be expressed as n / Cd and Cd is the cyclotomic coset (mod G(x) = 2i=0−1 bi xi , where bi = 0 ∀i ∈ 2n − 1) [17]. Theorem 2. Let F (x) = xd be a function of GF (2n ) which corresponds to the (n) Boolean mapping f (x1 , . . . , xn ) = (f1 (x), . . . , fn (x)) over F2 . Let the function G(x) corresponding to the Boolean mapping obtained by applying a linear transformation to the input coordinates of x1 , . . . , xn while fixing f (x1 , . . . , xn ). Then 2n −1 G(x) can be expressed as G(x) = i=0 bi xi , bi = 0 for wt(i) > wt(d), where wt(d) denotes the Hamming weight of d [17].
220
M.T. Sakallı et al.
Note that, in Theorem 1 and Theorem 2, bi values are field elements in GF (2n ). Also, the reader is referred to [17] for the proofs of the two theorems. Using stated definitions and theorems, we can classify power functions according to the DDT and LAT distributions and this classification can be made according to the cyclotomic coset of d for the power function f (x) = xd . The classification for the power functions in GF (28 ) is shown in Table 1. Theorem 1 illustrates the effect of applying a linear transformation, which defines case 1, on the terms of algebraic expression and it explains that all elements of a class, which is cylotomic coset of d, will appear as the terms in the algebraic expression where the algebraic degree will be the biggest value among the class elements. On the other hand, Theorem 2 defines case 2 and illustrates the effect of applying a linear transformation to the input coordinates of a power function or adding affine transformation before the input of a power function. It explains that Hamming weight of d, wt(d), indicates the number of terms in the algebraic expression and the terms having the powers with from wt(1) to wt(d) will appear in the algebraic expression. Moreover, as in case 1, the algebraic degree will be the biggest value among the class elements. So, a formula for the calculation of the number of terms in the algebraic expression for an S-box designed by using case 2 can be given as 1 + C(n, 1) + C(n, 2) + . . . + C(n, r)
(8)
where r is the Hamming weight of the power function.
case 1
case 2
case 3
xd
LA1
LA2
LA1
xd
xd
LA1
Fig. 1. Description of possible cases in designing AES S-box like S-boxes
Let LA1 (x) = a x + b x2 + . . . + h x128 be algebraic expression of the affine transformation shown in Figure 1 and let a , b ,. . ., h be finite field elements in GF (28 ). Then, algebraic expression of an S-box designed by case 2 is S(x) = (LA1 (x))d .
On the Algebraic Expression of the AES S-Box Like S-Boxes
221
On the other hand, algebraic expression of an S-box designed by case 3 is 0
S(x) = a (LA2 (x))2
d
1
+ b (LA2 (x))2
d
n−1
+ . . . + h (LA2 (x))2
d
.
With n=8, the equation becomes S(x) = a (LA2 (x))d + b (LA2 (x))2d + . . . + h (LA2 (x))128d where LA2 (x) is algebraic expression of the affine transformation LA2 shown in Figure 1. If d is chosen from any class elements from 127 given in Table 1, the expressions LA2 (x)127 , LA2 (x)254 ,. . . , LA2 (x)191 separately bring 255 terms and the resultant algebraic expression includes 255 terms with the algebraic degree 254 as it is given in the Equation (8) which can also be used to find the number of terms in the algebraic expression of an S-box for other power functions designed with case 2 or case 3. In addition, the powers of each expression have the same Hamming weight. This because they are in the same cyclotomic coset. For example, if x7 power mapping (or any power from class 7) is used with case 2 or case 3 to design an S-box, then the algebraic expression of the S-box will include 93 terms with the algebraic degree 224. Moreover, the computation needs more time in case 3 than in case 2. The method for resolving an S-box designed by case 1 can be given as – compute LA1 (x), the algebraic expression of the affine transformation, for the affine transformation LA1 using the given theory, – put xd instead of x in LA1 (x). The method for resolving an S-box designed by case 2 can be given as – compute LA1 (x), the algebraic expression of the affine transformation, for the affine transformation LA1 using the given theory, – put LA1 (x) instead of x in xd , – compute the algebraic expression of the S-box using the expression (LA1 (x))d . The method for resolving an S-box designed by case 3 can be given as – compute LA1 (x), LA2 (x) the algebraic expression of the affine transformations, for the affine transformations LA1 and LA2 using the given theory, – put LA2 (x) instead of x in LA1 (xd ), – compute the algebraic expression of the S-box using the expression LA1 (LA2 (x)d ). Example 2. Let n = 8 and let GF (28 ) be defined by the irreducible polynomial p(x) = x8 + x4 + x3 + x + 1 as in the AES specifications. In addition, let LA1 (x) = 63 + 05 x + 09 x2 + f 9 x4 + 25 x8 + f 4 x16 + 01 x32 + b5 x64 + 8f x128 , LA2 (x) = 36 + 52 x + 77 x2 + 13 x4 + e0 x8 + f e x16 + 9e x32 + 96 x64 + 27 x128 and let x7 power function be chosen together with LA1 , LA2 to design an S-box.
222
M.T. Sakallı et al.
Table 1. Classification of power functions in GF (28 ) according to the maximum DDT value ∇S , N LMS and maximum number of terms in the algebraic expression(MNOT) Class(d)
Elements of Classes
3 9 39 5 21 95 111 127 7 25 37 63 11 29 13 55 59 15 45 17 19 23 31 47 53 61 91 119 27 43 87 51 85 1
(3 6 12 24 48 96 192 129) (9 18 36 72 144 33 66 132) (39 78 156 57 114 228 201 147) (5 10 20 40 80 160 65 130) (21 42 84 168 81 162 69 138) (95 190 125 150 245 235 215 175) (111 222 189 123 246 237 219 183) (127 254 253 251 247 239 223 191) (7 14 28 56 112 224 193 131) (25 50 100 200 145 35 70 140) (37 74 148 41 82 164 73 146) (63 126 252 249 243 231 207 159) (11 22 44 88 176 97 194 133) (29 58 116 232 209 163 71 142) (13 26 52 104 208 161 67 134) (55 110 220 185 115 230 205 155) (59 118 236 217 179 103 206 157) (15 30 60 120 240 225 195 135) (45 90 180 105 210 165 75 150) (17 34 68 136) (19 38 76 152 49 98 196 137) (23 46 92 184 113 226 197 139) (31 62 124 248 241 227 199 143) (47 94 188 121 242 229 203 151) (53 106 212 169 83 166 77 154) (61 122 244 223 211 167 79 158) (91 182 109 218 181 107 214 173) (119 238 221 187) (27 54 108 216 177 99 198 141) (43 86 172 89 178 101 202 149) (87 174 93 186 117 234 213 171) (51 102 204 153) (85 170) (1 2 4 8 16 32 64 128)
∇S N LMS 2 2 2 4 4 4 4 4 6 6 6 6 10 10 12 12 12 14 14 16 16 16 16 16 16 16 16 22 26 30 30 50 84 256
112 112 112 96 112 112 112 112 96 96 96 104 96 96 96 96 96 116 116 120 104 96 112 104 96 96 112 112 80 80 80 116 118 0
MNOT case1 case2 case3 9 37 37 9 37 37 9 163 163 9 37 37 9 93 93 9 247 247 9 247 247 9 255 255 9 93 93 9 93 93 9 93 93 9 247 247 9 93 93 9 163 163 9 93 93 9 219 219 9 219 219 9 163 163 9 163 163 5 37 37 9 93 93 9 163 163 9 219 219 9 219 219 9 163 163 9 219 219 9 219 219 5 247 247 9 163 163 9 163 163 9 219 219 5 163 163 3 163 163 9 9 9
For case 2, the algebraic expression of the S-box can be resolved using S(x) = (LA1 (x))7 , S(x) = ( 63 + 05 x + 09 x2 + f 9 x4 + 25 x8 + f 4 x16 + 01 x32 + b5 x64(9) + 8f x128 )7 .
On the Algebraic Expression of the AES S-Box Like S-Boxes
223
For case 3, the algebraic expression of the S-box can be resolved using S(x) = 63 + 05 (LA2 (x))7 + 09 (LA2 (x))14 + f 9 (LA2 (x))28 + 25 (LA2 (x))56 + f 4 (LA2 (x))112 + 01 (LA2 (x))224 + b5 (LA2 (x))193 + 8f (LA2 (x))131 , S(x) = 05 ( 36 + 52 x + 77 x2 + · · · + 96 x64 + 27 x128 )7 + 09 ( 36 + 52 x + 77 x2 + · · · + 96 x64 + 27 x128 )14 + f 9 ( 36 + 52 x + 77 x2 + · · · + 96 x64 + 27 x128 )28 .. . + 8f ( 36 + 52 x + 77 x2 + · · · + 96 x64 + 27 x128 )131 + 63 .
4
On the Complexity of the Proposed Method
In Section 3, we have given technical background for the new method to determine the algebraic expression of power mapping based S-boxes designed by using the defined cases. In [17] Lagrange interpolation is defined for given 2n elements as f (x) =
n i=1
yi
i≤j≤n,j=i
x − xj . xi − xj
where x1 , . . . , xn , y1 , . . . , yn ∈ R and R is a field. In fact, for the finite field GF (28 ), Lagrange interpolation formula can be implemented with two different algoritms where we ignore the addition operations (XOR operations). First one is related with applying the formula directly and it needs 254 × 254 polynomial multiplications for 8-bit to 8-bit S-boxes thus for given 512 elements and we ignore the finite field multiplications here. On the other hand, second one takes the benefit of finite field theory and needs 255 polynomial divisions where the numerator is (x256 + x) and the denominator is (x + xi ). Again, we ignore the finite field multiplications. If the proposed method is considered then we can take the benefit of the fast exponentiation (square and multiply) algorithm. For example, If we are supposed to compute (LA(x))254 which means to calculate (LA(x))(11111110) then we can compute (LA(x))2 , (LA(x))4 , (LA(x))8 , (LA(x))16 , (LA(x))32 , (LA(x))64 , (LA(x))128 by taking the square of LA(x) subsequently and obtain the resultant algebraic expression by multiplying previous obtained expressions. Thus, this decreases the number of polynomial multiplications from 64516 to 13 and this scheme denotes the case 2 for the power function x254 . For case 3, the number of polynomial multiplications can be calculated using the same algorithm and it can be found that 18 polynomial multiplications are enough to determine algebraic expression of an S-box designed by using case 3. Moreover, for any power mapping based S-box with the defined cases, the fast exponentiation algorithm can be used to determine the algebraic expression of an S-box.
224
5
M.T. Sakallı et al.
Conclusions
In this paper, we have studied how to obtain algebraic expression of the AES S-box like S-boxes and explained the reason why the algebraic expression of the AES S-box designed by using case 1 is so sparse from the point of the number of terms in the algebraic expression. Moreover, we give a method according to the given cases for resolving of the algebraic expression of these S-boxes. This method does not only show theoretically the reason of being so sparce of the AES S-box algebraic expression, but it is also very fast resolving method when comparing with Lagrange interpolation. For example, the S-box designed by case 3 and x254 power mapping, shown in Appendix A, includes 255 terms with algebraic degre 254. It is resolved by using our method in approximately 40 miliseconds with a computer having 2 GHz processor. If any element of the class 127 is used in the S-box design according to the case 1, 2 and 3, then the number of terms of the algebraic expression will be 9, 255 and 255 respectively with algebraic degree invariable. On the other hand, if any element of the class 7 is used in the S-box design according to the case 1, 2 and 3, then the number of terms of the algebraic expression of these S-boxes will be 9, 93 and 93 respectively. Moreover, the algebraic degree of the algebraic expression will be 224. A formula for the calculation of the number of terms in the algebraic expression for an S-box designed by using case 2 and 3 can be given as 1 + C(n, 1) + C(n, 2) + . . . + C(n, r) where r is the Hamming weight of the power function. In addition, the algebraic degree of the algebraic expression of the S-box will be the biggest value among the used class elements. If case 1 is concerned in the design of an S-box, then all elements of the class used in the S-box design will appear in the algebraic expression and again, algebraic degree will be the biggest value among the used class elements. So, an improvement of the AES S-box like S-boxes may be considered from the point of the number of terms in the algebraic expression by using case 2 or case 3.
References 1. Federal Information Processing Standards Publication (FIPS 197), Advanced Encryption Standard (AES) (November 26, 2001) 2. Kavut, S., Yucel, M.D.: On Some Cryptographic Properties of Rijndael. In: Gorodetski, V.I., Skormin, V.A., Popyack, L.J. (eds.) MMM-ACNS 2001. LNCS, vol. 2052, pp. 300–311. Springer, Heidelberg (2001) 3. Biham, E., Shamir, A.: Differential Cryptanalysis of DES-like cryptosystems. J. Cryptology (1991) 4. Matsui, M.: Linear cryptanalysis method for DES Cipher. In: Helleseth, T. (ed.) EUROCRYPT 1993. LNCS, vol. 765, pp. 386–397. Springer, Heidelberg (1994) 5. Chun, S., Kim, S., Lee, S., Sung, S.H., Yoon, S.: Differential and Linear Cryptanalysis for 2-round SPNs. In: Information Processing Letters. Elsevier, Amsterdam (2002)
On the Algebraic Expression of the AES S-Box Like S-Boxes
225
6. Nyberg, K.: Differentially uniform mappings for cryptography. In: Helleseth, T. (ed.) EUROCRYPT 1993. LNCS, vol. 765, pp. 55–64. Springer, Heidelberg (1994) 7. Jakobsen, T.: Cryptanalysis of block ciphers with probabilistic nonlinear relations of low degree. In: Krawczyk, H. (ed.) CRYPTO 1998. LNCS, vol. 1462, pp. 213–222. Springer, Heidelberg (1998) 8. Jakobsen, T., Knudsen, L.: The interpolation attack on block ciphers. In: Biham, E. (ed.) FSE 1997. LNCS, vol. 1267, pp. 28–40. Springer, Heidelberg (1997) 9. Youssef, A.M., Tavares, S.E., Gong, G.: On Some probabilistic approximations for AES-like s-boxes. Discrete Mathematics 306(16), 2016–2020 (2006) 10. Youssef, A.M., Tavares, S.E.: Affine equivalence in the AES round function. Discrete Applied Mathematics 148(2), 161–170 (2005) 11. Rijmen, V., Daemen, J., Preneel, B., Bosselaers, A., De Win, E.: The Cipher Shark. In: Gollmann, D. (ed.) FSE 1996. LNCS, vol. 1039, pp. 99–111. Springer, Heidelberg (1996) 12. Daemen, J., Knudsen, L.R., Rijmen, V.: The block cipher Square. In: Biham, E. (ed.) FSE 1997. LNCS, vol. 1267, pp. 149–165. Springer, Heidelberg (1997) 13. Aoki, K., Ichikawa, T., Kanda, M., Matsui, M., Moriai, S., Nakajima, J., Tokita, T.: Camellia: a 128-bit block cipher suitable for multiple platforms-design and analysis. In: Stinson, D.R., Tavares, S. (eds.) SAC 2000. LNCS, vol. 2012, pp. 39–56. Springer, Heidelberg (2001) 14. Jing-mei, L., Bao-dian, W., Xiang-guo, C., Xin-mei, W.: Cryptanalysis of Rijndael S-box and improvement. Applied Mathematics and Computation (2005) 15. McEliece, R.J.: Finite Fields for Computer Scientists and Engineers. Kluwer Academic Publishers, Dordrecht (1987) 16. Lidl, R., Niederreiter, H.: Introduction to finite fields and their applications, Revised edn. (1994) 17. Youssef, A.M., Gong, G.: On the Interpolation Attacks on Block Ciphers. In: Schneier, B. (ed.) FSE 2000. LNCS, vol. 1978, pp. 109–120. Springer, Heidelberg (2001) 18. Aslan, B., Sakallı, M.T., Bulu¸s, E.: Classifying 8-bit to 8-bit S-boxes Based on Power Mappings from the Point of DDT and LAT Distributions. In: von zur Gathen, J., Ima˜ na, J.L., Ko¸c, C ¸ .K. (eds.) WAIFI 2008. LNCS, vol. 5130, pp. 123–133. Springer, Heidelberg (2008)
Appendix: A The S-Box based on x254 power mapping in GF (28 ), where GF (28 ) is defined by the irreducible polynomial x8 + x4 + x3 + x + 1 and the two affine transformations used to construct the S-box are given below. The S-box has been obtained applying affine transformation LA2 to the input bits and applying LA1 after the power mapping x254 as given in case 3.
226
M.T. Sakallı et al.
0 1 2 3 4 5 6 7 8 9 A B C D E F 0 C3 18 27 80 15 34 FD F7 2B FE 6B 77 F0 CA D4 72 1 1A 1B E3 D6 CF 6A D1 B1 21 10 9D 40 85 D0 F9 9F 2 66 48 C1 57 8A E8 78 B4 E9 CE D9 98 68 8C 99 BB 3 0A 49 95 AC 08 6C C8 4E 14 DE 2A 4F 17 CD A7 19 4 89 E6 B0 0F 28 1E E1 94 74 BD 1C 2E F6 3E 61 9E 5 13 97 64 3D 0B EE 60 88 F4 7A 8D 6D 24 32 C2 79 6 C9 59 9C AF AB 01 63 C5 E5 D8 36 26 05 C7 07 75 7 AA 4D 50 7F F3 B6 51 F5 BE 4C 20 ED 5A 83 52 84 8 E7 A9 AE 56 91 62 3A 06 C4 73 44 0C 22 DC B8 5E 9 BA C6 8B DD 86 B9 B5 03 41 16 42 A1 69 11 87 55 A 53 5B 58 CB 29 B3 2C 6E 45 A8 33 EF 92 8F DA FF B B7 CC 31 A5 EB E2 23 96 AD C0 47 82 F2 7B 67 D7 C A3 38 D2 BC 3C 02 FB 43 3B 2F A0 09 FC 00 39 4A D 7C 6F 76 30 A4 A2 7D FA 12 B2 9A 04 3F 93 F1 71 E 81 90 DB 46 5D 7E EC 5F D3 E4 5C E0 D5 37 EA 65 F F8 8E DF 9B 54 2D 0D BF 35 1D 0E 70 A6 25 1F 4B The two affine transformations to construct the S-box are ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ 10001111 x0 1 f0 ⎢ f1 ⎥ ⎢ 1 1 0 0 0 1 1 1 ⎥ ⎢ x1 ⎥ ⎢ 1 ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎢ f2 ⎥ ⎢ 1 1 1 0 0 0 1 1 ⎥ ⎢ x2 ⎥ ⎢ 0 ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎢ f3 ⎥ ⎢ 1 1 1 1 0 0 0 1 ⎥ ⎢ x3 ⎥ ⎢ 0 ⎥ ⎥=⎢ ⎥ ⎢ ⎥ ⎢ ⎥ LA1 = ⎢ ⎢ f4 ⎥ ⎢ 1 1 1 1 1 0 0 0 ⎥ . ⎢ x4 ⎥ + ⎢ 0 ⎥, ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎢ f5 ⎥ ⎢ 0 1 1 1 1 1 0 0 ⎥ ⎢ x5 ⎥ ⎢ 1 ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎣ f6 ⎦ ⎣ 0 0 1 1 1 1 1 0 ⎦ ⎣ x6 ⎦ ⎣ 1 ⎦ 00011111 0 f7 x7 ⎤ ⎡ ⎤ ⎡ ⎤ ⎤ ⎡ x0 10000011 1 f0 ⎢ f1 ⎥ ⎢ 1 1 0 0 0 0 0 1 ⎥ ⎢ x1 ⎥ ⎢ 1 ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎢ f2 ⎥ ⎢ 1 1 1 0 0 0 0 0 ⎥ ⎢ x2 ⎥ ⎢ 0 ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎢ f3 ⎥ ⎢ 0 1 1 1 0 0 0 0 ⎥ ⎢ x3 ⎥ ⎢ 0 ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ ⎥=⎢ LA2 = ⎢ ⎢ f4 ⎥ ⎢ 0 0 1 1 1 0 0 0 ⎥ . ⎢ x4 ⎥ + ⎢ 1 ⎥. ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎢ f5 ⎥ ⎢ 0 0 0 1 1 1 0 0 ⎥ ⎢ x5 ⎥ ⎢ 1 ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎣ f6 ⎦ ⎣ 0 0 0 0 1 1 1 0 ⎦ ⎣ x6 ⎦ ⎣ 0 ⎦ 00000111 0 f7 x7 ⎡
Because case 3 is used to construct the S-box, the expression S(x) below can be used to resolve the algebraic expression of the S-box. S(x) = 63 + 05 (LA2 (x))254 + 09 (LA2 (x))253 + f 9 (LA2 (x))251 + 25 (LA2 (x))247 + f 4 (LA2 (x))239 + 01 (LA2 (x))223 + b5 (LA2 (x))191 + 8f (LA2 (x))127 , S(x) = 05 ( 33 + 52 x + 77 x2 + · · · + 96 x64 + 27 x128 )254 + 09 ( 33 + 52 x + 77 x2 + · · · + 96 x64 + 27 x128 )253
On the Algebraic Expression of the AES S-Box Like S-Boxes
227
+ f 9 ( 33 + 52 x + 77 x2 + · · · + 96 x64 + 27 x128 )251 .. . + 8f ( 33 + 52 x + 77 x2 + · · · + 96 x64 + 27 x128 )127 + 63 .
Hence the algebraic expression of the given S-box is S(x) = 1C x254 + 1E x253 + 16 x252 + 98 x251 + 07 x250 + 58 x249 + 86 x248 + E2 x247 + B5 x246 + 11 x245 + 06 x244 + 8E x243 + BA x242 + 9E x241 + 3F x240 + A4 x239 + 22 x238 + 3C x237 + E4 x236 + 1A x235 + 9A x234 + 18 x233 + DD x232 + 99 x231 + 82 x230 + 4C x229 + 98 x228 + DE x227 + 25 x226 + F 8 x225 + 75 x224 + BB x223 + 81 x222 + F D x221 + D0 x220 + C9 x219 + 04 x218 + 74 x217 + F 6 x216 + B2 x215 + 39 x214 + 49 x213 + 0A x212 + F 9 x211 + 49 x210 + 3B x209 + 6C x208 + A7 x207 + 66 x206 + E3 x205 + 90 x204 + 42 x203 + B7 x202 + 5D x201 + 4F x200 + 8D x199 + DB x198 + 38 x197 + 9A x196 + 68 x195 + E5 x194 + 82 x193 + 50 x192 + 73 x191 + BD x190 + 06 x189 + A7 x188 + F 3 x187 + 1D x186 + 28 x185 + 46 x184 + 8C x183 + 04 x182 + CF x181 + 8C x180 + C8 x179 + 6E x178 + 59 x177 + 32 x176 + 51 x175 + DF x174 + A8 x173 + 91 x172 + A5 x171 + E7 x170 + 63 x169 + D5 x168 + A0 x167 + 1B x166 + 96 x165 + D3 x164 + 85 x163 + 58 x162 + AF x161 + C9 x160 + 88 x159 + 5E x158 + 2F x157 + A6 x156 + 9A x155 + 27 x154 + 84 x153 + 59 x152 + 91 x151 + C0 x150 + 83 x149 + 2B x148 + 1B x147 + BC x146 + 19 x145 + 30 x144 + 93 x143 + 96 x142 + 52 x141 + 2E x140 + 11 x139 + 3E x138 + 28 x137 + E3 x136 + E0 x135 + 95 x134 + 2C x133 + 0F x132 + 26 x131 + 99 x130 + F B x129 + 63 x128 + 7E x127 + 88 x126 + 14 x125 + A3 x124 + DD x123 + 94 x122 + 20 x121 + B4 x120 + 70 x119 + 7E x118 + B1 x117 + F 6 x116 + 0D x115 + 92 x114 + 1F x113 + 0B x112 + 62 x111 + 0D x110 + 3E x109 + 16 x108 + D6 x107 + F 8 x106 + E7 x105 + 47 x104 + 30 x103 + 42 x102 + CB x101 + 26 x100 + 05 x99 + 3B x98 + 26 x97 + 8C x96 + A8 x95 + 75 x94 + A1 x93 + 09 x92 + D9 x91 + 6A x90 + D1 x89 + 5A x88 + 45 x87 + 29 x86 + D1 x85 + C8 x84 + 5E x83 + 97 x82 + 28 x81 + 79 x80 + 59 x79 + C3 x78 + 48 x77 + 6F x76 + E8 x75 + 79 x74 + 3B x73 + DE x72 + A5 x71 + B5 x70 + EB x69 + 9C x68 + C3 x67 + DE x66 + 0D x65 + 23 x64 + F 9 x63 + 8A x62 + F 5 x61 + 5D x60 + B1 x59 + 7C x58 + 46 x57 + 5A x56 + F 9 x55 + 10 x54 + EE x53 + 55 x52 + 9D x51 + 8F x50 + C8 x49 + E6 x48 + 9D x47 + C2 x46 + F E x45 + 59 x44 + 3B x43 + 1F x42 + 1F x41 + BC x40 + 02 x39 + 20 x38 + E6 x37 + E6 x36 + 8B x35 + 7C x34 + B9 x33 + 81 x32 + 56 x31 + 95 x30 + 09 x29 + 02 x28 + 4D x27 + 6D x26 + 34 x25 + 5A x24 + 1D x23 + 02 x22 + 3E x21 + F B x20 + 41 x19 + 51 x18 + E6 x17 + EF x16 + 5D x15 + C7 x14 + B1 x13 + 78 x12 + BF x11 + F C x10 + D2 x9 + 51 x8 + F A x7 + BC x6 + A5 x5 + F 6 x4 + 15 x3 + 87 x2 + E7 x + C3 .
Student’s Polls for Teaching Quality Evaluation as an Electronic Voting System Marcin Kucharczyk Silesian University of Technology, Faculty of Automatic Control, Electronics and Computer Science, Institute of Electronics, ul. Akademicka 16, 44-100 Gliwice, Poland [email protected]
Abstract. The problems of electronic voting (e-voting) systems are commonly discussed in case of general election. The main problems of e-voting are related with the system security and the user’s anonymity. System security is the problem of cryptographic security, user’s authorization, limited access and protection against frauds. The anonymity is another important issue, because a guarantee that the voters are anonymous is reflected in the reliability of casted votes. Authorization and anonymity seems to be contradictory but it possible to separate both procedures. The problems of polls for teaching quality evaluation are similar. The polls need to be available only for authorized students but they also need to be filled in anonymously. Three solutions of such a problem in remote voting system are discussed in the document. Keywords: Electronic voting, system authorization, rights management, user anonymity, blind signatures.
1 Introduction The poll to test the academic staff related to the evaluation of education quality is an obligation imposed on the academic authorities by the legislature (in Poland). The university’s authority defines a scope of questions the respondents answer to and regulates access permission to the results of polls. Irrespective of the scope of questions and availability of results, the requirements for the polls are similar to those of fair and free elections. The rules can be written as follows [1, 2]: ─ A teacher should be estimated only by students who attended his course and only they should participate in the poll (in the general election vote rights are limited to the citizens living in the defined area). ─ The number of questionnaires filled in is limited for each student. He/she may cast only one poll for a chosen professor and his lectures, but he/she can evaluate many professors and their lectures at the same time (in the general election each voter casts the same amount of votes, usually one vote). ─ The questionnaires should be filled in anonymously (anonymity in the general election is guaranteed by appropriate provisions in acts and Constitution). The seemingly contradictory conditions, authorization and anonymity, can be fulfilled in the e-voting system, even in the one with the Internet access. All the rules mentioned F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 228–233, 2010. © Springer-Verlag Berlin Heidelberg 2010
Student’s Polls for Teaching Quality Evaluation as an Electronic Voting System
229
before need to be met but depending on voting model used and its implementation the problems of a votes sale or anonymity break possibility can be observed. The results of student’s polls for teaching quality evaluation are not directly connected with the power like the results of a general election. Because of the nonexistent relation the votes trade problem is less important than anonymity of users so the research was directed on anonymity.
2 Traditional Polls Authorization of user during a standard voting consists on identification with the use of a document, that confirms the identity in front of the member of electoral committee in the polling station. The committee’s task is to check if the person who presents a document is its holder and, on the basis of the data from the document, to check in the base whether he or she is authorized to vote in this polling station. There is a double authorization in this case. The liability of the identity document and inspection of details which confirm its authenticity are subject to verification. The identity of the person who presents the document is also checked against the photograph on the document. After successful identification the voter gets the ballot paper, fills in the one and casts the vote to the ballot box. Next the votes are counted. The traditional student’s polls for teaching quality evaluation are similar. The questionnaires are printed and distributed only to eligible voters: students of the university. The student fills in the questionnaires, cast the ones in ballot box and then the results are draw up and published. The poll need to be carried out once in each semester. The questions in the questionnaire sometimes need some thought before the answer is provided and some time is required also to draw up the results. The teacher’s evaluation is obligatory so whole the procedure takes a time at least twice a year on each faculty at the university. The decision of moving polling to the Internet was taken to simplify the process of polling an make the system remotely available. The similar requirements of general election and the student’s polls directed researches to known e-voting systems which need to be adopted to the unique requirements defined for the student’s polls.
3 Authorization Separated from Identification The priority of the polling system of teachers assessment is the anonymity of voters, in order to assure the reliability of the filled in questionnaires. The access to the system requires identifier which is not related with the student’s data [3]. The random string of letters and numbers, called token, was proposed. Following confirmation of the student’s membership in a group of eligible voters (proving the student card when asked), the student draws a token which authorizes him/her to enter into the polling system. Number of the token, on the basis of the entry in the data base, limits the scope of voting, i.e. a list of available lecturers and the courses on questionnaires. The model above is similar to the system, used in some countries, i.e. the United States, Brazil, India, where voting is realized using electronic devices in regular polling stations. Person comes to polling station, confirms their authorization by showing at the polling station the identity card and casts a vote using electronic voting device
230
M. Kucharczyk
which is made available to the voter [1]. The voting act and the authorization of the voter are independent like in the above model of student’s polls. Because the token which gives authorization for voting does not depend on the personal data of the student who is using it, voting result is recorded together with the token. It does not violate the anonymity of the voter and enables editing of the questionnaire filled in. When the polling is closed, the voter can check if the questionnaire is still in the system and if it was not changed. The significant disadvantage for voters is that they need to go to the polling station to obtain the identifier (token) for authorization in the voting system.
4 System with Remote Authorization The base for the solution of system with remote access only was Estonian election system used for Internet voting since 2005 [4]. The voter is authorized as himself and the electronic votes are kept in the system until voting in regular polling stations is finished. Then the votes are deprived of the signatures and decoded by the polling station computers. It enables in defined time a change or removal of the vote by the voter. System user is not voting anonymously, but the creators of the voting system assure that the information about the voter’s identity will not be used to identify a person for each vote in the system. This assurance is also a guarantee that no-one will check what was the content of the vote of particular voter. When the voting is closed it is also impossible to check if the vote was not changed or removed. The inconvenience of the system is lack of vote verification by a voter [5]. It was eliminated in the proposed modification of the polls system [2]. A student need to identify himself in the electronic system not to cast a vote but to get the access entitlements. The local LDAP authorization database connected with the student’s course services called SOTS (http://sotsinfo.polsl.pl/) is used to authorize the student. Using the remote connection the voter is authorized and places electronic signature on the election list. After successful authorization he/she receives anonymous token which give him/her rights to use a polling system and fill in the questionnaires. The above solution base on the trust of the system user that the voting procedure is realized in accordance with the system creators. Violation of anonymity by a slight change of the system software is easy. The advantage of proposed model for student’s polls is possibility of vote check because it is casted anonymously. The fear that program code will be modified and the personal data of the voter will be saved with the vote can be dispelled by creating open source system [6]. Additional protection can be provided with code signing and using trusted platform module [7].
5 Anonymity with Blind Signature Protocol The proposed idea of removing anonymity break possibility was use of the blind signature protocol which was implemented in the polls system. Blind signatures have been designed for the first time by Chaum [8]. The protocol’s idea is as follows: 1. A voting person prepares a message m and than encodes it using random value (blinding factor) r. It results in encoded message m'.
Student’s Polls for Teaching Quality Evaluation as an Electronic Voting System
231
2. Message m' is delivered to the voting institution for the electronic signature. Person who requests signature is a subject to authorization and only after the successful one, the authorization center is signing the message, which results in signed message s' (the data which make impossible to retrieve the signature are collected). 3. The voter removes from the received message s' the secret value r and the result is a message s, which is a digitally signed original message m. 4. The voter sends message m and its signed version s to the voting system, which checking its own signature can confirm the voter’s rights for vote. For mathematical example, using RSA algorithm, the procedure is as follows: 1. 2. 3. 4.
m' = (m·re) mod n, where (e, n) is a public key of the authorization center; s' = (m')d mod n, where (d, n) is a private key of the authorization center; s = s'·r–1 mod n; s ≡ md mod n, because in RSA algorithm red ≡ r, so s'·r–1 = (m')d· r–1 = (m · re)d· r–1 = md r·r–1 = md, which is an original message signed by the authorization center.
The most often electronic voting system uses blind signatures for encoding and signing of the election votes [9]. The voter casts a vote anonymously using one system. The vote is locally processed (encoded and encrypted by a defined hash function) and the signature request on the blinded vote is send to other system, which is authorizing the voter. Proper authorization of the voter is a condition for obtaining signature. At the same time e-voting system records, that the voter used his rights and blocks the possibility to vote more than once. After the removal of the secret component (blinding factor) of the message from the signed request the voter uses the voting system once again, where he sends the signed version of his vote. Poll system allows the student to cast just one vote for a chosen professor and his lecture, but at the same time the student casts many votes concerning a dozen of professors and their lectures or classes. The right for multiple voting is the main difference between student’s polls and general elections. If the blind signatures protocol will be implemented in the polls system as described above then each vote need to be blinded and signed before it is cast so the voter would have to carry out the signing procedure many times. The inconvenience can avoided using blind signatures for the approval of the identifier required for authorization in the voting system [3]. The blind signature protocol using RSA algorithm was implemented in the student’s poll system for education quality evaluation. The offline application, with available source code, is used for blinding the token and unblinding the signed one. The anonymity of the voter is assured by the random factor r and blinding algorithm implemented. The procedures of obtaining the identifier, signing it and authorization in the voting system are independent. As a result of this independence, each stage may be realized on different computer and at different time. It makes it more difficult to trace the user’s identity on the basis of the voting time or the location. The cost of the anonymity is more complex, so less convenient procedure of voting.
6 Implementation Details and Conclusion Based on known electronic voting solutions the polls system for teaching quality evaluation directed to voter’s anonymity was implemented. The unique requirements
232
M. Kucharczyk
of the student’s polls caused some modifications of voting models described before. The authorization procedure for vote was separated from the voter’s personal data so the opinion given can be checked and changed. system with printed tokens
Authirization with ID Card
system with remote access only - tokens are assigned after LDAP authorization
Blind Signature Protocol
Direct LDAP Authirization voter can choose procedure
ID Card
computer checks
man checks
personal data
OK
OK
computer checks
access is granted
voter draws the access key
?
inactive access key is blinded
OK
hidden key is signed
blinding factor is removed
anonymous voter presents the key
voter access with anonymous signed key
access for professors LDAP authorization, view own results only
poll results (the votes)
administration access professors and lectures database
system authorization, view results and databases administration
Fig. 1. Different access possibility to the polls system for teaching quality evaluation
Polls data can be accessed by the voters, evaluated professors and administrators (public access is currently limited by regulations) – see fig. 1. The administrator chooses the type of token distribution model when the new poll is started. The option of printing the tokens and receiving the ones with remote authorization are mutually exclusive. But if the remote access only model was activated then the voter can chose direct authorization or the use of blind signature protocol. Voter’s access is limited by the token number to the defined poll only. The professors have read only access to the results of polls which evaluate his own lectures and classes. The administration access includes: results view, professors and lectures lists edit and poll creation. Administrators can’t change or delete the questionnaires filled in. Overall results of polls are presented in the tables with detailed data available for chosen professor and lectures. It is easy to change/add a questionnaire pattern in the system and the way of results presentation depends on the one used. A graphical results presentation can also be added on account of users requirements. The PHP language was used for the main system implementation. Authorization data of system users and the results of polls are stored in MySQL database. External Active Directory service through LDAP protocol is used for students and professors
Student’s Polls for Teaching Quality Evaluation as an Electronic Voting System
233
personal authorization. Administration access uses web server authentication with additional limits implemented. An arbitrary precision arithmetic calculations used in the RSA algorithm and in the blind signature protocol was made using OpenSSL and GNU MP libraries. All mentioned tools are widely available open source software. The purpose of creation an electronic system for student’s polls was accelerate voting and counting votes, facilitate participation in the election and increase the turnout. The turnout for electronic polls was at most 20% in some periods and it was greater than for traditional polls. A slightly better result was achieved using the printed tokens then the tokens obtained after authorization in remote system. Only about 10% of the second ones was obtained using blind signature protocol. The results of using electronic system shows that anonymity is important for users but the most important is that the system is easy to use and to understand for voters.
References 1. Epstein, J.: Electronic Voting. Computer 40(8), 92–95 (2007) 2. Kucharczyk, M.: Assessment of Teachers and Classes – Measurement of Teaching Quality by the Internet. In: iCEER’07 Proceedings, The 2007 International Conference on Engineering Education and Research, Australia (2007) 3. Cetinkaya, O., Doganaksoy, A.: Pseudo-Voter Identity (PVID) Scheme for e-Voting Protocols. In: The Second International Conference on Availability, Reliability and Security (ARES 2007), Austria, pp. 1190–1196 (2007) 4. Estonian National Electoral Committee: Internet Voting in Estonia, http://www.vvk.ee/index.php?id=11178 5. Ansari, N., Sakarindr, P., Haghani, E., Zhang, C., Jain, A.K., Shi, Y.Q.: Evaluating Electronic Voting Systems Equipped with Voter-Verified Paper Records. IEEE Security & Privacy 6(3), 30–39 (2008) 6. Open Voting Consortium: Open Source Voting: Accurate, Accountable, http://www.openvotingconsortium.org/ 7. Paul, N., Tanenbaum, A.S.: Trustworthy Voting: From Machine to System. Computer 42(5), 23–29 (2009) 8. Chaum, D.: Blind Signatures for Untraceable Payments. In: Crypto’82, pp. 199–203. Plenum Press, New York (1983) 9. Ibrahim, S., Kamat, M., Salleh, M., Aziz, S.R.A.: Secure E-voting with Blind Signature. In: NCTT 2003 Proceedings, 4th National Conference on Telecommunication Technology, Malaysia (2003)
An Improved Estimation of the RSA Quantum Breaking Success Rate Piotr Zawadzki Institute of Electronics, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland [email protected]
Abstract. The security of RSA cryptosystem is based on the assumption that factorization is a difficult problem from the number theoretic point of view. But that statement does not hold with regard to quantum computers where massive parallelization of computations leads to qualitative speedup. The Shor’s quantum factorization algorithm is one the most famous algorithms ever proposed. That algorithm has linear time complexity but is of probabilistic nature. It succeeds only when some random parameter fed at algorithm input has desired properties. It is well known that such parameters are found with probability not less than 1/2. However, the described in the paper numerical simulations prove that probability of such event exhibits grouping at some discrete levels above that limit. Thus, one may conclude that usage of the common bound leads to underestimation of the successful factorization probability. Empirical formulas on expected success probability introduced in the paper give rise to the more profound analysis of the Shor’s algorithm classic part behaviour. The observed grouping still awaits for explanations based on number theory. Keywords: Quantum computation, factorization.
1
Introduction
Theoretical study of quantum systems serving as computational devices has achieved tremendous progress in the last several years. There exist strong theoretical evidence that quantum computers are able to break all presently used asymmetrical algorithms whose security is based on computational complexity. Qualitative progress results from the massive parallel computation realized as the quantum system controlled evolution. The efficient function period finding seems to be one of the most stimulating development in the field, as it provides efficient solution for factorization problem, which seems to be the Holy Grail of the classic algebra. Presently, interest in the factoring problem is especially great for composite integers being a product of two large prime numbers – the ability to factor such integers is equivalent to the ability to read information encoded via the RSA cryptographic system [3]. Thus, quantum computers, if built, pose F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 234–240, 2010. c Springer-Verlag Berlin Heidelberg 2010
An Improved Estimation of the RSA Quantum Breaking Success Rate
235
a serious challenge for the security of today’s asymmetric cryptographic systems. However, the quantum factorization is a probabilistic process and profound analysis of its efficiency is required. The algorithm random behaviour comes both from the inherent nature of quantum measurement and specific construction of supporting classic calculations. The researchers in the field are mainly interested in the quantum part success probability and proposed many modifications improving that aspect of the original Shor’s version [4,5]. It is reported that quantum device are extremely reliable if multiple measurements and sophisticated post processing are employed [1]. Much less attention was devoted the classic part, although it has large impact on the overall factorization procedure efficiency. It was proved in [2] that lower bound on probability of finding parameter suitable for factorization is equal to 1/2. The classic parameter selection failure causes entire algorithm repetition that is very undesirable as the repetitive runs of quantum device are undoubtedly costly both in terms of time and money. The fine grained formulas on the above mentioned probability are vital to overall cost estimation of the factorization procedure. The aim of this paper is to provide fine grained formulas on the classical part success rate compared to commonly used and relatively crude estimation introduced in [2]. The influence of that step on the overall algorithm performance one can find in [10]. The details of function period finding quantum algorithm are described in many textbooks [6] and due to lack of space will be omitted here. Next section describe the classic part of the quantum factorization and its relation to the quantum one. The methodology of the quantum factorization simulation with classic computer is introduced in section 3. The section 4 presents results and novel analytical expressions on the expected success probability.
2
Shor’s Algorithm
In the analysis of the quantum factorization it will be assumed that it is possible to build a quantum device that recovers the period of the function in polynomial time. The finding of the function period seems to be not too exciting application of quantum computers. However, the security of the RSA [7] and ElGamal [7] cryptosystems is in fact based on the period finding difficulty of modular exponential function (f (x) = ax mod N ). The period of such function is in number theory regarded as property of a and called its order. The knowledge of the function period allows for efficient modulus factorization or discrete logarithm calculation. In effect, an polynomial time period finding algorithm opens a way to breaking those asymmetrical cryptosystems. The assumption that factorization of the composite number formed as the product of two large primes cannot be performed in polynomial time lays at the core of the RSA cryptosystem security. Presently, the best classic algorithm 1/3 2/3 performs factorization in time proportional to en (log n) , where n is the composite number size expressed in bits. However, the assumption of factorization ineffectiveness is purely empirical – there is no theoretical proof that this property holds even for classic computers.
236
P. Zawadzki
The algorithm proposed in [8] completely changed the view on factorization complexity. It exploits reduction of factorization to order finding based on the following observation. Let N be the composite number and a < N is coprime to N . The order of a is the smallest number such that ar mod N = 1, thus r is the period of exponential function. If the order of a is even then one may write ar/2 − 1 ar/2 + 1 mod N = 0 (1) and
p = gcd ar/2 − 1, N
(2)
is a nontrivial factor of N provided that ar/2 mod N = −1.
(3)
Thus factorization of composite number N is reduced to the order finding of some number a, and in fact, to the period finding of the modular exponential function. Order finding with classic computer gives no advantage over other factorization algorithms as its complexity is also exponential. However, the order of a may be determined in polynomial time with the help of the quantum computer [6].
3
Numerical Simulation
The following steps summarize the Shor’s algorithm for quantum factorization of the composite number N : 1. Select a random number a coprime to N (otherwise gcd(a, N ) is a factor of N ). Only some a are good candidates as the order of a determined by the next step has to be even and condition (3) must be fulfilled. 2. Find the order of a with the quantum computer. The order is successfully recovered only for some subset of valid quantum measurements. 3. Calculate divisor p from the equation (2) and return to the point 1) with N = N/p. It is clear, that the nature of the above algorithm is probabilistic. The sources of uncertainty are twofold: the randomness of the quantum period finding and the random selection of the number a with desired properties (3). The success probability of an order quantum recovery is well known [6,2]. It is related both to the quantum measurement inherent uncertainty and properties of the classic continued fraction expansion algorithm used in postprocessing of the measurement result. The quantum measurement failure may be diminished to arbitrary level by proper expansion of the modulus N quantum representation. The second part is successful only when the measurement output is coprime to order to be recovered [10]. On the other hand, the influence of the proper a selection on algorithm effectiveness requires further investigation. The literature on the
An Improved Estimation of the RSA Quantum Breaking Success Rate
237
subject provides only a lower bound on probability that a poses all required properties [2] (4) p(a) ≥ 1 − 2−(k−1) where k is the number of prime factors of N . That lower bound has maximal value equal to 1/2 when composite number is a product of only two prime numbers, what in fact represents the most interesting situation in the context of the RSA system security considerations. The relatively simple code is required for numeric calculation of p(a) (see Table 1). The classic order finding algorithm has exponential complexity and very quickly becomes a daunting task for typical PC architectures. Moreover, exponentiation of a very quickly leads to overflow when integer numbers standard representation is used. To overcome that problem one have to take advantage of library providing computations in arbitrary precision, what in turn, additionally slows down the program execution. Table 1. Pseudocode for Shor’s factorization success probability calculation success(N) { coprime=1 ; lucky=0 ; a=2 ; while (a
4
)
Results
The pseudocode from Table 1 was implemented in C++ language with the help of Class Library for Numbers providing transparent interface for modular arithmetic. The probability of suitable for factorization parameter a selection was computed for composite numbers of the form N = pq, where factors were taken from the list of the first 500 primes. The calculated probabilities are shown on the Fig. 1, but due to picture clarity only composites not exceeding 5000 are presenetd on the plot. The observed probabilities are always greater than 1/2 as predicted by the bound (4). However, the presented simulation results exhibit a deeper structure. The grouping of points around some discrete levels is evident
238
P. Zawadzki
and that indicates the existence of a class of composites less resistant to quantum factorization than others. Observed behaviour comes from the factors’ p and q properties, but there is no satisfactory theoretical explanation of the obtained results in the literature known to the author.
1 1:4
p(a)
0.9
1:3 2:3
0.8
1:2 0.7 2:2
0.6 0.5
1:1 0
1000
2000
3000
4000
5000
N
Fig. 1. Classic factor selection success probability
The introduction of empirical formulas on the observed probability levels is the main contribution of the paper. The factors are primes numbers, so they have to be odd (the trivial case of factor 2 is excluded) and may be expressed as p = 2α μ + 1, q = 2β ν + 1.
(5) (6)
Lets name α and β the parity levels of the factors. The probability of ,,lucky” parameter a selection for the composite N = pq is then given by the expression 1+ p(a) = f (α, β) = 1 −
min{α,β} δ=1 2α+β
4δ−1 .
(7)
The levels predicted by (7) are shown on Fig. 1 as solid lines with respective values of α and β labeled on the right. The perfect matching of simulation results and theoretical predictions is visible. The points not included in the marked levels come from less probable combinations of parity levels that are not placed on figure because of clarity. The lack of correlation between bits in the prime number representation is one of its most useful cryptographic properties. It is also very helpful in counting prime numbers with the given parity level. The least significant bit of the prime number representation is always set to ,,1”, as the prime number have to be odd and, in consequence, α ≥ 1. The prime numbers with parity level α ≥ 2 have the second least significant bit set to ,,0”, and numbers with α ≥ 3 have the second and third least significant bit set to ,,0”, and so on. Thus probability that randomly selected prime number has parity level not less than α equals
An Improved Estimation of the RSA Quantum Breaking Success Rate
239
to 2−(α−1) . Probability that randomly selected prime number has parity level exactly equal to α is given by P (α) = 2−(α−1) − 2−α = 2−α .
(8)
The above considerations have been verified experimentally. The parity level probability density function P (α) was computed for primes less than 107 . The comparison of numerical experiment and theoretical considerations (8) is presented on Fig. 2. 0 numeric 2-α
-2 -4 log2(P(α))
-6 -8 -10 -12 -14 -16 -18 -20 0
2
4
6
8
10 α
12
14
16
18
20
Fig. 2. Parity level probability density function P (α)
Some interesting conclusions follow from (7) and (8). The lower bound p(a) = 1/2 is reached only when α = β = 1. Only P (α = 1)P (β = 1) = 25 % composite numbers of the RSA form fulfil this condition. Thus, for 75 % factorization cases one may expect faster algorithm convergence than the one estimated from the commonly used bound (4). On average, the selection probability of the lucky parameter is equal to ∞ ∞
f (α, β)P (α)P (β) = 0.736.
(9)
α=1 β=1
Also from (7) and (8). one can numerically estimate the percentage of composite numbers for which lucky parameter selection probability is above some threshold, for instance, more than 20 % composites have p(a) > 0.9 P (α)P (β) = 20.5 %. (10) α,β:f (α,β)>0.9
5
Conclusion
The quantum factorization algorithm represents a breakthrough in complexity theory and modern cryptography. The Shor’s algorithm owes its fame to polynomial time breaking of virtually all presently used public key algorithms. Unfortunately, the practical breaking is out the reach by now because factorization of
240
P. Zawadzki
the number 15 is still one of the most complicated quantum computations [9]. However, very rapid progress in that field is observed so it is difficult to estimate the time horizon when practical computation will be in scientists’ reach. The quantum factorization was analysed many times and there were proposed modifications to the original algorithm version improving its speed and efficiency. However, the researchers were concentrated so far on the probabilistic aspect of the quantum measurement. The randomness introduced by the classical parts of the algorithm still requires further investigation. The computer simulation results presented herein expose that success rate of the algorithm is usually underestimated. Theoretical considerations are required to validate empirical formulas introduced in the paper.
References 1. Bourdon, P.S., Williams, H.T.: Probability estimates for Shors algorithm. Quant. Inf. Comput. 7(5&6), 522–550 (2007) 2. Ekert, A., Jozsa, R.: Quantum computation and Shor’s factoring algorithm. Rev. Mod. Phys. 68(3), 733–753 (1996) 3. Gerjuoy, E.: Shor’s factoring algorithm and modern cryptography. An illustration of the capabilities inherent in quantum computers. Am. J. Phys. 73(6), 521– 540 (2005), http://arxiv.org/pdf/quant-ph/0411184 4. Knill, E.: On Shor’s quantum factor finding algorithm: Increasing the probability of success and tradeoffs involving the Fourier Transform modulus. Tech. Rep. LAUR95-3350, Los Alamos National Laboratory (1995), http://www.eskimo.com/~ knill/cv/reprints/knill:qc1995c.ps 5. McAnally, D.: A refinement of Shor’s algorithm (2001), http://xxx.lanl.gov/pdf/quant-ph/0112055 6. Nielsen, M.A., Chuang, I.L.: Quantum Computation and Quantum Information. Cambridge University Press, Cambridge (2000) 7. Schneier, B.: Applied Cryptography. John Wiley & Sons, Chichester (1996) 8. Shor, P.W.: Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM J. Sci. Statist. Comput. 26, 1484–1509 (1997), http://www.citebase.org/abstract?id=oai:arXiv.org:quant-ph/9508027 9. Vandersypen, L.M.K., Steffen, M., Breyta, G., Yannoni, C.S., Sherwood, M.H., Chuang, I.L.: Experimental realization of Shor’s quantum factoring algorithm using nuclear magnetic resonance. Nature 414, 883–887 (2001) 10. Zawadzki, P.: A numerical simulation of quantum factorization success probability. In: Tkacz, E., Kapczyski, A. (eds.) Internet– Technical Developments and Applications. Advances in Intelligent and Soft Computing, vol. 64, pp. 223–231. Springer, Heidelberg (2009)
Mining Bluetooth Attacks in Smart Phones Seyed Morteza Babamir1, Reyhane Nowrouzi2, and Hadi Naseri3 1
2
University of Kashan, Department of Computer, Kashan, Iran Islamic Azad University of Fars, Science and Research branch, Shiraz, Iran 3 Islamic Azad University of Estahban, Estahban, Iran [email protected], [email protected], [email protected]
Abstract. The Bluetooth port of the Smart Phones is subject to threat of attacks of Bluesnarfing, Bluejacking and Bluebugging. In this paper, we aim to deal with mining these attacks. Having explained properties of the three attack types, we state three typical cases of them including SMS manipulation, phone book manipulation, and phone call initiation. According to characteristics of each attack type, we model the attacks by using Colored Petri-Nets and then mine the models. To show correctness of our models, we verify their liveness, fairness, and boundness properties. Finally, we mine models to analyze attacks. Keywords: Bluetooth Attack, Process Mining, Security.
1 Introduction Bluetooth, preparing simple and easy communication among users is a fit way for transmitting different programs and information between the smart phones. However, such a suitable technology is subject to viruses and malicious software for intrusion. In 2007, McAfee Inc., the well-known anti-virus vendor, reported that 83% of mobile operators had stated they had suffered infections on their mobile devices [1].A common categorization of attacks on the Smart Phones via the Bluetooth port are: Bluebugging, Bluejacking and Bluesnarfing [2,3,4], causing many problems for smart phone users. Bluebugging attack type discovered and successfully demonstrated by Herfurt and Laurie on 50 phones during CeBit 2004 [4]. Typical kinds of these attacks are: Malicious command execution on remote user’s smart phone, phone call initation, read and write of SMS, victim’s private data access, contacts access, eaves dropping, and internet connection while the phone holder is not ready and aware of it. Once connected, the attacker sends a couple of commands for initiating some call or some SMSs, updating the smart phone phonebook, or forwarding a call [4]. Thereafter without awareness of the attacked phone, the attacker can eavesdrop on its calls and SMSs. Bluejacking attack type doesn’t change data of remote user’s smart phone but it sends unsolicited data via Bluetooth so that Bluejacked user may think that their phone is malfunctioning. Bluejacking doesn’t usually have malicious purpose but F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 241–253, 2010. © Springer-Verlag Berlin Heidelberg 2010
242
S.M. Babamir, R. Nowrouzi, and H. Naseri
repeating messages can be annoying. Since certain Bluetooth applications such as the Object Exchange Push Profile require no authentication, an attacker can simply send unsolicited data, including text, images, video, or sound to a smart phone. Therefore, this attack type can be used to distribute viruses or trojan horses. Bluejacking was originated with a user named ajack, who developed first Bluejacking software called SMan for the Symbian operating system of smart phones. Bluesnarfing attack type is concerning the theft of information such as copying entire contact book, calendar or anything else stored in the phone's memory through a Bluetooth connection. BlueSnarfing has been identified by Marcel Holtmann and independently by Adam Laurie in 2003 [4]. Laurie discovered the vulnerability that enables Bluesnarfing when he was testing the security of Bluetooth devices. In this paper, regarding access level to the Smart Phone and kind of intrusive action we first propose three common attacks, (1) SMS manipulation, (2) Phone call initiation, and (3) Phone book manipulation. In second step, we consider properties of Bluebugging, Bluejacking and Bluesnarfing attacks types and decide on associating common threats with attack types. In the third step, we use Colored Petri-Nets [5] to model the three common attacks based on their properties. Then, by executing the models we construct logs to show the model properties, consisting of boundness, liveness, and fairness. In the fourth step, with the aim of recognizing attacks we deal with analyzing models through its mining; analyzing models is one of the basic and key processes of our approach.
2 Preliminaries Petri-Net is a visual modeling way which prepares the possibility of studying systems. We can find useful information from dynamic behavior and the structure of modeled system by analyzing Petri-Net. A Petri-Net consists of a collection of Places (P) probably including tokens (marks), a collection of Transitions (T), and a collection of functions (F). Functions are connector ports of transitions and places and determine how tokens to be removed from input places of a transition and how tokens to be added to output places of the transition. A Colored Petri-Net is a Petri-Net in which: tokens are colored so that each color indicates a token type such as Integer Petri-Nets via visual modeling provide a base for modeling concurrent, synchronous and distributed systems. To model Blueattacks, we use the Colored Petri-Nets to identify and distinguish smart phone users in order to make decisions and special policies for each user. Assigning a colored token to a user, we identify him/her in each step and implement his/her related policy. For constructing Colored Petri-Nets, we use the CPNTools [6], which is a strong industrial visual modeling tool.
3 Bluetooth Attacks Bluetooth technology prepare the given connection and wireless waves with low radio scope with low cost. The maximum Bluetooth scope is about 10 meters. This technology uses bandwidth 2.4 Giga hertz which is a license free one. The maximum
Mining Bluetooth Attacks in Smart Phones
243
bandwidth of the smart phone Bluetooth connection is 3 mega bit per second. Bluetooth security is an important subject in Bluetooth. Four security goals for making safe connecting via Bluetooth are introduction: Confidentiality, Authorization, Authentication and Integrity. To achieve these goals, three non security mode, service level security and link level security are introduced. In no security mode, Bluetooth doesn’t use security strategies. In security mode, in servicing step, two mobiles having Bluetooth can use a no secure and no integrity link. In security mode in link step, mobile uses security ways before defining connection channel. So tow mobiles uses secure channel for connecting. These security ways include Authentication, Authorization and optional Encryption. Authentication defines the identity of mobile having Bluetooth. Authorization is the process of rejecting or accepting availability to net security. Encryption is the given translation to secure code. In spite of all security mechanisms, there are dangers which threaten Bluetooth. Bluetooth attacks are classified into three groups [6]. 3.1 Modeling the Attacks To model the attacks, SMS manipulation, phone call initiation, and phone book manipulation, we first classify them (Table 1) based on their characteristics that introduced in the Section 1 and then we deal with modeling them as Colored Petri-Nets. Phone call initiation attack. This attack, which is a sort of Bluebugging one, causes a phone call initiation. In this sort of attack, the hacker can enter contact list and initiate a phone call using the data which has been stored in contact list or in received, missed, and dialed calls. The hacker can also enter the hacked phone a number using its keyboard and then call the number by the hacked phone. Therefore, dialed phone thinks that the hacked phone called him/her. Figure 1 shows Petri-Net model of this attack along with its state space. The “1`i” notation shows taking ith token away from some input place and putting the token into the some output one. In this model, having connected to hacked phone, hacker will be in the “phone” place. Afterwards, the hacker can perform one on of the followings: - He/She enters the phone book and goes to the “openpb” place and then while executing the “contact1” transaction makes a call and goes to the “connect” place, - He/She enters the contact list and goes to the “openlog” place and then opens the three other places, missed calls, received calls, and dialed calls, via this place. Having put into the places, he can choose a number and then make a connection, - He/She presses a number via its own keyboard and then goes to the “recon” place and then makes a phone call and then goes to “connect” place. Table 1. Classifying attacks 1 2 3 4
Attack Phone call initiation Phone book manipulation SMS manipulation
Type Bluebugging Bluebugging, Bluejacking Bluebugging, Bluejacking
244
S.M. Babamir, R. Nowrouzi, and H. Naseri
Fig. 1. Phone call initiation model and its boundness
Fig. 2. Phone book manipulation model and its boundness
Phone book manipulation attack. In this attack, the hacker can edit, delete, store data located in the phone book, and then copy back into its own phone to edit it. Copy back of data into hacker’s phone is a Bluesnarfing attack type but manipulation, deletion and edition of data are Bluebugging one. Figure 2 shows the manipulation of the phone book along with its state space. In this model, having connected to the hacked
Mining Bluetooth Attacks in Smart Phones
245
phone, hacker goes to the “phone” place and opens the phone book in order to open the phone book and then he goes to the “openph” place. In this mode, the hacker can perform one of the followings: - He/She closes the phone book “close” and then goes to the “phone” place, - He/She chooses a number and goes to the “contact” place and then makes a phone call. And finally, by pressing “end bottom”, the conversation ends and it goes to the “openph” place, - He/She chooses the new bottom and the goes to the “insert” place. Then, he/she makes a new contact and then stores it into the target phone and finally returns to the “openph” place, - He/She opens a number and then goes to the “change” place. In this mode he/she can change some data and restore the changed data into the hacked phone, (5) He/She deletes a phone number and then returns to the “openph” place. SMS manipulation attack. This attack, indicating a Bluebugging attack type is hacker’s access to SMSs of the hacked phone in order to manipulate them. In this attack, hacker penetrates into hacked phone inbox and read its messages. Hacker is also able to put some messages into the hacked phone inbox and then sends them to the numbers which are in hacked phone contacts. Figure 3 shows this model along with its state space. In this model, having connected into the hacked phone and opened the SMS inbox, hacker goes to the “phone” place and can perform one of the followings: - He/She goes to the “read” place in order to read some SMS and finally returns to the “close” place, - He/She goes to the “new message” place and then starts storing a new message (the “write” place). Then, he/she enters target phone’s number and places the SMS in a “ready for send” place. Finally he/she sends the SMS by pressing the “send” bottom.
Fig. 3. SMS manipulation model and its boundness
246
S.M. Babamir, R. Nowrouzi, and H. Naseri
4 Verifying Models We verify the models for boundness, liveness and fairness properties. We show the boundness property of a model by lack of infinity of its state space and we show liveness and fairness properties via a log file. Verifying boundness. To verify the boundness feature of a model visualized by a PertiNet, we generate its state space. If there is no infinity in its state space, the model is bounded and all it is analyzable. We generate the state space of a model by the Reachability Graph is made by the CPNTools. So for analyzing model we should simulate it. In Figures 1 to 3, reachability graph of each model, indicating absence of infinity of its state space, have been shown.
Fig. 4. Fairness and liveness in model of phone call initiation attack
Mining Bluetooth Attacks in Smart Phones
247
Fig. 5. Fairness and liveness in model of phone book manipulation attack
Verifying fairness and liveness. Liveness is a feature, indicating firing transitions. If all the transitions of a Petri-Net are fired during execution of the net it is live. So, liveness of a Petri-Net shows the lack of its deadlock. Figures 4, 5 and 6 show the log files have been generated using CPNTools. As these files show, there is no dead transition; so, the models are live. And since that all transitions have fired, the models are fair.
5 Mining Attacks Having verified the models of the attacks, we deal with its mining to analyze them. To this, we need to know about the number of events and processes, which an event indicates firing a transition of a model and a process indicates a request transaction, consisting of some events. Each process is shown by a colored token (mark) so that n processes are distinguished from each other by n different colors. Using colors we can follow behavior of a distinct process and discriminate if it is an attack process or a fair one. By mining the behavior of each colored token, indicating a distinct process, we are able to analyze the behavior of processes. The behavior of a process is determined
248
S.M. Babamir, R. Nowrouzi, and H. Naseri
Fig. 6. Fairness and liveness in model of SMS manipulation attack
by its sequence of events. For example, consider Figure 1 and assume that the “hack” place consists of two colored tokens named α and β; this means there are two distinct processes in the place. Assume that both of processes intend to make contact1, consisting of path “connect” → “phone book” → “contact1” → “end1”. Processes α and β can travel this path in two ways, sequentially and concurrently. In a sequential travel, the sequence of events the second process happens after those of the first one and in a concurrent travel the sequence of events of two processes are blended with each other. 5.1 Mining Phone Call Initiation Attack To mine the model of the phone call Initiation (Figure 1), we should compute the number of events (firing transitions) for each process (transaction). In fact, each event shows a step of a process; so, a process consists of a lot of events. Connecting to a phone, opening its phone book, and making a connection, for instance, are three events of a process which intend to contact. When the number of events of a process increases to a threshold during a short time interval, it indicates the process plays role of an attack. By mining the model we want to find the threshold. Table 2 shows mining of the phone call initiation attack (Figure 1 including 21 transitions) in which “#Runs” indicates the number of executions of model by the CPNTools. Initially, we placed 100 distinct processes (colored tokens) in the “hack”
Mining Bluetooth Attacks in Smart Phones
249
place and then followed the path each token travels. Afterwards, we came to understand that each process on average created 11 events and each transition was fired on average 6 times for each token. Each number in Table 2 indicates percentage of events (firing transitions). The numbers are compared with each other in pairs, which each pair appears in form of “Contactn, Endn”. For example, the second column of Table 2 shows that out of 1100 events (firing transitions) 7.27% events were firing “Contact1” and 6.554% events were firing “End1”. A comparison of these two percentage shows that out of 7.27% tokens went in to the place “connect1” at time τ, 6.554% of them went out at time τ+1 and rest of them went out at time τ+n (n>1). For each of the pairs, we find such a small discrepancy. Therefore, we come to conclusion that if out of the input tokens to the place “conectn” the major of them go out at the next time, the process is an attack because it expects that a non-attack process would be in the place for a while. To obtain the threshold of the discrepancy between number of events “Contactn” and events “Endn”, we calculate average of discrepancies. Now, we deal with analyzing the model. (1) we have shown thresholds for each of pairs “Contactn”, “Endn” (5≥n≥1) in Table 2, which each thresholds shows minimum discrepancy in case of an attack. This means, “If during run-time monitoring of the Bluetooth port we observe that out of incoming tokens of the place “connect1” at time τ more than 4% of them stayed at the place at time t+1, the process wouldn’t be an attack one. (2) Figure 7 depicted by tools ProM shows analysis of the phone call initiation attack. In the left side of the figure, “cases: 100”, “event: 1100”, and “event classes: 21” indicate the number of: (1) processes (tokens), (2) runs of the model (shown by Figure 1), and (3) transitions of the model respectively. In the right side of Figure 7, the upper side chart shows Min., average, and Maximum number of runs of the model for each transition of the model. Therefore, Number 11 shows that each transition of the model has been fired on average by 11 processes. This means, “If during runtime monitoring the Bluetooth port,11 distinct tokens on average pass from a transition, it would indicate an attack. Table 2. Mining phone call initition attack #Runs→
1100 900
700
400
200
6.83% 6.66% 3.5% 3% 2.383% 2.83% 1.16% 1.16% 8% 6.83%
4.66% 4.33% 2.67% 2.67% 2.67% 2.67% 2.67% 2.65% 8% 7.33%
3.96% 3.46% 4.45% 4.45% 3.46% 3.46% 1.98% 1.7% 6.93% 5.45%
Transition↓ Contact1 End1 Contact2 End2 Contact3 End3 Contact4 End4 Contact5 End5
7.27% 6.55% 2.27% 2.27% 1.27% 1.18% 3.18% 3.18% 7.45% 7%
7.12% 6.5% 3.12% 3.12% 2.62% 2.12% 2.25% 3% 7.12% 6.75%
Threshold for discovery 4% 1% 0.28% 0.11% 0.83%
250
S.M. Babamir, R. Nowrouzi, and H. Naseri
(3) In the right side of Figure 7, the lower side chart shows Minimum, average, and Maximum number of firing each transition of the model for each process (token). Therefore, Number 6 shows that each process has fired on average same transition 6 times. This means, “If during run-time monitoring the Bluetooth port, a token passes same transition over 6 times, it would indicate an attack. Similarly, Table 3 and Table 4 show mining attacks phone book manipulation and SMS manipulation respectively. In the similar way to the determination of thresholds for the phone call initiation attack, we can determine thresholds for two another models. Also, similar to Figure 7, Figure 8 shows analysis of the SMS manipulation attack.
Fig. 7. Analysis curve of phone call initiation attack
Mining Bluetooth Attacks in Smart Phones Table 3. Mining Phone book Manipulation attack #Runs䜮 1100 900 Transition䜯 Open one 9% 9% New Copy
251
Table 4. Mining SMS Manipulation attack
#Runs䜮 1100 900 700 400 200 Transition䜯 8.2% 9.3% 10% Inbox 12.8% 12% 13.5% 10% 11% New 12.3% 11% 11.5% 13.3% 15% 8.2% 8.5% 9% 6.3% 7% message 8% 8% 8.6% 7.6% 5% 700
400
200
Fig. 8. Analysis curve of SMS manipulation attack
6 Related Works There are three common approaches to face the Bluetooth attacks. In the first approach, implemented in the C language [7], attacks are analyzed and recognized by the dynamic data are collected at runtime. The disadvantage is low performance and high overloading. In the second approach, security policies against attacks are presented in an automaton [8] and [9], and then behavior of the client attacks are monitored by a runtime monitor. The third approach uses a modular policy language (named MPL) for explaining security policies. Then these policies will be changed into some aspect codes. At runtime these codes will monitor application behavior and realize and announce unsuitable conditions [10]. Some special attempts have been already introduced for intrusion detection techniques on Bluetooth-based devices. [11] developed a host-based algorithm for Bluetooth battery depletion attacks that detects attacks by monitoring power levels. However, this network-based method can be employed in only a limited number of
252
S.M. Babamir, R. Nowrouzi, and H. Naseri
Bluetooth devices. Another special attempts use statistical approach to identify anomalous behavior on the Bluetooth. In this approach, a training period determines behavior of users and establishes a baseline behavior and then the current behavior of smart phone is compared with it. [12] implements a network intrusion detection algorithm for discovering malicious Bluetooth traffic. It establishes a graphical user interface of the Bluetooth intrusion detection system for monitoring Bluetooth statistics to analyze attack behavior. Most notably, the Trifinite Group has implemented and released details of several Bluetooth attacks [13]. The above-mentioned approaches are algorithmic and based on coding; however, in this paper, we follow a model-based approach in which we address modeling and analyzing common attacks. This is a contribution to the security of smart phones in which we use software engineering’s basics and processes. It means that designing and analyzing process is considered in this view. The modeling causes: (1) decreasing errors at implementation level, (2) simple and comprehensive understanding of the approach, and (3) preparing the analysis of the approach. Using Petri-Nets for modeling sorts of Bluetooth attacks, we can use abilities, capabilities and advantages of the Nets: (1) to verify properties of the models and (2) to analyze models.
7 Conclusion In this paper, to mine common attacks on Bluetooth port of smart phone, we first organized the attacks and then dealt with mining and analyzing them by the Petri-Net tools. By mining models we succeed to obtain thresholds by which one is able to determine monitor behavior of processes that connected to the Bluetooth port at run-time. By analyzing models we showed how to can use of the thresholds to recover the common attacks. As a result, to prevent and confront these attacks we could find some ways. Using Petri-Nets for modeling attacks could prepare us for possibility of showing the process of attack activities. As we observed, in this paper we obtained important information and awareness in relation with control flow of our models using analyzing model. Also, we considered our models for the liveness and deadlock properties. The advantage of using and analyzing these different methods in this paper is that we can discover problem prior to implement them. As a future work, we started extending and improving current work in two ways: (1) analysis of models to discover attacker’s interests, i.e. the paths of the models can be traveled more than other ones and (2) use of timed models instead of un-timed ones by Timed Petri-Nets. By this way, we exploit time instead of statistics in mining models and can present more accurate mining. Moreover, we can deal with performance evaluation of our models for their efficiency of run-time monitoring.
References 1. Brenner, B.: Mobile Carriers Admit to Malware Attacks (2007), http://searchsecurity.techtarget.com/news/article/ 0,289142,sid14_gci1243513,00.html 2. Scarfone, K., Padgette, J.: Guide to Bluetooth Security, NIST Special Publication 800-121, National Institute of Standards and Technology (2008)
Mining Bluetooth Attacks in Smart Phones
253
3. Loo, A.: Security Threats of Smart Phones and Bluetooth. Communications of ACM 52(3), 150–152 (2009) 4. Herfurt, M.: Bluesnarfing @ CeBIT 2004 - Detecting and attacking Bluetooth-enabled cell phones at the Hannover Fairground, pp. 1–12 (2004), http://trifinite.org/Downloads/BlueSnarf_CeBIT2004.pdf 5. Jensen, K.: Coloured Petri Nets: Basic Concepts, Analysis Methods and Practical Use, 2nd edn. Springer, Heidelberg (2003) 6. http://wiki.daimi.au.dk/cpntools/cpntools.wiki 7. Moffie, M., Kaeli, D.: Application Security Monitor, Special Issue on the workshop on binary instrumentation and application. ACM SIGARCH Computer Architecture News 33(5), 21–26 (2005) 8. Vanoverberghe, D., Piessens, F.: Supporting security monitor-aware development. In: Proceedings of the 3rd International Workshop on Software Engineering for Security Systems (SESS’07), USA, pp. 1–7. IEEE Computer Society, Los Alamitos (2007) 9. Babamir, S.M., Jalili, S.: A Logical Based Approach to Detection of Intrusions against Programs. In: Proceedings of the 2nd International Conference on Global E-Security, London, pp. 72–79 (2006) 10. Tuohimaa, S., Leppanen, V.: A Compact Aspect-Based Security Monitor for J2ME applications. In: Proceedings of International Conference on Computer System and Technologies-CompSysTech’07, USA. ACM International Conference Proceeding Series, vol. 285, pp. VI.1.1–VI.1.6 (2007) 11. Buennemeyer, T., Nelson, T., Gora, M., Marchany, R., Trong, J.: Battery Polling and Trace Determination for Bluetooth Attack Detection in Mobile Devices. In: Proceedings of Information Assurance and Security Workshop, pp. 135–142. IEEE Press, Los Alamitos (2007) 12. OConnor, T., Reeves, T.: Bluetooth Network-Based Misuse Detection. In: Proceedings of Annual Computer Security Applications Conference, ACSAC 2008, USA, pp. 377–391. IEEE Press, Los Alamitos (2008), doi:10.1109/ACSAC.2008.39 13. http://trifinite.org
Users’ Acceptance of Secure Biometrics Authentication System: Reliability and Validate of an Extended UTAUT Model Fahad AL-Harby, Rami Qahwaji, and Mumtaz Kamala School of Computing, Informatics and Media, University of Bradford, Bradford, UK {fmalharb,R.S.R.Qahwaji,m.a.kamala}@bradford.ac.uk
Abstract. This paper presents current findings from cross-cultural studies investigating the adoption of new secure technology, based on fingerprint authentication systems to be applied to an e-commerce websites, within the perspective of Saudi culture. The aim of the study was to explore factors affecting users’ acceptance of biometric authentication systems. A large scale laboratory experiment of 306 Saudis was performed using a login fingerprint system to observe whether Saudis are practically and culturally enthusiastic to accept this technology. The findings were then examined to measure the reliability and validity along with applying a proposed conceptual framework based on the Unified Theory of Acceptance and Use of Technology (UTAUT) with three moderating variables: age, gender and education level. The study model was adapted to meet the objectives of this study while adding other intrinsic factors such as self-efficiency and biometric system characteristics. Keywords: Adoption, Authentication, Biometrics, E-commerce, Fingerprint, Saudi Arabia, UTAUT.
1 Introduction Biometric authentication is considered the automatic identification, or identity verification, of an individual using either a biological feature they possess (physiological characteristic like a fingerprint), or something they do (behavioural characteristic, like a signature) [1]. Nowadays, there is growing interest in using biometrics technology: the International Biometric Group (IBG) predicts the market for biometrics technology to increase from $2.1 billion in 2006 to $5.7 billion in 20101, motivated by large-scale government programmes and dynamic private-sector initiatives. Nevertheless, general awareness of biometric technologies is low since of their limited applications [2]. The choice of fingerprint devices over other solutions would defeat some cultural obstacles such as the prohibition of women’s facial recognition in some Muslim countries such as Saudi Arabia. AL-Harby et al.[3] stated that the majority of Saudis would have a preference to use fingerprint recognition methods. 1
http://www.biometricgroup.com/
F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 254–258, 2010. © Springer-Verlag Berlin Heidelberg 2010
Users’ Acceptance of Secure Biometrics Authentication System
255
Several benefits can be achieved from using biometric authentication in ecommerce systems, such as ease of use (while no data input, such as ID or password, are required from the user), also reducing data vulnerability. Further benefit is enhanced security and decreased risk of viruses, since the browser is built in such a way that there is no require typing the Uniform Resource Locator (URL). As a results, risk such phishing would be compact due the lack of data input by a user[4]. E-banking is a potential system for biometric authentication technology [5]. Many financial institutions anticipate that the use of biometric technologies might lessen the total of money spent on fraud problems [6]. Biometric technology can assist to avoid illegal e-transactions and identity theft [7, 8]. Currently, many financial institutions use biometrics technology to secure their institutions, such as the United Banker’s Bank (UBB), the California Commerce Bank, the Dutch bank ING, Banco Azteca, Mexico and Affinity Plus Federal Credit Union. Adoption and user acceptance of this technology is limited, Nevertheless, it is increasing at a rapid rate [9]. Many organisations are paying attention to adopt of biometric technology, such as financial institutions, government organizations, and retail marketing. The rest of the paper is organised as follows: section 2 research model development, followed by the experimental and measurement development in section 3, while the reliability and validation of the measurement scale is explained in section 4. The discussion and conclusions are drawn in section 5.
2 Research Model Development The Unified Theory of Acceptance and Use of Technology (UTAUT) was launched by Venkatesh et al. [10]. Several studies were performed to investigate and validate the UTAUT in differentials background. UTAUT appeared to be the best theory that have to present a positive instrument to assess the possibility of any latest technology acceptance [10]. The current study build a research framework based on the UTAUT with two variables added: computer self-efficacy and biometric system characteristics and modified experience with education level and voluntariness was discarded to costume the requirements of the perspective being investigated. This is justified by the fact that this research focuses on users in underdeveloped cultures and the concern of education level is of great implication in both cases [11]. The model was then used to investigate the factors affecting the intention to use the biometrics technology within e-commerce. For this work, the model was tailored to meet the objectives of the study by adding other intrinsic factors; the research model is shown in Figure 1.
3 Experimental and Measurement Development To explore the influence of using biometric system in e-commerce websites, we designed an experiment which integrated the development of a fingerprint authentication system. The experiment in addition involved a survey instrument to evaluate the probability of adoption this technology. The “participant” was asked to authenticate his or her identity with a fingerprint sensor and register to the biometrics system. Following completing the experiment, all participants were inviting to complete a list of question survey to assess the intention to use this system. The target populations of
256
F. AL-Harby, R. Qahwaji, and M. Kamala
this research were clients, providers, and regulators of Saudi Arabian financial services. This population consisted of the e-commerce services users, managerial, employees and technical personnel of e-commerce providers. As stated earlier, 306 contributors responded in this experiment; 171 (55.9%) were male and 135 (44.1%) of the contributors were female. Ages ranged from 18 to 55 years. Other than half of contributors (178) were at the undergraduate studies, with just 7.8% at the postgraduate studies. Measurement items used in this investigation were adapted from prior validated measures [10-13], or were developed on the base of a hypothetical background and literature review. A five-point Likert scale sorting from (1) ‘strongly agree’ to (5) ‘strongly disagree’ was used to assess responses.
Fig. 1. The research model
4 Reliability and Validation of the Measurement Scale The software used to perform the appraisal of the study model was SPSS version 16. As suggested by Anderson et al. [14], to evaluate the reliability and validity of the measures, two-phase processes were exploited to study the data. The first phase consists of the analysis of the measurement model, where the second phase is concerned among the appraisal of the structural relationships along with latent constructs. The consideration of the measurement model include the estimation of internal consistency reliability, additionally to the convergent and discriminate validity of the model
Users’ Acceptance of Secure Biometrics Authentication System
257
instruments, which explains the strength measures used to assess the study model [15]. For reliability, all scales used were exceeding the minimum recommended values level of 0.70, as showed in Table 1, which can present an indication of adequate internal consistency [16]. Additionally, the constructs illustrated satisfactory convergent and discriminate validity. Fornell and Larcher[17] recommended the convergent validity is satisfactory when constructs have an average variance extracted (AVE) of at least 0.5. To study convergent validity every items loading are have to be above 0.5 [18]. Table 1. Psychometric properties of the constructers Variable constructs
Composite reliability
Performance expectancy (PE) Effort expectancy (EE) Social influence (SI) Behavioural intention to use (BI) Self-efficacy (SE) Biometric system characteristics (BSC)
0.954 0.937 0.892 0.921 0.887 0.865
Cronbachs alpha(a)
0.887 0.758 0.760 0.885 0.844 0.741
Average variance extracted
0.512 0.543 0.689 0.596 0.614 0.633
5 Discussion and Conclusion Many financial institutions and government organizations have introduced latest secure systems through the web, such as biometric authentication systems to enhance efficiency, to lessen threats and costs, also to improve security. Seeing that more and more financial institutions and government organizations apply biometric technology, it is significant for them to identify factors that influence their client behavioural intention to use such systems. This paper identifies a number of main factors influencing client behavioural intention towards using biometric system with introducing to new factors such biometric characteristics and self-efficacy. Organizations, managers and developers can raise intention to use a biometric system through self-efficacy[19]. Based on this study, the measurements scale for the extended UTAUT model shows that the reliability and validity can be excepted as suggested by many scholars [14-18]. The amount of research on the adoption of latest technologies along with users is vast. A significant aim of many organisations is to evaluate the factors that develop the adoption and acceptance rate of new technology and come to a decision which technologies are worthy of implementation. Nevertheless, there is a lack of literature on biometric technology adoption and acceptance, mainly in Middle Eastern countries. As of this point of view, this study attempted to extend the consideration of users’ differences when applying the UTAUT as a representative and powerful technology acceptance model to the Saudi users, and to realize the adoption process in relative to biometric technology. This paper is a part of a larger research project. The next stage will be analysing to the collected data among the path regression and SEM analysis to give us an indication of how the participants believed in the benefits and the potential of this technology.
258
F. AL-Harby, R. Qahwaji, and M. Kamala
References 1. Wayman, J., Alyea, L.: Picking the best biometric for your applications. National biometric test center collected works, pp. 269–275. National Biometric Test Center, San Jose (2000) 2. Furnell, S., Evangelatos, K.: Public awareness and perceptions of biometrics. In: Computer Fraud and Security 2007, vol. 1, pp. 8–13 (2007) 3. AL-Harby, F., Qahwaji, R., Kamala, M.: The feasibility of biometrics authentication in ecommerce: User acceptance. In: IADIS International Conference WWW/Internet, Freiburg, Germany (2008) 4. Al-Harby, F., Qahwaji, R., Kamala, M.: Towards an Understanding of the Intention to Use Biometrics Authentication Systems in E-Commerce: Using an Extension of the Technology Acceptance Model. Accepted for publication in the International Journal of EBusiness Research, IJEBR (2009) 5. Liu, S., Silverman, M.: A practical guide to biometric security technology. IT Professional Archive 3(1), 27–32 (2001) 6. James, T., et al.: Determining the itention to use biometric devices: An application and extension of the Technology Acceptance Model. Journal of Organizational and End-User Computing 18(3), 1–24 (2006) 7. Jain, A., Hong, L., Pankanti, S.: Biometric identification. Communications of the ACM 43(2), 91–98 (2000) 8. Herman, A.: Major bank signs up for digital signature verification technology. In: Biometric Technology Today (2002) 9. Uzoka, F.-M.E., Ndzinge, T.: Empirical analysis of biometric technology adoption and acceptance in Botswana. Journal of Systems and Software 82(9), 1550–1564 (2009) 10. Venkatesh, V., et al.: User acceptance of information technology: Toward a unified view. MIS Quarterly 27(3), 425–478 (2003) 11. Oshlyansky, L., Cairns, P., Thimbleby, H.: Validating the Unified Theory of Acceptance and Use of Technology (UTAUT) tool cross-culturally. In: HCI 2007 The 21st British HCI Group Annual Conference, University of Lancaster, UK (2007) 12. AlAwadhi, S., Morris, A.: The use of the UTAUT model in the adoption of e-government services in Kuwait. In: 41st Hawaii International Conference on System Sciences, Hawaii (2008) 13. Al-Gahtani, S., Hubona, G., Wang, J.: Information technology (IT) in Saudi Arabia: Culture and the acceptance and use of IT. Information & Management 44, 681–691 (2007) 14. Anderson, J., Gerbing, D.: Structural equation modeling in practice: areview and recommended two-step approach. Psychological Bulletin 103, 423–441 (1988) 15. Fronell, C.: A second generation of multivariate analysis: classification of methods and implications for marketing research. In: Houston, M.J.E. (ed.) Review of Marketing. American Marketing Association, Chicago (1987) 16. Nunnally, J.: Psychometric theory, 2nd edn. McGraw-Hill, New York (1978) 17. Fornell, C., Larcker, D.: Evaluating structural equation models with unobservable variables and measurement error. Journal of Marketing Research 18, 39–50 (1981) 18. Hair, J., et al.: Multi-variate Data Analysis with Readings. Macmillan, New York (1992) 19. Al-Harby, F., Qahwaji, R., Kamala, M.: The Effects of Gender Differences in the Acceptance of Biometrics Authentication Systems within Online Transaction. In: IEEE 2009 International Conference on CyberWorlds, pp. 203–210. IEEE Computer Society, Bradford (2009)
Two Dimensional Labelled Security Model with Partially Trusted Subjects and Its Enforcement Using SELinux DTE Mechanism Jaroslav Jan´ aˇcek Department of Computer Science, Faculty of Mathematics, Physics and Informatics, Comenius University, Mlynsk´ a dolina, 842 48 Bratislava, Slovakia [email protected]
Abstract. Personal computers are often used in small office and home environment for a wide range of purposes – from general web browsing and e-mail processing to processing data that are sensitive regarding their confidentiality and/or integrity. Discretionary access control mechanism implemented in the common general purpose operating systems is insufficient to protect the confidentiality and/or the integrity of data against malicious or misbehaving applications running on behalf of a user authorized to access the data. We present a security model, based on the Bell-La Padula and Biba models, that provides for both confidentiality and integrity protection, and that uses a notion of partially trusted subjects to limit the level of trust to be given to the processes that need to to pass information in the normally forbidden direction. We discuss a way to enforce the model’s policy using SELinux mechanism present in current Linux kernels. Keywords: Security model, partially trusted subjects, SELinux.
1
Introduction
Personal computers are used for a wide range of purposes – from general web browsing and e-mail processing to processing data that are sensitive regarding their confidentiality and/or integrity and authenticity. The sensitive data are often processed using general applications, such as web browsers, word processors, spreadsheet calculators, etc. in addition to dedicated special applications. Applications often contain design and/or programming errors that can be abused to perform malicious activities (such as to execute an arbitrary piece of code) when such application processes a specially crafted piece of input. Therefore, applications that process data originating from untrusted sources cannot be trusted not to perform some malicious activities. The range of activities such application can perform depends on what the underlying operating systems allows the application to do. While larger organizations can dedicate some computers to the sensitive data processing and prevent them from communicating with untrusted external systems, it can be hardly expected in the small office or home environment. In these F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 259–272, 2010. c Springer-Verlag Berlin Heidelberg 2010
260
J. Jan´ aˇcek
environments, a single computer with a single operating system is usually used for the whole range of purposes, which makes the protection provided by the operating system even more important. Commonly used desktop operating systems, such as Microsoft Windows 2000/ XP/Vista or various distributions of Linux, implement discretionary access control mechanism that is unable to prevent malicious or misbehaving application from abusing access to the data of the user it runs on behalf of. Every filesystem object (such as a file or a directory) is owned by a user or a group of users – its owner, every process (subject ) runs on behalf of a user, and the owner of an object can specify a discretionary access control list for the object. Each entry in the access control list specifies the permitted operations for processes running on behalf of an identified user (or a group of users). If a user can perform an operation on a file containing a sensitive piece of information, any process running on behalf of the user can perform the operation on the file. If, for example, a user runs an e-mail reading application that can be tricked to execute a piece of code included in a specially crafted message, an attacker can gain access to all files that the user has access to – including files containing sensitive information. The insufficiency of the discretionary access control in the common operating systems has motivated the work on integrating other controls. In the Linux world, the most significant projects are Linux Security Modules (LSM)[6] and SELinux[7]. LSM provides a framework in the Linux kernel to integrate a security module with the kernel. The kernel then calls the security module’s functions whenever it needs to make an access control decision. SELinux is a security module (using LSM) implementing Domain and Type Enforcement (DTE) [5,8] for Linux. DTE associates a domain with each subject and a type with each object, and a policy specifies which operations can be performed by a subject in the domain D on an object of type T . The policy also specifies how a subject may transition between domains and what type will be associated with a new object based on the domain the the creating process and the type of another associated object (such as the directory where a file is created). SELinux (DTE in general) can be used to enforce a wide range of security policies but the rules may become rather complex. While it is a great tool for securing systems with well-defined security requirements (such as servers), it is difficult to prepare a general policy for desktops where the users’ requirements differ significantly [10]. In the Windows world, Windows Vista comes with two significant security enhancements – User Account Control (UAC)[11] and Mandatory Integrity Control (MIC)[12]. UAC deals with the fact that, while it is a bad security practice, many (or most?) users use accounts with the administrator’s privileges. Any malicious code executed on behalf of such user has the full privileges on the system. When UAC is enabled, Windows Vista prompts the user using a pop-up window when a program actually needs to use the administrator’s privileges. MIC allows subjects and objects to be assigned an integrity level and prevents subjects from modifying objects with a higher integrity level than that of the
Two Dimensional Labelled Security Model
261
subject. In fact, it implements one of the rules of the Biba model [4]. It may be also configured to prevent subjects from reading from objects with a higher level, i.e. it optionally implements one of the rules of the Bell-LaPadula model [3]. We have identified the typical classes of applications used on the personal computers in the small office and home environment and the security requirements of the data they process [1]. We have designed a two-dimensional data classification scheme with independent confidentiality and integrity levels, and we have presented examples of the typical data classification. We have presented a security model for the confidentiality and integrity protection of data classified according to our scheme, and we have shown how it can be used to protect the data used by the typical classes of applications [2].
2
Security Model
In this section, we present our security model in short. For more details, see [1,2]. The model contains two types of entities – subjects and objects. Objects are passive entities of the model – they represent information sources and destinations. Some typical examples of objects in an operating system are files, directories, communication objects (such as pipes, sockets, . . . ). Subjects are active entities of the model – they perform operations on objects. Typical subjects of an operating system are processes. For the purpose of this paper, we will consider only two operations that a subject may perform on an object: read and write. The read operation allows the subject to receive information from the object, and the write operation allows the subject to send information to the object. Each object O and each subject S has several security attributes associated with it, and the model’s information flow policy specifies whether a subject S is allowed to perform a given operation on an object O based on the security attributes of the subject and the security attributes of the object. Our data classification scheme assumes that each object is assigned a confidentiality level and an integrity level. Each object is also assigned an identifier of a user that is the object’s owner. Our model assumes there are at least three confidentiality and three integrity levels. Because the complexity of enforcement of our model’s policy using SELinux depends on the number of levels, we will restrict them to 3 in this paper. As far as the confidentiality is concerned, we classify the data into three basic categories: public data (level 0), normal data – C-normal (level 1), and data that are sensitive regarding their confidentiality – C-sensitive (level 2). The C-sensitive data are the data that their owner (a user) wishes to remain unreadable to the others regardless of the software the user uses, and even if the users makes some mistakes (such as setting wrong access rights for discretionary access control). As far as the integrity (or trustworthiness) of data is concerned, we also classify the data into three basic categories: potentially malicious data (level 0), normal data – I-normal (level 1), data that are sensitive regarding their integrity – I-sensitive (level 2).
262
J. Jan´ aˇcek
The requirements of the integrity protection of data is tightly coupled to the trustworthiness of the data. The trustworthiness of data can be thought of as a metric of how reliable the data are. If some data can be modified by anyone, they cannot be trusted not to contain wrong or malicious information. If some data are to be relied on, their integrity has to be protected. The potentially malicious data require no integrity protection, and can neither be trusted to contain valid information, nor can be trusted not to contain malicious content. The I-sensitive data are the data that their owner wishes to remain unmodified by the others regardless of the software the user uses, and even if the users makes some mistakes. The I-sensitive data are to be modifiable only under special conditions upon their owner’s request. A special category of I-sensitive data is the category of the shared system files such as the programs, the libraries, various system-wide configuration files, the user database, . . . . Some of these files may be modifiable by the designated system administrator, some of them should be even more restricted. The basic idea of our model is to prevent unintended information flow from an object with a higher confidentiality/lower integrity level to an object with a lower confidentiality/higher integrity level. Classical multi-level security models, such as Bell-La Padula[3] or Biba[4], distinguish between untrusted and trusted subjects. Trusted subjects are allowed to violate the basic idea stated above. It turns out that in a typical small office or home desktop operating system too many subjects would have to be considered trusted in order to achieve an acceptable behaviour[2]. To overcome this problem, we divide subjects into three categories: – untrusted subjects, – partially trusted subjects, and – trusted subjects. A trusted subject is a subject that is trusted to enforce the information flow policy with intended exceptions by itself. An untrusted subject is a subject that is not trusted to enforce the information flow policy. It is assumed to perform any operations on any objects unless it is prevented from doing so by the operating system. A partially trusted subject is – trusted not to transfer information from a defined set of objects (designated inputs) at a higher confidentiality level to a defined set of objects (designated outputs) at a lower confidentiality level in a way other than the intended one, and – trusted not to transfer information from a defined set of objects (designated inputs) at a lower integrity level to a defined set of objects (designated outputs) at a higher integrity level in a way other than the intended one, but – not trusted not to transfer information between any other objects.
Two Dimensional Labelled Security Model
263
The sets of designated inputs and outputs regarding confidentiality are distinct from the sets regarding integrity. Any of the sets may be empty. A partially trusted subject, like a trusted one, can be used to implement an exception to the basic policy, because it can violate the policy (and it is trusted to do it only in an intended way). The most important difference between trusted and partially trusted subjects is in the level of trust. While trusted subjects are completely trusted to behave correctly, partially trusted subjects are only trusted not to abuse the possibility of the information flow violating the policy between a defined set of input objects and a defined set of output objects. 2.1
Information Flow Policy
Let C = {0, 1, 2} be the set of confidentiality levels, I = {0, 1, 2} be the set of integrity levels, and L be a finite set of possible labels for objects, 0 ∈ L being the default label used for objects without an explicitly assigned label. The labels will be used to define the sets of designated inputs and outputs for partially trusted subjects. Let U be a finite set of user identifiers. Let TU be the set of trusted system users who are trusted to maintain correct confidentiality and integrity levels on the objects they own. Let C and I be ordered so that 0 is the least sensitive level and 2 is the most sensitive level. Let each object O have the following attributes: CO ∈ C – the confidentiality level of the object, IO ∈ I – the integrity level of the object, LO ∈ L – the label assigned to the object, and UO ∈ U – the user identifier of the object’s owner. Let each subject S have the following attributes: CRS , CWS ∈ C – the highest/lowest confidentiality level the subject can normally read from/write to, CRLS , CW LS ∈ C – the highest/lowest confidentiality level of a specially labelled object that the subject can read from/write to, CRLSS , CWLSS ⊆ L – the set of labels of the objects that belong to the designated input/output set for a partially trusted subject regarding confidentiality, IRS , IWS ∈ I – the lowest/highest integrity level the subject can normally read from/write to, IRLS , IW LS ∈ I – the lowest/highest integrity level of a specially labelled object that the subject can read from/write to, IRLSS , IWLSS ⊆ L – the set of labels of the objects that belong to the designated input/output set for a partially trusted subject regarding integrity, and US ∈ U – the user identifier of the user the subject is acting on behalf of. A subject S can read from an object O if read(S, O) is true, where read(S, O) ⇐⇒ [CRS ≥ CO ∨ (CRLS ≥ CO ∧ LO ∈ CRLSS )] ∧[IRS ≤ IO ∨ (IRLS ≤ IO ∧ LO ∈ IRLSS )] ∧[US = UO ∨ CO ≤ 1] ∧[US = UO ∨ IO ≤ 1 ∨ UO ∈ TU]
264
J. Jan´ aˇcek
A subject S may write to an object O if write(S, O) is true, where write(S, O) ⇐⇒ [CWS ≤ CO ∨ (CW LS ≤ CO ∧ LO ∈ CWLSS )] ∧[IWS ≥ IO ∨ (IW LS ≥ IO ∧ LO ∈ IWLSS )] ∧[US = UO ∨ IO ≤ 1] ∧[US = UO ∨ CO ≤ 1 ∨ UO ∈ TU] Each untrusted subject S must satisfy the following conditions: CWS = CW LS ≥ CRS = CRLS IWS = IW LS ≤ IRS = IRLS CWLSS = CRLSS = IWLSS = IRLSS = ∅ Each partially trusted subject S must satisfy the following conditions: CWS ≥ CRS
CWS ≥ CRLS
CW LS ≥ CRS
IWS ≤ IRS
IWS ≤ IRLS
IW LS ≤ IRS
Trusted subjects are given upper boundaries on the confidentiality level for reading and on the integrity level for writing, and lower boundaries on the confidentiality level for writing and on the integrity level for reading. Partially trusted subjects are allowed to pass information from their designated inputs to their designated outputs, but are very limited otherwise. 2.2
Security Properties of the Information Flow Policy
We will state some security properties of the information flow policy defined above in the formal way. The theorems can be proven although the proofs are out of the scope of this paper (due to their length). They are based on deriving contradictions if the theorems did not hold. First, we will formally define the flow of information within the system. Definition 1. Let S be a set of subjects and O be a set of objects. Let O0 , Oout ∈ O be any two objects. We say that a policy allows an information flow from O0 to Oout within the system (S, O) and denote it as flow(S, O, O0 , Oout ) if there exists a finite sequence of pairs (S1 , O1 ), (S2 , O2 ), . . . , (Sn , On ) where ∀i ∈ {1, . . . , n} : Oi ∈ O ∧ Si ∈ S such that ∀i ∈ {1, . . . , n} : read(Si , Oi−1 ) ∧ write(Si , Oi ) and
On = Oout
where read(S, O) and write(S, O) are the functions of the policy determining whether the subject S can read from, or write to the object O. Using the formal definition of flow, we can precisely define what we mean by an information leak – a violation of the confidentiality protection requirements.
Two Dimensional Labelled Security Model
265
Definition 2. Let S be a set of subjects and O be a set of objects. We say that a policy allows an information leak within the system (S, O) and denote it as leak(S, O) if ∃Oa , Ob ∈ O : COa > COb ∧ flow(S, O, Oa , Ob ) We can also define the precise meaning of a violation of the integrity protection requirements – information spoiling. Definition 3. Let S be a set of subjects and O be a set of objects. We say that a policy allows information spoiling within the system (S, O) and denote it as spoil(S, O) if ∃Oa , Ob ∈ O : IOa < IOb ∧ flow(S, O, Oa , Ob ) Having the definitions, we can state the basic security properties using the following theorems. The first two theorems deal with the case when there are only untrusted subjects. In such case, the information flow policy guarantees that no information from a more confidential object can end up in a less confidential object, and that no information from an object with a lower integrity level can influence any object with a higher integrity level. Theorem 1. If S is a set of untrusted subjects and O is a set of objects, our policy does not allow any information leak within the system (S, O). Theorem 2. If S is a set of untrusted subjects and O is a set of objects, our policy does not allow any information spoiling within the system (S, O). Another two theorems deal with the case when there may be some partially trusted subjects as well. The first of these theorems says that in order to pass information from an object with a higher confidentiality level to an object with a lower confidentiality level using only untrusted and partially trusted subjects, each subject that passes information from an object with a higher confidentiality level to an object with a lower confidentiality level, must be a partially trusted subject and it must be passing the information from its specially labelled input to its specially labelled output. Assuming that no partially trusted subject passes information from its special inputs to its special outputs in an unintended way, any information leak allowed by the policy within a system without trusted subjects is intended. Theorem 3. Let S be a set of untrusted and/or partially trusted subjects and let O be a set of objects. Let O0 , Oout ∈ O be two objects such that CO0 > COout and flow(S, O, O0 , Oout ). For every finite sequence of pairs (S1 , O1 ), . . . , (Sn , On ) such that ∀i ∈ {1, . . . , n} : Si ∈ S∧Oi ∈ O∧read(Si , Oi−1 )∧write(Si , Oi )∧On = Oout for each pair (Sj , Oj ) such that COj−1 > COj : Sj is a partially trusted subject and LOj−1 ∈ CRLSSj and COj−1 ≤ CRLSj and LOj ∈ CW LSSj and COj ≥ CW LSj
266
J. Jan´ aˇcek
The last theorem says, that in order to pass information from an object with a lower integrity level to an object with a higher integrity level using only untrusted and partially trusted subjects, each subject that passes information from an object with a lower integrity level to an object with a higher integrity level, must be a partially trusted subject and it must be passing the information from its specially labelled input to its specially labelled output. Assuming that no partially trusted subject passes information from its special inputs to its special outputs in an unintended way, any information spoiling allowed by the policy within a system without trusted subjects is intended. Theorem 4. Let S be a set of untrusted and/or partially trusted subjects and let O be a set of objects. Let O0 , Oout ∈ O be two objects such that IO0 < IOout and flow(S, O, O0 , Oout ). For every finite sequence of pairs (S1 , O1 ), . . . , (Sn , On ) such that ∀i ∈ {1, . . . , n} : Si ∈ S∧Oi ∈ O∧read(Si , Oi−1 )∧write(Si , Oi )∧On = Oout for each pair (Sj , Oj ) such that IOj−1 < IOj Sj is a partially trusted subject and LOj−1 ∈ IRLSSj and IOj−1 ≥ IRLSj and LOj ∈ IW LSSj and IOj ≤ IW LSj 2.3
Comparison to Other Projects
In this section, we will compare our security model to the mandatory integrity control (MIC) implemented in Windows Vista, and to the SELinux. Our Model vs. MIC in Windows Vista. We have already mentioned MIC in the introduction. It can be used to prevent a potentially malicious application running at a low integrity level from modifying data at a higher integrity level. Unlike our model, however, it does not prevent an untrusted application running at a higher integrity level from reading (potentially malicious) data at a lower level. If the application contains a flaw that can be exploited by processing malicious data, the user can, e.g. by an accident, use it to read and process malicious data stored at the low integrity level, and thus turn the application to a malicious one that can spoil data at its (higher) integrity level. MIC also allows confidentiality protection to be turned on. Unlike in our model, the confidentiality and integrity levels in MIC are not independent, the same level is used for both. It can be used to prevent a potentially malicious application running at a low level from reading (and also from modifying) data at a higher level. Unlike our model, it does not prevent an application running at a higher level (and thus capable of reading data at that level) from writing to objects at a lower level. Any application is thus able to export any data that it can read to external untrusted systems, or to store them to a low-level file.
Two Dimensional Labelled Security Model
267
MIC is a useful feature that allows a careful user to use a web browser or to test a potentially malicious application without the risk that they will modify (and optionally also read) any data classified at a higher level. The user must be careful enough, however, not to open any files classified at the low level in another application running at a higher level unless the application is trusted not to misbehave upon reading the data. The integrity and confidentiality protection provided by our security model is definitely stronger than that of MIC, and it has provable security properties similar to those of Bell-LaPadula and Biba models. Our Model vs. SELinux. SELinux, by implementing a flavour of domain and type enforcement, provides a very flexible security mechanism that can be used to enforce a wide range of security policies. We will show in the next section that it can be used to implement our model as well. The commonly used SELinux policies[9,10] define unique types and domains for many typical system services, server applications, and their data, and strictly restrict the set of operations the processes running within these domains are allowed to perform. The strict version of the policy is suitable for servers but causes problems on desktop systems because the restrictions are too strict to be accepted by users. The targeted version of the policy, that restricts many system services but leaves the applications started by the user in a single, unconfined domain, is suitable for desktops. It prevents a flawed system service program from accessing data it does not have to be able to access. It does not, however, prevent user-started applications from accessing the user’s data, perhaps except for some specially designated sensitive data (such as private keys) accessible only to certain applications. The targeted policy could be combined with our model to provide combined benefits of both. The targeted policy provides more rigorous restrictions for specific services while our model can be used to protect the ordinary users’ data.
3
Using SELinux to Enforce Our Policy
Current Linux kernels include SELinux security module that implements a very flexible security mechanism – domain and type enforcement[5,8,7]. We will show in this section how this mechanism can be used to enforce our security model’s information flow policy. We will not deal with all details, such as all the finegrained permissions and object classes SELinux supports, but we will focus on the basic ideas and constructions that are sufficient to understand how to build the fully functional SELinux configuration. 3.1
Some SELinux Basics
SELinux does not distinguish formally between domains and types. Every SELinux controlled object and every subject (process) is assigned a security label consisting of three parts – a user, a role, and a type. SELinux user identities are not to
268
J. Jan´ aˇcek
be confused with the standard Linux user identities – they are independent attributes. They may be mapped in a 1:1 manner, but they do not have to. SELinux roles are assigned to subjects, and the SELinux policy specifies the subject types (domains) that a role is authorized for, i.e. the set of types a subject with a given role may be assigned. The policy also specifies the set of roles a SELinux user is authorized to assume. We will not use specific roles for our model’s policy, so we will assume that every user is authorized for a role that is authorized for all domains unless another security policy is in force, in which case the set of domains and roles is determined by the other policy. Types are declared in the SELinux policy configuration. A type may be labelled by several attributes. An attribute name may be used in the policy configuration to represent the set of types that are labelled with the attribute. Access control rules are specified using the following syntax: allow source types target types : classes permissions ; For access to objects, the source types is a set of domains, target types is a set of object types, and permissions is the set of operations that a subject of a type from the set of domains may perform on an object of a type from the set of object types. The classes is a set of classes of objects that the rule applies to. Classes are used in SELinux to distinguish between different sorts of objects, such as file, directory, socket, process, . . . . If more than one rules match a subject and an object, the resulting set of permissions is the union of the sets specified in the matching rules. The SELinux policy also specifies type transition rules that specify the type of a new object or subject based on the current type (domain) of the creating subject and the type of a related object (such as a directory when creating a new file, or an executable file when creating a new process). We will not deal with these rules in this paper, but they would be used in a full policy configuration. Another important concept of SELinux policy configuration is the concept of constraints. A constraint is specified for a set of classes and a set of permissions, and it consists of a boolean expression. If the expression evaluates to false, no operation from the set may be performed on an object of the specified class. The expression may compare SELinux user identities, roles and types of the subject and the object with each other, or with a set of values. The set of types may also be specified using the type attributes as stated above. For simplicity, we will only deal with access control rules and constraints in this paper. We will use only two sets of permissions – read perms representing the permissions needed for the read operation of our model, and write perms representing the permissions needed for the write operation of our model. We will deal only with one set of object classes – object classes representing all relevant object classes. 3.2
Configuring SELinux for Our Policy
First we will assume that we can use SELinux exclusively to enforce the policy of our model. Later we will discuss how our policy can be combined with another SELinux policy.
Two Dimensional Labelled Security Model
269
Our model uses four security attributes for objects – the confidentiality and integrity levels, the owner user identifier and the label used to describe the sets of designated inputs and outputs for partially trusted subjects. We will use the SELinux user field in object labels to represent object owners. We will use object type to indicate the other three values. First, we will define a type attribute for each confidentiality level (C0, C1, C2), a type attribute for each integrity level (I0, I1, I2), and a type attribute for each used label (Lx for label x). Then we need to define types for objects. We will define a type for every possible pair of confidentiality and integrity levels, and tag it with the two type attributes as follows: for every c, i ∈ {0, 1, 2} add a type tagged with the attributes Cc and Ii. We will then add new object types for every needed combination of the confidentiality and integrity levels and labels and tag them with the three appropriate type attributes – Cc, Ii, Ll where c and i are the confidentiality and integrity levels of the object and l is its label. We will then need types (domains) for subjects. We will first need several type attributes that will be used to identify the types representing subjects with a specific value of our model’s subject attributes: CRx, CW x, CRLx, CW Lx, IRx, IW x, IRLx, IW Lx for x ∈ {0, 1, 2}, and CRLSx, CW LSx, IRLSx, IW LSx for x ∈ L, where L is the set of needed labels. We can then define the types for subjects. The full set of needed types for untrusted and trusted subjects contains 34 = 81 types. The type for an untrusted or trusted subject S is to be tagged with the type attributes CR CRS , CW CWS , IR IRS , IW IWS where x denotes the value of x. In practise, it may be sufficient to predefine the 9 basic types for untrusted subjects (CRS = CWS ∧ IRS = IWS ) and add the types for other untrusted and for all trusted subjects on as needed basis. As far as partially trusted subjects are concerned, we add a new type with a unique name for every used combination of the partially trusted subject attributes, and tag it with the following type attributes: CR CRS , CW CWS , CRL CRLS , CW L CW LS , IR IRS , IW IWS , IRL IRLS , IW L IW LS where x denotes the value of x, and CRLSy for y ∈ CRLSS , CW LSy for y ∈ CWLSS , IRLSy for y ∈ IRLSS , and IW LSy for y ∈ IWLSS . We have chosen the type attributes so that it is easy to determine whether a particular subject or object has a given value of our model’s attributes. We can use this to create the SELinux constraints corresponding to the the rules of our model’s information flow policy (see fig. 1). The terms T1, T2, U1, U2 represent the type of the subject (T1) and the object (T2) and the SELinux user identity of the subject (U1) and the object (U2) in the constraint definition grammar, and TU is to be replaced by the set of trusted users. We have expressed the decision making functions of our model’s information flow policy exclusively using the constraints. If there is no other policy to enforce, we can add a general “allow all” access control rule, such as allow domain object:object classes {read perms write perms}; where domain is a type attribute to be assigned to every type for subjects, and object is a type attribute to be assigned to every type for objects.
270
J. Jan´ aˇcek
constrain object_classes read_perms ( (T2 == C2 and T1 == CR2) or (T2 == C1 and T1 == {CR1 CR2}) or (T2 == C0) or (T2 == Lx and T1 == CRLSx and ( (T2 == C2 and T1 == CRL2) or (T2 == C1 and T1 == {CRL1 CRL2}))) [ repeat the last subexpression for every label ); constrain object_classes read_perms ( (T2 == I0 and T1 == IR0) or (T2 == I1 and T1 == {IR1 IR0}) or (T2 == I2) or (T2 == Lx and T1 == IRLSx and ( (T2 == I0 and T1 == IRL0) or (T2 == I1 and T1 == {IRL1 IRL0}))) [ repeat the last subexpression for every label ); constrain object_classes read_perms ( (U1 == U2 or T2 = {C0 C1}) and (U1 == U2 or T2 = {I0 I1} or U2 == TU) ); constrain object_classes write_perms ( (T2 == C0 and T1 == CW0) or (T2 == C1 and T1 == {CW1 CW0}) or (T2 == C2) or (T2 == Lx and T1 == CWLSx and ( (T2 == C0 and T1 == CWL0) or (T2 == C1 and T1 == {CWL1 CWL0}))) [ repeat the last subexpression for every label ); constrain object_classes write_perms ( (T2 == I2 and T1 == IW2) or (T2 == I1 and T1 == {IW1 IW2}) or (T2 == I0) or (T2 == Lx and T1 == IWLSx and ( (T2 == I2 and T1 == IWL2) or (T2 == I1 and T1 == {IWL1 IWL2}))) [ repeat the last subexpression for every label ); constrain object_classes write_perms ( (U1 == U2 or T2 = {I0 I1}) and (U1 == U2 or T2 = {C0 C1} or U2 == TU) );
x ]
x ]
x ]
x ]
Fig. 1. SELinux constraints enforcing our policy
Two Dimensional Labelled Security Model
271
If there is another security policy described in terms of a SELinux configuration, we can combine our model’s policy with it by adding the type attributes, adding the constraints and by tagging the existing subject and object types with our attributes as appropriate. If more than one combination of our model’s security attributes is needed for a given existing type, it may be necessary to duplicate the type and tag its individual copies with our appropriate type attributes. If the duplicated type is used directly in the rules of the existing policy, the rules have to be modified to include all of the copies of the original type. If the rules are defined exclusively using type attributes, no changes may be necessary.
4
Conclusions
We have presented our security model for independent multi-level confidentiality and integrity protection with partially trusted subjects, and we have shown a way to implement it in the current Linux operating systems without the need to modify any part of the kernel. In order to build a complete functional SELinux configuration, the model will have to be extended to cover other operations, such as creation, deletion, and renaming of objects, operations on subjects, etc. The SELinux policy will also need type transition rules which are out of scope of this paper (due to the paper length limits). We believe, however, to have shown that this way of implementation is feasible. We also have plans of simplifying the resulting SELinux policy by directly utilizing SELinux support for multi-level confidentiality protection, and perhaps, by using some of the multi-level confidentiality protection fields in a non-standard way to implement multi-level integrity protection as well. Acknowledgements. This paper was supported by the grant VEGA 1/0266/09.
References 1. Jan´ aˇcek, J.: A Security Model for an Operating System for Security-Critical Applications in Small Office and Home Environment. Communications (Scientific Letters ˇ of the University of Zilina) 11, 5–10 (2009) 2. Jan´ aˇcek, J.: Mandatory Access Control for Small Office and Home Environment. In: Vojt´ aˇs, P. (ed.) Informaˇcn´e Technol´ ogie – Aplik´ acie a Te´ oria. PONT s.r.o., Seˇ na, pp. 27–34 (2009) 3. Bell, D.E., La Padula, L.J.: Secure Computer System: Unified Exposition and Multics Interpretation. Technical report (1976) 4. Tipton, H.F., Krause, M. (eds.): Information Security Management Handbook, 5th edn. CRC Press, LLC (2004) 5. Badger, L., Sterne, D.F., Sherman, D.L., Walker, K.M., Haghighat, S.A.: Practical Domain and Type Enforcement for UNIX. In: Proceedings of the 1995 IEEE Symposium on Security and Privacy. IEEE Computer Society, Washington (1995)
272 6. 7. 8. 9.
J. Jan´ aˇcek
Linux Security Modules, http://lsm.immunix.org/ Security Enhanced Linux, http://www.nsa.gov/selinux/ Domain and Type Enforcement, http://www.cs.wm.edu/~ hallyn/dte/ Coker, F., Coker, R.: Taking advantage of SELinux in Red Hat Enterprise Linux. Red Hat Magazine (6) (April 2005), http://www.redhat.com/magazine/006apr05/features/selinux/ 10. Fedora SELinux Project – Discussion of Policies, http://fedoraproject.org/wiki/SELinux/Policies 11. Russinovich, M.: Inside Windows Vista User Account Control, http://technet.microsoft.com/en-us/magazine/2007.06.uac.aspx 12. Riley, S.: Mandatory integrity control in Windows Vista, http://blogs.technet.com/steriley/archive/2006/07/21/442870.aspx
A Roaming-Based Anonymous Authentication Scheme in Multi-domains Vehicular Networks Chih-Hung Wang and Po-Chin Lee Department of Computer Science and Information Engineering National Chiayi University, Chiayi, Taiwan {Wangch,s0970397}@mail.ncyu.edu.tw
Abstract. In vehicular networks, a vehicular user can communicate with peer vehicles or connect to Internet. A vehicular user could move across multiple access points belonging to either their home network domain or the foreign networks domain. In some cases, the real identity or location privacy may be disclosed or traced by malicious attackers. This poses challenges on privacy to the current vehicular networks. On the other hand, when a vehicular user drives to the foreign network, the foreign server has no idea to authenticate the vehicular user who is not the member of the foreign server. A roaming concept can be used to solve this problem. In this paper, we propose a privacy preserving authentication scheme in multi-domains vehicular networks. Our scheme considers three kinds of situations when the vehicular users communicate with the home network or foreign networks and give some approaches for them by combining polynomial-pool based key distribution scheme and roaming technology. The proposed authentication protocols are designed for preserving vehicular user’s real identity and location privacy. The security analysis and comparisons with the previous works are also discussed. Keywords: Vehicular Networks, Wireless Security, Privacy, Authentication, Pseudonym.
1 Introduction In recent decades, vehicular networks (VANETs) become a popular research area in computer networks. A vehicular user can communicate with peer vehicles or connect to Internet when he drives a car. Therefore, vehicles are becoming “computers on wheels”, or rather “computer networks on wheels” [7,8,11]. In VANETs, user privacy and location privacy are two popularly discussed issues. A vehicular user could drive across multiple access points (APs) either belonging to their home network domain or to the network domain owned by different authorities during a long-distance trip. The real identity or location privacy may be disclosed or traced by malicious attackers. This poses challenges on privacy to the current vehicular networks [6,7,8]. On the other hand, in nowadays vehicular networks researches, most of the network models are constructed from a single domain environment and even ignore the handover procedure between different road site units [2,4,9]. Since vehicular user could move F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 273–287, 2010. © Springer-Verlag Berlin Heidelberg 2010
274
C.-H. Wang and P.-C. Lee
across multiple domains, the privacy preservation should be considered not only in the home network domain but also in the foreign network domains. Chuang and Lee [1] in 2008 considered an authentication mechanism for a user moving between two or more network domains in vehicular networks. The network model in Chuang and Lee’s scheme is combining network mobility (NEMO) with authentication, authorization and accounting (AAA) model. The purposes of Chuang and Lee’s scheme are to resolve the high computational cost of public key infrastructure (PKI) and reduce the round trip time (RTT) of the authentication procedure. However, Chuang and Lee’s scheme did not consider the privacy issues. The authentication protocol for a single domain in VANETs has been well designed in the previous literature. Nevertheless, there is a great many vehicular users in the real world and they can move quickly and even reach far away regions. Thus how to design an authentication protocol that can support multi-domains in efficient way is an actual challenge for VANETs. The purpose of this paper is to present a novel scheme by only using low-cost symmetric encryptions to accomplish multi-domain authentication with privacy preservation, which the existent systems cannot properly deal with. In this paper, we explore an improved polynomial-pool based key distribution scheme by using pseudonym to design an anonymous authentication protocol in the home network. This approach can not only authenticate legitimate user but also achieve user privacy. We also design a crossing authentication protocol when a vehicular user roams to the foreign networks. In this protocol, a handover procedure is considered to reduce communication overheads. The roaming vehicular user may be in the following two situations. One is the vehicular users may stay the foreign network in a short term, such as for business or excursion. This situation can be processed by a temporary joining-new-domain procedure. The other is the vehicular users stay the foreign network for a long period of time, such as the vehicular user changing dwelling place or standing in a new location for a business trip over one week. This situation can be processed by a transferring domain procedure. The roaming technology is used in the proposed protocol.
2 Preliminary 2.1 Security Requirements The security system in the vehicular networks should satisfy the following requirements [6,7,8]: Authentication: Vehicular users or road side units (RSUs) should be authenticated as a legitimate user in vehicular networks. Every entity must perform mutual authentication before he communicates with others. Confidentiality and Integrity: The transmission data should be encrypted to prevent attacker from gaining information. Integrity for all messages should be protected to prevent attacker from altering them. Availability: The system should be protected against some attacks (such as DoS attack). So, availability should be also supported by alternative means.
A Roaming-Based Anonymous Authentication Scheme
275
Privacy: When the vehicular user drives on road and communicate with RSUs, he doesn’t want his real identity exposed. Moreover, the vehicular user also dislikes his location being traced by someone. Therefore, privacy of vehicular users should be guaranteed. Real-time: At the high speeds typical in vehicular networks, real-time response of data transmission should be considered to guarantee the communication quality. 2.2 Network Model A typical vehicular network, as show in Fig. 1, has three participants and two communication modes in the architecture. The first participant is authentication authorization and accounting server (AAA server) or named key distribution center (KDC). The entire vehicle registrations, key managements, authentications and authorizations are handled by AAA server. The second one is the road side unit; it is a wireless access point beside roads. And the last one is the vehicular user, or named vehicle. These three participants communicate with each other through two kinds of communication modes. One is the wired communication which involves AAA server and RSUs; the other is the wireless communication which involves the vehicular users and RSUs. Authority Center
KDC
AAA server
Internet RSU
Vehicular User
RSU
Fig. 1. A typical architecture in vehicular networks
In our network model, the whole network is divided into two or more sub-network domains. One is the home network (HN) and the others are foreign networks (FNs). The vehicular user can access network service at FNs. When the vehicular user wants to access resources at FNs, he must be authenticated and get authorization provided by FNs. So, the home network and the foreign network have a roaming agreement between networks. We assume that there exist secure channels among the home network, foreign networks and RSUs, which means that the wired channels are secure and the attacker only can launch attacks in wireless channels.
276
C.-H. Wang and P.-C. Lee
2.3 Attack in Vehicular Networks The following attacks could be launched by attackers in vehicular networks [7,8]. Bogus information: Attackers broadcast wrong information in order to affect driver’s behavior in vehicular networks. Cheating with positioning information: In order to escape liability, attackers use this attack to alter their position, speed and direction in the case of an accident. ID disclosure of other vehicles in order to track their location: Attackers in this case may collect all the broadcasting messages in order to trace some vehicular users. The vehicular user’s location and identity could be known to the attacker. Denial of Service: The attacker may want to bring down the network or cause channel jamming. All the network services will be affected by this attack. Masquerade: The attacker impersonates legal users to launch some malicious attacks (such as Sybil attacks).
3 The Proposed Schemes 3.1 Polynomial-Pool Based Key Distribution The polynomial-pool based key distribution, which can be used in our proposed schemes as a building block, was proposed in 2003 by Liu and Ning [5]. To distribute the pair-wise keys, the key server randomly generates a bivariate t-degree polynomial t
f ( x, y ) = ∑ aij x i y j over a finite field Fq , where q is a prime number and the propi , j =0
erty of f ( x, y) = f ( y, x) holds. Assuming that each user has a unique identity, user i can get a polynomial share f (i, y) from the key server. If user i and j want to establish a common key to communicate with each other. User i can compute the common key f (i, j ) by evaluating f (i, y) at point j , and user j can compute the common
key f ( j , i) by evaluating f ( j, y) at point i . The details of polynomial-pool based key distribution could be found in [5]. 3.2 Our Schemes
A vehicular user is considered to be able to move anywhere he wants, but his access right is fixed in a limited range controlled by the home network manager. If the vehicular user goes out of the range, he only can be authenticated by the foreign network manager in order to obtain the network access right. For example, if a vehicular user registers at city A, he cannot access network resources when he move to city B because did not register at city B. Hence, an authentication protocol with roaming concept is proposed to deal with authentication in foreign networks. Our scheme considers three different situations. The first one is the vehicular user communicates within the range of the home network. The second one is the vehicular user communicates in the
A Roaming-Based Anonymous Authentication Scheme
277
foreign networks in a short term, and the last one is the vehicular user communicates in the foreign networks for a long period of time. Table 1. Notations for the proposed scheme RSUj
The identity of RSUj
PID TPIDi , j
The pseudonym set for the vehicular user The temporary pseudonym of the vehicular user i given by the foreign RSU j
CAi , j
The certificate-chain for authenticating the vehicular user i by RSU j (j=0
SK i, j
being the domain server) The secret key shared between the home RSU j and the vehicular user i
TSKi, j
The temporary session key shared between the foreign RSU j and the
TAK i , j
vehicular user i The master secret key shared between a foreign server and the vehicular user i The master secret key shared between the foreign server and the vehicular user i to be used for transferring domain The temporary authentication key for RSU j to authenticate the
K xy
vehicular user i The secret key shared between RSU x and RSU y
h (⋅)
A one-way hash function
MSK i MSKTR
i
3.2.1 The Anonymous Authentication Protocol in the Home Network We propose a protocol by using polynomial-pool based key distribution in the case of the home network. In order to achieve the requirement of privacy, we improve the original polynomial-pool based key distribution in [5]. In the scheme of [5], each vehicular user must employ the real identity to compute common keys when they want to communicate with each other. However in VANETs, the vehicular users need to hide real identity for the purpose of user privacy. Thus our protocol replaces the real identities with pseudonyms. In the Setup phase, the home server generates a set of pseudonyms PID = {PID1 , PID3 ,..., PIDm } and several bivariate polynomials with degree t. Each valid polynomial has a unique index number (e.g. f1( x, y), f 2 ( x, y),..., f n ( x, y) ). Each RSU j will obtain all polynomial shares with RSUj ’s real identity from the home
server (e.g. f1 ( RSU j , y ), f 2 ( RSU j , y ),..., f n ( RSU j , y) ). When a vehicular user registers at the home server by using his real identity, he can get polynomial shares with pseudonyms computed by the home server (e.g. the vehicular user IDi can obtain the polynomial shares f1 ( PID1 , y ), f 3 ( PID10 , y ),..., f n ( PIDm , y ) ). The polynomial shares and pseudonyms for each vehicular user will be stored in home server’s database, as show in Table 2.
278
C.-H. Wang and P.-C. Lee Table 2. The polynomial shares and pseudonyms stored in home server’s database
Pseudonym
Polynomial share
Owner
PID1
f1 ( PID1 , y )
IDi
PID1
f 2 ( PID1, y)
IDx
PID1
f3 ( PID1, y )
IDz
... f 3 ( PID10 , y )
IDi
PID10
f 4 ( PID10 , y )
IDz
PID m
f n−1 ( PIDm , y )
ID x
PID m
f n ( PIDm , y)
IDi
...
PID10
...
When the vehicular user and the RSUs want to communicate with each other, they first compute the same authentication key by using their common polynomial share. For example, if both of the vehicular user and RSU have the same polynomial f1 ( x, y ) , they can compute the same authentication key by using the RSU’s real identity and the vehicular user’s pseudonym. And then, the vehicular user and RSU can establish a session key after performing authentication protocol. The whole protocol is shown in Fig. 2 and the detail steps are described in the following.
RSUj
α , PIDβ , Eau _ keyi , j (r ) E au _ key i , j ( s , h ( r || s )) ESK i , j ( s )
Fig. 2. The anonymous authentication protocol in the home network
Setp 1: RSUj periodically broadcasts its real identity. Step 2: Upon receiving the message sent by RSUj , the vehicular user i randomly
selects a polynomial share from his all polynomial shares (e.g. fα ( PIDβ , y ) ). And then the vehicular user calculates the authentication key au _ key i, j = fα ( PIDβ , RSU j ) . Note that the vehicular user can share au _ key i ,0 = fα ( PIDβ , IDHN ) with his home server.
A Roaming-Based Anonymous Authentication Scheme
279
After that, the vehicular user chooses a random number r and sends the polynomial identity number α , the pseudonym PIDβ and the random number r encrypted by the authentication key to RSUj . Step 3: When RSUj received the response message, it checks the polynomial
identity number. And then,
RSUj computes
the same authentication key
au _ key i, j = fα ( RSU j , PIDβ ) and decrypts the message. After getting the random number r , RSUj computes the session key SK i, j = h(r || s ) , where s is a random number
selected by RSUj . At the end of the step, RSUj sends the random number s and the session key encrypted by the authentication key to the vehicular user. Step 4: After receiving the message from RSUj , the vehicular user decrypts it and
gets the random number s . Then the vehicular user checks whether the session key is valid or not by verifying h(r || s) . If it is, the vehicular user sends s encrypted by the session key back to RSUj . If RSUj accepts the final message, the two parties can then perform the anonymous authentication protocol and obtain a session key which can be used in the follow-up communications. 3.2.2 The Anonymous Authentication Protocol in the Foreign Network If the vehicular user leaves the home network, he cannot access network resources in the foreign networks. In this case, a roaming-based anonymous authentication scheme is proposed. When the vehicular user goes into the foreign networks, he must be authenticated in order to obtain the network access right. To achieve the requirement of privacy, the vehicular user does not want his real identity being disclosed in the foreign networks. This section will discuss the case that the vehicular user shortly stays in the foreign networks for business or excursion. Assume that the vehicular user has registered at the home network and got the polynomial shares provided by the home server. The detail steps of the authentication in the foreign network, as shown in Fig. 3, are described in the following. Step 1: The vehicular user i sends a temporary join request message including the identity of the home server, the identity of the foreign server, the pseudonym, the polynomial identity number α , the encrypted real identity and a random number x to the foreign server. Note that the transmitted messages must be forwarded through the foreign RSU1, which is the first RSU that the vehicular user meets in the foreign network. The foreign server then forwards this message to the home server through a secure channel for checking the vehicular user. Upon receiving the message, the home server decrypts the message by using the authentication key and verifies the identity of the vehicular user. Step 2: If the vehicular user is valid, the home server generates a certificate CA i , 0 by
computing CA i , 0 = h ( ID i || x ) in order to verify the vehicular user at the foreign server. The home server then sends the certificate to the foreign server through a secure channel. Upon receiving the certificate, the foreign server selects a random number y and
280
C.-H. Wang and P.-C. Lee
computes a master secret key MSK i = h(CAi,0 || y) . The foreign server then sends the random number y encrypted by CA i , 0 to the vehicular user through RSU1. Step 3: After the vehicular user decrypts the message and obtains y , he computes the master secret key MSK i = h(h( IDi || x) || y ) . Then, the vehicular user sends y encrypted by the master secret key MSKi back to the foreign server. Finally, if y , delivered from the vehicular user, is correct, the foreign server then sends the temporary session key TSK i ,1 = h ( v 0 || CA i ,1 ) to the vehicular user, where v0 is random number selected
by the foreign server. The temporary session key will be used upon the vehicular user communicating with the foreign RSUs. The foreign server also sends the random number v0 and the certificate CA i ,1 to RSU1 for the next handover procedure, where CA i ,1 = h ( CA i , 0 ) . Vehicular user i
Foreign server
RSU1
Home server
IDHN ,IDFN ,PIDβ ,α , Eau_keyi , 0 ( IDi , x ), temporary join request Start join procedure
CA i , 0 = h ( ID i || x )
ECAi , 0 ( y )
E MSK i ( y )
EMSKi (TSKi ,1 ) v 0 , CAi ,1 Fig. 3. The anonymous authentication protocol in the foreign network
When the vehicular user communicates with the foreign RSUs, he needs to perform the authentication procedure with all of the RSUs he meets. And these RSUs also need to connect to the server to check the legitimacy of the user, which causes a lot of communication overheads. For this reason, we design a novel handover procedure to reduce the communication costs. When the vehicular user arrives at the second or later RSU of the foreign network, he can be authenticated by the RSU through a credential obtained from the previously visited RSU. Hence, except for the first RSU the vehicular user meets, other RSUs need not connect to the server upon performing the authentication. The detail protocol, as shown in Fig. 4, is described below. Step
1:
Before
handover,
RSU1 computes
the
temporary
session
key
TSK i ,1 = h ( CA i ,1 || v 0 ) to communicate with the vehicular user i and generates three
A Roaming-Based Anonymous Authentication Scheme
281
random numbers v1 , θ 1 and ω 1 . The credential is HMAC k12 (θ1 ) . RSU1 then computes a new temporary pseudonym TPID i ,1 = h (CAi , 2 || ω1 ) and a new temporary session key TSK i ,2 = h(CAi , 2 || v1 ) and sends an encrypted message including the credential, θ 1 , the temporary pseudonym and the temporary session key to the vehicular user for authentication, where CA i , 2 = h ( CA i ,1 ) . RSU1 also sends the random numbers v1 , ω 1 and the certificate CA i , 2 to RSU2 for verifying the credential. Step 2: After arriving at RSU2, the vehicular user i then sends his temporary pseudonym TPID i ,1 and the credential encrypted by the temporary session key TSK i , 2 to
RSU2 for authentication. Step 3: Upon receiving the message, RSU2 computes the temporary pseudonym TPID i ,1 = h ( CA i , 2 || ω1 ) and corresponding temporary session
key TSK i , 2 = h(CAi , 2 || v1 ) . Then, RSU2 decrypts and verifies the credential HMAC k12 (θ1 ) . If the verification passes, RSU2 generates a new temporary pseudonym TPID i , 2 = h (CA i , 3 || ω 2 ) , a new temporary session key TSK i ,3 and a new credential HMAC
k 23
(θ 2 ) , and sends them to the vehicular user for the next handover. RSU2
also needs to send the random numbers v2 , ω 2 and the certificate CAi ,3 to RSU3. The vehicular user and other RSUs also run the similar procedure for authentication until the vehicular user reaches the destination or arrives in a new foreign network.
ETSK i ,1 (TSK i , 2 , HMAC k12 (θ1 ),θ1 , TPID i ,1 )
CA i , 2 , v1 , ω1
TPID i ,1, ETSK i , 2 ( HMAC k12 (θ1 ), θ1 ) TPID i ,1 , ETSK i , 2 (TSK i ,3 , HMAC k 23 (θ 2 ), θ 2 , TPID i , 2 )
Fig. 4. Handover procedure in the foreign network
3.2.3 The Transferring Domain Protocol in the Foreign Networks We consider that the vehicular users stay in the foreign network for a long period of time, such as the vehicular user changing dwelling place or standing in a new location for a business trip over one week. In this case, as the vehicular users use the foreign network resource in a long term of time, the communication overheads rapidly increase with a large number of authentications or handover processes if the protocol in the previous section is used. Therefore, we design a transferring domain authentication protocol to solve this problem. The detail steps of the protocol, as shown in Fig. 5, are described in the following. Note that the messages transmitted between the
282
C.-H. Wang and P.-C. Lee
vehicular user and foreign server must be forwarded through the RSU. For simplicity we omit the RSU in the protocol. Step 1: The vehicular user sends a domain transfer request message including the identity of the home server, the identity of the foreign server, the pseudonym, the polynomial identity number α , the encrypted real identity and a random number x to the foreign server. The foreign server then forwards this message to the home server through a secure channel. Upon receiving the message, the home server decrypts the message by using the authentication key and verifies the identity of the vehicular user. The home server then revokes all polynomial shares and pseudonyms of the requester. Step 2: The home server delivers the real identity and x to the foreign server through a secure channel. Upon receiving the real identity of the vehicular user and x , the foreign server starts a transferring domain procedure. First, the foreign server generates a random number y and a master secret key MSKTR i = h ( IDi || x || y ) and then sends y encrypted by x to the vehicular user. When the vehicular user decrypts the transmitted message and gets y , he can compute the master secret key. Step 3: Finally, the vehicular user sends his real identity and y encrypted by the master secret key back to the foreign server for authentication. If the verification passes, the foreign server selects some of the pseudonyms together with the corresponding polynomial shares and sends them encrypted by MSKTR i to the vehicular user. After performing the transferring domain procedure, the vehicular user becomes a new member of the foreign network. Then, the vehicular user can run the anonymous authentication protocol the same as the one in the home network as mentioned before to communicate with RSUs.
IDHN ,IDFN ,PIDβ ,α, Eau_keyi , 0 ( IDi , x), transferring domain request
Ex ( y)
Vehicular user ' s real IDi , x
E MSKTR i ( ID i ,y ) E MSKTR i ( Pseudonyms , Polynomail shares ) Fig. 5. The proposed transferring domain protocol
A Roaming-Based Anonymous Authentication Scheme
283
4 Discussion 4.1 Security of the Proposed Schemes Authentication In our protocols, only legitimate users can access the network resource. All vehicular users must be authenticated by the home server or the foreign server through RSUs. Also, the vehicular users must authenticate the server to check whether it is legitimate. While the vehicular user stays in the home network, the authentication key is used to confirm the validity of the users and RSUs. To be authenticated by RSUs, the vehicular user must be able to generate a correct authentication key. According to our design, except for the vehicular user and RSU j , no one can generate the session key SK i, j = h(r || s) without getting the authentication key because the random values r
and s are protected by the authentication key during transmission. Therefore, the vehicular user and RSU j can achieve mutual authentication after carrying out the anonymous authentication protocol. While the vehicular user stays in the foreign network, the foreign server and RSUs cannot directly authenticate the vehicular user. However, this situation can be treated as roaming. In our protocol, the foreign server sends the authentication message of the vehicular user to the home server. Except for the vehicular user and the foreign server, no one can generate the master secret key MSK i = h (CAi ,0 || y ) used in the anonymous authentication protocol in the foreign network, because the certificate CA i , 0 = ( ID i || x ) is delivered through a secure channel and unknown to the malicious outsiders. However, if the foreign server is unauthorized, it cannot get the certificate from the home server because it is assumed that there exists a roaming agreement between the home network and the foreign network. While the vehicular user wants to transfer to other domain, except for the vehicular user and the foreign server, no one can generate the master secret key MSKTR i = h ( IDi || x || y ) used in the proposed transferring domain protocol, because the malicious outsiders cannot know the real identity and the random values x and y . If the foreign server is unauthorized, it cannot get the real identity and the random number selected by the vehicular user from the home server. In the handover procedure, RSU j −1 can generate the temporary session key, the credential and the temporary pseudonym for the next stage. The vehicular user can obtain the above messages from RSU j −1 . Also, RSU j can compute the temporary session key by using the temporary pseudonym and random number received from RSU j −1 . Therefore, the vehicular user and RSU j can mutually authenticate each other. Similar to the previous works, our scheme did not consider RSUs’ compromise raising impersonation or eavesdropping attacks. Privacy In our schemes, the pseudonym approach is used to reach the privacy property. Furthermore, the un-traceability property is also considered.
284
C.-H. Wang and P.-C. Lee
In the home network, the vehicular user obtains a set of polynomial shares with pseudonyms. When the vehicular user arrives at a new home RSU, he randomly selects a new pseudonym and a corresponding polynomial share used for this RSU. Therefore, the vehicular user can conceal his real identity by frequently changing the pseudonyms. If a malicious attacker wants to perform a long-term trace on a vehicular user, he cannot be successful because the vehicular user changes his pseudonyms upon arriving at a new home RSU. In the foreign network, only the vehicular user and the home server can generate the certificate. The foreign server gets the certificate from the home server. However, the foreign server has no idea about the real identity of the anonymous user. Moreover, in handover procedure, the temporary pseudonym was generated by RSU j −1 . A malicious user cannot trace a vehicular user because the temporary pseudonym is changed frequently. The foreign RSU j only knows the two temporary pseudonyms used for RSU j −1 and RSU j +1 . Therefore, except for the home server and the vehicular user, no one can know the real identity of the vehicular user in the foreign network. Due to only a few neighbors around a RSU, it can directly transmit the vehicular user’s credential and pseudonym to all its neighbors. But the RSU has no idea about where the vehicular user will go. Hence, the untraceability can be realized. Real-time We use symmetric cryptosystem to reduce the computation and communication overheads. Further, the handover procedure was considered to reduce the communication overheads in the foreign network. When the vehicular user roams to the foreign network, the foreign server only needs to authenticate the vehicular user once per session. The credential can be used by the vehicular user in the following communications with foreign RSUs. Our schemes use symmetric cryptosystems instead of asymmetric cryptosystems, which is particularly suitable for secure communications in the system with high-speed movement. 4.2 Comparisons
Compared with many previous schemes, our schemes have several advantages, e.g., anonymity, location privacy, mutual authentication, and efficiency by using symmetric cryptosystems. Chuang and Lee [1] proposed a lightweight mutual authentication mechanism for network mobility (NEMO) in vehicular networks. The mechanism, called LMAM, is based on NEMO combing with AAA model in vehicular networks with low computation cost and local authentication. kim et al. [3] proposed a privacy preserving authentication protocol in vehicular networks. The protocol offers the traceability with privacy protection by using pseudonym and MAC (Message Authentication Code) chain. Zhang et al. [10] also proposed a location privacy preserving authentication scheme in vehicular networks. Zhang et al.’s scheme is based on the blind signature in the elliptic curve arithmetic. The computational cost and communication cost of several schemes are shown in Table 3, where C h denotes the cost of executing the hash function, C sym denotes the
A Roaming-Based Anonymous Authentication Scheme
285
cost of executing a symmetric encryption or decryption, C poly denotes the cost of computing a polynomial result, C asym denotes the cost of executing an asymmetric encryption or decryption, C pairing denotes the cost of executing a pairing operation, RSU_AAA denotes the number of transmissions between RSU and AAA, V_RSU denotes the number of transmissions between the vehicular user and RSU, RSU_RSU denotes the number of transmissions between two different RSUs, AAA_AAA denotes the number of transmissions between two different AAAs and n denotes the number of RSUs the vehicular user meets in the handover procedure. In the comparison of computational cost, our method looks a little worse than Chuang and Lee’s scheme but better than kim et al.’s and Zhang et al.’s schemes in performance for the foreign network authentication. Further, in handover procedure, our method is more efficient than the others’ schemes in the case that a vehicular user moves across multiple RSUs. Hence, the efficiency by using symmetric cryptosystems in our method is suitable for high-speed movement in VANETs. Table 3. Comparison of several schemes Chuang and Lee [1]
Computation Cost
Foreign Network Authentication Handover Procedure
Communication Cost
7 Ch + 4 C sym
7 C sym + 4 Casym
6 C sym + 4 C pairing
Ours 11 Ch + 10 C sym + 2 C poly (6 Ch +
+2 Casym )*n
+3 C pairing )*n
2 C sym )*n
2RSU_AAA 2V_RSU
2RSU_AAA 2V_RSU
5RSU_AAA 6V_RSU
2AAA_AAA 5RSU_AAA 5V_RSU
3V_RSU*n
(2V_RSU+ RSU_RSU) *n
(5V_RSU+ 2RSU_RSU)*n
(2V_RSU+ RSU_RSU) *n
Foreign Network Authentication Handover Procedure
(6 Ch +4 C sym
Zhang et al. [10]
(3 Ch +4 C sym
(5 Ch + 4 C sym )*n
Mutual Authentication Location/User Privacy Consideration of Multidomains
kim et al. [3] 10 Ch +
Yes No
Yes Yes
Yes Yes
Yes Yes
Yes
No
No
Yes
In the comparison of the communication cost, it is clear that our approach is not the best one in the foreign network authentication. Our approach takes some costs for the foreign server to check an anonymous user who comes from the other domain in the foreign network authentication. Hence, our approach looks a little worse than the others’ schemes in the foreign network authentication. However, the difference is not very large and our method additionally provides a nice feature of multi-domain architecture, while the other two methods, kim et al.’s and Zhang et al.’s schemes, only consider a single domain environment, which causes a lot of traffic loads on the server if their schemes are directly applied to multiple domains. The multi-domain
286
C.-H. Wang and P.-C. Lee
architecture is particularly suitable for the large scale VANETs. Although Chuang and Lee’s scheme also has considered the multi-domain architecture, it failed to provide location/user privacy.
5 Conclusions We proposed a privacy preserving authentication scheme in vehicular networks. The proposed scheme considered three different kinds of situations in vehicular networks. Our scheme not only can authenticate legitimate user but also can achieve user privacy. The proposed protocol in the home network can anonymously authenticate the vehicular users by using polynomial-pool based scheme. In foreign networks, the vehicular users can be anonymously authenticated by the foreign server through the roaming technology. Also, the proposed handover protocol can effectually reduce the communication overheads when the vehicular users stay in the foreign networks. Further, the transferring domain protocol provides a flexible way for the vehicular users to transfer to other domain in the situation when they need to stay in the foreign network for a long period of time. Acknowledgments. This work was supported in part by National Science Council, Taiwan, R.O.C. under the grants NSC97-2221-E-415-002-MY2.
References 1. Chuang, M.C., Lee, J.F.: LMAM: A Lightweight Mutual Authentication Mechanism for Network Mobility in Vehicular Networks. In: 2008 IEEE Asia-Pacific Services Computing Conference, pp. 1611–1616 (2008) 2. Fonseca, E., Festag, A., Baldessari, R., Aguiar, R.L.: Support of Anonymity in VANETs– Putting Pseudonymity into Practice. In: IEEE Wireless Communications and Networking Conference, pp. 11–15 (2007) 3. Kim, S.H., Kim, B.H., Kim, Y.K., Lee, D.H.: Auditable and Privacy-Preserving Authentication in Vehicular Networks. In: The Second International Conference on Mobile Ubiquitous Computing, Systems, Services and Technologies, pp. 19–24 (2008) 4. Li, C.-T., Hwang, M.-S., Chu, Y.-P.: A Secure and Efficient Communication Scheme with Authenticated Key Establishment and Privacy Preserving for Vehicular Ad Hoc Networks. Computer Communications 31(12), 2803–2814 (2008) 5. Liu, D., Neng, P.: Establishing Pairwise Keys in Distributed Sensor Networks. In: The 10th ACM Conference on Computer and Communications Security, pp. 52–61 (2003) 6. Plobl, K., Nowey, T., Mletzko, C.: Towards a Security Architecture for Vehicular Ad Hoc Networks. In: The First International Conference on Availability, Reliability and Security (2006) 7. Raya, M., Hubaux, J.P.: The Security of Vehicular Ad Hoc Networks. In: The 3rd ACM Workshop on Security of Ad hoc and Sensor Networks, pp. 11–21 (2005) 8. Raya, M., Hubaux, J.P.: Securing Vehicular Ad Hoc Networks. Journal of Computer Security 15(1), 39–68 (2007)
A Roaming-Based Anonymous Authentication Scheme
287
9. Xi, Y., Sha, K., Shi, W., Schwiebert, L., Zhang, T.: Enforcing Privacy Using Symmetric Random Key-set in Vehicular Networks. In: International Symposium on Autonomous Decentralized Systems, pp. 344–351 (2007) 10. Zhang, C., Liu, R.-X., Ho, P.H., Chen, A.: A Location Privacy Preserving Authentication Scheme in Vehicular Networks. In: IEEE Wireless Communications and Networking Conference, pp. 2543–2548 (2008) 11. Now: Network on wheels, http://www.network-on-wheels.de
Human Authentication Using FingerIris Algorithm Based on Statistical Approach Ahmed B. Elmadani Department of Computer Science Faculty of Science, University of Sebha Sebha Libya [email protected]
Abstract. Biometric becomes nowadays, a strong tool to authenticate persons, because they are able to prove a true identity. Research shows that different applications are used in verification. Fingerprints, Face, Iris recognition are some examples. But most of them safer from FRR and FAR, so for those reasons more researches and new algorithms are needed to be developed and to solve this problem. This paper presents a system with an algorithm, uses a pair of biometrics print (fingerprint and Iris), used to gain access to personal resources, it is based on a statistical approach. Features are extracted and used to authenticate persons. Paper shows that the developed system solves the mentioned problem and accelerates matching process. Keywords: Fingerprint, Iris, statistical feature, feature matching.
1 Introduction This paper will investigate some alternative techniques for user password. It will focus on biometrics (fingerprint and iris) as alternative means user authentication to gain access to the system resources. 1.1 Background To access a system, an authorized person needs to type his login name associated with his password. If it matches in the password file or directly, the user will be granted the access to the system resources. But there are numerous problems with the use of password-based system: • • •
In order to remember the passwords, users tend to write down passwords, especially when they use many different passwords. There is no solution to this problem except user education. Users tend to pick obvious passwords, such as names of friends, loved ones or pets. Passwords are transmitted from the user to the system in clear text on the communication lines.
F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 288–296, 2010. © Springer-Verlag Berlin Heidelberg 2010
Human Authentication Using FingerIris Algorithm Based on Statistical Approach
•
289
Password files tend to store passwords in encrypted (or hashed) form, but often there is no restriction on reading the password file. This makes them very amenable to dictionary based attacks [1].
In summary, passwords do not provide an adequate mechanism for authentication, because it may be forgotten, stolen, or left behind and this will cause system security failure [3]. Password-based system does not prove the existence of the right user, because someone else such as friends, attackers can impersonate the actual user. Biometrics is an alternative method to authenticate users. 1.2 Biometric as Alternative of Password Biometrics (face, Voice, Fingerprint, Iris), where an individual's identity is verified by a unique physical or behavioral characteristic, is fast becoming a standard solution for a variety of applications, including physical access control, computer welfare payments and healthcare data security [2]. While it, traditionally, has been used for applications that require highest security, it is now gaining acceptance in mainstream consumer applications worldwide [3]. 1.3 Authentication Using Fingerprint and Iris Image Fingerprints are geometric human characteristic. They are formed even before the birth during the hand development. Science has proved that each person has unique fingerprint, even identical twins have different fingerprints. It has been estimated that the chance of two people, including twins, having the same fingerprint is less than one-in-a-billion [4]. Iris is the elastic, pigmented, connective tissue that controls the pupil. The iris is formed in early life in a process called morphogenesis. Once fully formed, the texture is stable throughout life [11]. The iris of the eye has a unique pattern, from eye to eye and person to person [9]. Iris scan will analyze over 200 points of the iris, such as rings, furrows, freckles, the corona and will compare it with a previously recorded template. Glasses, contact lenses, and even eye surgery does not change the characteristics of the iris [9,11]. In automation security, fingerprints and Iris are more secure than passwords, because of fine differentiation between seemingly identical [8]. Iris structure is also different from person to another. Fingerprints are provided by the structure of the friction ridges, where Iris has extraordinary structure and provides abundant texture information. The spatial patterns that are apparent in the Iris are unique to each individual [12]. Individual differences that exist in the development of anatomical structures in the body result in the uniqueness [13]. Fingerprint and Iris won’t be forgotten or stolen and it suggests that iris is as distinct as fingerprints. Compared with other biometrics (such as face, fingerprints, voiceprints, etc.), Therefore they can be used to authenticate person [11]. Methods of constructing information from fingerprint or iris image are quit the same [9]. Iris scanning systems will vary the light and check that the pupil dilates or contracts, where fingerprint scanning system will check the friction ridges [14].
290
A.B. Elmadani
1.4 Fingerprint or Iris Image Processing In a closed search, where the target fingerprint image is already known, the problem is greatly simplified [4]. The system simply retrieves the existing record and makes a one-to-one comparison with the subject fingerprints image. When the number of images is small, comparison will determine the result in reasonable time, but when the one-to-one comparison deal with a large number of images say several millions this technique is time consuming [2]. It may be in some systems will last for days [3]. Similarly most modern iris detection algorithms produce what is known as iris mask, it represents the portion of the iris obstructed by eyelid or eyelash [8]. The mask is ignored when doing iris code comparison. So the idea is same and it is time consuming [14]. Fingerprint, iris image always needs an enhancement, to come out with its features clear. This is because of the low quality images captured using scanner devices [10]. An image once captured and resized, is filtered using one of the known filters methods such as Linear, Wiener, Median, and Gaussian [5]. The image using one or more filtering algorithm is filtered several times until it came clear. Then a binarization, using a threshold, is applied to the image. At this point the fingerprint image is ready to detect its edges or minutiae points. Using the same methods an iris image can be prepared, then information can be constructed [9]. The constructed information is stored on a template file for future comparison use [1]. Multimodal biometric systems are always considered to be the combination of two or more complete modal biometric systems. The conventional multimodal biometric identification systems tend to have larger memory fingerprint, slower processing speeds and higher implementation and operational cost [15]. They improve performance in respect to increasing accuracy and decreasing FAR [16]. For example Jain et al in [17], provide fingerprint, face, speech based system which uses minutiae that require an image with big size. Wang et al in [18] present Face, Iris system that uses weighted sum comparison, Lumini and Nanni in their system Fingerprint and iris [19], they use multiple detection algorithms from FVC2004 competition and Gabor filters say [15]. Traditional multimodal biometric approach, in most cases requires either installation of multiple sensors or multiple algorithms or both and this will result in higher installation and operational cost and large memory prints [15]. 1.5 Matching Techniques Three techniques are used in image matching for pattern recognition: template, statistical and structural. The application is subject to the noise and number of invariant patterns in the image [9]. If there are a small number of invariant patterns, then a template matching is probably most appropriate [10]. But if there is a little apparent structure, to the patterns or if they are in the presence of high noise, then, the statistical pattern recognition may be most effective. If, however, the patterns recognition have an obvious structure and a certain amount of variability, the structural patternrecognition technique is usually most appropriate. A combination of more than technique can be used [3]. A statistical approach for image pattern recognition uses different factors to test the equality of two images. Among those factors Hough Transform and Euler number [2,6].
Human Authentication Using FingerIris Algorithm Based on Statistical Approach
291
Hough Transform (HT) is a technique used to isolate features of a particular shape within an image. The desired features must be specified in some parametric form. The classical HT is most commonly used for detection of regular curves such as lines, circles ellipses, etc. The main advantage of the HT is that it is tolerant to gaps in feature boundary descriptions and relatively unaffected by image noise [6,7]. Euler number (EN) is another approach to identify the similarity of two images. The EN is a measure of the topology of an image. It is defined as the total number of objects in the image minus the number of holes in those objects. We can use either 4or 8-connected neighborhoods. Noting that, each two identical images will have, the same EN [3]. Correlation Coefficient (CC) is a simple descriptive statistic that measures the strength of the linear relationship between two intervals or ratio-scale variables. It measures the co-variation in the magnitudes of two images [7]. The CC indicates the extent to which the pairs of numbers for these two variables of a given images lie on a straight line. Correlation is used to measure the relationship between two variables (like two images). The values of the CC can range from -1 to +1. If there is no relationship between two variables, the value of the coefficient is 0. If there exists, a perfect positive relationship, the value is +1. If there is a perfect negative relationship, the value is -1. The CC of two images is equal to one or very close to one, only if the two images are from the same sense [3].
2 Methodology and Discussion The proposed system, compared with the existing systems, it uses less memory resources and reduction of time process as a result it becomes faster because of using numbers in its calculation and comparisons instead of images. Note: In our new fingerIris matching algorithm, we assume that the image scale is 1:1. 2.1 FingerIris Preprocessing Due to low quality images obtained from fingerprint reader and from camera, the scanned image has to be filtered several times to come out with its features. They are as follows: 1.
2. 3.
In order to reduce the impact of noise on images, we perform image equalization to all images sized 120x120 pixels, then filtered using standard median, and with Gaussian filtering on 3-by-3 pixel neighborhoods. Filtering reduces the graininess of the images, so that fewer false edges were detected in the next step of segmentation. The result of operation gives a clear image. For example see figure 1. An image is equalized, then filtered using a median filtering. The two images to form a part of human record in fingerIris system with a total size image equals to 240x240 pixels as shown in figure 1. The Canny method for edge detection was chosen to detect edges in the previous filtered image. Canny method gives better results than other methods, and it helps in finding image features. The threshold was fixed to a predefined significant value. In figure 2, the upper part shows the edged fingerIris image.
292
4.
5.
A.B. Elmadani
The detected edge may contain noise, i.e. it may contain multiple edge fragments corresponding to a single whole feature. Furthermore the output of an edge detector defines only where features are in an image. For that reason a HT algorithm was used to obtain best results and is tolerant to fingerprint and Iris translation and rotation. Such tolerance is usually reached by using Hough transform-based algorithm. The HT is used to determine both what the features are, and how many of them exist in the image. Using the detected edge image X [n, n], we form an accumulator array say A [b, c] which contains values of pixels ranging from 0..255, the values are arranged in term of X and Y in the A [b, c]. A HT (Ρ, Ф) values are computed using equation (1), it is a straight-line equation. P = X. COS Ф + Y. SIN Ф
(1)
where P is the normal distance of the line from origin and Ф is the angle of the origin with respect to X-axis as shown in figure 2.
Fig. 1. Iris and Fingerprint image preprocessing equalization and median filtering
6.
7.
Since we deal with images, which are tolerant to translation and rotation, an angle ranging from ±10 o in step of 0.2 was chosen to compute the HT (Ρ, Ф) values for fingerprint image, where for the Iris image an angle ranging from ±35 o in step of 0.2 was chosen. Figure 2, shows the calculated Hough transform for fingerIris images. Using the result obtained from the HT, we locate three highest peaks, and then we connect the three peaks to form pattern (PR) first for fingerprint as shown in figure 2. Similarly we form PR for Iris. Then we calculate area (AR) under the curve using equation (2), which will be area of PR figure 3 shows formed pattern for fingerprint. AR
where
=
∑
b
f
( P
i
).
a
f ( Pi ) = Pi = y i COS φ + x i SIN φ
Where a = - 10 , b = 10 for fingerprint , a = - 35, b = 35 for Iris and n = 100.
b
− n
a
(2)
Human Authentication Using FingerIris Algorithm Based on Statistical Approach
293
Calculate the EN for each PR, using a Matlab function EN= bweuler (PR, 8). The calculated values AR and EN are for future use. Table 1 shows for each image two values represent area of two different portions. In image no. 1 values (0.6975, 0.3765) are for fingerprint and values (0.3734, 0.2046) are for corresponding Iris image. Having different values can help in distinguishing images.
Fig. 2. Binarized Fingerprint and Iris image and resulted Peaks calculation using a HT
2.2 Matching Procedure Values of EN are used to narrow the searching list, this is then by finding minimum squire error (MSE) using formula(3). Decision are made for minimum value of the MSE. For those images values of MSE are equals to zero or closer to zero, test is passed to the next stage as we will see in the next paragraph. MSE
=
( EN
a
1
− EN
2
)2
b
Fig. 3. Fingerprint Pattern formed using a three highest peaks
(3)
294
A.B. Elmadani
Table 1. Calculated Values of EN and pair AR for fingerprint and pair AR for Iris image No. 1
PRFP FP1
2
FP2
3
FP3
4
FP4
5
FP5
6
FP6
7
FP7
8
FP8
ARFP 0.6975 0.3765 0.7691 0.4734 0.7351 0.439 0.7654 0.49 0.6891 0.4709 0.5730 0.3442 0.0702 0.1133 0.0201 0.0211
ENFP -283
PRIR IR1
-320
IR2
-275
IR3
-280
IR4
-291
IR5
-326
IR6
-54
IR7
-3
IR8
ARIR 0.0644 0.2046 0.3754 0.2238 0.3663 0.1928 0.3861 0.1396 0.3777 0.1606 0.3965 0.1627 0.3157 0.1239 0.1915 0.1901
ENIR -40 -21 -13 -23 -63 -42 -5 -4
We calculate the HT of the live image, find three highest peaks connecting them to form pattern as in 6 then match their AR with stored ARs, by calculating CC using formula (4), If no, we move to other three highest peaks until we found a matched ARs. The process can start either with fingerprint or Iris image. Thus a user can secure his access to the system resources. m
CC =
n
∑ ∑ ( AR
FP 1 ij
− AR FP1 )( AR FP 2 ij − AR FP2 )
i =0 j =0
m
n
m
n
( ∑ ∑ ( AR FP 1 ij − AR FP1 ) 2 ∑ ∑ ( AR FP 2 mn − AR FP2 ) 2 ) i =0 j =0
(4)
i = 0 j =0
Where ARFP1 , ARFP2 and ARFP1ij , ARFP 2ij are Area of the live pattern and stored one for fingerprint. Iris pattern Area are AR IR 1 , ARIR 2 and AR IR 1ij , ARIR 2ij as illustrated in Table 1.
Fig. 4. False acceptance and false rejection rates resulted
Human Authentication Using FingerIris Algorithm Based on Statistical Approach
295
In Table 2 we show values of CC for stored images against unknown image say for the first person in our illustration. Values in the table 2 show that the searched images corresponding to person's information at column 1. 2.3 Testing Our System A digital camera and fingerprint reader were used to capture images for 200 persons, where the results of testing are shown in the Table 2. Our system rejects 3 false users, accepts 2 valid users for low security then accepts one false user and reject 4 valid user for medium security where the system no false user accepted but rejects 5 valid users for high security, this is because of low image used. Figure 4, shows a curve resulted by the fingerIris system. Table 2. Resulted CC of image 1 against other stored images Image no CCFP CCIR Decision
1 1
2 0.9057
3 0.8173
5 0.0103
4 0.0117
1 1 1 ok
0.0017 0.1287 0.1302 No
0.002 0.1241 0.1239 No
0.0129 0.1391 0.1358 No
0.0081 0.107 0.1066 no
3 Conclusion A method of fingerIris matching was shown, where a fingerIris image smoothed and its true edges are detected using a HT algorithm. The fingerIris pattern features based on the area under the curve were found. We show how patterns can be formed from the resulted peaks of running a HT process, which is tolerant to fingerIris translation and rotation. This method, if it is applied perfectly, will enhance the security based on biometrics authentication.
References 1. Elmadani, A.B., Prakash, V., Ramli, A.R.: Smart Card and Secure Coprocessor Enhance Internet Security. Suranaree Journal of Science and Technology 9(2) (2002) 2. Elmadani, A.B., Prakash, V., Ali, B.M., Ramli, A.R., Jumar, K.: New Technique in Fingerprint Matching based on An-AVL tree Search Algorithm. Journal of Technology and commerce Brunei Darussalam (2006) 3. Elmadani, A.B., Prakash, V., Ali, B.M., Ramli, A.R., Jumari, K.: Fingerprint Access Control with Anti-spoofing Protection. Brunei Darussalam Journal of Technology and Commerce (2005) 4. Pankanti, S., Prabhakar, S., Jain, A.K.: On the individuality of fingerprints. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(8), 1010–1025 (2002)
296
A.B. Elmadani
5. Wang, S.D., Lee, C.J.: Fingerprint recognition using directional micro pattern histo-grams and LVQ networks. In: Proceedings of International Conference on Information Intelligence and Systems, USA, pp. 300–303 (1999) 6. Zorski, W., Foxon, B., Blackledge, J., Turner, M.: Application of the Hough Transform: Iris and Fingerprint Identification. In: Horwood, E. (ed.) Third IMA Conference in Image Processing Mathematical Methods, Algorithms and Applications Processing III, pp. 69–81 (2001) 7. Toft, P.: The Radon Transform - Theory and Implementation, Ph.D. thesis. Department of Mathematical Modelling, Technical University of Denmark (June 1996) 8. Ganeshan, B., Theckedath, D., Young, R.C.D., Chatwin, C.R.: Biometric iris recognition system using a fast and robust iris localization and alignment procedure. Optics and Lasers in Engineering 44, 1–24 (2006) 9. Roy, K., Bhattacharya, P.: Iris Recognition: A Machine Learning Approach. VDM Verlag Saarbrücken, Germany (2008) 10. Hollingsworth, K., Peters, T., Bowyer, K.W., Flynn, P.J.: Iris Recognition Using SignalLevel Fusion of Frames From Video. IEEE Transactions on Information Forensics and Security 4(4), 837–848 (2009) 11. Massimo, T., Stan, L., Chellappa, Z.: Handbook of Remote Biometrics for Surveillance and Security Series. In: Rama (ed.) Advances in Pattern Recognition (2009) 12. Yingzi, D., Lves, R.W., Etter, D.M., Welch, T.B.: Biometric signal processing laboratory. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP apos;04), vol. 5(1), pp. 1025–1028 (2004) 13. Ma, L., Wang, Y., Tan, T.: Iris recognition using circular symmetric filters. In: Proceedings of 16th International Conference on Pattern Recognition, vol. 2, pp. 414–417 (2002) 14. Lipinski, B.: Iris Recognition: Detecting the Iris. Journal of Medical and Biological Sciences, Scientific Journal International (2004) 15. Baig, A., Bouridane, A., Kurugollu, F., Qu, G.: Fingerprint-Iris Fusion based Identification System using a single Hamming Distance Matcher. International Journal of Bio-Science and Bio-Technology 1(1) (2009) 16. Hong, L., Jain, A., Pankanti, S.: Can Multibiometrics Improve Performance? In: Proceedings AutoID’99, Summit, NJ, pp. 59–64 (1999) 17. Jain, A.K., Hong, L., Kulkarni, Y.: A Multimodal Biometric System using Fingerprint, Face and Speech. In: 2nd International Conference on Audio and Video-Based Biometric Person Authentication, Washington, D.C., pp. 182–187 (1999) 18. Wang, Y., Tan, T., Jain, A.K.: Combining Face and Iris Biometrics for Identity Verification. Springer, Heidelberg (2003) 19. Lumini, A., Nanni, L.: When Fingerprints are Combined with Iris – Case Study: FVC 2004 and CASIA. International Journal of Network Security 4(1), 27–34 (2007)
Aerial Threat Perception Architecture Using Data Mining M. Anwar-ul-Haq, Asad Waqar Malik, and Shoab A. Khan Department of Computer Engineering, College of E&ME National University of Science and Technology, Islamabad Pakistan [email protected], [email protected], [email protected]
Abstract. This paper presents a design framework based on a centralized scalable architecture for effective simulated aerial threat perception. In this framework data mining and pattern classification techniques are incorporated. This paper focuses on effective prediction by relying on the knowledge base and finding patterns for building the decision trees. This framework is flexibly designed to seamlessly integrate with other applications. The results show the effectiveness of selected algorithms and suggest that more the parameters are incorporated for the decision making for aerial threats; the better is our confidence level on the results. To delve into accurate target prediction we have to make decisions on multiple factors. Multiple techniques used together helps in finding the accurate threat classification and result in better confidence on our results. Keywords: Aerial Threat Architecture, Aerial Threat Data Mining, Centralized Architecture, Simulated threat architecture, Threat perception Architecture.
1 Introduction Many aircrafts around the world are using techniques that can help in avoiding radars, ideally being invisible to the radar. Some are using radar absorbent materials that return very less electromagnetic waves. Getting a very few return from radar makes it look like noise or birds in the air as the cross section shown, is very much reduced in size [1]. Most of the fighter jets use paint at its thick edges where there is a likelihood of maximum radar waves returning [2]. Some techniques utilize the shape of the plane to reduce the cross section by tilting the dihedral or trihedral angles. To counter these measures there are some techniques known as low frequency radars, multiple transmitters, sonic boom and infrared [3, 4]. Stealth planes reduce their cross section by using these techniques to look like a bird or noise in the air. B-2 Spirit and F-117 Nighthawk are two very well known stealth planes [5]. Accurate threat detection is based on a signature of the aircraft which holds unique characteristics or behavior of each plane that can help to identify it. Planes are broadly classified based on experience, trends and personal knowledge base. These classifications are fighters, helicopters, transporters and bombers. Threat perception is the need of every country. That is why there is a lot of work being carried out in this field to be self sufficient and to make the defence of the country strong from any potential threats. F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 297–305, 2010. © Springer-Verlag Berlin Heidelberg 2010
298
M. Anwar-ul-Haq, A.W. Malik, and S.A. Khan
Some traditional techniques use the aerial images and statistical algorithms for detection via image processing [6]. Some other techniques are predicting the radar signature from the aircraft engines [2]. Some work is also done to predict the radar cross section of the fighter planes accurately using the Rayleigh distribution fluctuations [7]. Our approach is different and builds architecture from the knowledge base for decision making via data mining algorithms.
2 Proposed Model Our proposed model is based on a scalable, distributed architecture where major entities are the Threat Simulator, Interfacing Module (IM), Centralize Processing Module (CPM) and the Data Centric Layer (DCL). Our model takes special care of the decision time, and will return results in real time, quickly with minimal storing/accessing files and databases access. For our data mining decision making we are using the WEKA API [8] developed in Java while. Our threat simulator was already working in C# generating the flight patterns. From WEKA we can use Instances, filters and classifiers etc. for our application context and visually model the results. To communicate between modules written in different programming languages we incorporated an Interfacing Module (IM) in our model. This module is running on the simulator machine. There were inter-operability issues of data types while communication between different programming modules through sockets. Our model incorporates standard XML parsing for the communication solution. Some other options to communicate between two different modules include writing wrapper classes at both ends [9] or using some third party applications to create objects from the added dynamic link libraries (DLLs) or JAR files [10]. Figure 1 shows the model we propose using XML string objects. We already had a threat simulator working which is generating an XML file containing lots of simulated flight parameters. The interfacing module (IM) running on the simulator machine is being used for interoperability between Threat Simulator and Centralized Processing Module. It creates an XML object string from the simulated flight patterns, converts them to bytes and passes over the UDP port to send to the Centralized Processing module (CPM) server. The CPM converts the bytes back to string and uses the XML parser for processing the strings. CPM uses JDBC driver for connectivity with the Data Centric Layer for processing the XML tags and fetching the results. We have a database of more than 70 planes and more than 40 attributes for each plane, for decision making which will grow with time. The Centralized Processing Module fetches the processed string results, converts them to bytes and sends it again on the network for UDP client interfacing module. In all this procedure there is no writing to an XML file to keep all the processing in real time, for quick results XML string objects are made, except the first part where data is initially fetched from the XML file to start off. The Java desktop application also uses the Data Centric layer to get data patterns from the knowledge base. It uses an ARFF (Attribute relation file format) format where data is kept in attributes, relation and instances format. Based on the instances of data, a decision is classified into a group. We are using C4.5 (J4.8 Implementation of WEKA) data mining algorithm for our classification. A classification tree is generated, based on the provided data instances.
Aerial Threat Perception Architecture Using Data Mining
299
Fig. 1. Threat Perception Architecture
3 Methodology For automatic target recognition radar cross section is one major factor that returns the size of the blip which is the return of electromagnetic waves from the surface of the target. Radar Cross Section gives us size of the cross section, material of the target, absolute and relative size and incident angle [11]. Radar Cross Section depends on the geometric cross section that is the area presented by the target to the radar at the instance the radar waves hit it. It is one of the major factors which can be used broadly to classify the type of threat e.g. Helicopter, Fighter, Transporter, Passenger, Drone, Missile etc. To find the Radar Cross Section accurate is very difficult itself, since it depends on the geometric cross section of the plane and changes a lot for maneuverable planes like fighters. That is why a range is used for each type of plane classification. Some planes also use the radar absorbent materials to return very less cross section equivalent to the size of birds or noise.
300
M. Anwar-ul-Haq, A.W. Malik, and S.A. Khan
Another factor that can be used for planes is Heat Signature for each plane. As sonar noise signatures are used for submarines, Heat Signatures can be used for planes for the temperature of the heat they release. For this factor, electromagnetic radars do not work and we need to have Infrared radars. Thermal images can be used for a visual picture of the plane to detect the emissivity from different parts of the plane. Apparent temperature difference for a point can be detected for heat signatures. Maximum speed of a plane is also one of the direct measurements that help a lot for the plane classification. Speed can be measured from the difference of distance between time intervals. There is a limit for each plane for the ceiling it can attain. Some planes fly very high, others stay close to ground to avoid radars, in this way plane ceiling can be incorporated into decision making as well. Combination of all these factors to the decision making can help to get a very accurate classification. Right now our research incorporates Speed and the Ceiling available from our simulator and all decisions are being made on it. We are using a confidence factor to quantify the accuracy of identification and the level of trust on our prediction. Using these two measurements we define the confidence level to be a maximum of 50% because there is a need to incorporate more factors for 100% accurate target identification. Our Confidence Factor is made up of Flight Plans (5%) + Ceiling (10%) + Heat Signature (10%) + Thermal images (10%) + Cross section (25%) + Speed (40%). All these factors are further classified, based on the offset,. e.g. if the speed (40%) is found to accurately match that of the simulator, the speed factor is returned all 40%, if it is in +200 km/hr the confidence on the height factor decision making reduces to 35% and the same goes for all the factors. For the decision making and to find accurate patterns out of our knowledge base we are maintaining a database of more than 70 planes: we are incorporating as much factors as we can for all types of planes. Some of the factors include: Height, Speed, RCS, Reflectivity, Directivity, Geometric cross section, Temperature, Emissivity, Background, Crew, Length, Maximum Takeoff Weight, Empty Weight, Weight Loaded, Max Payload, Wing Span, Fuel, Zero Fuel Weight, Ferry Range of Plane, Combat Range of plane, Rate of Climb, Wing Load, Thrust/Weight Ratio, Material, Angle of Climb, Wing Area, Ceiling, Plane Variants, Year Introduced, Manufacturer, Role, Primary Users, Mission Types, Power Plant and Armament. The data mining algorithm has to be selected, based on the continuous or categorical response variables. We have to take care of the dependent variable and make the decision based on it. Some of the best known algorithms include Classification (CHAID, CART, Quest, id3, C4.5/5.0), Regression (Logistic Regression, Discriminant analysis, linear models) and Neural networks (Back propagation, Conjugate, Gradient, Quassi Newton and Genetic algorithm) [12]. Classification algorithms go on to make a decision tree (both a visualization tree and a text based tree in WEKA). Quest and CART do only binary classification of trees, while CHAID and C4.5 builds non-binary trees containing multiple branches. We are using C4.5 algorithm on the database values to find the predications and decision making.
Aerial Threat Perception Architecture Using Data Mining
301
4 Results Firstly, the XML file is being generated by the threat simulator where a single track is written for the ceiling and speed tags. A sample of the XML file being used is given below. <Speed>2405 18000 A continuous update is being written to this file by the Threat Simulator based on the flight parameters. Secondly, the Interfacing Module reads the XML shown above, makes a string XML object, converts it into bytes and sends it over the UDP port for the server Centralized Processing Module to process the results. The UDP Central Processing Module server is always ON and is ready to receive the client requests for processing.
Fig. 2. Java application results
302
M. Anwar-ul-Haq, A.W. Malik, and S.A. Khan
Thirdly, the CPM server receives the bytes at the UDP port and converts the bytes back to the string. It then parses the string for ceiling and speed values from the Data Centric Layer (figure 2 above). The results below show the type of planes that match the tag values or are closest along with returning the confidence factor for these values. These values are directly fetched after comparing the parsed values with the data centric layer. The planes shown above (figure 2) are those that have speed and ceiling values with in an offset of +200 km/hr and +1000 meters of the tagged values respectively. Because of this offset our confidence level is less than 50%. These values are fetched based on the truth of both speed and ceiling values.
Fig. 3. Arff file creation from Data Centric Layer
Once the tags are processed again a string XML object is created which is converted to bytes and sent back to IM client. The processed string is sent to the IM client, it follows the same procedure as that of the Central Processing module for parsing the tags. The Java desktop application takes the input from the Data Centric layer and populates the threat.arff file to be read through WEKA API in the application. The arff file is written after fetching the parameters from the database. The figure 3 above shows the relation, attributes and instances generated by the desktop application in the arff file format for processing by the WEKA API. All the instances in this application are currently speed and ceiling, which will be extended in future for more parameters.
Aerial Threat Perception Architecture Using Data Mining
303
Fig. 4. Speeds and classes for the input planes
Based on our inputs to the WEKA API the visualization graph generated shows the minimum and maximum speeds and ceiling, their averages and classifies them broadly into the input classes as shown in figure 4 above. For all the 54 planes in this input more than 36 have speeds below 1000 km/hr (figure 4) which broadly can be said about the Helicopters, transporters and passengers. Rest of them is mostly fighters. Second graph shows that more than 35 planes have the ceiling below than 10000 meters. Another graph shows the total classes of inputs (4) and number of planes belonging to each class (54 divided into 18, 17, 1 and 18). The purpose of the Java desktop application is to get the patterns from the WEKA classifiers. We are using J4.8 to get the patterns out of our input file. From our database of planes the patterns we found for the tree are shown below. Figure 5 shows a text based tree, where we have the helicopters for the speed less than 355 km/hr. For speed greater than 335 it is further classified on the basis of ceiling. For all speeds greater than 335 km/hr and ceilings less than 13,716 meters we have transport planes and for all speeds and ceilings greater than 335 km/hr and 13,716 meters respectively we have fighters. Number of leaves for the tree is 3 and total size of tree containing all nodes is 5. Some other statistics for the data are shown including the confusion matrix etc.
304
M. Anwar-ul-Haq, A.W. Malik, and S.A. Khan
Fig. 5. Data Mining patterns results
Figure 6 below shows the same text based tree into a more appealing visual diagram, clearly classifying the different planes into specific groups, based on numeric values of speed and ceiling.
Fig. 6. Visualization graph for the planes
Aerial Threat Perception Architecture Using Data Mining
305
5 Conclusion The purpose of this research is to give architecture to our current needs. We built a cross platform, distributed and scalable architecture that returns result in real time quickly minimizing the role of saving to files/databases. Thus, being an efficient solution that incorporates the future needs of adding as much parameters as needed for the decision making. Currently the planes are very broadly classified which will be improved to delve down into a more accurate classification. The future work on this project will be to quantify and improve the CPM time, overall decision making time. It will incorporate more and more features into the database to improve our decision making hence improving the proposed confidence level. A good addition to this work will be to come up with some logit function for the measurement parameters and adding it into our confidence criteria. The next level for this project can be the direct weapon assignments by the control systems for assigning the weapons based on the target identified.
References 1. Lederer, P.G.: An introduction to Radar Absorbent Materials (RAM), Great Britain, Royal Signals and Radar Establishment (1986) 2. Zdunek, A., Rachowicz, W.: Cavity Radar Cross Section Prediction. IEEE Transaction on Antennas and propagation 56(6) (2008) 3. Przemieniecki, J.S.: Critical technologies for national defense, US. American Institute of Aeronautics and Astronautics (1991) 4. Youssef, N.Y.: Radar Cross Section of Complex Targets. Proceedings of IEEE 77(5) (1989) 5. Jarrett, P.: Faster, further, higher: leading-edge aviation technology since 1945, United Kingdom, Putnam Aeronautical Books (2002) 6. Stephen, W.Y., Marshall: A Novel approach for Automatic Aircraft detection. In: Proceedings of UK European Signal Processing Conference, Tampere, Finland, September 4-8 (2000) 7. Vasserot, T.P.: The Jet fighter Radar Cross Section. Proceedings of IEEE Transaction on Aerospace and Electronic Systems (1975) 8. Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann Publishers, San Francisco (2005) 9. Bishop, J., Horspool, R.N., Worrall, B.: Experience in integrating Java with C# and.NET. Concurrency and Computation: Practice and Experience 17, 663–680 (2005) 10. Jaimez, C., Lucas, S.: Web Objects in XML. Efficient and easy XML serialization of Java and C# objects, http://woxserializer.sourceforge.net/index.html 11. Dybdal, R.B.: Radar Cross Section Measurements, Aerospace Corp electronics research Lab (1986) 12. Han, J., Kumber, M.: Data Mining: Concepts and techniques. Morgan Kaufmann Publishers, Urbana-Champaign US (2006)
Payload Encoding for Secure Extraction Process in Multiple Frequency Domain Steganography Raoof Smko, Abdelsalam Almarimi, and K. Negrat Higher Institute of Electronics Engineering Baniwalid, Libya [email protected], [email protected], [email protected]
Abstract. In this paper, we used a new technique for payload encoding in hided text in a bitmap image. The technique based on using an index of the dictionary representing the characters of the secret messages instead of the characters themselves. The technique uses multiple frequency domains for embedding these indexes in an arbitrary chosen bitmap image. By using discrete cosine transform DCT, discrete wavelets transform DWT, and a combination of both of them. We test the technique in a software package designed specially for implementing this technique, and we got very good results in terms of capacity of hiding, imperceptibility which are the most two important properties of steganography, the time of hiding the text, and the security issues, especially by using a new approach in payload encoding that gives the technique a powerful behavior in both encoding and extracting sides. Imperceptibility; improved by increasing PSNR (between 106.58 and 122.28 db), security; improved by using an encrypted Stego-key and secret message by the dictionary, capacity; improved by a factor = 1.3 by encoding the secret message characters with 6-bits only and embedding the secret message in all three color components RGB, efficiency; improved by having a embedding/ extraction time in msec (for example 256x256 Lena image it is 638 msec, in the worst case). Keywords: Steganography, DCT, DWT, PSNR.
1 Introduction Steganography is the art and science of invisible communication. This is accomplished through hiding information in other information, thus hiding the existence of the communicated information. The word steganography is derived from the Greek words “stegos” meaning “cover” and “grafia” meaning “writing” [1] defining it as “covered writing”. In image steganography the information is hidden exclusively in images. Steganography differs from cryptography in the sense that where cryptography focuses on keeping the contents of a message secret, steganography focuses on keeping the existence of a message secret [2]. Steganography and cryptography are both ways to protect information from unwanted parties but neither technology alone is perfect and can be compromised. Once the presence of hidden information is revealed or even suspected, the purpose of steganography is partly defeated [3]. The strength of steganography can thus be amplified by combining it with cryptography. F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 306–313, 2010. © Springer-Verlag Berlin Heidelberg 2010
Payload Encoding for Secure Extraction Process
307
Two other technologies that are closely related to steganography are watermarking and fingerprinting [4]. These technologies are mainly concerned with the protection of intellectual property, thus the algorithms have different requirements than steganography. In watermarking all of the instances of an object are “marked” in the same way.The kind of information hidden in objects when using watermarking is usually a signature to signify origin or ownership for the purpose of copyright protection [2]. With fingerprinting on the other hand, different, unique marks are embedded in distinct copies of the carrier object that are supplied to different customers. This enables the intellectual property owner to identify customers who break their licensing agreement by supplying the property to third parties [3]. This paper intends to offer a new approach for designing the payload to be loaded in a bitmap image in image Steganography using multiple frequency domain with dictionary based text embedding [5].
2 Related Studies In 1992, Kurak and McHugh [11] suggested a new system in steganography called image downgrading which is a special case of substitution methods and there image act as secret and cover. Given secret image and cover image same dimensions, sender must change the four least significant bits of cover image with four most significant bits of secret image. The receiver extracts the four significant bits of stego-cover, thereby gaining access to four most significant bits of secret image. Zhao and Koch [12] are present another scheme of steganography techniques used for binary image (black and white), or in digitalized fax data. In which they divide binary image to rectangular image blocks, and calculate percentage of zero and ones in each block. Another simple technique presented by Matsui and Tanaka [13], where it is used in fax binary images, because fax images used run length and Huffman for encoding images. They make use from RL technique in which all pixels comes after each other in a sequence manner, where a sequence of white pixels comes then followed by another sequence of black pixels, odd number of those sequences represent 1 and even numbers represent 0. In the transform domain there are some other techniques can be listed as related works. Zhao and Koch [12] used DCT technique in steganography in which during embedding process sender split the cover image to 8x8 pixel blocks, each block encode one secret bit. Before communication started the sender and receiver agree previously on the location of two DCT coefficients which will be used in embedding process, comparison done between these two values, if we want to embed secret bit 1, we must check these two values the first one should be grater than second value if not we must swap their positions. Same thing for embedding 0, second point should be grater than first point if not swapping done. Development done on previous DCT method by Zhao and Koch [12, 14], in which they have used three points instead two points, for comparison. Hiding secret 1, p1>p2 and p1>p3; while for embedding 0, p1
308
R. Smko, A. Almarimi, and K. Negrat
3 Research Methodology Instead of having just a look to previous steganography techniques, comparing, and analyzing. The proposed stego-system intended to focus on three approaches: firstly; mixes two different frequency domains (DCT and DWT) for making it more imperceptible, robust, and secure. Secondly; dictionary based data embedding (use a special dictionary for encrypting the data before embedding) to make the data recovery impossible by the third party. Thirdly; makes a completely recoverable and secure data hiding by encoding a proper payload to be useful in recovery process.
4 The New Approach In this paper, we used a technique based on working with Bitmap images (BMP)[6]. We used various transform techniques in this paper to be more efficient and have more valuable results. The contributions reached in this technique can be shown in the security viewpoint, especially in the designing of the stego key. Such key that are used in embedding stage as well as in the extraction stage, to give more robustness to the steganography process and make it more difficult for third parties to attack and recover the secret message. The proposed stego system is a dictionary based image steganography system, which means it depend on a system's specific dictionary for encoding the characters in the secret message to binary value, rather than using the well known character codes like Unicode or ASCII that adds the data encryption to the stego system. The system is work on colored Bitmap images, rather than gray scale images, which provides more capacity by hiding the secret message in three layers of color representing matrix (RGB) [7]. Frequency domains are good platforms for image steganography; they provide a wider space to work on then the LSB insertion in spatial domain. Here instead of using a single frequency domain, the proposed system provides a multiple frequency domains like using DWT with DCT and then it does the embedding, here for DWT two types are provided which are; Haar and Daubechies-4 [8]. 4.1 Architecture of the Proposed System The system is made from two major parts; Embedding and Extraction process. The Embedding process hides the secret message inside the cover image and the result will be the stego-image. The other process is Extraction that extracts the embedded secret message from the stego-image regarding some security issues. Embedding the secret message goes through several processes, it begins with loading the image (because it uses the colored bitmap image, it loads the image in 3D matrix with values Red, Green, and Blue that comes from decomposing the pixel values) and normalizing its dimensions based on the selected technique (DCT or DWT+DCT). This step is done to make the image dimensions (width and height) fit the algorithm that transforms the image from spatial domain to frequency domain by taking the largest suitable part from the image and clipping the smallest improper extra pixels [9].
Payload Encoding for Secure Extraction Process
309
After normalizing the image dimensions, depending on selected technique; it goes to DCT block, if DCT is chosen, which takes the Discrete Cosine Transform of the normalized image pixel's matrix or to DWT block, if DWT+DCT is chosen, to take the Discrete Wavelet Transform (haar or Daubechies-4, based on the selection) and then to DCT block, by this it uses two frequency domains to increase the imperceptibility and security level [10]. Now the image is in transform (or frequency) domain and it will be passed to Embedding block, here before explaining the Embedding block there are two other inputs for the proposed stego system which are the Secret Message (the desired text message to be hidden inside the cover image) and Stego Key (the key for security reason, it is required in the receiving side to extract the message, without this the second party can not recover the hidden secret message). 4.2 The Security in the New Approach In the security point of view, we used what we called it a Stego Key. Here we add a facility of having a Stego Key (or Password), which makes the secret message to be password protected, and for extracting the secret message from stego image the correct password should be entered otherwise no message will be recovered. Also we encrypted the data before hiding it in the cover image; a special dictionary used for encoding the secret message characters by getting the address of the character in the dictionary and not its well known character codes, which is used in both embedding and extraction process that makes the secret message extraction impossible by Stegano analysis even if the embedding process is known. Other techniques are used for efficiency, like Data compression; such that before the encoding process, the secret message goes through a white space remover process (for example removing extra contiguous repeated spaces, tabs, and end lines) and also goes through a filter process that transform all characters to lower case (this is useful for saving space in the dictionary too by having just lower case characters there). Payload encoding; to make our extraction process perfect and well retrieval of the secret message, the payload header is 99-bit length that holds these information as shown in figure (1).
42-bit
Message Length
Password
16-bit
36-bit
Used Quarter Tech
2-bit
3-bit
Message
Message length / 6 - bits
Fig. 1. The format of Payload Encoding
The characters encoded in a way such that each character in the secret message with 6-bits instead of 8-bits, for improving the capacity, because steganography in frequency domain suffers from having a low capacity, especially hybrid method (DWT+DCT).
310
R. Smko, A. Almarimi, and K. Negrat
5 Experiments Results The experiments done on Lena image, the image is colored bitmap 512x512 pixel, and the size of 768 KB. The Results of the experiment included in a comparison table as shown in table (1). The comparison is used for comparing different techniques used in the experiments, in terms of the quarter used in the transformation, the maximum capacity (measured in characters), the secret message size (measured in characters), the time taken in embedding the message in the cover image (the time is measured according to the specification of the computer used in the running of the program), and finally the number of bits that changed comparing the cover image (before embedding) and stego image (after embedding). Table 1. Comparison Table for the Used Techniques Used Technique DCT
Used Quarter -
Maximum Capacity (Characters) 2031
Time (ms)
Bits changed
719–750
0.064
DWT (Haar)
HH
495
780 -920
0.025
DWT (Haar)
LH
495
780 -920
0.063
DWT (Haar)
HL
495
780 -920
0.062
DWT (Haar)
HH+LH
1007
803-1031
0.085
DWT (Haar)
HH+HL
1007
803-1031
0.083
DWT (Haar)
LH+HL
1007
803-1031
0.094
DWT (Haar)
HH+HL+LH
1519
920-1115
0.103
DWT (Daub4)
HH
495
312-500
0.018
DWT (Daub4)
LH
495
312-500
0.018
DWT (Daub4)
HL
495
312-500
0.020
DWT (Daub4)
HH+LH
1007
469-500
0.044
DWT (Daub4)
HH+HL
1007
469-500
0.044 0.048 0.055
DWT (Daub4)
LH+HL
1007
469-500
DWT (Daub4)
HH+HL+LH
1519
590-850
The experiments implemented using special software package, this package have been developed specially for the new approach used in steganography. The software developed using Microsoft J# 2005 Express Edition, which it produce more flexible for this work in the designing a GUI for the application, and it is fast enough since most of the image processing program needs a long time for execution. All the experiments executed using PC with 2.0 GHz CPU speed, 1 GByte RAM, 120 GByte HDD, and 512 MByte graphic card, so all the results that take the time in account are based on these specifications. Figure (2) shows the graph representing the maximum capacity (measured by characters) for the techniques used in the approach. The frequency domain steganography
Payload Encoding for Secure Extraction Process
311
suffers from having a low capacity; for example in DCT we can embed one bit in 64pixels (8x8), but there is some improvement in our approach which is the use of 6-bits for each character instead of 8-bit, and embedding the message in three layers (RGB).
2500
Maximum Capacity, 2031
Maximum Capacity, 1519
2000
Maximum Capacity, 1007
1500 Maximum Capacity, 495
1000 500 0 -
One Quarter
Two Quarters
DCT
Three Quarters
DWT + DCT
Fig. 2. The maximum capacity in characters for the techniques used
Because of getting a high imperceptibility in the Stego images, and found that there is no noticeable distortion at all; that is why the results are: MSE = 0 and peak signal to noise ratio PSNR = Infinity db and the file size remains the same. Another way used for showing the difference between the Stego and the Cover image, which is the Bit difference; or the number of bits changed by the embedding process. Figure (3) shows the chart of comparing the Bit Difference between the Cover and Stego image after embedding the secret message for the different techniques used.
0.103
0.064
0.065 0.025
0.015
DCT
0.016 0.006 Haar + DCT HH
0.018 0.004
Haar + DCT HH+HL+LH
Bit Difference Quarter
Daub + DCT HH
0.011 Daub + DCT HH+HL+LH
Bit Difference Full
Fig. 3. The bit difference between cover and stego image
The execution of staganography programs usually take a long time, this notation is very clear to proficient persons whom working in steganography, because of the big amount processes done in those programs; here some improvements are done in our approach and these results are obtained. Figure (4) shows a graph for the time required in various techniques used for the hardware envirnment mentioned above in this section.
312
R. Smko, A. Almarimi, and K. Negrat
1115 920
920
750 719
850
780 590 500 312
DCT
Haar + DCT HH
Haar + DCT HH+HL+LH
Duration time Quarter
Daub + DCT HH
Daub + DCT HH+HL+LH
Duration time Full
Fig. 4. The time required for various techniques used
6 Conclusion and Future Works • • •
• • •
• •
In all techniques and quarter selections the MSE reaches to very small value, and the PSNR range is between 80 – 120 dB; which shows that there is very small distortion in the image at all. There is no noticeable change in the luminosity histogram between cover and stego image; which prevents the stego image from attacks, when attackers want to see the change in luminosity histogram. For imperceptibility (Bit Difference); all techniques are imperceptible enough by having a high PSNR value, but we measured a bit difference as an extra parameter for this and by experiment and recording various results we saw that the best technique is Daubechies-4 with DCT, then Haar with DCT, and then DCT only. And best quarter for embedding in DWT+DCT is (HH). For duration time of embedding; best technique is Daubechies-4 with DCT, then DCT, and then Harr with DCT. The cover and stego images have the same; file format (BMP), size (KB), and dimensions. Hence no change is made in these areas. The embedded secret message is 100% recoverable without error (BER=0), which is a good result and great challenge in frequency domain that holds many looses in transform and rounding values during both embedding and extraction processes. No secret message can be recovered from the stego image without giving a correct password. The capacity is improved by 1.333, by encoding each character of the secret message with 6-bits. And also improving it by 3 due to embedding message in all layers (RGB). Since Haar wavelet depends on the differencing and averaging for high and low pass filters respectively it is simpler than Daubechies-4 to deal with, but it has more looses in transform process. While Daubechies-4 is more complex and difficult, but nearly it has no loose in transform process.
As an ideas for future works, we can use the YCbCr color space instead of RGB, and embedding the secret message in chrominance (Cb, Cr) parts only and without changing
Payload Encoding for Secure Extraction Process
313
the luminance to increase the imperceptibility, because human eyes is more sensitive for luminosity of the color. Also, we can select three points for embedding a secret message bits instead of two, to embed 2-bits in any 8x8 block of DCT coefficients, by this the capacity of the cover image can be increased twice. Another idea is that in using the variable bit length codes in the dictionary character codes to reduce the number of bits required for encoding the secret message. It can be done by using the Huffman coding, while we know the general probabilities of the characters. And using a finite state automaton in recovering process for decoding the bit stream recovered from the stego image, we can apply the same idea on other lossless compressed image files and videos as well. Also, there are many ways for improving the work by using the same proposed system as watermarking system by a small change; selecting the two chosen points from medium frequencies in the 8x8 block of the DCT coefficients to increase the robustness, as it is the most important property of any watermark system, finally, by providing the DWT with more than one stage, like 2 or 3-stage DWT, for increasing the capacity.
References [1] Moerland, T.: Steganography and Steganalysis. Leiden Institute of Advanced Computing Science, http://www.liacs.nl/home/tmoerl/privtech.pdf [2] Marvel, L.M., Boncelet Jr., C.G., Retter, C.: Spread Spectrum Steganography. IEEE Transactions on image processing 8, 8 (1999) [3] Wang, H., Wang, S.: Cyber warfare: Steganography vs. Steganalysis. Communications of the ACM 47, 10 (2004) [4] Anderson, R.J., Petitcolas, F.A.P.: On the limits of steganography. IEEE Journal of selected Areasin Communications (May 1998) [5] Johnson, N.F., Jajodia, S.: Exploring Steganography: Seeing the Unseen. Computer Journal (February 1998) [6] Tariq, A., Smko, R., Salem, O.: Dictionary Based Steganography Using Multiple Frequency Domain. Gulf University Journal 1(1) (2009) [7] Owens, M.: A discussion of covert channels and steganography. SANS Institute (2002) [8] Dunbar, B.: Steganographic techniques and their use in an Open-Systems environment. SANS Institute (January 2002) [9] Silman, J.: Steganography and Steganalysis: An Overview. SANS Institute (2001) [10] Lee, Y.K., Chen, L.H.: High capacity image steganographic model. Visual Image Signal Processing 147, 3 (2000) [11] Cachin, C.: An Information-Theoretic Model for Steganography. In: Aucsmith, D. (ed.) IH 1998. LNCS, vol. 1525, p. 306. Springer, Heidelberg (1998) [12] Chen, P.-C.: On the Study of Watermarking Application in WWW-Modeling, Performance, Analysis, and Applications of Digital Image Watermarking Systems. Master Thesis, National Tsing Hua University (May 1999) [13] Kurak, C., McHughes, J.: A Cautionary Note On Image Downgrading. In: Proceedings of IEEE Computer Security Application Conference 1992. IEEE Press, Los Alamitos (1992) [14] Zhao, J., Koch, E.: Embedding Robust Labels into Image for Copyright Protection. In: Proceedings of the International Conference of Intellectual Property Rights for Information, Knowledge and New Techniques. Oldenbourge Verlag, Munchen (1995)
An Implementation of Digital Image Watermarking Based on Particle Swarm Optimization Hai Tao1, Jasni Mohamad Zain1, Ahmed N. Abd Alla2, and Qin Hongwu1 1 2
Faculty of Computer Systems and Software Eng., University Malaysia Pahang, Malaysia Faculty of Electrical and Electronic Engineering, University Malaysia Pahang, Malaysia [email protected], [email protected], [email protected], [email protected]
Abstract. The trade-off between the imperceptibility and robustness is one of the most challenges in digital watermarking system. To solve the problem, a digital image watermarking implementation of evolutionary algorithm in the discrete wavelet domain is presented. In the proposed scheme, the watermark is embed) coefficients in wavelet domain. Furthermore, ded in the vertical subband ( the proposed algorithm based on Particle Swarm Optimization is performed to train scaling factors to accomplish maximum the watermark strength while decrease the visual distortion. From experimental results, it demonstrates the robustness and the superiority of the proposed hybrid scheme.
1 Introduction With the popularization and development of computer network and multimedia technologies, it is proliferated that various multimedia products are vulnerable to illegal possession, duplication and dissemination. Digital watermarking is the process of embedding or hiding digital information called watermark into a multimedia product, and then the embedded data can be extracted later from the watermarked product, for the protection of intellectual property rights and reduction counterfeiting, which is indiscernible and hard to remove by unauthorized persons [1]. These years, the discrete wavelet transform (DWT) approach remains one of the most effective techniques that is easy to implement for image watermarking [2-4]. In [3], a distance measure between the distorted and undistorted images/video in order to determine the distortion is introduced, but the algorithm is non-blind. Lee et al [4] presents a Genetic Algorithm-Based watermarking algorithm in the discrete wavelet transform domain. The algorithm consists of wavelet-domain low-frequency region watermark insertion and genetic algorithm-based watermark extraction. However, because most of the energy is concentrated in the lowest frequency component in DWT domain, the coefficients modification of approximation subband cause serious visual distortion and in the GA, there exist drawbacks which are its expensive computational cost and the low convergence speed. In this paper, a novel robust blind watermark extraction scheme using PSO in DWT domain is proposed. The watermark insertion is implemented in the vertical F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 314–320, 2010. © Springer-Verlag Berlin Heidelberg 2010
An Implementation of Digital Image Watermarking Based on PSO
315
subband ( ) component of DWT domain, and particle swarm optimization (PSO) searches and extracts watermark automatically. This scheme can also simultaneously optimize multiple scaling factors to obtain the highest possible robustness without losing the transparency, in embedding the watermark image.
2 Preliminaries 2.1 Discrete Wavelet Transform (DWT) The DWT separates an image into a lower resolution, label the resulting sub-images from an octave of it as LL (the approximation) which is the coarse overall shape, covers the low-frequency component that contain most of the energy in the image and LH (horizontal details), HL (vertical details) and HH (diagonal details) which represent higher-frequency detailed information have the finer scale wavelet coefficients, according to the filters used to generate the sub-image. The wavelet components are then used to obtain the next coarse overall shape by further iterating LL1 in this process and we get the details (LL1, LH1, HL1, and HH1) at each succeeding octave are one-fourth the size of the previous one. This process is repeated several times until the desired final scale is reached. In DWT, most of the energy is concentrated in the lowest frequency component, in which embedding watermark is robust against various attacks but the fidelity of host image is degraded. 2.2 Particle Swarm Optimization (PSO) The basic idea of the classical particle swarm optimization (PSO) algorithm is the clever exchange of information about the global and local best values mentioned above. Let us assume that the optimization goal is to maximize an objective function f r . Each particle will examine its performance through the following two views. Each potential solution is also assigned a randomized velocity, and the potential solutions, called particles, correspond to individuals. Each particle in PSO flies in the D-dimensional problem space with a velocity dynamically adjusted according to the flying experiences of its individuals and their colleagues. The location ∈ of the particle is represented as = , ,….., , where , , ∈ 1, . and are the lower and upper bounds for the dimension, respectively. The best previous position (which gives the best fitness value) of the particle is recorded and represented as = , ,….., , which is also called . The index of the best particle among all the particles in the population is represented by the symbol. The location is also denoted by . The velocity of the ith particle is represented by = , ,….., and is clamped to a maximum velocity = , ,….., which is specified by the user. The particle swarm optimization concept consists of, at each time step, regulating the velocity and location of each particle toward its and locations according to (1) and (2), respectively.
316
H. Tao et al.
=
(1) =
(2)
Where w is the inertia weigh; , are two positive constants, called cognitive and social parameter respectively; = 1,3, … . , ; = 1,3, … . , and m is the size of the swarm; , are two random sequences, uniformly distributed in [0,1]; and = 1,3, … . , denotes the iteration number, is the maximum allowable iteration number.
3 The Proposed Scheme It is well known that embedding watermark information into the lowest frequency sub-band in DWT domain would cause serious effects the transparency of watermarked image, in spite of robustness against various signal processing. In this section, a brief overview of the proposed watermark embedding and watermark extracting processes is presented in DWT domain with good visual quality and reasonable resistance to various attacks. And the PSO optimization of the novel scheme is described. 3.1 Watermark Embedding Suppose that original image is a gray-level image and is the width by height of , respectively. The watermark is a binary image and is the width by height of , respectively. The original image is decomposed into The wavelet representation of m levels and obtain multi-resolution presentation and approximation as shown in Fig.1. is a frequency coefficient in coordinate , where represents the orientation and . In consideration of the visual quality and the robustness, the proposed algorithm that a binary watermark is embedded into component is formulated as follows: 1. In the embedding process, the original image is decomposed into m-level sub-bands to obtain a series of multi-resolution fine sub-shapes, , and (i=1, 2, 3) and the coarse overall shape . The component is decomposed into non-overlapping blocks with size , and k=1, 2,…, . 2. The watermark information ( ) need be pretreated in order to eliminate the correlation of watermark image pixels and enhance system robustness and security. For the advantages of lowing computed complexity and obtaining inverse transform easily comparing with Arnold transform, the watermark image is pretreated through affine scrambling. The affine scrambling is showed as equation, =
,
0
(3)
An Implementation of Digital Image Watermarking Based on PSO
317
For enhancing the statistical imperceptible through embedding watermark, series of 11 values substitute for 01 which is the value of watermark image by scrambling, respectively. The new watermark is generated = · , according to a sequence of the binary pseudo-random modulating the watermark, where 11 and0 . 3. In component, embedding each bit watermark information into each is motivated by experiment. In each block, max { , , 1, , , 1 , 1, 1 } and min { , , 1, , , 1 , 1, 1 } are calculated, and sub-bands coefficients are then modified according to the equation, (4) Where is the scaling factors. 4. Watermark bits are embedded into the original image and m level inverse wavelet transform of the sub images is performed. Then, the watermarked image can be obtained. 3.2 Watermark Extraction The watermark extraction is the reverse procedure of the watermark embedding. It can be summarized as follows: 1. In the extracting process, the watermarked image is decomposed into 3-level using DWT to obtain a series of high-frequency subbnads and a high-energy subband. 2. HL component is decomposed into non-overlapping blocks M with size 2 2, In each block, x = max{I i, j , I i 1, j , I i, j 1 , I i 1, j 1 } and y = min { I i, j , I i 1, j , I i, j 1 , I i 1, j 1 } are calculated. Then, Average i, j = 0.5 x y is defined. w =
1, 0,
Average i, j Average i, j
i, j i, j
I I
(5)
3. A complete watermark sequence w’ is obtained and inverse affine transform perform on the sequence, then binary watermark image has been extracted. 4. After extracting the watermark, normalized correlation coefficients to quantify the correlation between the original watermark and the extracted one is used. A normalized correlation (NC) between w and w’ is defined as: ′
∑
NC = ∑
∑
′
where w and w ′ denote an original watermark and extracted one, respectively.
(6)
318
H. Tao et al.
Fig. 1. Three-level wavelet transform
Fig. 2. Diagram for proposed scheme
3.3 Proposed Optimization Process In order to achieve the optimal performance of a digital image watermarking algorithm, the developed technique employs PSO algorithm to search for optimal parameters. In the optimization process, the parameters are the scaling factors ( ) which are obtained for optimal watermarking depending on both the transparency and the robustness factors. In every swarm, each member vector or particle in the particle represents a possible solution to the problem and hence is comprised of a set of scaling factors. To start the optimization, PSO use randomly produced initial solutions generated by random number generator between 0 and 1. In the proposed scheme, for solving optimization problem for multiple parameters, in Eq.(4) is a weight of each watermarked bit and embedded each modulating watermarked bit into sub-band component by DWT transform and therefore, all of represent the multiple scaling factors. After modifying the sub-band coefficients of the decomposed host image by employing the scaling factors, the watermarked images of the current generation are calculated according to the watermark embedding procedure explained in Section 3.1. In order to evaluate the feasibility of extracted watermark, both a universal quality index (UQI) [5] and NC values evaluate the objective function as performance indices. Due to UQI‘s role of imperceptibility measure, it is used as output image quality performance index. Similarly, NC is used as a watermark detection performance index because of its role of robustness measure. The maximum of objective value V can be calculated with =
(7)
The attacks that were utilized in the process of the objective function evaluation were: median filter, Gaussian noise and rotation for obtaining the optimal scaling factors with calculating the values of UQI and NC. In simulations of PSO, a set of parameter values are identified. In the proposed scheme, the size of the initial particle for PSO is 30, 1 =1, 2 =1. The PSO process is repeated until the scaling factors are optimally found. Optimization diagram for digital image watermarking using PSO is shown in Fig.2.
An Implem mentation of Digital Image Watermarking Based on PSO
319
4 Experimental Resu ults In this section to evaluate th he performance of the proposed watermarking scheme had been tested on the grayscaale 8-bit image of size 512 512 “Lena” and the 3-leevel wavelet decomposition Dau ubechies 9/7 filter coefficients are used. A 32 32 binnary image “UMP” is used as th he watermark W and in order to eliminate the correlationn of watermark image pixels an nd enhance system robustness and security through afffine scrambling. As are shown in n Fig .3. (a) and (b),respectively. And watermarked imagge is shown in Fig.3.(c).
a
b
c
Fig. 3. (a) Origiinal image, (b) watermark and (c) watermarked image
A good watermark scheeme should be robust against different kind of attacks. In order to illustrate the robustt nature of our watermarking scheme, robust test for variious signal processing such as Gaussian G filtering (0,0.003) (GF), median filtering ( 3 3) (MF), sharpening(SP), transslating(30 pixel)(TR), rotating(30 )(RT) and cropping(300%) (CP). Table 1 presents the UQI U and NC value of the detailed experiment results. Table 1. The T experimental results under different attacks Attacks
GF
TR
MF
RT
SP
CP
0.9884 0.8751
0.9903 0.9102
0.9879 0.7858
0.9872 0.7824
0.9845 0.8928
0.98511 0.89622
Watermark extraction UQI NC
ue of watermark embedding at JPEG compression is eevaIn addition, the NC valu luated over various compresssion factors. By comparison with several existing schem mes, it is evident that the proposeed scheme has better performance than [2] and [6] as shoown in Table 2.
320
H. Tao et al.
Table 2. The comparison results with existing schemes JPEG Quality
[2] scheme
[6] scheme
Proposed scheme
90%
0.9624
1
1
75%
0.7953
0.8614
0.9273
60%
0. 6572
0.7925
0.8461
40%
0.5872
0.6937
0.8103
5 Conclusions Digital watermarking technique to obtain the highest possible robustness without losing the transparency is still one of the most challenging issues. This paper presents an optimal robust image watermarking technique based on DWT. In this scheme, firstly, the watermark is embedded into the vertical subband ( ) coefficients in wavelet domain, and subsequently, scaling factor is trained by PSO which represents the intensity of embedding watermark instead of heuristics. The experimental results demonstrated that the proposed optimal watermarking scheme has strong robustness to a variety of signal processing and distortions. This simultaneously proves the more effective implementation of the novel scheme in comparison with existing schemes.
References 1. Cox, I.J., Matthew, L.M., Jeffrey, A.B., et al.: Digital Watermarking and Steganography, 2nd edn. Morgan Kaufmann Publishers (Elsevier), Burlington (2007) 2. Reddy, A.A., Chatterji, B.N.: A new wavelet based logo-watermarking scheme. Pattern Recognition Letters 26, 1019–1027 (2005) 3. Kang, X., Huang, J., Shi, Y.Q.: An image watermarking algorithm robust to geometric distortion. In: Petitcolas, F.A.P., Kim, H.-J. (eds.) IWDW 2002. LNCS, vol. 2613, pp. 212–223. Springer, Heidelberg (2003) 4. Lee, D., Kim, T., Lee, S., Paik, J.: Genetic algorithm-based watermarking in discrete wavelet transform domain. LNCS. Springer, Heidelberg (2006) 5. Wang, Z., Bovik, A.C.: A Universal Image Quality Index. IEEE Signal Processing Letters 9, 81–84 (2002) 6. Hsu, C.S., Tu, S.F.: An Imperceptible Watermarking Scheme Using Variation and Modular Operations. International Journal of Hybrid Information Technology 1(4), 9–16 (2008)
Genetic Cryptanalysis Abdelwadood Mesleh, Bilal Zahran, Anwar Al-Abadi, Samer Hamed, Nawal Al-Zabin, Heba Bargouthi, and Iman Maharmeh Computer Engineering Department, Faculty of Engineering Technology, Al-Balqa‘ Applied University, Amman, Jordan [email protected]
Abstract. In this work, Elitism Genetic Algorithm cryptanalysis for the basic substitution permutation network is implemented. The GA cryptanalysis algorithm gets the entire key bits. Results show the robustness of the proposed GA cryptanalysis algorithm. Keywords: Cryptanalysis, Elitism GA.
1 Introduction Security [1] is always a concern of all the organizations with assets that are managed by computer systems. Because of the popularity of the data encryption standard (DES) and the advanced encryption standard (AES) as encryption schemes in security systems, there are always interests to find cryptanalysis attacks on DES and on AES. A typical cipher [1] takes the plaintext and some known key as its input and produces the ciphertext. Cryptanalysis is the process of recovering the plaintext and/or the key from a cipher. A cryptographic system has a finite key space and, hence, is vulnerable to an exhaustive key search attack. Yet, a typical cipher remains secure because the size of the key space is such that the time and resources for a search are not available. A random search through a finite but large key space is not usually an acceptable cryptanalysts tool. Differential cryptanalysis [2] and linear cryptanalysis [3] are the two most powerful cryptanalysis approaches. In this work, cryptanalysis is viewed as the process of recovering the key from a cipher. In machine learning, this process can be studied as a search problem (a space of keys is searched to find the right key). It is known that an exhaustive based search (attack) may try every possible key on a piece of ciphertext until an intelligible translation into plaintext is obtained. Because the key space is always extremely large, the key search process is always expensive (time consuming) and the exhaustive based search methods are impractical. On the other hand, it is known that GA [4] is an optimization algorithm that has proven its robustness, moreover, it is known that GA offers powerful and domain independent search capabilities that can be used in many learning tasks (such as the key search process). For these reasons, machine learning community has been motivated to study the key search (attack) process as a search problem (GA optimization problem) [5, 6, 7, 8, 9, 10, 11]. In this work, elitism GA is used to search the key space. F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 321–332, 2010. © Springer-Verlag Berlin Heidelberg 2010
322
A. Mesleh et al.
The rest of this paper is organized as follows. Section 2 presents the SPN. Section 3 describes the GA Cryptanalysis. Experimental results and conclusion are discussed in sections 4 and 5 respectively.
2 SPN Substitution-permutation networks (SPNs) [12] evolved from the work of Claude Shannon [13], where he proposed the well-known principles of “confusion” and “diffusion”. Confusion is the obscuring of the relationship between elements of the plaintext and the elements of the ciphertext. On the other hand, diffusion is the spreading of the influence of plaintext elements over the ciphertext. These two principles are achieved through the use of “mixing transformation” involving a number of rounds (each round consists of a substitution operation followed by an invertible linear transformation). Feistel [14] realized Shannon’s mixing transformation through the use of substitution permutation networks (SPNs). As a result, SPNs form the foundation for many modern private key block cryptosystems such as DES and AES. Such cryptosystems obtain their cryptographic strength by iterating a cryptographic operation several times. The basic SPN consists of a number of rounds of nonlinear substitutions (S-Boxes) connected by bit position permutations. The substitutions are performed by dividing the block of bits into small sub-blocks and using a mapping stored as a table lookup and referred to as an S-box. The basic SPN structure is used to construct ciphers which possess good cryptographic properties such as completeness [15] and resistance to differential and linear cryptanalysis [16, 17]. Fig. 1 shows a 16-bit SPN that consists of 4 rounds of 4 x 4 S-Boxes. The 16-bit plaintext block (p1,…,p16) is divided into four 4-bit sub-blocks; these sub-blocks form inputs for the S-Boxes. Each S-Box is defined as an a 4-bit bijective mapping S: PÆ C, where P = [p1,…, p4] and C = [ c1,…, c4]. Before each round of substitution, a 4bit key (K=[k1,…,k4]) is XORed with the 4-bit input sub-block (X=[x1,…, x4]). Substitution is implemented with a lookup-table (see Table 1) and the mapping is chosen from the DES S-Boxes (see Table 2). In all rounds except the last, a permutation (see Table 3) of the bits then follows. In the last round, the data is typically XORed with a final round key instead. As a result, a cipher text is produced C= [c1,…,c16]. Decryption implemented in a reverse order through the SPN. For input P and output C, the input difference is ΔP = Pi ' ⊕ Pi '' and the corresponding output difference is
ΔC = Ci' ⊕ Ci'' , it is observed that the difference between two ciphertexts is a function of the difference between the corresponding plaintexts. Each entry in Table 4 is the number of pairs of S-box inputs with input difference corresponding to the row that maps to pairs with the column’s output difference. By joining together the differential characteristics, one may obtain a differential for the rounds in which they feature, (∆P, ∆C). We need to choose and link differential characteristics so that ∆C will be the output difference for those P bits significantly more often than 1/2y of the time, given that the input difference is ∆P. From the difference distribution table (see Table 4), B (1011) corresponds to 2 (0010) with 8/16 probability. Similarly, 4 (0100) corresponds to 6 (0110) with 6/16 probability.
Genetic Cryptanalysis
Fig. 1. 16-bit SPN that consists of 4 rounds of 4 x 4 S-Boxes
323
324
A. Mesleh et al. Table 1. S-Box Hexadecimal representation Input
0 1 2 3
4 5 6
7
Output
E 4 D 1
2 F B 8
8 9
A B C D E
3 A 6
C 5
9
0
F 7
Table 2. S-Box used in the SPN Input
0
1 2 3
4 5 6
7
8 9
A B C D E
F
S11 S12 S13 S14 S21 S22 S23 S24 S31 S32 S33 S34 S41 S42 S43 S44
E 0 4 F F 3 0 D A D D 1 7 D A 3
4 F 1 C 1 D E 8 0 7 6 A D 8 6 F
2 E D 4 6 F A 3 6 3 8 6 0 6 C A
8 1 B 7 4 E 1 2 5 A 0 7 A 3 D 8
3 A F 5 9 C 5 B 1 2 B 4 1 4 F 9
6 C 9 3 2 1 C 7 C 5 2 E 8 2 3 5
7 8 0 D A 5 F 9 8 1 7 C F 9 4 E
D 7 E 8 8 4 7 A 9 0 4 D E B 9 0
1 4 8 2 E 7 B 1 E 9 9 0 3 5 0 6
F 2 6 9 B 2 4 F 3 4 F 9 6 F B 1
B D 2 1 3 8 D 4 F 6 3 8 9 0 7 D
A 6 C B 7 0 8 6 D 8 1 F 2 7 1 4
C B 7 E D A 6 C 7 E C 3 5 C E B
5 9 3 A C 6 9 0 B C 5 B B 1 5 C
9 5 A 0 0 9 3 5 4 B A 5 C A 2 7
0 3 5 6 5 B 2 E 2 F E 2 4 E 8 2
Table 3. Permutation Input
1 2 3 4
Output
1 5 9 13 2 6 10 14 3 7
5 6 7
8
9 10 11 12 13 14 15 16 11 15 4
8
12 16
As a result, the total probability is 8/16 * 6/16 = 3/16. Finally, 2 (0010) corresponds to 5 (0101) with 6/16 probability and the total probability is 3/16 * 6/15 * 6/16 = 27/1024. So the attacker can partially guess the target key.
3 GA Cryptanalysis GA [4] is an optimization algorithm based on both stochastic and probabilistic measures; it inspects the solution space (key space – candidate keys) for an optimal solution (target key).
Genetic Cryptanalysis
325
Table 4. Difference distribution table for the 4 x 4 S-Box Output Difference Input Difference
0 1 2 3 4 5 6 7 8 9 A B C D E
0 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 4 0 0 0 2 2 0 2 4 0
2 0 0 0 2 0 0 4 2 0 0 2 8 0 0 2
3 0 2 2 0 2 0 0 2 0 0 0 0 0 0 4
4 0 0 0 2 0 0 4 2 0 2 0 0 2 0 2
5 0 0 6 0 0 2 0 0 0 0 0 2 2 0 0
6 0 0 2 0 6 2 0 2 2 0 0 0 2 0 0
7 0 2 2 0 0 0 0 0 2 4 0 2 0 4 0
8 0 0 0 0 0 0 0 0 0 2 6 0 0 2 6
9 0 0 2 4 2 0 0 2 0 0 0 0 0 0 0
A 0 4 0 2 0 4 0 2 0 2 0 0 0 2 0
B 0 0 0 0 4 0 0 0 4 2 2 0 2 0 0
C 0 4 0 2 2 2 2 0 0 2 0 0 0 2 0
D 0 2 0 0 0 0 2 0 4 0 0 2 6 0 0
E 0 0 2 0 0 0 2 0 2 0 4 0 0 2 2
F 0 0 0 4 0 2 2 4 2 0 0 2 0 0 0
F
0
2
0
0
6
0
0
0
0
4
0
2
0
0
2
0
GA (see Fig. 2) starts with a randomly selected “population” of possible solutions for the target problem (key search) and lets them “evolve” over multiple generations to find better solutions (better keys). The GA algorithm is mainly based on the “survival of the fittest” principle (organisms that best “fit” their environment have the best chance of survival). While executing GA, new population individuals are “born” while others “die”. In GA, a stochastic process, Crossover, takes two (or more) parent nodes that may generate offspring nodes (possible better solutions – better keys) by exchanging “chromosomes” between parent nodes. As a result, new individuals (better solutions – better keys) may be created in the next population. Mutation is another stochastic transformation of individuals that may modify their genotypes. Individuals are selected for “crossover” based on their fitness values (goodness of selected subset of features). Better goodness features are more likely able to reproduce (survive).
Candidate Keys
Crossover
Mutation
Selecting using Key goodness
Target Key
New Solution Member
Fig. 2. Genetic Algorithm main steps: Candidate key selection, Crossover and Mutation
326
A. Mesleh et al.
The basic steps of GA are given below. These steps are general enough to govern many flavors of GA implementations [4]: Initialization step: Randomly generate an initial population of N chromosomes and evaluate the fitness for each of them. Parent Selection step: If not using elitism GA flavor, then set M=0; Else Set M: (0<M
Genetic drift is used to measure the stochastic changes in gene frequency through random sampling of the GA population. Some genes of chromosomes may become more important to the final solution than others. The genetic drift can be stalled when the chromosomes representing decision variables that have a reduced “salience” to the final solution do not experience sufficient selection pressure.
Genetic Cryptanalysis
327
to represent chromosomes and also the key. Crossover and mutation processes were applied at the candidate keys to generate new ones. Differential characteristics are used as fitness measures. Following Albassal & Wahdan [5], fitness function is represented as a count: F= d/L, where d is the count for each partial sub-key value and L is the total number of plaintext pairs that are used in evaluation. The count F is incremented whenever the difference for the input at the last round corresponds to the expected value from the differential characteristic. The maximum number of generations has been chosen so that the total number of keys in all generations (S) will not exceed half of all possible sub-key values: S < (1/2) * 2n. This serves as the GA stopping criterion. In this work, GA inspects the solution space (key space) for an optimal solution (a target/right key), the main steps of GA as a target key search method can be summarized as follows: Step 1: A random key population is generated, the population consists of a predefined key length (number of bits) (chromosomes), each individual solution (candidate key) is represented as a vector, and the vector length represents the number of bits in the target key. Step 2: The fitness of each solution (candidate key) is computed, fitness uses a specific evaluation function to measure the goodness of each solution (the goodness of each candidate key). Step 3: Repeat until a certain stopping criterion is met: • • • • • •
The solution (key) with best fitness is stored. Then a new key population is generated: Select two parents from key population Crossover the parents. Mutate new children (“offsprings”). Place the new generated “offsprings” to the population and randomly select a new population for a further run of the GA algorithm.
Step 4: Return the best solution (the target key).
4 Experimental Results To implement the GA cryptanalysis attack on the SPN, 5000 pairs of plaintext with a specific difference (00F0) are generated, then the most probable difference is calculated through the SPN (The differential probability of the SPN with input difference 00F0 is 3/1024). The GA cryptanalysis attack is implemented by generating independent 5 sub-keys (16-bit sub-key) to represent the target key. The attack starts by generating a set of random keys (GA first generation), for each generated candidate key, the fitness is set to zero, then all the test pairs are used and the output difference is compared to the expected differential probability. If the correct difference is found, the fitness value is incremented and finally normalized to the total number of pairs.
328
A. Mesleh et al.
The GA then evolves to regenerate a new generation: the survive (selection) percent is 50% (Set M to 0.5), Crossover and mutate the new individuals. The GA parameters were initially set as follows: key size = 16 bits, maximum number of generations = 16, crossover probability = 20% and mutation probability = 1%, population size = 1024. The mentioned GA parameters are empirically set. For a randomly chosen target key (2CD9), we have conducted many experiments using different crossover probability values (0.15, 0.20, 0.25, 0.30, and 0.35) to decide the best crossover probability (Pc), and finally concluded that the best crossover probability is 20%. In Fig. 3, it is clear that GA converges at the fourth generation when Pc=0.20, i.e. the algorithm gets the entire key bits at the fourth generation (the algorithm does not stop searching until the sixteenth iteration for simulation purposes). It worths mentioning that the mutation probability (Pm) is set to 0.01, other Pm values does not significantly affect the results.
Fig. 3. Fitness values against generation for some randomly chosen key (2CD9) using different crossover probabilities
The attack had been tested several times on different keys; the GA cryptanalysis algorithm was able to find the target keys in all experiments. Tables 5, 6, 7 and 8 show the 16 GA generations to find 4 keys (5DC3, 75E7, 2C19 and 2CD9). Each table (see Tables 5, 6, 7 and 8) lists the best solution (best candidate key so far) and the corresponding normalized GA fitness. For all the many experiments that we have conducted, results reveal that the elitism GA algorithm gets the entire key bits. Majority of keys (as shown in the Tables 5, 6, 7 and 8) were gotten at the fourth generation.
Genetic Cryptanalysis
329
Table 5. Best Solutions (Candidate keys) and the corresponding normalized fitness values to find the target key: 5DC3 GA Generation Number 1
Best Candidate Key 3A2D
Normalized Fitness 16.67%
2
5D83
50%
3
5D83
50%
4
5DC3
100%
5
5DC3
100%
6
5DC3
100%
7
5DC3
100%
8
5DC3
100%
9
5DC3
100%
10
5DC3
100%
11
5DC3
100%
12
5DC3
100%
13
5DC3
100%
14
5DC3
100%
15
5DC3
100%
16
5DC3
100%
Table 6. Best Solutions (Candidate keys) and the corresponding normalized fitness values to find the target key: 75E7 GA Generation Number 1
Best Candidate Key 75EF
Normalized Fitness 33.34%
2
75EF
33.34%
3
75E7
100%
4
75E7
100%
5
75E7
100%
6
75E7
100%
7
75E7
100%
8
75E7
100%
9
75E7
100%
10
75E7
100%
11
75E7
100%
12
75E7
100%
13
75E7
100%
14
75E7
100%
15
75E7
100%
16
75E7
100%
330
A. Mesleh et al.
Table 7. Best Solutions (Candidate keys) and the corresponding normalized fitness values to find the target key: 2C19 GA Generation Number 1
Best Candidate Key 2C49
Normalized Fitness 41.17%
2
2C49
41.17%
3
2C19
100%
4
2C19
100%
5
2C19
100%
6
2C19
100%
7
2C19
100%
8
2C19
100%
9
2C19
100%
10
2C19
100%
11
2C19
100%
12
2C19
100%
13
2C19
100%
14
2C19
100%
15
2C19
100%
16
2C19
100%
Table 8. Best Solutions (Candidate keys) and the corresponding normalized fitness values to find the target key: 2CD9 GA Generation Number 1
Best Candidate Key 2C49
Normalized Fitness 22.23%
2
2B09
33.34%
3
2C19
72.23%
4
2CD9
100%
5
2CD9
100%
6
2CD9
100%
7
2CD9
100%
8
2CD9
100%
9
2CD9
100%
10
2CD9
100%
11
2CD9
100%
12
2CD9
100%
13
2CD9
100%
14
2CD9
100%
15
2CD9
100%
16
2CD9
100%
Genetic Cryptanalysis
331
5 Conclusions In this paper, the elitism GA was used in the SPN cryptanalysis. The GA fitness is related to the differential characteristic of the cipher. In the worst case, the GA cryptanalysis was able to get the entire key bits at the tenth generation. Majority of keys were gotten at the fourth generation. Results reveal the robustness of the proposed cryptanalysis - elitism GA cryptanalysis. A good reduction in time was achieved (at the tenth generation, the GA cryptanalysis algorithm has not tried half the possible sub-key value). It is obvious that using elitism GA in cryptanalysis is always useful. However, using GA cryptanalysis to attack different encryption algorithms and comparing the elitism GA cryptanalysis to other GA flavors are still open for future work.
References 1. Stallings, W.: Cryptography and Network Security Principles and Practices, 4th edn. Prentice Hall, Englewood Cliffs (2005) 2. Biham, E., Shamir, A.: Differential cryptanalysis of DES-like cryptosystems. Journal of cryptography 4(1), 3–72 (1991) 3. Matsui, M.: Linear cryptanalysis method for DES cipher. In: Helleseth, T. (ed.) EUROCRYPT 1993. LNCS, vol. 765, pp. 386–397. Springer, Heidelberg (1994) 4. Goldberg, D.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley Publishing Company Inc., Reading (1989) 5. Albassal, A., Wahdan, A.: Genetic algorithm cryptanalysis of the basic substitution permutation network. In: Proceedings of the 2003 IEEE Midwest International Symposium on Circuits and Systems (MWSCAS ’03), December 2003. IEEE, Los Alamitos (2003) 6. Hernández-Castro, J., Isasi, P.: New results on the genetic cryptanalysis of TEA and reduced round versions of XTEA. In: Proceedings of the 2004 IEEE Congress on Evolutionary Computation (CEC 2004), June 2004, vol. 2(2), pp. 2124–2129 (2004) 7. Dozier, G., Garrett, A., Hamilton, J.: A comparison of genetic algorithm techniques for the cryptanalysis of TEA. International Journal of Intelligent Control and Systems (IJICS) 12(4), 325–330 (2007) 8. Toemeh, R., Arumugam, S.: Applying Genetic Algorithms for Searching Key-Space of Polyalphabetic Substitution Ciphers. The International Arab Journal of Information Technology 5(1) (January 2008) 9. Bergmann, K., Jacob, C., Scheidler, R.: Cryptanalysis using genetic algorithms. In: Keijzer, M. (ed.) Proceedings of the 2008 Genetic and Evolutionary Computation Conference (GECCO 2008), July 2008, pp. 1099–1100 (2008) 10. Gorodilov, A., Morozenko, V.: Genetic Algorithm for finding the keys length and cryptanalysis of the permutation cipher. International Journal Information Theories & Applications 15, 94–99 (2008) 11. Husei, H., Bayoumi, B.: A Genetic Algorithm for Cryptanalysis with Application to DESlike Systems. International Journal of Network Security 8(2), 177–186 (2009) 12. Heys, H.: A Tutorial on Linear and Differential Cryptanalysis, Technical Report CORR 2001-17, Centre for Applied Cryptographic Research, Department of Combinatorics and Optimization, University of Waterloo (March 2001); Also appears in Cryptologia, vol. XXVI(3), pp. 189–221 (2002)
332
A. Mesleh et al.
13. Shannon, C.: Communication theory of secrecy systems. Bell System Technical Journal 28, 656–715 (1949) 14. Feistel, H.: Cryptography and computer privacy. Scientific American 228(5), 15–23 (1973) 15. Kam, J., Davida, G.: A structured design of substitution-permutation encryption networks. IEEE Transactions on Computers 28(10), 747–753 (1979) 16. O’Connor, L.: On the distribution of characteristics in bijective mappings. In: Helleseth, T. (ed.) EUROCRYPT 1993. LNCS, vol. 765, pp. 360–370. Springer, Heidelberg (1994) 17. Heys, H., Tavares, S.: The design of product ciphers resistant to differential and linear cryptanalysis. In: Stinson, D.R. (ed.) CRYPTO 1993. LNCS, vol. 773. Springer, Heidelberg (1994) 18. Reed, P., Minsker, B., Goldberg, D.: The practitioner’s role in competent search and optimization using genetic algorithms. Presented at the World Water and Environmental Resources Congress, Washington, DC (2001) 19. Rogers, A., Bennett, A.: Genetic drift in genetic algorithm selection schemes. IEEE Transaction Evolutionary Computation 3, 298–303 (1999)
Multiple Layer Reversible Images Watermarking Using Enhancement of Difference Expansion Techniques Shahidan M. Abdullah and Azizah A. Manaf University Technology Malaysia [email protected], [email protected]
Abstract. This paper proposes a high capacity reversible image watermarking scheme based on enhancement of difference-expansion method. Reversible watermarking enables the embedding of useful information in a host signal without any loss of host information. We propose an enhancement of Difference Expansion technique whereby we can embed recursively into multiple layer of payload for grey scale and also RGB color scale, hence it increase capacity much better. The proposed technique improves the distortion performance at low embedding capacities and mitigates the capacity control problem. We also propose a reversible data-embedding technique with blind detection of watermark. This new technique exploits the selection of optimum block size to implement the algorithm. The experimental results for many standard test images show that multilevel of embedding increase the capacity when compared to normal difference expansion. There is also a significant improvement in the quality (PSNR) of the watermarked image, especially at moderate embedding capacities. Keywords: Multilayer embedding, difference expansion, blind detection, reversible watermarking.
1 Introduction Digital watermarking is a process of embedding valuable information into another digital media called host for the purpose such as copy control, authentication, copyright protection and distribution tracking. When inserting the watermark into the host, there are distortion occur and become constraint so that the host and the watermarked work are perceptually equivalent. Watermarking valuable and sensitive images such as medical, military and artwork images gives a big challenge to most watermarking methods. Watermarking process normally introduces small changes in host image but irreversible degradation in the original image. This degradation may cause the loss of significant artifacts in military, medical images and may reduce the aesthetic and monetary values of artwork. Reversible watermarking is one type of fragile watermarking in which the watermark is sensitive to any intentional or unintentional forge of the watermark bit. Content F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 333–342, 2010. © Springer-Verlag Berlin Heidelberg 2010
334
S.M. Abdullah and A.A. Manaf
authentication [24] is one of the applications using the fragile watermarking techniques. Reversible watermarking is one type of digital watermarking with an intriguing feature that when the watermarked bit has been authenticated, enables the decoder not only extract the watermark, but also perfectly reconstruct the original host to produce the exact original image, that is un-watermarked content [1]-[23]. This application of getting back the exact original content is highly important in sensitive imagery applications such artwork, law enforcement, medical imaging, astrophysics research, and military application. Having the original image during analysis and diagnosis to make the right decision is a very critical importance. Traditional watermarking techniques cannot provide adequate security and integrity for content authentication because of their irreversible nature. Reversible watermarking, which is also called as erasable watermarking, invertible watermarking and lossless watermarking has been widely studied in the current literature of watermarking applications [1]-[14],[23]. There are several techniques introduced to solve reversible watermarking applications [1]-[22]. The first was introduced by Mintzer et al.[10] where the embedded watermarks were visible and could be extracted and removed because it was embedded in reversible approach. Fridrich et al. [7][8] compresses extracted vector which represents a group of pixels group, and embed payload by adding it to compress vector. Tian[4] method based on difference expansion with high capacity and quality. He applies a pair of pixels by determining their average and different of pixel values. The pixel value for selected pair of pixel will be updated during embedding process. Alattar [2] introduced an algorithm with high capacity for color images. Payloads are hide in the difference expansion of vectors of adjacent pixels. Celik et al. used LSB-substitution technique and achieved high capacity by using a prediction based conditional entropy coder [3]. Reversible watermark techniques must have the ability to reconstruct the exact original host data after the extraction process. Therefore, the system should have three main steps. First embedding watermark bit w to digital host h, the result is h’ = h + w. The second is digital content authentication by extracting the watermark signal from the watermarked image h’. If it is not tempered means authentic, we go to last step to reconstruct the original image y exactly as before the first step take in place. In this paper we describe a high capacity, high quality reversible watermarking algorithm for digital image. Our method can be applied for both type of images, grey scale and also color image. Our process starts with scanning the embeddable pair of pixel which will not become overflow after embedding process. We sort by ascending order the selected pairs of pixel by using their difference values. This will make the pair of pixel which has the smallest different will be the first that will embed by the watermark bit. After embedding the watermark, only one pixel from a selected pair of pixel is changed it’s value. We recursively embed the hidden data to the host image until the quality of image PSNR(peak signal to noise ratio) is become below 30[24], then we stop.
Multiple Layer Reversible Images Watermarking
335
2 Enhancement Multilevel Difference Expansion (DE) Techniques We use a modified version of Difference Expansion (DE) in [1-4]. We embed multilayer (n layers) of payloads into a host image h, and obtain n number of watermarked image n x h’. Before sending it to the content authenticator, the image h’ might or might not have been tempered by some intentional or unintentional attack. If the authenticator finds that no tempering occurred in h’, i.e. h’ is authentic, then the authenticator can remove all the payloads w from h’ to restore the original image, which is a new image h’’. Our approach of reversible watermarking is using enhancement of DE whereby first we select embedding pairs of pixel in a block and register their positions in a lookup table (LT). This LT is very important during embedding, extraction and restoring the original image. This DE technique removes the need of lossless compression on host image and also watermark image. This new enhancement DE allows the smooth area to keep the one bit of payload. Smooth area is the case where the difference value pair of pixel is zero or one. Let us take pair of pixel (xi ,yj ) in an image h,(xi,yj) Z, where 0 ≤ (xi ,yj ) ≤ 255, 1 ≤ i ≤ row and 1 ≤ j ≤ column. Define k = 1 to p where p is total pair of pixel in h, dk for difference between kth pixel pair and yi > xj as follows; dk = yj - xj .
(1)
Let b is a message bit either 0 or 1, dk’ is a new difference after adding with message bit, d k’ = d k + b .
(2)
Now let yj’ new pixel value after add by dk’ yi’ = yi + dk’ .
(3)
The new value for kth pair pixel is (yi’ , xj). The inverse transform of (3) is =
yi’ xj
′
(4)
Where is symbol . is the floor function meaning the greatest integer less than or equal to. As pixel values are bounded within the range [0,255], we have the following conditions; x
254 . 2
(5)
Thus to prevent the overflow the greater pixel value after embedding must fulfill condition (5). In our approach because pixel xj is unchanged, underflow situation is
336
S.M. Abdullah and A.A. Manaf
not arise. The next step is to create lookup table of all expanded pair of pixels. We assign 1 to represent an expanded pair of pixels and 0 for the un-expanded pair of pixel. Let size of host image h is hs ,size of lookup table LT in bit is given by 16
16
.
(6)
Side information inside the LT is number of layers n, size of watermark for n layer and status of embeddable in every pair of pixel. In our approach we let the number of layer increase until PSNR value below 30 [24].
3 Multiple Layer of Reversible Watermarking Scheme We describe our scheme using flowchart and procedure as mentioned below. 3.1 Principle The principle of this scheme is as shown in Figure 1 and Figure 2 as follows; i. ii. iii.
scanning embedding extracting start
Layer=layer + 1 scanning
embedding
yes PSNR > 30
no stop Fig. 1. Reversible Watermarking Scheme: Embedding
Multiple Layer Reversible Images Watermarking
337
start n=layer
Extracting watermark Reconstruct n-1 layer yes
(n-1) >1
no Reconstruct original
stop Fig. 2. Reversible Watermarking Scheme: Extracting and Reconstructing
3.2 Lookup Table The lookup table L is binary bitmap that indicates three things i. ii. iii.
Size watermark image in each layer Number of embeddable layer on watermarked image Position (xi ,yj ) pair of pixel either 0 for unselected pair and 1 for selected pair for watermarked
It is a key required by decoder to retrieve the payload starting from last layer and to rebuild the original image after the first layer. Since one bit is assigned to each pair, and for image with 8 bit per pixel, therefore for n layer of watermark, LT size S is given by equation (6). Constant 16 is derive from two pixels to store watermark size for row and column, n is number of layers. Let see this simple example of pixel of (2x4) image as shown in Figure 3 below;
Fig. 3. Example of pixel of (2x4) image
338
S.M. Abdullah and A.A. Manaf
We trace the pair of pixels for first layer by vertical approach starting from first pixel to last pixel according to the sequence a-e, b-f, c-g d-h. For second layer we reverse by horizontal tracing starting from last pixel up to first pixel thru h-g, f-e, d-c, b-a. The process of tracing continues by repeating the same process as above. This approach is taken because we want to get different value pair of pixel hence increase embeddable pair of pixel. 3.3 Scanning and Embedding The detail procedure to identify embeddable pair is describe below. Let us take pair of pixel (xi ,yj ) in an image h,(xi,yj) Z, where 0 ≤ (xi ,yj ) ≤ 255, 1 ≤ i ≤ row and 1 ≤ j ≤ column. Here Sij denotes the block in the ith row and the jth column. The scanning process involves the following steps: 1. 2. 3. 4.
Partition the entire image h into similar block size with pairs of pixels Sij; Determine status of every pixel pair either overflow or not by using equation (5). Set 1 for embeddable pair and 0 for un-embeddable pair in look-up table. end
The embedding step continues after the above process complete; 1. 2. 3. 4.
Read look-up table and host image h Sort Sij; accoding to dk For all block Sij; If (xi ,yj ) is embeddable Execute equations (1),(2),(3) end if end for 5. end
The recursive step is a process of continue to scan and embedding if the PSNR value is more than 30. We can embed n watermarks images if we have n layers of embedding. 3.4 Extraction and Recovery Extraction is conducted for the purpose of authentication where first process, hidden payload is extracted from watermarked image starting from last layer and second process, original image is constructed from watermarked image after we reach top of layer. Process involves are; 1. 2. 3. 4.
Read LT and host image h Read max layer n from LT For all block Sij; If (xi ,yj ) is embeddable Determine hidden bit b Execute equations (4)
Multiple Layer Reversible Images Watermarking
5. 6.
7. 8.
339
end if end for n=n-1 if n > 1 goto step 3. end if construct h end
4 Experimental Results and Comparisons We discuss the result, analysis and compare our result with another algorithm in DE. 4.1 Experimental Results We implemented and tested the algorithm detailed in 3.3 and 3.4. For different types of images we implement the algorithm and calculate payload, bit rate and PSNR. We apply the algorithm recursively for 20 different types of RGB and gray scale images. In Table 1 to 3 below, we present our result based on three images. Maximum layer n we set is 8, because of size lookup table will increase as value of n increase. During embedding phase we list out for each layer name of images, payload, bit rate for each layer, total bit rate and PSNR. For first layer we embed text file and the following layers are images. 4.2 Analysis It clearly shows that for first layer PSNR value is very high. This is because the payload size is less than the total case for dk is equal zero or one. The table indicates that the achievable embedding capacity depends on the nature of the image itself. Some images can accept more payload bits with lower distortion in the term of PSNR compared to other images. For example Lena and Car produce more expandable pairs with lower distortion compared to image with more edge areas such as baboon, and hence the former images can carry more watermark bit at higher PSNR. Table 1. Image Lena.jpg
Layer 1 2 3 4 5 6 7
Payload(bit) 1519 8000 30360 80000 30360 96000 30360
Total bit rate 0.006 0.036 0.152 0.457 0.573 0.939 1.06
PSNR 73.52 63.33 51.40 41.20 39.91 32.43 31.95
340
S.M. Abdullah and A.A. Manaf Table 2. Image baboon.jpg
Layer 1 2 3 4 5 6
Payload(bit) 1519 8000 8000 8000 8000 8000
Total bit rate 0.023 0.145 0.267 0.389 0.512 0.634
PSNR 61.19 44.25 40.43 35.94 33.48 31.02
Table 3. Image car.bmp
Layer 1 2 3 4 5 6 7
Payload(bit) 1540 8000 8000 96000 102400 80000 96800
Total bit rate 0.003 0.017 0.031 0.202 0.384 0.526 0.698
PSNR 76.80 66.64 63.85 47.65 39.91 35.52 31.65
4.3 Comparisons with Other Algorithms Table 4 below is comparison between Tian’s [4] and Lee’s[20] algorithm and our propose algorithm upon the value of bitrate (bpp) and PSNR(dB) using image “Lena” (512 by 512). Table 4. Bit rate and PSNR between Tian’s, Lee’s and proposed technique
Bit rate(bpp) 0.45 0.57 0.93 1.06
PSNR Tian’s algortihm 37.77 36.15 29.43 -
PSNR Lee’s algorithm 41.00 37.00 31.00 -
PSNR proposed algorithm 41.20 39.91 32.43 31.95
Tian’s and Lee’s technique performs in term of PSNR almost similar results when bit rate is 0.45. For higher payload, our propose technique outperform other technique whereby the bit rate is greater than one.
5 Conclusion In this paper, a very high-capacity algorithm based on the difference expansion of multilayer of embedding for reversible watermarking with low image distortion is implemented. Test results show that the amount of data, one can embed into an image depends highly on the nature of the image. The test results also indicate that the performance of the propose algorithm is superior for images with smooth area.
Multiple Layer Reversible Images Watermarking
341
References 1. Yaqub, M.K.: Reversible Watermarking Using Modified Difference Expansion. International Journal of Computing & Information Sciences 4(3), 134–142 (2006) 2. Alattar, A.M.: Reversible watermark using the difference expansion of a generalized integer transform. IEEE Trans. Image Process. 13(8), 1147–1156 (2004) 3. Celik, M.U., Sharma, G., Tekalp, A.M., Saber, E.: Lossless generalized-LSB data embedding. IEEE Trans. Image Process. 14(2), 253–266 (2005) 4. Tian, J.: Reversible watermarking using a difference expansion. IEEE Trans. Circuits Syst. Video Technol. 13(8), 890–896 (2003) 5. De Vleeschouwer, C., Delaigle, J.E., Macq, B.: Circular interpretation of bijective transformations in lossless watermarking for media asset management. IEEE Trans. Multimedia 5(1), 97–105 (2003) 6. Dittmann, J., Benedens, O.: Invertible authentication for 3D-meshes. In: Proc. SPIE, vol. 5020, pp. 653–664 (2003) 7. Fridrich, J., Goljan, M., Du., R.: Lossless data embedding—New paradigm in digital watermarking. EURASIP J. Appl. Signal Process. 2, 185–196 (2003) 8. Fridrich, J., Du., R.: Lossless authentication of MPEG-2 video. In: Proc. IEEE Conf. Image Processing, September 2002, vol. 2, pp. 893–896 (2002) 9. Honsinger, C.W., Jones, P., Rabbani, M., Stoffel, J.C.: Lossless recovery of an original image containing embedded data, U.S. Patent 6 278 791 (2001) 10. Mintzer, F., Lotspiech, J.: Safeguarding digital library contents and users, Digital watermarking. D-Lib. Mag. (December 1997) 11. Kalker, A.A.C.M., Willems, F.M.J.: Capacity bounds and constructions for reversible data-hiding. In: Proc. 14th Int. Conf. Digital Signal Processing, July 2002, vol. 1, pp. 71–76 (2002) 12. Kalker, A.A.C.M., Willems, F.M.J.: Reversible embedding methods. Presented at the 40th Annu. Allerton Conf. Communication and Control (2002) 13. Kamstra, L., Heijmans, H.: Reversible data embedding into images using wavelet techniques and sorting. IEEE Trans. Image Process.. 14(12), 2082–2090 (2005) 14. Kamstra, L., Heijmans, H.: Wavelet techniques for reversible data embedding into images. Centrum voor Wiskunde en Informatica Rep. (2004) 15. Maas, D., Kalker, T., Willems, F.M.J.: A code construction for recursive reversible datahiding. In: Proc. ACM Workshop Multimedia, Juan-les-Pins, France, December 2002, pp. 15–18 (2002) 16. Ni, Z., Shi, Y., Ansari, N., Wei, S.: Reversible data hiding. In: Proc. IEEE Int. Symp. Circuits and Systems, May 2003, vol. 2, pp. 912–915 (2003) 17. Thodi, D.M., Rodriquez, J.J.: Prediction-error-based reversible watermarking. In: Proc. IEEE Conf. Image Processing, October 2004, pp. 1549–1552 (2004) 18. Reversible watermarking by prediction-error expansion. In: Proc. IEEE Southwest Symp. Image Analysis and Interpretation, pp. 28–30 (2004) 19. Van Leest, A., Van der Veen, M., Bruekers, F.: Reversible image watermarking. In: Proc. IEEE Conf. Image Processing, September 2003, vol. 3, pp. 731–734 (2003) 20. Lee, S., Chang, D.: Reversible Image watermarking based on integer to integer wavelet transform. IEEE, Los Alamitos (May 2007) 21. Xuan, G., Zhu, J., Chen, J., Shi, Y.Q., Ni, Z., Su, W.: Distortionless data hiding based on integer wavelet transform. IEE Electron. Lett. 38(25), 1646–1648 (2002)
342
S.M. Abdullah and A.A. Manaf
22. Kurshid Jinna1, S., Ganesan2, L.: Lossless Image Watermarking using Lifting Wavelet Transform. International Journal of Recent Trends in Engineering 2(1) (November 2009) 23. Zhao, Y.: Dual Domain Semi-Fragile watermarking for Image Authentication. Master Thesis, University of Toronto (2003) 24. Zeki, A.M., Manaf, A.A.: Digital watermarking based on ISB, Phd Thesis, University Technology Malaysia (2008)
Modeling and Analysis of Reconfigurable Systems Using Flexible Nets Laid Kahloul1,2, Allaoua Chaoui3, and Karim Djouani2,4 1 LISSI Laboratory, Paris Est University, Paris, France Computer Science Departement, Biskra University, Algeria [email protected] 3 Computer Science Department, Constantine University, Algeria [email protected] 4 F'SATI at TUT, Pretoria South Africa [email protected] 2
Abstract. Reconfigurable systems, with dynamic structure at runtime, know a large usage in mobile code systems and mobile networks. Developing these systems and ensuring there correction can require the exploitation of formal tools. Petri Nets were well used in modeling and verification of these systems. High Level Petri Nets were proposed to model some aspects of these systems. In this paper, we propose a new formalism “Flexible Nets” with a high level dynamicity. This formalism will allow an easy and intuitive modeling of reconfigurable systems. The expressive power of the current formalism is due to the feature that all constituents of the net’s structure can be added or deleted during the execution of the net. This paper presents the formal definition of the formalism, gives an example on mobile code systems, and discusses some analysis issues of the current formalism. Keywords: Reconfigurable Systems, Mobile Code Systems, Colored Petri Nets, Flexible Nets.
1 Introduction Reconfigurable systems are systems where the structure is not fixed. During their execution, the structure of these systems can change. These systems cover numerous domains: mobile code systems [12], wireless and mobile networks [25], mobile robots [26], etc …These systems are used in many applications. In mobile code systems architectures, a code can migrate from one host to another. In [12], three major design approaches for mobile code systems are presented: (i) Remote evaluation systems: where some execution unit send other units towards a remote host (where some resources reside) to perform a computation, (ii) Code on demand: where an execution unit downloads a remote code (with a new know-how) to its local host, and (iii) Mobile agent systems: which are the smartest systems, where code is autonomous to decide the time and the destination of migration. Most current systems are based on these three major ideas. In [13], MundoCore, a communication middleware for pervasive computing, is built on the concept of dynamic reconfigurable architecture F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 343–357, 2010. © Springer-Verlag Berlin Heidelberg 2010
344
L. Kahloul, A. Chaoui, and K. Djouani
with “loadable modular services”. This middleware allows user’s applications to install/replace services at runtime to ensure efficiency to the mobile user. By services we mean components with well-defined interfaces that typically consist of multiple objects that implement some functionality. In [14], Mobile Agent paradigm is exploited to build a middleware that transparently avoids multimedia services interruptions by managing handoffs in wireless networks. A mobile agent (called session proxy client or SP) is associated to every mobile user. When the mobile user changes its wireless Internet cell, the associate SP manages the handoff and grants transparency to the user. In [15], mobile agents are used to provide services to mobile users. In this last system, mobile agents are used to build VHE (virtual home environment) that provides users with services irrespective of network access and devices capabilities. In this case, mobile agents are used to represent the users in the network, and to bring services and preferences profiles during roaming. This system will be presented with more details as a case study in this paper. However, many of these systems are developed in an ad-hoc approach, which have produced incompatible platforms, security problems, etc… All these problems can limit the advantages of such technologies. Formal approaches provide techniques to ensure the correction and reliability of these kinds of systems. Research on formal tool for mobile systems, has produced many formalisms and modeling language. The most relevant propositions are inspired from π-calculus [1] and from Petri nets. To cover the expressive weakness of classical Petri nets, researchers have proposed some extensions to deal with mobility aspect: Mobile Petri nets [5], Dynamic Petri Nets [5], Elementary Object Nets [6], Reconfigurable Nets [7], Nested Petri Nets [8], Hyper Petri Nets [9], Mobile Synchronous Petri Net [10], MTLA (Spatial Temporal Logics) [11], … etc. In mobile code systems, type of resources and their bindings play a central role in the migration process. Resources decide also the success or failure of the process. Formal proposed approaches do not deal with these aspects and their problems. In our previous works, we have proposed “Labeled Reconfigurable Nets” [16] extended to “Colored Reconfigurable Nets” in [17]. Our objective was to propose a formal and graphical tool to model mobile code systems in an easy and intuitive way. In all these works, we were interested to provide formalisms that model mobility explicitly. The mobility is modeled through the reconfiguration of the net’s structure when some transitions are fired. When trying to offer this quality in a model, we have to deal with the problem of interpreting this reconfiguration formally. In the two works [16] and [17], we have introduced specific transitions “reconfigure transitions”, which reconfigure the net when they are fired. The first inconvenience of this solution is that we must provide a specific treatment of these transitions when the model is analysed. The second inconvenience is the use of rigid labels associated to transitions makes the model proposed in [16] not a parameterized model. The objective of the present paper is to deal with the reconfiguration of the net with another point of view. For this reason, we introduce specific sorts in the model. These sorts will contain signed objects (places, transitions, and arcs). An internal operation will be defined in these sorts. This operation will add or delete objects based on the sign of these last ones. Types in the model can be constructed based on these sorts and other known types. Reconfiguration of the structure of the net is interpreted as an operation that manipulates this structure by manipulating its components which are signed objects.
Modeling and Analysis of Reconfigurable Systems Using Flexible Nets
345
The presence of a positive object (resp. negative object) in some place can be a reason to add (resp. delete) this object to (resp. from) the structure of this net. The formalism proposed is called Flexible Nets and reflects the idea that the model has a dynamic structure. This structure can be expanded, shrunken, or destroyed. By this work, we attempt to provide a formalism for reconfigurable systems (more precisely mobile code systems) that helps the developer of this kind of systems in the process of modeling and offers to him a method to validate his models. The rest of this paper is organized as follows: In section 2, we present the formalism “Flexible Nets”. In section 3, we show the use of Flexible Nets to model a mobile middleware (where mobility is both physical and logical). Section 4 discusses some ideas that can be used to verify the models, and section 5 presents related works. Finally, section 6 concludes this paper.
2 Flexible Nets Flexible Nets (FN) is an extension of colored Petri Nets [18]. In FN, the places, the transitions and the arcs are objects that can figure as marking of places. These objects can be signed: positive or negative. We introduce three sorts: P (for place), T (for transition), and A (for arc). These three sorts contain negative and positive objects. We abandon the use of reconfigure transitions. All transitions can change the structure of the net depending on the expressions labeling input arcs of these transitions. By using signed objects, we offer the possibility for adding (or deleting) nodes to (from) the net, when a transition is fired. The reconfiguration of the net by adding or deleting nodes will be interpreted as an internal operation defined in the sorts P, T and A. In the next paragraphs, we will present this internal operation and how it reconfigures a net. The initial marking of an added place, the guard associated to an added transition, and the expression labeling an added arc must be present when the transition that will add these nodes is fired. These three pieces of information can be modeled as data presented in the input places of a transition. Finally, we consider that transitions can be temporized. 2.1 Formal Definition A Flexible Net N is a 9-tuple (Σ, P, T, A, C, G, E, I, τ), where: • • • • •
Σ: a set of types (Colors). We denote by Σ* the set of all muti-sets of the set Σ; P: a set of places; T: a set of transitions; A: a set of arcs. A⊆(TxP) ∪(PxT); C: a color function associated to each place. C:P Σ. For each place p, C associates a unique color C(p); • G: a guard function associated to each transition. G: T Exp. Where Exp is the set of all Boolean expressions that can be constructed using constants and variables defined in types Σ; • E: an expression function that associates to each arc a in A an expression E(a); • I: is an initial state of the net. I=<M0, S0>, where M0 is the initial marking of places P. M0: P Σ*. S0 is the initial structure of the net. We take S0=P∪T∪A.
346
L. Kahloul, A. Chaoui, and K. Djouani
• τ: is a mapping which associates to each transition t from T a time τ(t). We consider that τ takes values in the set of positive real numbers. In section 2 and in section 3 of this paper, variables are always written as italic letters and constants as regular letters. Expressions are always written between angle brackets. The expression <> denotes the empty expression. The sets P, T and A are types in Σ. By P, T and A, we denote the names of these types and the sets that contain the elements (places, transitions and arcs) that constitute N. These types contain names of places, transition and arcs. These names can be signed. For example, we accept p as an unsigned place, +p, -p as a signed place. We define the internal operation ⊕ in sets P, T, and A. We denote by ∅ the neutral element for ⊕. We adopt the following proprieties for this operation, let p, p’, p’’ be elements in one of the three above sets (P, T, or A): • p⊕p’= p’⊕p; (p⊕p’)⊕p’’= p’⊕(p⊕p’’); p=+p; p⊕-p=-p⊕p=∅; p⊕∅=p; If P’={p1, p2, p3} is a subset of P. We allow that P’ can also be written as an expression P’term=p1⊕p2⊕p3. In this semantics, the expression p⊕p’ denotes the set {p, p’}, and the expression p⊕-p denotes the set {p}/{p’} which is briefly the set {p}. The neutral element ∅ denotes the empty set {}. In the set of types Σ={γ1, γ2, …}, we can have complex types. If γ is a complex type in Σ, then every element x in γ can be written as a tuple <x1, x2, x3, …, xn>, such as n is the arity of x. In this case, we denote by xi the sub-element of order i in the element x, and we denote by Type(xi) the type of this sub-element. In a tuple <x1, x2, x3, …, xn> which appears in a marking of a place p, we suppose that: • if Type(xi)=P (which means that xi is a place), then xi+1 and xi+2 must be the two expressions that represent respectively, the initial marking of xi and the type of xi. If xi is negative xi+1 and xi+2 can be empty expressions; • if Type(xi)=T (which means that xi is a transition) then xi+1 must be the expression that represents the guard of xi. If xi is negative xi+1 can be an empty expression; • if Type(xi)=A (which means that xi is an arc), then xi+1 must be the expression labeling the arc xi. If xi is negative xi+1 can be an empty expression; 2.2 Firing Rules Let N be a Flexible Net, and t a transition in T. As in CPN [18], we denote by °t the set of input places of the transition t, and by t° the set of output places of the transition t. Let I0=<M0, S0> be the current state of N. Firing t changes I0 towards I1=<M1, S1>. We denote this as : <M0, S0> t<M1, S1>. Preconditions to fire t. t can be fired iff: the time associated to t which is τ(t) is expired and there is a unification υ such that M0(p)≥E(p, t)[υ], for each p in °t. Post-conditions of firing t. after firing t, N will transit from its current state I0 to another state I1=<M1, S1>. For each p in °t we will have: M1(p)=M0(p)-E(p,t)[υ]. For each p in t°, we will have: M1(p)=M0(p)+E(t,p)[υ]. S0= P0∪T0∪A0 will be updated to
Modeling and Analysis of Reconfigurable Systems Using Flexible Nets
347
S1=P1∪T1∪A1, with their terms P1term , T1term , A1term . For each p in °t, if E(p, t) is the tuple <e1, e2, …, en>, then for each ei in e1, e2, …, en: • if Type(ei)=P then P1term= P0term⊕ei[υ], C(ei[υ])=ei+1[υ], and M1(ei[υ])=ei+2[υ]; • if Type(ei)=T then T1term = T0 term ⊕ ei[υ] and G(ei[υ])=ei+1[υ]; • if Type(ei)=A then A1term = A0term ⊕ ei[υ] and E(ei[υ])=ei+1[υ].
The following three examples are presented to more clarify the semantics of this formalism. We start by the example of Fig. 1.
p1 t1
firing t1
p1 t1 p2
Fig. 1. Flexible Nets first example
In Fig. 1, firing the transition t1 will change the structure of the net. In the new structure, a new place p2 and a new arc (t1, p2) are added. To ensure this behaviour, we must have at the initial state of the net: • M0(p1)=<+p2, <>, “integer”, +(t1,p2), <1>>. This marking will allow the creation of a new place p2 which type is integer with an empty initial marking and a new arc (t1,p2) which is labelled <1>. • (p1,t1) is labeled . Where: p2, is a variable of type P, a1, is a variable of type A, mark_p2, Type, exp_a1 are variables of type expression. From this initial state, it is clear that t1 can be fired with respect to the substitution: υ=[p1+p2, mark_p1<>, Type”integer”, a1+(t1,p2), exp_a1<1>] System of Fig. 2 has a more complex behavior. In this case, the system starts by firing t1 which will delete the arc (p0, t0) and will add: the place p2, the transition t0, and the two arcs {(t0, p2), (p2, t1)}; then, from this state, t1 can be fired an which restitutes the first structure (by deleting the new added nodes and restituting the arc (p0, t0)), and finally the t0 destroys the net. To facilitate this presentation, types of places will not appear in the following expressions. Initially, we must have: • M0(p0)=<-(p0,t0), p2, <-t1, -p2, (p0,t0), >, t1, <>, ∅, <>, (t0, p2), <∅, ∅, ∅, <>>, (p2,t1), , (t1, p0), <∅, -p0, <>, -t0, <>, -p1, <>, ∅, <>, ∅, <>, ∅, <>>>; • E(p0, t0)=; • E(t0, p1)=<1> and M0(p1)=0.
348
L. Kahloul, A. Chaoui, and K. Djouani
P0
P0 t0
firing t0
P1
t0 P1
t1 P2
firing t1 P0 firing t0
t0 P1
Fig. 2. Flexible Nets second example
From this initial state, the transition t0 is enabled with the substitution: υ=[a1(p0,t0), p1 p2, mark_p1<-t1, -p2, (p0,t0), >, t1 t1, grd_t1<>, p2∅, mark_p2<>, a2(t0, p2), exp_a2<∅, ∅, ∅, ∅, <>>, a3(p2,t1), exp_a3< t1, grd_t1, p1, mark_p1, a1, exp_a1>]. Once t0 is fired: • (p0, t0) will be deleted, • p2 will be added with M1(p2)=<-t1, <>, -p2, <>, (p0,t0), >, • t1 will be added, • (t0, p2) labeled <∅, ∅, ∅, ∅, <>> will be added, • (p2, t1) labelled will be added, • (t1, p0) labeled <∅, -p0, <>, -t0, <>, -p1, <>, ∅, <>, ∅, <>> will be added, • M1(p1)=<1>. From this second state, t1 is enabled with the substitution: υ’=[t1-t1, grd_t1<>, p1-p2, mark_p1<>, a1(p0, t0), exp_a1]. When t1 is fired: t1 and p2 will be deleted, (p0, t0) is restituted, and M2(p0)=< ∅, -p0, <>, -t0, <>, -p1, <>, ∅, <>, ∅, <>> From this third state, t0 is enabled with the substitution: υ’’= [a1∅, p1-p0, mark_p1<>, t1-t0, grd_t1<>, p2-p1, mark_p2<>, a2∅, exp_a2<>, a3∅, exp_a3<>]. When t0 is fired, the net will be destroyed.
3 A Case Study To show the application of the formalism on a large system, we have chosen as a case study a middleware VHE (Virtual Home Environment). A VHE middleware provides requested services for mobile users irrespective of their locations, their networks and their devices. The system modeled has been presented and developed in [15]. The system is developed for UMTS architectures (Universal Mobile Telecommunications System). In these architectures, fixed networks are connected to wireless networks to
Modeling and Analysis of Reconfigurable Systems Using Flexible Nets
349
allow service’s access for mobile users. Users can access services through a variety of devices: PCs, laptops, or mobile phones. The VHE system is realized through a set of stationary and mobile agents. In [15], used agents and some scenarios are described. As a usage of Flexible Nets, we will show models that can be proposed for some agents of the VHE system. A brief description of these agents and used scenarios are firstly presented. 3.1 Used Agents in the VHE The Stationary agents used are: (1) The NSA (Networks Supervisor Agents): supervises all the networks in the UMTS infrastructure, (2) The NA (Network Agent): supervises one network, authenticates connected users, routes messages between networks, registers and supervises the RNCAs (Agents responsible on Radio Network Connection), (3) The RNCAs: Agents responsible on Radio Network Connection, (4) The SAs (Service Agents): represent services in the net, (5) The PA (Provider Agent): creates and manages the SAs, dispatches the SAs towards nodes where the user (who required these services) is located, maintains an overview about all available services, and creates VHE-agents (which contain user service profile), (6) The NBA (Node-B Agent): manages a node B, routes messages from (resp. to) user to (resp. from) RNCA, (7). The mobile agents used in this system are: (1) The VHEA (Virtual Home Environment Agent): it contains user’s service-profile (personal data, information about subscribed services, the provider of each service, etc …), (2) The MUA (Mobile User Agent): an agent that represents the user in the entire network. 3.2 The Initialization Scenario Initially, the NSA agent is created to supervise the set of networks in the infrastructure of the UMTS system. After the creation of the NSA and for each network, an NA is created. Each created NA starts by informing the NSA that it is created and receives from the NSA the set of available nets. After this, the NA creates the PA (Provider Agent). When the PA is created, it starts by requiring (from the NSA) the list of available services in the set of networks supervised by the NSA. In case of failure of some service, the PA requires an update from NSA. For each RNC (Radio Network Connection), an RNCA (RNC-Agent) is created. The RNCA starts by informing the NA that it is created, and so the NA starts a heart-beat protocol to check continuously if the RNCA is operational or not. In case when some RNCA is failed, the NA updates the data base of available RNCAs. The same heart-beat protocol is applied between the RNCA and NBA (responsible on a radio node B) when the latter is created. In the Flexible Nets, transitions which reconfigure the net will be used to model wireless transitional communications and mobility behaviors. Once fired, these transitions will add or delete some nodes (places, transitions, and arcs). These nodes model a communication protocol between two agents or they can model a whole mobile agent. Types used in the specification can be complex (called structures). The type blackToken can contain only one constant denoted bt. These black tokens are used only for synchronization purposes. The sets P, T and A are three types that can
350
L. Kahloul, A. Chaoui, and K. Djouani
be used in the specification. Variables of types P, T, and A are written in lower-case, italic, and indexed letters (p1, t1, a1). Constants are written as regular, lower or uppercase indexed letters. In the following description, arcs for which we don’t specify a label are labeled bt. We define a type name that contains names (strings or identifier). Constants in this type are always preceded by n, and variables in this type are denoted: x, y, z, etc… We define three functions: Np, Nt, and Na. Np: name P. It associates to each name n a node. For example, if np is a name then Np(np) is the place p. Respectively, Nt and Na do the same thing for transitions and arcs. We define a function Name: P∪T∪A name. It associates a name to a node. Places and transitions created dynamically to model transitional communications are always denoted by identifiers that start with the letter c like: cp, and ct. If a place p (resp. a transition t) appears in the model of an agent A, we denote p as A-p (resp. t as A-t). In inter-communication between agents, we use some interface places. Identifiers of these places are always post-fixed by in (for an input interface place) our out (for an output interface place). For space reason, we show only the models of the agents NA, (Fig. 3). NA Pin2 P0
RNCADB
T4
P8
t0
T6
P2
P1 T1
P6 P11
P5
Pin1
t1 P4
P3 T2
t2
t3
T8
T11 P t8 20
P18 Pin7
P13
P7 P14 T9
P16
T
P10 t7 t Pin6 15
P12
P21
T3
P17
Pin3
T5 p9
Pp
P15
T1
Pout1 t16
t4 P19
Pin5
Pin4
t6
t5 Pout2
Fig. 3. Flexible Net Model for NA
Fig. 3 shows NA model and Table I and Table II show the most important labels. The transition T1 models the emission of “I am here” to the NSA. This emission is done through a wireless communication. The transition T1 will create the dynamic configuration required in this communication between NA and NSA (it is considered dynamic because it will disappear once the communication is achieved). This configuration is a set of nodes added once T1 is fired. These nodes are: cp1, (T1, cp1),
Modeling and Analysis of Reconfigurable Systems Using Flexible Nets
351
ct1, (cp1, ct1), (ct1, NA-pin1), ct2, (NA-pout1, ct2), (ct2, pin1). The transition T2 models the reception from NSA of the list of networks supervised by this NSA. When T2 is fired, it deletes (finishes the protocol) the nodes created by T1. The transition T3 creates the PA. T4 models the reception of “I am here” from an RNCA, it updates the DB of RNCAs, and it deletes the configuration created previously by RNCA. Variables x, y in the label of (pin2, T4) will be substituted by the two places used by RNCA to communicate with this NA. The transition T5 models the emission of “heart-beat” to some RNCA chosen from the RNCADB (RNCA Data Base). T5 creates the configuration required to the communication between NA and RNCA. When T5 is fired, it will add nodes: cp1, (T5, cp1), ct1, (cp1, ct1), (ct1, RNCA-pin1), ct2, (RNCApout1, ct2), (ct2, pin3). The transition T6 models the receptions of the “current state” from the RNCA required, the updating of the data base of RNCAs, and the deleting of the configuration created by T5. The transition T7 is a timed transition. T7 will be fired if the waiting time of the RNCA response is expired. Firing T7 updates the data base of RNCAs with the information that the RNCA required is no more operational. Table 1. Label for NA Model (1) Node p0 (p0, t0) (t0, p1) (p1, T1) (T1, p3) (p3, T2) (pin2, T4) (T4, BDRNCA) (RNCADB, T5) (T5, p9) (p9, T6) (T6, RNCADB) (T5, p11) (p11, T7)
Marking or label <-cp1, -(T1, cp1), -ct1, -(cp1, ct1), -(ct1, NSAp1), -ct2, -(NSAp2, ct2), -(ct2, pin1)> . <-p1, -a1, -t1, -a2, -a3, -t2, -a4, -a5> <-p1, -a1, -t1, -a2, -a3, -t2, -a4, -a5> <-p1, -a1, -t1, -a2, -a3, -t2, -a4, -a5>
3.3 Connection Scenario for a Mobile Device When a user wants to connect to the network using a mobile device, he starts by opening his device and entering his identifier code <X>. A request authentication is sent to the NBA (Node B Agent), where the device is located. The NBA will transfer this request towards the RNCA (Radio Network Agent). The latter forwards the request to the NA. When the NA receives this request it will authenticate the user and responses are forwarded agent by agent to NBA. Once NBA receives this notification, it creates the MUA (Mobile User Agent) which will represent the user in all the net, and a TA (Terminal Agent) is created on the mobile device. The TA will transfer device capabilities to the MUA. In parallel with this scenario, the NA requires the user profile from the PA (Provider Agent). If the PA doesn’t contain the user profile (the case of roaming users), the NA requires the profile from another NA in the user home net. When the PA receives the user profile, it creates the VHEA (Virtual Home
352
L. Kahloul, A. Chaoui, and K. Djouani
Agent) which will contain the user profile. The PA sends the VHEA to the RNCA, and the latter forwards it to the NBA. Once the VHEA is in the NBA’s environment, it communicates the user’s profile information to the MUA. The MUA uses these information and the device’s capabilities to decide which services will be offered to the user. In the specification (Fig. 5, Table II), we use the types TCode and TProfile. An element of TCode denotes the code of a user and an element of TProfile denotes the profile of a user. Table 2. Label for NA Model (2) Node (pin4, T8) P21 (p21, T8), (T8, p21), (p21, t3), (t3, p21) (T8, p14) (p14, T9) (T9, p16) (p16, T10) (pin5, T10) (Pp, T8) (Pp, t4) (Pp, t16) (t8, p20) (p20, T11) (T11, p17) (p17, T12) (T12, p18) (p18, t5) (t3, p14) (Pp,T16) (t16, pout1)
Marking or label <X> , ct1, (cp1, ct1), <x, why>, (ct1, PA-pin4), <x, why>, ct2, (PA-pout2, ct2), <x, why, profile, exists>, (ct2, pin5), <X, why, profile, exists>, X, AUT> <-p1, -a1, -t1, -a2, -a3, -t2, -a4, -a5> <X, why, profile, exist> <X, AUT, profile, 0> <X, AUT, profile, 1> <X, ROA, profile, 1> , ct1, (cp1, ct1), <X>, (ct1, NA’-pin6), <X>, ct2, (NA’-pout1, ct2), <X, profile>, (ct2, pin7), <X, profile>, X> <-p1, -a1, -t1, -a2, -a3, -t2, -a4, -a5> , ct1, (cp1, ct1), , (ct1, PA-pin3), > , ct1, (cp1, ct1), <x, y>, (ct1, PA-pin4), <x, y>, ct2, (PA-pout2, ct2), (ct2, pin5), X, ROA> <X, ROA, profile, 1> <X, profile>
In Fig. 3 (see also Table II), the transition T8 models the reception of “Authenticate user X?” from the RNCA. The transition T9 models the emission of a “Get profile of user X?” to the local PA. Once T9 is fired, it will create the configuration required to the communication between the NA and the PA. This configuration is composed of the nodes: cp1, (T9, cp1), ct1, (cp1, ct1), (ct1, PA-pin4), ct2, (PA-pout2, ct2), (ct2, pin5). The transition T10 models the reception of a response about the “Profile of user X?” from the PA. When T10 is fired, it deletes the nodes added by the transition T9. The place Pp will contain tokens which are structural data : < X, why, profile, exist>. The parameter why distinguishes the two cases where a profile is required. The first case is for an authentication, and in this case the value of the parameter why is equal to (AUT). The second case is the roaming, and in this case the value of why is (ROA). If
Modeling and Analysis of Reconfigurable Systems Using Flexible Nets
353
the profile doesn’t exist the value of exist is 0. The transition T11 models the emission of a “User Profile Request” to another NA located at the Home Network of a roaming user. If the profile is in the local PA or after it has been downloaded from the home net of the user, the NA sends response for authentication through t6 or sends this profile to another NA’ through t16. The transition T11 creates the configuration required to the communication between: NA and NA’. The transition T11 will add nodes: cp5, (T11, cp5), ct6, (cp5, ct6), (ct6, NA’-pin6), ct7, (NA’-pout1, ct7), (ct7, pin7). The transition T12 models the reception of a response to the sent request: “Get profile of user X?”. This response comes from the NA’ (located in the Home Network of the roaming user). When the transition T12 is fired, it will delete the configuration created by T11. The transition t5 models the action of updating the Data Base used in the PAs (Provider Agents). t5 communicates to PA the profile obtained from NA’. The transition t6 models the emission (to RNCA) of the response of the request “Authenticate user X?”. The transition T15 models the reception of a “Get profile of user X?” from another agent NA’. Finally, the transition T16 models the emission of the “Profile of user X” to NA’.
4 Analysis Issues The use of formal tools finds its motivation in the analysis and verification techniques that can be used with these tools. The proprieties that can be verified depend on the kind of the tool and the modeled system. In case of Petri Nets, some known proprieties are defined: boundedness, safety, reachability …etc. The most of proposed extensions of Petri Nets have some automatic tools to achieve the verification of some proprieties. In the current work, we consider that the same proprieties defined for classical Petri Nets can be studied and extended to Flexible Nets. To analyze these proprieties, we proposed two ways in the current time. The first way is based on an extended reachability tree that can be generated automatically. In classical Petri Nets, the reachability tree has as root the initial marking, and as nodes all the reachable marking. Arcs between nodes (marking) are labeled with the transition that transforms some marking (first node) to another marking (second node). In the FN, nodes of this reachability tree are not simple marking but they are states. Each state is a couple <S, M>, where S is the current structure of the net, and M is the marking associated to this structure. The root is the initial state <S0, M0>. The nodes are the set of reachable states. Each two states are linked by an arc labeled with the transition that transforms one state (the first node) to the other state (second node). As in classical Petri Nets, if this tree is finite then many proprieties can decided. In the case of an infinite reachability tree, the analysis will not be complete. In this work, we have realized a small prototype, that can be used to compute reachability tree for some given net, and for some number of level in case of infinite tree. The user must enter the net as a specification in a text to the program. The second analysis way is to unfold the flexible nets to another low level net for which some analysis tools existed. In our case, we have chosen the unfolding of Flexible Nets toward Dynamic Nets (DN) [5]. The dynamic Nets are high level nets, where new transitions can be added to the original nets when some existing transitions are fired in this net. Although the power expressiveness of Dynamic Nets, they
354
L. Kahloul, A. Chaoui, and K. Djouani
impose some constraints on the structure of the net. The high dynamicity of Flexible Nets makes the unfolding complex but possible. We have proposed a transformation technique that transforms the FN into the DN. The objective of this transformation is to profit from the idea that DN can also be unfolded into CPN (Colored Petri Nets). These last one, can be analyzed through the existing tools. The transformation from FN into DN is a formal transformation; this makes it possible to automate the unfolding. This transformation is proved but it is not yet automated.
5 Related Works Research on the use of Petri nets to model systems with dynamic structure has provided some remarkable results. The most important propositions are dedicated to mobile systems and mobile agents. In PrN (Predicate/Transition nets) [19], mobile agents are modeled through tokens. These agents are transferred by transition firing from an environment to another. In this work, the structure of the net does not change. The agents are represented as token, so this abstraction does not allow representing some complex behavior of this kind of agents. In [10], authors proposed MSPN (Mobile synchronous Petri net) as formalism to model mobile systems and security aspects. They have introduced the notions of nets (an entity) and disjoint locations to explicit mobility. A system is composed of set of localities that can contain nets. To explicit mobility, specific transitions are introduced. Two kinds of specific transitions were proposed: new and go. Firing a go transition moves the net from its locality towards another locality. The destination locality is given through a token in an input place of the go transition. In this work, mobility is not also explicit. Mobility is implicitly modeled by the activation of some nets and the deactivation of other nets, using tokens. Migration of an agent is modeled by the deactivation of the net modeling this agent in a locality and the activation of the net that represents this same agent in the destination locality. So, this is a kind of simulation of mobility. In nest nets [20], tokens can be Petri nets themselves. This model allows some transition when they are fired to create new nets in the output places. Nest nets are hierarchic nets where we have different levels of details. Places can contain nets, and these nets can contain also nets as tokens in their places et cetera. So all nets created when a transition is fired are contained in places. So the created nets are not in the same level with the first net. This formalism is proposed to adaptive workflow systems. In “reconfigurable net” [3], the structure of the net is not explicitly changed. No places or transitions are added in runtime. The key difference with colored Petri nets is that firing transition can change names of output places. Names of places can figure as weight of output arcs. This formalism is proposed to model nets with fixed components but where connectivity can be changed over time. In [21] PEPA nets are proposed, where mobile code is modeled by expressions of the stochastic process algebra PEPA which play the role of tokens in (stochastic) Petri nets. The Petri net of a PEPA net models the architecture of the net, which is a static one. Mobile Petri nets (MPN) [5] extend colored Petri nets to model mobility. MPN is inspired from joincalculus [4]. The output places of transition are dynamic. The input expression of a transition defines the set of its output places. In all these formalism, the structure of the net is not changed and mobility is modeled implicitly through the net’s dynamic.
Modeling and Analysis of Reconfigurable Systems Using Flexible Nets
355
In this model, an important work is required from the modeler to model mobility implicitly. MPN are extended to Dynamic Petri Net (DPN) [5]. Mobility in DPN is modeled explicitly, by adding subnets when transitions are fired. The DN formalism implies some constraints, (i) No transition without input places, (ii) Added nets, to the original net, must not modify the input of an existing transition in the original net, (iii) We can’t add a connection between two disconnected existing nodes, and (iv) We can’t delete nodes (place, transition or connection). Flexible Nets doesn’t imply these constraints, so it is more flexible and more expressive. We consider that Flexible Nets can be used by reconfigurable systems developers with more flexibility than other formalisms. This is due to the feature that it models mobility explicitly through mobility of nodes in the Flexible Net. Developers can encode mobile aspects of their system directly and explicitly in the FN formalism. The power of Petri nets resides in its verification methods. To ensure verification of high level Petri nets, some works were proposed. In [24], author proved the equivalence between Reconfigurable nets and the join calculus. Reconfigurable nets can be interpreted in join calculus and so can be verified. In [22], P/Tω nets are translated into linear logic programming. Author of [23], encoded Synchronous mobile nets in rewriting logic; they can use Maude to verify specifications. In this paper, we have first way through simulating the net or drawing automatically its reachability tree. The second way that requires more development in future papers, consists of the unfolding of the FN into the Dynamic Nets. These last one can be transformed into CPN.
6 Conclusion Petri nets are an elegant model for concurrency. With its graphical representation and its formal background, it was used to specify and verify concurrent multi-processes systems. The classical model has not the power of expressiveness to deal with current aspects such as mobility. To take benefits from the power of the model in mobility domains, several works have been proposed. These works try to extend Petri nets with same ability to specify mobility. We can distinguish between extensions that model mobility in an implicit way (no modification in the structure of the net), or in an explicit way (the net reconfiguration models components mobility). Proposed models are complex and classical verification tools of Petri nets could not be applied. To deal with this problem, some research works try to interpret high level models in basic specification language: linear logic, rewriting logic or CCS calculus [2]. Some other works proposed operational and denotational semantics for these models. These semantics can be used to prove some proprieties like: bisimulation, isomorphism with other languages, implantation validation, etc … In this paper, we have proposed “Flexible Nets”, a formalism to specify systems with dynamic structure. We have shown the expressiveness of this formalism through the modeling of some agents in a VHE system. To analyze models, we have proposed tow plausible ways: (i) through a simulator tool that simulates the firing of transitions and that depict the reachability tree, (ii) through the unfolding of Flexible Nets specifications into Dynamic Nets models. The dynamic nets models can be encoded into CPN [5]. The use of Flexible Nets facilitates the tasks of the developer that want to realize formal specification of his system.
356
L. Kahloul, A. Chaoui, and K. Djouani
The model presented in section 3 is based on informal presentation of the VHE system [15]. Some protocols are presented using UML sequence diagrams in [15]. We have not applied any rules to generate the FN models. We think that in the future works, we can think about proposing some rules that help the passage from UML toward FN. This is possible due to interested works realized until now in the transformation of UML diagrams towards Petri Nets. The present work can be improved at the analyzing level. We are working on the achievement of the automatic tool that can be used to compute reachability trees for FN models and verify the required proprieties if it is possible. This tool will be used to analyze behavioral proprieties of reconfigurable systems specified with FN. Another tool can be also developed to automate the unfolding of the FN into DN.
References 1. Milner, R., Parrow, J., Walker, D.: A calculus of mobile processes. Information and Computation 100(1), 1–40 (1992) 2. Milner, R.: A Calculus of Communication Systems. LNCS, vol. 92. Springer, Heidelberg (1980) 3. Badouel, E., Javier, O.: Reconfigurable Nets, a Class of High Level Petri Nets Supporting Dynamic Changes within Workflow Systems. Research report INRIA (1998) ISSN 0249-6399 4. Fournet, C., Gonthier, G.: The Join Calculus: A Language for Distributed Mobile Programming. In: Barthe, G., Dybjer, P., Pinto, L., Saraiva, J. (eds.) APPSEM 2000. LNCS, vol. 2395, pp. 268–332. Springer, Heidelberg (2002) 5. Asperti, A., Busi, N.: Mobile Petri Nets. Mathematical Structures in Computer Science 19(6), 1265–1278 (2009) 6. Valk, R.: Petri Nets as Token Objects: An Introduction to Elementary Object Nets. In: Desel, J., Silva, M. (eds.) ICATPN 1998. LNCS, vol. 1420, pp. 1–25. Springer, Heidelberg (1998) 7. Buscemi, M., Sassone, V.: High-Level Petri Nets as Type Theories in the Join Calculus. In: Honsell, F., Miculan, M. (eds.) FOSSACS 2001. LNCS, vol. 2030, pp. 104–120. Springer, Heidelberg (2001) 8. Lomazova, I.A.: Nested Petri Nets; Multi-level and Recursive Systems. Fundamenta Informaticae 47, 283–293 9. Bednarczyk, M.A., Bernardinello, L., Pawlowski, W., Pomello, L.: Modelling Mobility with Petri Hypernets. In: Fiadeiro, J.L., Mosses, P.D., Orejas, F. (eds.) WADT 2004. LNCS, vol. 3423, pp. 28–44. Springer, Heidelberg (2005) 10. Rosa-Velardo, F., Marroqn Alonso, O., Frutos Escrig, D.: Mobile Synchronizing Petri Nets: a choreographic approach for coordination in Ubiquitous Systems. In: 1st Int. Workshop on Methods and Tools for Coordinating Concurrent, Distributed and Mobile Systems, MTCoord’05. ENTCS, vol. 150 (2005) 11. Knapp, A., Merz, S., Wirsing, M., Zappe, J.: Specification and refinement of mobile systems in MTLA and mobile UML. Theoretical Computer Science 351, 184–202 (2006) 12. Fuggetta, A., Picco, G.P., Vigna, G.: Understanding Code Mobility. IEEE transactions on software engineering 24(5) (May 1998) 13. Aitenbichler, E., Kangasharju, J., Muhlhauser, M.: MundoCoreA: Light-weight Infrastructure for Pervasive Computing. Pervasive and Mobile Computing 3(4), 332–361 (2008)
Modeling and Analysis of Reconfigurable Systems Using Flexible Nets
357
14. Bellavista, P., Corradi, A., Foschini, L.: Context-Aware Handoff Middleware for Transparent Service Continuity in Wireless Networks. Pervasive and Mobile Computing 3(4), 439–466 (2008) 15. Baousis, V., Kyriakakos, M., Hadjiefthymiades, S., Merakos, L.: Performance evaluation of a mobile agent-based platform for ubiquitous service provision. Pervasive and Mobile Computing 4, 755–774 (2008) 16. Kahloul, L., Chaoui, A.: Code Mobility Modeling.: A Temporal Labeled Reconfigurable Nets. In: The 1st International Conference on MOBILe Wireless MiddleWARE, Operating Systems, and Applications, Innsbruck, Austria, February 14 (2008) 17. Kahloul, L., Chaoui, A.: Coloured reconfigurable nets for code mobility modeling. In: Int. J. of Computers, Communications & Control, Proceedings of ICCCC 2008, vol. III(suppl.), pp. 358–363 (2008) ISSN 1841-9836, E-ISSN 1841-9844 18. Jensen, K.: Coloured Petri Nets. Basic Concepts, Analysis Methods and Practical Use: Basic Concepts. In: Monographs in Theoretical Computer Science, 2nd corrected printing 1997, vol. 1. Springer, Heidelberg (1997) ISBN: 3-540-60943-1 19. Xu, D., Deng, Y.: Modeling Mobile Agent Systems with High Level Petri Nets. In: IEEE International Conference on Systems, Man, and Cybernetics, vol. 5, pp. 3177–3182 (2000) 20. van Hee, K.M., Lomazova, I.A., Oanea, O., Serebrenik, A., Sidorova, N., Voorhoeve, M.: Nested Nets for Adaptive Systems. In: Donatelli, S., Thiagarajan, P.S. (eds.) ICATPN 2006. LNCS, vol. 4024, pp. 241–260. Springer, Heidelberg (2006) 21. Gilmore, S., Hillston, J., Kloul, L., Ribaudo, M.: PEPA nets: A structured performance modelling formalism. Performance Evaluation 54(2), 79–104 (2003) 22. Cervesato, I.: Petri Nets and Linear Logic: A case study for logic programming. In: The Joint Conference on Declarative Programming, Italy, September 11-14, pp. 313–318 (1995) 23. Rosa-Velardo, F.: Coding Mobile Synchronizing Petri Nets into Rewriting Logic. Electronic Notes in Theoretical Computer science 174(1), 83–98 (2007) 24. Buscemi, M.G., Sassone, V.: High-Level Petri Nets as Type Theories in the Join Calculus. In: Honsell, F., Miculan, M. (eds.) FOSSACS 2001. LNCS, vol. 2030, p. 104. Springer, Heidelberg (2001) 25. Agrawal, D.P., Zeng, Q.A.: Introduction to Wireless and Mobile Systems. Brooks/Cole, Monterey (2003) 26. Siegwart, R., Nourbakhsh, I.R.: Introduction to Autonomous Mobile Robots. Bradford Book (2004)
Using Privilege Chain for Access Control and Trustiness of Resources in Cloud Computing Jong P. Yoon and Z. Chen Mercy College 555 Broadway, Dobbs Ferry, NY 10522 {jyoon,zchen}@mercy.edu
Abstract. Cloud computing is emerging as a virtual model in support of “everything-as-a-service” (XaaS). There are numerous providers such as feeders, owners and creators who are less likely the same actor, and multiple platforms possibly with different security control mechanisms. Consequently, cloud resources cannot be securely managed by traditional access control models. In this paper, we propose a new security technique to enable a multifactor access control, and to cope with various deployment models where user’s network and system sessions may vary. Using the metadata of resources and access policies, the technique builds the privilege chains. The contribution of this paper includes a mechanism of the privilege chains that can be used to verify the trustiness of cloud resources and to protect the resources from unauthorized access. Keywords: Cloud Computing, Access Control, Privilege Chains, SystemContext Information, JPEG Metadata.
1 Introduction Cloud computing models consist of subjects and objects, the objects that can be created by or provided for the subjects. Subjects, as actor, can be a service provider (SP) or a service user (SU), where SPs provide objects to a cloud and SUs request objects from a cloud. The services provided by SPs can be everything, from the infrastructure, platform or software resources. Each such service is respectively called Infrastructure as a Service (IaaS), Platform as a Service (PaaS), or Software as a Service (SaaS). For example, Google Apps Engine (http://www.google.com/apps) or Microsoft Azure platform (http://www.microsoft.com/windowsazure/) is a PaaS, while Google Docs (http://docs.google.com) is a SaaS, and DropBox (http://www.dropbox.com) is an IaaS. We know that SPs provide SUs with resources such as a JPEG image file in {I|P|S}aaS. There are numerous SPs such as feeders, owners and creators. As illustrated in Figure 1, a resource is created by a creator, who may then grant the ownership to a new owner (1 in Figure 1). A resource owner may delegate the feedership to a cloud server and further to a resource feeder (234 in Figure 1). While a resource is available in a cloud server, a user may request for a usership of a resource from a server (5 in Figure 1). Of course, it is also possible that a resource in one virtual F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 358–368, 2010. © Springer-Verlag Berlin Heidelberg 2010
Using Privilege Chain for Access Control and Trustiness
359
machine (VM) is deployed to another VM (3 and 4 in Figure 1) within the same cloud. Numerous actors with different roles, such as creatorship, ownership, feedership and usership, are involved in to handle the same resource, which is available to be accessed [4]. In terms of actors, menFig. 1. Cloud Resource Lifecycle tioned above, from a creator to a requester, there is a chain of privileges that the actors are granted. Such chain of privileges is very significant information that we use for the security management in cloud computing. The privilege chain can be used 1) to ensure the trustiness of resources, and 2) to protect the cloud resources from unauthorized accesses. •
•
While in clouds each user has unique access to its individual virtualized environment [17], a resource may be posted and deployed in one or more VMs, and multiple access policies for the same resource may be available in zero or more virtual machine memory (VMM)s. In this practical environment, typical models [6,18] do not take into account the metadata of data, such as the owner and service provider of JPEG files, and the session information of users, such as session identification, user certificate, or IP address, etc. Such models are often inadequate for new types of computing environment that provisions computing infrastructure. Resources are not trusted simply if there is a broken chain of privileges. With a broken privilege chain, we mean there is a missing subject in a chain, the chain from creator to user. It turns out that the resources are associated with any actors who are not trusted.
Since in support of everything-as-a-service (XaaS) there are various operating systems such as Unix, Linux, or Windows, and software packages such as DBMS, SAP, ERP, CMS, etc, cloud resources are available in one or more those platforms. Each such platform has different mechanisms of authentication and authorization, from typical password-based or LDAP-based authentication [2] to multifactor authorization techniques and RBAC [7,18]. In the variety of cloud infrastructures, software packages and platforms, a cloud resource previously accessed in one platform cannot be accessed by the same user in another platform, and vice versa. This paper has contributions to security management of the cloud resources in general and to access control models for cloud services in more specific. The contributions include: • •
Multifactor access control for JPEG files, in that both the JPEG context and access policies are used. Chains of privileges that are constructed for access policies in VMM and for JPEG files of cloud computing resources. The chain of privileges of JPEG
360
J.P. Yoon and Z. Chen
files can be used for trustiness and security management of cloud computing resources. This paper is organized as follows: Section 2 describes the background knowledge and the formats of metadata of file resources available in a cloud. Section 3 introduces the basic elements of the cloud memory, which can hold the user login information and the metadata of resources. Section 4 proposes the access control model with an extended format of access policies, and the new method to verify the trustiness of cloud resources and to make authorization decision based on the privilege chain. Section 5 concludes our work.
2 JPEG Metadata Photographic images can be compressed by the JPEG (Joint Photographic Expert Group, http://www.jpeg.org) compression algorithm. JPEG compression is used in a number of image file formats, such as JPEG/Exif (Exchangeable Image File Format) and JPEG/JFIF (JPEG File Interchange Format), and widely sued for storing and transmitting images on the Internet. JPEG files contain metadata [3,9,15], which consists of the data contained in marker segments in a JPEG file. The image metadata object, in the marker segments between the SOI (Start of Image, #FFD8 in hex) marker and the EOI (End of Image) marker for the image, contains information about make and model of digital camera, time and date the picture was taken, distance the camera was focused at, location information (GPS) where the picture was taken, small preview image (thumbnail) of the picture, firmware version, serial numbers, name and version of the image manipulation program, name of the owner, etc. Such metadata can be added by a user, software, or a digital camera. Some of the software that can edit in the metadata segment includes EXIFcare, EXIF Writer, EXIF Tool, MetaDataMiner, etc. For example, in Figure 2 illustrates metadata information displayed in EXIF Viewer, where there is the column for “Owner Name,” which is of our interest. It is also possible that any attributes can be edited by using software. For example, using EXIF Tool command, exiftool –P –overwrite_oritinal –creator=’Chris’ wedding1.jpg
(1)
the original file creation date, ownership, permissions, type, creator, icon and resource fork can be overwritten. For security reason, this overwriting function can be done one time only and read many. However, addition of new tags is always possible. Therefore, if the file, wedding1.jpg, is granted to a feeder, say Steve, then the following command is used: exiftool –P –overwrite_oritinal –feeder=’Steve’ wedding1.jpg
(2)
Consequently, the file, wedding1.jpg, has the metadata contains the following: Creator: Chris DateTimeCreated: 2009:06:28 10:03:42 Owner: Owen DateTimeOwned: 2009:06:30 13:12:32 Feeder: Steve DateTimeDeployed1: 2010:01:28 09:10:11
Using Privilege Chain for Access Control and Trustiness
As illustrated above, the metadata contains a set of (attribute, value) pairs. The picture in the file, wedding1.jpg, is taken by Chris on June 28th, 2009, owned by Owen on June 30th, then granted to the feeder, Seteve, on January 28th, 2010, and thereafter, the picture is up and available for user accesses. Each such agent is an important factor that this paper can use for access control in cloud computing. Note that the metadata can be obtained and held in a VMM.
361
Fig. 2. Metadata of JPEG
3 System-Context Information in Cloud Memories In cloud services architecture, the service provider (SP) implements the service logic and presents it to clients over the Internet (cloud). The service logic itself is typically composed of multiple components. The SP uses some virtualization abstraction, e.g., a virtual machine, for service deployment. Several of these VMs, belonging to various independent SPs, can then be deployed on the infrastructure. We make a simplifying assumption that a SP will deploy all resources on a single cloud infrastructure. An example is shown in Figure 1, where the SP provides a VM for customers or service users (SUs). A SU may access the VM from another PC or possibly from a dumb terminal. The SP adds value by allowing reaming access to the infrastructure, and possibly providing centralized management. There may be several architectures with various infrastructures possible [5,16], however, we will consider a cloud with the memory components of VMs [12,13,19]. As illustrated in Figure 3, we consider the memory for cloud computing in two tiered architecture. Each SP provides its Virtual Machine Memory (VMM) which may serve one or more available cloud resources to SUs. Each such VMM is addressed and monitored for namespaces bookkeeping by the Cloud Fig. 3. Virtual Machine Memories and Cloud Global Global Memory (CGM) of Memory the Cloud infrastructure as shown in Figure 3. This paper describes the VMM and CGM with their relationships with respect to authorization of user accesses to JPEG images.
362
J.P. Yoon and Z. Chen
As a SP places a JPEG file in a VM for service, the metadata of the JPEG file that is related to authorization process is loaded in the CGM. Therefore, whichever VM of a cloud environment loads a JPEG file available, since there is the only one CGM, the CGM can hold the authorization-related metadata of a JPEG file, while all other information is loaded in a VMM. (Note that JPEG metadata can be extracted from files by for example a program, the com.drew.metadata.exif Java package.) In the same manner, as a SU is signed in a VM, the authorization-related information is loaded in the CGM while all other user and session information loaded in a VMM. The authorization relation information depends on authorization methods and algorithms. However, the common data needed for authorization includes user’s current session information and JPEG’s user role information. For example, the user’s current session information includes the session id as logged by a SU, ip number and the host name that the SU is logged from, and resource name that is requested by the SU, the current VMM id, the network protocol, etc. The JPEG’s user role information includes the name of creatorship, ownership and feedership, and with the time stamp of such privilege granting or delegation date. The pairs of (variable, value) are associated with the user environment or the file environment. Such pairs are available in the system context of VMM and/or CGM. As a user logs in a cloud server, the login-, network- and session-information can be held by the system-context of a cloud and available in the CGM. The session name “John” is associated with the user environment “userEnv” and the information can be retrieved by a CGM context cgm_context(‘userEnv’, ‘session_name’) cgm_context(‘userEnv’, ‘network_user’)
(3)
and “John” can be returned upon request. As a resource is posted in a VM, the metadata of the resource is hold by the system-context of a VM and available in a VMM. Similarly, for a JPEG file, the VMM context vmm_context(‘fileEnv’, ‘creator’) vmm_context(‘fileEnv’, ‘feeder’)
(4)
binds with the feeder “Steve”.
4 Access Control Models Why the CGM is needed in a cloud computer environment although there are multiple VMMs are available? The reason for that is not because of provisioning for insufficient memory space, but aims at avoiding the propagation of subjects and privileges. In Figure 3, VMMs without CGM, the metadata about JPEG files and user information should be propagated from one VMM, where the user is logged in or a JPEG file is provided in, to another VMM, where the user access to a JPEG files. Such propagation may cause to modify or temper the metadata information. Having said the CGM in cloud computing, we describe the access control model, the delegation model in deployment methods provided in a cloud computing environment, and authorization policies.
Using Privilege Chain for Access Control and Trustiness
363
Recall SP (Service Provider) and SU (Service User) introduced in Section 1. In general, a resource is created by a creator, owned by an owner which may be the same as the creator, granted its serviceability to another by the owner, and finally placed for service in a cloud. They are all SPs. A cloud resource then will be used by a SU who accesses the cloud computing. Both SPs and SUs enter into a VMM. An SP posts a cloud resource in a VMM. As a cloud resource is posted, the metadata of the resource is extracted in a VMM and copied to the CGM. An SU requests a cloud resource from a VMM, the credential of an SU is extracted and held by the system-context of the CGM as illustrated in Figure 3. Here, we assume that access policies are available in the CGM. With the metadata of a resource and the system-context information about users, an appropriate access policy will be enforced. 4.1 Policy Management The policy manager creates and manages the policies that can be used to make access decisions. Typical access control policies are defined over three elements, (subject, object, signed action), which means that subject is allowed to do action on object. Depending on the sign of actions, subject is permitted to do the action if plus sign, or denied otherwise. We denote the format of such policies as (s, o, ±a), where s, o and ±a respectively are subject, object and signed action [7,8,11]. The signed action specified for typical access control policies is a privilege that can be applied to an object. We call this type of privileges object privilege. In addition to this, we want to propose to use another type of privileges, which can be applied to a system. We call this type system privilege. Some examples of the system privileges are “grant” or “admin.” Having these all together, the policies are specified over four elements: (s, o, ±a, m)
(5)
where m denotes a system privilege, such as “grant” that the subject s is permitted to grant the privileges (the same object and system privileges) further to other subjects, or “none” that implies no further privileges but only the given object privilege ±a. For example, the policy (“jyoon”, “wedding1.jpg”, +r, “grant”) implies that a request from jyoon is permitted to read the wedding1.jpg file and also permitted to grant the privileges to other subjects as well. On the other hand, the policy (“@cysecure.org”, “wedding1.jpg”, +r, “none”) means that requests from the host (“@cysecure.org has no privilege of further granting to others but is permitted to read the file only. Now, we want to consider the delegation mechanism in access control. The delegation mechanism has been used to support decentralized administration of access policies [1,10]. It allows an authority (delegator) to delegate all or parts of its own authority or someone else’s authority to another user (delegatee) without any need to involve modification of the root policy. In this context, there are two types of the subject (or agent) of actions: subject as an actor and as a target. An actor subject (sa) permits a target subject (st) to do an action (a) on object (o), which is in the same context of delegation. We modify the policy format in (5) to the following:
364
J.P. Yoon and Z. Chen
(sa, st, o, ±a, m)
(6)
where •
sa denotes delegator (or actor subject). The creator (or the first agent) of a file will be of sa. This actor subject may appear in the metadata of JPEG files. • st denotes delegatee (or target subject). This target subject may appear in the metadata of JPEG files. The end user, the user who requests to access, of files will be of st. • o denotes the target subject is called “direct object.” One type of examples is JPEG files. • ±a denotes a signed action. It can be “read”, “write”, “download”, etc, that sa and st may request. • m respectively object, signed action and privilege mode. For example, the policy (“Steve”, “jyoon”, “wedding1.jpg”, +r, “grant”) implies that Steve grants jyoon read the wedding1.jpg with the grant system privilege. 4.2 Privilege Chains Recall the metadata discussed in Section 2, such as creator, owner, feeder, requester, and user. From the creation of (for example JPEG file) resources to the service in a cloud service, there is one or more actor subjects (or SPs) involved. As an example of the case of resource wedding1.jpg, creator Chris delegates (sells) to owner Owen. Along the sequence of SPs in the metadata of a cloud resource, there will be a chain from the creator to the feeder. We call such a chain “privilege chain for metadata”, denoted by PCm. A chain of privileges can be constructed in a (linked) list. Each node of PCm consists of the variable and value pair, as discussed in (3) and (4) of Section 3. A node of the linked list represents an SP, any actor from creator to feeder. The variables of PCm can be creator Æ owner Æ feeder. The head of a linked list is usually the creator of a resource, and the tail node is the feeder to a cloud. In the example of the JPEG file above, the head node is Chris and the tail is Steve, and the privilege chain is PCm: Chris Æ Owen Æ Steve
(7)
For privilege chains for metadata, we want to define some specific nodes: Definition 1. (Creator Head and Feeder Tail) In a PCm, the head node is called creator head node, if the node has a creator (name) in it. The tail node is called feeder tail node, if the node has a feeder. Corollary 1. (Complete or Broken PCm) PCm is complete if it contains both creator head nod and feeder tail node. Otherwise, it is broken. The PCm in Figure 4(a) (b) and (c) is complete, while 4(d) (e) (f) and (g) are broken. Privilege chains are broken because there are one or more nodes missing between the creator head node and the feeder tail node. Similarly, consider access policies. An actor subject (sa) in an access policy, (sa, st, o, ±a, m), is represented in a head node, while target subject (st) in a tail node. That is,
Using Privilege Chain for Access Control and Trustiness
365
sa Æst. Furthermore, consider two access policies, pi and pj, (sai, sti, oi, ±ai, mi) and (saj, stj, oj, ±aj, mj) respectively. If saj = sti, then the node of pi is linked to the node of pj. That is, the privilege chain is sai Æ sti (or saj) Æ stj. It means that of pi is the head node, while pj is the tail node. We call such a chain “privilege chain for access policy”, denoted by PCp. For example, consider the following access policies: (“Chris”, “Owen”, “wedding1.jpg”, +{r,w,x}, “grant”)
(8)
(“Owen”, “Steve”, “wedding1.jpg”, +{r,w,x}, “grant”)
(9)
(“Steve”, “John”, “wedding1.jpg”, +r, “none”)
(10)
The privilege chain for the above policies is: PCp: Chris Æ Owen Æ Steve Æ John
(11)
For PCm and PCp, the following is defined: Definition 2. (Feeder Head and User Tail) The head node of a PCp is called feeder head node if the head node is the same as the feeder node of a PCm. The tail node of PCp is called user tail node if it is a user (or SU). Corollary 2. (Complete or Old-granter or Unknown-granter PCp) PCp is complete if it contains both feeder head nod and user tail node. PCp is an old-granter chain if it contains the user tail node but no feeder header. PCp is an unknown-granter chain if it contains the user tail node but the header node is unknown. For example, PCp in Figure 4(a) (b) and (f) are complete since they have both feeder head node and user tail node. Figure 4(c) and (d) illustrate PCp of old-granter chain, because they have the user tail node but the header node is other than the feeder head node. By old-granter, we mean that the head node is not a feeder but any actor that appears earlier than the feeder in PCm. Figure 4(e) shows PCp of unknown-granter chain. By unknown-granter, we mean it is not found in any privilege chains for metadata, PCm. The difference between old-granter chain and unknown-granter chain is that both have the user tail node but the later has the head node which is not found in PCm, while the former has the header node which is available in PCm. Using both PCm and PCp, in the following subsections, we will discuss how to verify the trustiness of cloud resources and how to determine the authorization. 4.3 Trustiness of Cloud Resources As described in Section 4.2, a privilege chain for metadata can be complete, broken or missing. A privilege chain for access controls can be complete, old-granter or unknown-granter. With this in mind, we can define the trustiness of cloud resources as follows: Definition 3. (Traceability of Cloud Resources) A resource is traceable if any one below holds: 1) PCm is not broken, or 2) PCp has node that is identical to any node (node value) of PCm.
366
J.P. Yoon and Z. Chen
For example, the resources in Figure 4(a) (b) and (c) are traceable, because they are not broken. The resources in Figure 4(d) and (f) are traceable, because PCp of (d) has the header node which is identical to the value of the node (Owner: Owen), and PCp of (f) has Steve which is also in PCm. Definition 4. (Trustiness of Cloud Resources) If a resource is traceable, then it is trusted with respect to PCm and PCp. For example, the resources in Figure 4(a) (b) (c) (d) and (f) are trusted since they are traceable, while (e) and (g) are not. 4.4 Authorization Decision Having both PCm and PCp available, as we (1) ensure the trustiness of resources in Section 4.3, and in this section, we want to (2) make the authorization decision. We observe that there are six cases as an example, in which authorization can be determined more efficiently. • Case (a) in Figure 4 has both chains of subject. Since both chains are available (we do not discuss how both are available in this paper), the subjects listed in one chain can be cross-checked with those in another. This is a crossvalidation. The two linked lists are back-tracking as far as they are identical. If they are not identical, the two nodes from both lists are the points for further legitimacy verifications. • Case (b) illustrates a JPEG file which has a full list of agents, but with an inadequate list of access policies, in cloud computing. In this case, the authorization decision is made as stated in the access policy that is available in VMM. • Case (c) is similar to (b), except an access policy that permits to st by sa which is not the same as the tail node of the chain for JPEG metadata. If there exists any node, in the privileged subject chain of metadata that is identical to sa of access policies, then the authorization decision should be further negotiated [14], to see if the policy can be used. • Case (d) illustrates an access policy that is available in VMM but no appropriate privileged subject in the metadata. It means that JPEG file in a VM is not trusted as discussed in Section 4.3. In this case, the cloud server should confirm the quality, legitimacy issues and trustiness levels of the resource. • Case (e) is similar to (c), except that there is no adequate chain of privileged subjects in a JPEG metadata, although an access policy is available in VMM. It means there is no historical record of delegation or granting, but only for the current serviceability available. In this case, additional negotiations may be needed to see whether the available access policy is used for authorization decision or the owner of the JPEG needs to be identified. • Case (f) of Figure 4 illustrates both un-trusted services and resources. Therefore, the request is denied. As discussed above, using both chains of privileged subjects from JPEG files and VMM can make not only authorization decisions but also control the legitimacy and quality issues of resources. If a resource is be posted in one or more VMs, and multiple access policies for the same resource may be available in zero or more VMMs, then conflicts amongst the policies may exist. Such conflicts can be resolved in the CGM, but the details will not be discussed in this paper due to the size limit.
Using Privilege Chain for Access Control and Trustiness
367
Fig. 4. Privilege Chains
5 Conclusion To address the growing concern of security issues in cloud computing, we have investigated a new approach to the metadata- and privilege-based access control. By using JPEG metadata and system-context user information, we gain numerous benefits including two-factor access control for authorization decision and maintenance of cloud resources for trust services. In this paper, although we take as an example JPEG files to discuss about metadata, the concept of using JPEG metadata can be extended to any types of resources in cloud computing as long as such resources have a header file which can contain metadata.
References [1] Ben Ghorbel-Talbi, M., Cuppens, F., Cuppens-Boulahia, N., Bouhoula, A.: Managing Delegation in Access Control Models. IEEE ADCOM (2007) [2] Blezard, D., Marceau, J.: One user, one password: integrating Unix accounts and active directory. In: ACM Conf. on SIGUCCS (2002) [3] Cudre-Mauroux, P., Budura, A., Hauswirth, M., Aberer, K.: PicShark: mitigating metadata scarcity through large-scale P2P collaboration. VLDB Journal 17 (2008) [4] Security Guidance for Critical Areas of Focus in Cloud Computing, v.2.1, Cloud Security Alliance (2009), http://www.cloudsecurityalliance.org/guidance/ csaguide.v2.1.pdf
368
J.P. Yoon and Z. Chen
[5] Christodorescu, M., Sailer, R., Schales, D.L., Sgandurra, D., Zamboni, D.: Cloud security is not (just) virtualization security. In: ACM Cloud Computing Security Workshop (2009) [6] Damiani, M., Martin, H., Saygin, Y., Spada, M., Ulmer, C.: Spatio-temporal Access Control: Challenges and Applications. In: ACM SACMAT (2009) [7] Ferraiolo, D., Atluri, V.: A meta model for access control: why is it needed and is it even possible to achieve? In: ACM SACMAT (2008) [8] Ferraiolo, D., Kuhn, D., Sandhu, R.: RBAC Standard rationale: comments on a critique of the ANSI standard on Role-Based Access Control. IEEE Security & Privacy 5 (2007) [9] Haslhofer, B., Klas, W.: A survey of techniques for achieving metadata interoperability. ACM Computing Surveys 42 (2010) [10] Joshi, J., Bertino, E.: Fine-grained role-based delegation in presence of the hybrid role hierarchy. In: ACM SACMAT (2006) [11] Kulkarni, D., Tripathi, A.: Context-aware role-based access control in pervasive computing systems. In: ACM SACMAT (2008) [12] Hao, F., Lakshman, T., Mukherjee, S., Song, H.: Enhancing dynamic cloud-based services using network virtualization. ACM SIGCOMM Computer Communication Review 40 (2010) [13] Lenk, A., Klems, M., Nimis, J., Tai, S., Sandholm, T.: What’s Inside the Cloud? An Architectural Map of the Cloud Landscape. In: IEEE Conf. on CLOUD (2009) [14] Lee, A., Winslett, M., Basney, J., Welch, V.: Traust: a trust negotiation-based authorization service for open systems. In: ACM SACMAT (2006) [15] Pereira, F.: MPEG multimedia standards: evolution and future developments. In: ACM MULTIMEDIA (2007) [16] Raj, H., Nathuji, R., Singh, A., England, P.: Resource management for isolation enhanced Cloud services. In: ACM CCSW (2009) [17] Vaquero, L., Rodero-Merino, L., Caceres, J., Lindner, M.: A greak in the clouds: towards a cloud definition. ACM SIGCOMM Computer Communication Review 39(1) (2008) [18] Wang, H., Osborn, S.: Discretionary access control with the administrative role graph model. In: ACM SACMAT (2007) [19] Zeng, W., Zhao, Y., Ou, K., Song, W.: Research on Cloud storage architecture and key technologies. In: ICIS (2009)
Modeling of Trust to Provide Users Assisted Secure Actions in Online Communities Lenuta Alboaie1 and Mircea-F. Vaida2 1
Alexandru Ioan Cuza University of Iasi, Romania, Faculty of Computer Science, Berthelot, 16, Iasi, Romania [email protected] 2 Technical University Of Cluj-Napoca, Communication Department, Gh. Baritiu Street, 26-28, Cluj-Napoca, Romania [email protected]
Abstract. Nowadays, the main problem is not the lack of high quality resources, but their retrieval, organization and maintenance. That is a challenge for the users in making the best and right decisions. The paper presents a model for trust, able to be used in various online communities. The model provides to each user from the community, a personalized secure context to manage significant resources. Therefore the model assures every user that his preferences are important and according to this the system furnished resources efficiently. Keywords: Social trust, local trust metric, online community, secure recommendations.
1 Introduction In World Wide Web there are many online communities that store a great amount of data which are continuously increasing. Anyone can publish any kind of resources: a diary published within a blog, a track that a user wants to make public, etc. In the context in which the overloaded information phenomenon brings a major impact, as this paper comes up with a solution to improve the performances of the existing resources management found within online communities. The discussions within this paper are based on the concept of social trust. The definition of trust accepted by great majority of the authors is presented in [1]: “Trust is the subjective probability by which an individual, A, expects that another individual, B, performs a given action on which its welfare depends”. This action is in social web an evaluation, an opinion that someone express regarding someone else. Trust and reputation systems are very useful in communities in which users have to interact with other users about whom they don’t have any previous information. In this case a trust and reputation system assures the utilization of the users experience (resulted from the previous interactions) and on this basis the establishment of user-user and user-resource evaluation levels. In the literature we have two categories of algorithms to calculate trust: local and global (also called local trust metrics, F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 369–382, 2010. © Springer-Verlag Berlin Heidelberg 2010
370
L. Alboaie and M.-F. Vaida
respectively global trust metric) presented in detailed in [2]. At this moment in the literature most of the approaches are based on the development of algorithms which lead to obtaining the user’s global importance and we find very few approaches of the development of the local trust metrics [3-6]. In [7] an initial general model of calculating local trust was proposed. In this paper we show how our new model based on new relations for computing the ratings between user-user and user-resource, which allows users to make efficient actions.
2 The Engine Ratings of Our Trust System The purpose of the proposed model is to build a flexible way to manage resources in a personalized manner. The purpose of the proposed model is to build a flexible way to filter irrelevant resources for users. In this way, a user which is member to a community based on our model (called StarWorth) will dynamically see information that he/she is most interested in. Meanwhile the user can choose to interact with users who have similar interests with his. Our system is designed to be used in an online community; therefore, we will consider shortly the used vocabulary: • Users – are members of an online community. • Resources – their definition is made accordingly to the definition given by (T. Berners-Lee, 1998). • Worth – this parameter is a measure of a trust statement that a user associates to another user. This measure signifies a given rating, accorded by a user to another user. Also, the worth can be obtained (quantized) indirectly as we will see in the following paragraphs. In our system we have five evaluation levels. Users can give ratings to other users from these intervals. Table 1. Levels of evaluations with significance (0, 1] means low trust; (4, 5] high trust
Level1 (0,1]
Level2 (1,2]
Level3 (2,3]
Level4 (3,4]
Level5 (4,5]
We note the upper limit with MaxWorth, where MaxWorth = 5 in our experiments. We considered a set of constructions which have the following associated semantics. In fact, these constructions can be mathematically considered as functions or, from the implementation point of view, they are considered associative tables: • •
Explicit worth of a user: WE_UU (user, user) – explicit worth, represents the rating for a user, and the rating is given manually by the user to another user Implicit (deducted) worth of a user: WI_UU (user, user) – measures how close are of both preferences
(The preference can be considered: the accepting degree of a point of view or the appreciation degree of a piece of art). • We consider the function WU(user, user) for each pair of (user, user) from a Web community
Modeling of Trust to Provide Users Assisted Secure Actions in Online Communities
⎧WE _ UU (U x , U y ), if U x evaluates explicitly U y WU (U x , U y ) = ⎨ otherwise ⎩ WI _ UU (U x , U y ),
371
(1)
We will define the manner of computation of the implicit values introduced above. Let us consider two users Ui, Uj from the Web community. The value of WI_UU(Ui,Uj) indicates the deducted worth based on explicit evaluations made by users to each other. Let the users, whom we have ratings from user U i to be l {U i1 ,...,U ik } . Also, we consider having explicit ratings from Ui to Uj, l ≤ k so we have defined WE_UU (Uil, Uj) (see Fig. 1).
Fig. 1. Implicit user-user evaluation computation
In order to compute WI_UU values we must to compute the value of the weight corresponding with the explicit ratings. We denote this weight with PE (U i ,U il ) . It represents an explicit rating weight, in our case the weight of the rating provided by Ui, being computed as: PE (U il ,U j ) =
WU (U i ,U il ) MaxWorth
(2)
We compute the implicit rating that user U i provided to U j as:
WI _ UU (U i ,U j ) = where,
1 k * ∑ PE (U il ,U j ) * WE (U il ,U j ) k l =1
(3)
1 ≤ l ≤ k , k is the number of the users that were explicitly evaluate by U i .
From (2) and (3) we obtain the final implicit reputation computing formula: k
WI _ UU (U i ,U j ) =
∑WU (U l =1
i
,U il ) *WE (U il ,U j )
k * MaxWorth
(4)
372
L. Alboaie and M.-F. Vaida
We have WU(U i ,U il ) = WE _ UU(U i ,U il ) if there exists an explicit evaluation from
U i to Uil , otherwise we consider an implicit evaluation from U i to Uil . In order to present our local trust metric we consider the following constructions: -
sourceUser – is user for which is calculated the vision over community; WU – contains the evaluation from trust network in community at a given time WE – contains the explicit evaluations from WU sinkUsers – are users that received evaluations from sourceUser Input: sourceUser, WE, WU Output: WU, sinkUsers
Step 1. add in sinkUsers all nodes accessible from sourceUser Step 2. do savedWU = currentTrustNetwork; Step 3. Foreach U în sinkUsers Step 4. Find {U i1 ,..., U ik } satisfying the following conditions: there is an edge between sourceUser and each U ik in WU there is an edge between each U ik and U in WE Step 5. Calculate implicit trust value between sourceUser and U using: k
WU ( sourceUser,U ) =
∑ savedWU(sourceUser,U l =1
l i
) *WE(U il ,U )
(5)
k * MaxWorth
/* Update or insert an edge between sourceUser and U with capacity computed at Step 5 in WU */ while (savedWU != WU) Algorithm: StarWorth –local trust metric pseudocode The stop condition ( savedWU != WU ) is materialized in implementation through election of a ε value, so that two matrix savedWU, WU are in different relationship if there exists indices i and j so that (WU [ i , j ] − savedWU [i , j ]) > ε . User-resource evaluation mechanism In an online community there are users and they manage (upload| delete| modify) a wide range of resources. In previous section we present how trust relations can be computed between users. In this subsection we shall consider a mechanism that allows users to have a personal vision on resources depending on others evaluations.
Modeling of Trust to Provide Users Assisted Secure Actions in Online Communities
373
We shall consider the following constructions: -
WE_UR (useri , resourcej) - ) represents the explicit evaluation given by userj to resourcej WI_UR (useri , resourcej) - ) represents the implicit evaluation value, computed by system and that useri associate to resourcej
Let us consider WR (user, resource) function, for every pair (user, resource):
⎧WE _ UR (U x , R y ), if U x evaluates explicitly R y resource WR (U x , R y ) = ⎨ otherwise WI _ UR (U x , R y ), ⎩
(6)
Using a similar reasoning for computation of WU values, we consider: k
WI _ UR(U i , R j ) =
∑WU (U l =1
i
, U il ) * WE _ UR(U il , R j )
(7)
k * MaxWorth
Using this mechanism we can assure that users have available a safe environment in which they will interact with similar users and appropriate resources to their preferences. For a better understanding of how we designed, implemented and tested our StarWorth system, we present an experiment in the next section. You can find more experiments in [8,9].
3 Example of Using StarWorth System 3.1 The Simulator To test the local trust algorithm we need the communities for this testing. In the specialized literature the local metric trust tests proposed have been accomplished on the dates from Epinions in [2] or in [10] have been created their own dates for tests. In this work we chose to analyze and to generate data that are to be useful in obtaining conclusions on the system, without introducing factors which would not be necessary or could even impede a correct analysis. The generation of the test’s data is made by using a generator (called DataTestGenerator) which can be customized to generate different cathegories of explicit evaluations which mirror evaluations that can also be found within the online communities. We called it generator because it generates data which represents possible evaluations from the online communities. These data can be generated depending on a series of parameters which allow us to observe different phenomenons. Therefore we have the following parameters: •
goodUsers: With this paramether we can aproximate how many users are well seen in the community. In other words the paramether allows us to set the number of users who receive a majority of good evaluations.
374
L. Alboaie and M.-F. Vaida
• •
• • • •
goodResources: this paramether allows us to aproximate the number of resources appreciated in the community minMarksCount: for a better observation of the dynamics of spreading the trust it is needed that the users from the community to evaluate themselves. This paramether sets the minimum number of evaluations realized by an entity. goodMarksThreshold: represents the threshold which separates the good evaluations from the worse ones userVotesDensity: is a percentage which represents the density of the community’s network. In other words this paramether represents how many explicit evaluations are made in the system. resourceVoteDensity: the paramether aproximates from the number of possible evaluations which is the percentage of evaluations to be given to the resources goodObjectMaxDivergence: permits to set the percentage representing how many bad marks can be awarded to an abject set as being good. So, if we have a resource which has associated 100 evaluations, then the report
goodObjectMaxDivergence *1000 represents the maximum number of 100 bad marks This factor makes clear what majority means in the explanation from goodUsers and goodResources. This modeling of the generator catches a real phenomenon which is the existence of non-active agents in the system. We consider being non-active agents the ones who didn’t do any evaluation or did too few ones, and indirectly our system eliminates them from the analysis. For a great relevance of test data we introduced the notion: interest of a user (the domain for resources). Therefore we have introduced three parameters: • • • •
domainsCount – the number of domains maxDomainsPerUser- the number of maximum of domains of a user maxDomainsPerResource – the number of maximum of domains for a resource marksDomainDensity – represents the percentage of marks given to a user or a resource with a common domain.
The generator assures the fact that at least marksDomainDensity from the given marks by a user are given in the domain of interest of the user. In this way and depending on how the densities are chosen a better connection of the users with similar domains of interest is assured, phenomenon that takes place in the real world. Besides these one we also have the compulsory parameters: • •
agentsNumber – the number of entities from the community. resourcesNumber-the number of resources from the system.
Modeling of Trust to Provide Users Assisted Secure Actions in Online Communities
375
3.2 Tests The generator from StarWorth allows many various parameters of the input variables. These parameters insure a big degree of generality and allow our system the simulation of various online communities with different profiles. We consider the following configuration file: DataTest.10U15R 0 <maxValue> 5 3.5 <entities goodObjectMaxDivergence="5"> <minMarksCount> 1 <arrangements> <domains number="1"> <maxDomainsPerUser> 1 <maxDomainsPerResource>1 <minMarksCountInCommonDomain>1 10 12 <userVotesDensity> 14.0 15.0 Using the configuration file, the generator produced these explicit evaluations: •
User-User explicit evaluations 5.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 5.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00
376
L. Alboaie and M.-F. Vaida 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 5.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 1.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00
•
User-Resource explicit evaluations 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 3.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 3.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 4.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 5.00 0.00 0.00 0.00 0.00 0.00 2.00
Using our trust propagation mechanism we obtain the following: • The trust network represented as a matrix that contain explicit Users-User generated evaluation and implicit User-User computed evaluations 5.00 1.08 4.00 0.00 0.00 0.00 0.00 0.00 1.66 0.00 0.00 0.00 0.99 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 0.00 0.00 2.25 0.00 0.00 0.00 5.00 0.00 0.00 0.00 1.90 5.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 1.36 0.00 0.00 0.00 5.00 0.00 5.00 0.00 0.00 0.00 0.00 1.46 0.00 0.00 0.00 2.75 0.00 0.00 0.00 0.81 0.00 0.00 5.00 2.25 0.00 0.00 0.90 0.00 5.00 0.84 0.86 0.00 0.00 0.00 0.91 0.00 0.00 0.00 5.00 0.00 0.00 1.18 0.00 0.00 2.00 0.91 0.00 0.00 0.00 1.16 0.00 0.00 0.00 0.00 5.00 0.00 1.90 0.00 0.00 0.00 4.00 0.00 0.00 2.25 0.84 1.29 0.00 0.00 0.77 0.00 5.00 0.99 2.75 0.76 0.61 0.88 5.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 2.30 0.00 0.00 0.00 0.79 0.00 0.00 0.00 0.85 0.00 0.00 0.85 5.00 1.00 0.63 0.86 0.00 0.00 0.00 0.84 0.00 0.00 0.00 4.00 0.00 0.00 0.96 0.00 5.00 1.10 0.87 0.00 0.00 0.00 1.90 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 5.00 1.36 0.00 0.00 0.00 1.90 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 5.00 0.00 0.00 4.00 0.88 1.90 0.00 0.00 0.85 0.00 0.00 1.10 5.00 1.00 0.63 0.90 5.00 0.00 0.83 0.81 0.79 0.00 0.00 0.70 0.00 1.00 0.89 0.93 0.56 0.60 0.86 0.95 5.00
• The matrix that contain the explicit generated ratings for resources and implicit computed ratings for resources 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2.50 0.00 0.00 0.00 5.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 4.00 4.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 3.30 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 3.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 3.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 0.00 0.00
Modeling of Trust to Provide Users Assisted Secure Actions in Online Communities
377
0.00 1.90 0.00 0.00 0.00 1.90 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 3.30 0.00 0.00 0.00 3.30 3.30 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 5.00 0.00 0.00 1.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 3.30 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2.00 0.00 0.80 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2.00 0.00 0.00 0.00 3.30 0.00 0.00 0.00 4.00 3.30 0.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 3.30 4.00 0.00 0.00 0.00 4.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 1.00 0.00 0.80 0.00 1.00 0.00 0.00 0.00 4.10 0.00 0.00 0.00 0.00 4.00 4.00 0.00 0.00 1.40 0.00 0.00 0.00 5.00 5.00 0.00 0.00 0.00 0.00 0.00 2.00
From this matrix we extract hierarchy of relevant resources for each user. Table 2. The vision on resources of every member of community
User Identification
Resources sorted by relevance
User Id=1
20
16
User Id=2
4
8
User Id=3
3
16
User Id=4
11
16
User Id=5
14
User Id=6
18
2
6
User Id=7
14
4
8
9
User Id=8
8
9
12
16
User Id=9
3
11
User Id=10
14
12
User Id=11
18
14
User Id=12
2
6
3
User Id=13
4
8
9
3
User Id=14
8
20
14
12
16
User Id=15
13
14
5
6
9
9
11
20
Thus the system ensure that the user will not have to interact with irelevant resources for him and his actions in the community can be safer and more efficient. The user also, has its own image on the users from the system, and it is his choice to interact or not with some users. For example, we can consider a community where some user B uploads some resources infected by viruses. Even if a user A, did not explicit evaluates B, with trust propagation mechanism (that use the experiences of other users that already interact with B, and which were evaluated explicitly or implicitly by A) the system will suggest that user B is probably malicious and A can act accordingly.
378
L. Alboaie and M.-F. Vaida
Fig. 2. The graphs represents an overview of the community of 15 users, presented above; A: the graph containing the explicit ratings; B: the graph containing both explicit and implicit ratings; the continue arcs represent the explicit ratings and the dashed arc the implicit computed ratings
4 The Results of Using StarWorth System In this section we shall present a set of consequences resulting from how the system was modeled. • • • •
The system ensures that a user will see resources prioritized, in a similar manner with those which resemble Users who add spam resources will see more spam because the system groups users according to their preferences Resources relevant to a user (even those new) are visible in the top list of resources Users are encouraged to make proper evaluations
It will not happen as in eBay, where users give the most positive ratings of fear of possible revenge. In systems based on StarWorth metric there are no good or bad ratings, there are interesting or uninteresting ratings from the point of view of users. The rating given by a user is pursues its goal, namely to quickly access important resources for it. The previous results are sustained by experiments realized with StarWorth system. As we have seen, trust is a concept that plays a role increasingly important in online communities. With a trust metric, the trust can be propagated in the community. This is an advantage that can be used by a recommendation system.
Modeling of Trust to Provide Users Assisted Secure Actions in Online Communities
379
Intuitively, if we have a new user U evaluating many resources, through standard mechanisms those evaluations cannot be propagated [2, 11]. But if user U has evaluated a set of users, than using a trust metric, trust propagation can be done. As a result, through an implicit mechanism, to user U the system can associate a greater number of resources and the recommendation mechanism is more efficient. In this section we analyzed and exemplified, through one use case, the benefits brought by our trust metric within a recommendation system. The most recommendation systems use PC (Person Correlation) mechanism [12] which allows establishing a correlation between users. After that, a likely userresource rating is computed. Using these values the system will make recommendations. In our study we shall do the following analysis: Input: a number of explicit user-resources ratings for each user, a number of explicit user-user ratings for each user; Output: • • •
Using standard PC algorithm we obtain for every user the number of accessible users and accessible resources; Using a modified implemented version of PC algorithm, that is using useruser evaluations, we obtain for every user the number of accessible users and accessible resources; Using StarWorth local trust metric we obtain for every user the number of accessible users and accessible resources;
Use Case: We consider a community with 10 users and 200 resources. The explicit data evaluations were generated with the tool presented in section 3, which allows configuration of many various parameters in order to obtain online communities with diverse profiles. Table 3. The results obtained from PC algorithm and StarWorth local trust metric
User ID User-User Ratings no. User-Resource ratings no. PC - accessible users no. PC modified – acc. users no. PC - accessible resources no. PC modified – acc. res. no. StarWorth – acc. users no. StarWorth – acc. resources no.
1 2 9 1 1 9 9 9 25
2 3 6 2 3 24 31 9 31
3 3 9 1 2 9 28 9 39
4 2 8 1 1 8 8 9 18
5 2 6 1 1 6 6 9 23
6 7 2 3 9 8 1 1 1 3 9 8 9 31 9 9 36 34
8 2 6 1 2 6 25 9 31
9 2 17 2 2 34 34 9 32
10 3 20 3 6 37 55 9 42
Observation: In PC algorithm we have considered a threshold which represents the minimal required resources evaluations by two users; this threshold is necessary for computing PC coefficient. If this threshold is not satisfied than we have considered 0 as value for PC coefficient; 0 means that it cannot be establish a relation between those users.
380
L. Alboaie and M.-F. Vaida
Analyzing the results obtained for the community consisting of 10 users and 200 resources, we get the following graphical representations:
Fig. 3. Comparison between the number of accessible users using the following algorithms: PC standard, PC using user-user evaluations and StarWorth trust metric
Observation: In Fig. 3. On Ox axis are represented the users IDs and on the Oy axis are represented the number of accessible users.
Fig. 4. Comparison between the numbers of accessible resources
Observation: In Fig. 4. On Ox axis are represented the users IDs and on the Oy axis are represented the number of accessible resources. The following results show the relationship between the number of given ratings and the number of accessible resources and the number of accessible users. The results below were obtained by running the algorithms on 100 instances of online communities. On Ox axes are represented the number of explicit given ratings.
Modeling of Trust to Provide Users Assisted Secure Actions in Online Communities
381
Fig. 5. Correlation between the number of user-resource evaluations and the number of accessible users
Fig. 6. Correlation between the number of user-resource evaluations and the number of accessible resources
As a conclusion of this study, we note that taking into account user-user evaluations, the system ensure a faster integration of new users in the system. Moreover, if they entered the community on the invitation of the older members of the community (e.g. Twine), than this invitation can be considered an explicit evaluation between users. Using StarWorth trust metric the system will be capable to recommend resources from the first moment. All our experiments have shown that our system brings benefits in online communities providing users assisted secure actions.
5 Conclusions The paper presents a model for trust, able to help users from online communities to interact with appropriate users and resources. In these mode good decisions and few time-consuming actions concerning resource management can be realized.
382
L. Alboaie and M.-F. Vaida
As a conclusion of this study, we show that using our trust metric proposal, the system will be capable to provide users a secure context to manage significant resources. Through how it was designed, our system can be integrated in different online communities as: education, e-health [13], social networks etc. As future research direction we will study the behaviour of the model in real comunities as medical and educational domains.
Acknowledgements This work was supported by CNCSIS – UEFISCSU, project number PNII-IDEI 1083/2007-2010.
References [1] Gambetta, D.: Can We Trust Trust? In: Gambetta, D. (ed.) Trust: Making and Breaking Cooperative Relations, pp. 213–238. Basil Blackwell, Oxford (1990) [2] Josang, A., Ismail, R., Boyd, C.: A survey of trust and reputation systems for online service provision. Decision Support Systems 43(2), 618–644 (2007) [3] Massa, P., Avesani, P.: Controversial Users demand Local Trust Metrics: an Experimental Study on Epinions.com Community. In: 25th AAAI (American Association for Artificial Intelligence) Conference (2005) [4] Golbeck, J. (ed.): Computing with Social Trust, vol. 338, p. 50 illus. Springer Publisher, Heidelberg (2009) [5] Ziegler, C.: Towards Decentralized Recommender Systems. PhD thesis, Albert-LudwigsUniversity at Freiburg, Germany (June 2005) [6] Wu, Z., Yu, X., Sun, J.: An Improved Trust Metric for Trust-Aware Recommender Systems. In: 2009 First International Workshop on Education Technology and Computer Science, vol. 1, pp. 947–951 (2009) [7] Alboaie, L.: PReS – Personalized evaluation System în a WEB Community. In: ICE-B 2008: Proceedings of the Int. Conference on E-Business, pp. 64–69 (2008) [8] Breaban, M., Alboaie, L., Luchian, H.: Guiding Users within Trust Networks Using Swarm Algorithms. In: Proceedings of the Eleventh Conference on Congress on Evolutionary Computation, Trondheim, Norway, pp. 1770–1777 (2009) [9] Alboaie, L.: Studies on trust modeling and computation of reputation in online communities. Ph.D. thesis, Al. I. Cuza University, Iasi (March 2009) [10] Ziegler, C., Lausen, G.: Analyzing correlation between trust and user similarity in online communities. In: Proceedings of the Second International Conference on Trust Management (2004) [11] Belkin, N.J., Croft, W.B.: Information filtering and information retrieval: two sides of the same coin? Communications of the ACM, Special issue on information filtering 35(12), 29–38 (1992) [12] Rodgers, J.L., Nicewander, W.A.: Thirteen ways to look at the correlation coefficient. The American Statistician 42, 59–66 (1988) [13] Alboaie, L., Buraga, S.: Trust and Reputation in e-Health Systems. In: International Conference on Advancements of Medicine and Health Care through Technology, ClujNapoca, Romania. IFMBE Proceedings, vol. 26, pp. 43–46. Springer, Heidelberg (2009)
A Collaborative Social Decision Model for Digital Content Credibility Improvement Yuan-Chu Hwang Department of Information Management, National United University 1, Lien-Da, Miao-Li, 36003, Taiwan, R.O.C. [email protected]
Abstract. This paper proposed a concept of harvesting the social network intelligence as high quality information source for decision-making. A collaborative social decision model (CSDM) is proposed for improving the digital content creditability on leisure e-service. This study investigates people’s original experiences and feedbacks form their social network relationships. Those personal perceptions and latest feedback information from social network are utilized for exploring high quality digital content. The unique information from social network could help user shape the digital content quality and bring alternative information sources of leisure e-service system. Contrast to traditional leisure service applications, the collaborative social decision model could improve digital content creditability and facilitate leisure e-service innovation. Keywords: Digital content credibility, social network, experience co-creation.
1 Introduction Advanced information technology has connected the world more tightly than we used to be. Many applications apply web 2.0 concept provides people a new channel to present their idea and share information around the world. The user-generated content (UGC) is one of the significant features that bring huge digital content. However, the convenience of sharing digital content also make people troubled with massive digital contents, especially those unreliable content. Malicious digital content or advertise messages make people uncomfortable. In the current stage, leisure industry has utilized the advanced information science and technology to create new e-services for leisure activity participants. However, current leisure e-services mainly focus on the service itself which only emphasis on potency and benefit increasing instead of considers the possible connections of people and their essential needs. The digital content creditability in leisure e-service system is controversial. In order to prompt the content quality in leisure e-service, it is necessary to find a way to prevent low quality digital content in leisure e-services. In this paper, we explore the social network and create new opportunity for improving digital content creditability in leisure e-service systems. From the very basic needs of humanity, leisure industry should focus on the essential requirements of people. what are the actual needs of human in leisure e-service? Creditable and trustworthy digital contents in leisure e-service system could fulfill user’s basic needs. F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 383–390, 2010. © Springer-Verlag Berlin Heidelberg 2010
384
Y.-C. Hwang
However, the perceptions may vary for different digital contents; it is mostly influenced by their personal experience they encountered. In the experience economy age, integrate information technology and user’s perceptions could help us obtain useful information for decision-making. Users’ social networks could become a new information channel to help others to evaluate a great quantity of digital contents. From the viewpoint of harvesting the social network intelligences, this study investigates people’s original experiences and feedback information from their social network relationships. Contrast to traditional leisure service applications, the social network provides huge personal experiences from our social relationships, which makes leisure e-service into a brand-new era with comparatively creditable digital contents.
2 Social Network Intelligence We already integrated the network into our real life. By utilizing networks for daily events including knowledge acquisition, information transmission, interpersonal relationship development, and hypothesized activity events participation. The fusion of Internet cyberspace and real life becomes the future tendency of our life. The information of different sources from cyberspace may provide critical and essential information for decision-making. Through the collaborations of cross-domain, crossdiscipline, and cross-culture user groups, the joint outcome could form creative and unprecedented new ideas. Collaboration is a recursive process where two or more people or organizations work together toward an intersection of common goals. [2] Based on Surowiecki’s study [4], the fundamental elements of collective intelligence include cognitive diversity, independence, and decentralization. Collaboration does not require leadership and can sometimes bring better results through decentralization and egalitarianism [1].Collaboration with those from people’s social network relationships is one kind of typical innovation process. According to Lerman’s study [5], the core elements of social websites including: (1) users create or contribute content, (2) users annotate content with tags, (3) users evaluate content and (4) users create social networks by designating other users with similar interests as friends or contacts. Social networks could provide information that could not achieved from traditional channels. Extensive information exchange could be also carry out owing to social network relationships. However, if the collaboration is performed with the unfamiliar user groups, the information service has to face the barrier such as trust and information quality. The lack of quality information source and massive unrelated information (information smoke) makes it difficult to retrieve useful information for decision making from the e-service environment. The difficulty is not only because the cyberspace is fulfilled with malicious users but also consider the potential risk of the gathered information. This paper proposed a concept of harvesting the social network intelligence as high quality information source for decision-making. A collaborative social decision model (CSDM) is proposed for improving the digital content creditability on leisure e-service. The remaining sections are shown as follows. In section 3, a collaborative social media model and its computational designs for leisure e-service is presented. Finally, a conclusion and the directions of our future work are presented in Section 4.
A Collaborative Social Decision Model for Digital Content Credibility Improvement
385
3 The Collaborative Social Decision Model for Leisure e-Service It is difficult to cooperate with someone who is a complete stranger in our daily life. Based on the concept of the proximity e-service proposed by Hwang and Yuan [3], homophily describes the tendency of individuals to associate and bond with similar others. Homophily has been found in numerous network studies. By highlighting the homophily of e-service participants, these isolated individuals can be treated as a user group with proximity (that is: common goals, similar interests, etc.). Loose-coupled eservice participants thus can be empowered to form groups/clusters with weak ties. Proximity thus enables e-service participants to contribute their strength for improving digital content creditability. Our study based on the proximity theory concept and provides an innovative mechanism for leisure e-service participants to share their knowledge based on their perception of digital content in CSDM-based leisure e-service. The perception and feeling of the leisure activities such as tourism information, the artifacts of culture, etc. becomes the knowledge for sharing. The homophily of similar interests could becomes one of the incentives for leisure e-service participants to break the barrier and join the e-service. The homophily of geographic proximity (i.e. visit the same place, similar tour trip) makes user feel comfortable and close to each other. These perception feelings make users to join and contribute for the leisure e-service instead of just become a free rider that only retrieves useful information from the leisure eservice platform. Moreover, utilize the social network relationships could form a huge information source matrix that reserves massive useful personal experience of the leisure information. The social network then becomes powerful resources to enhance the information quality of leisure e-service. We will introduce the collaborative social decision model (CSDM) for leisure e-service in the next subsection. 3.1 Framework of CSDM on Leisure e-Service The social network then becomes powerful resources to enhance the information quality and its creditability of leisure e-service. The CSDM (collaborative social decision model) for leisure e-service is shown as Figure 1, which is constructed by several components. The major module is noted as “Collaborative Social Network Intelligence” Module, which is comprised of five sub-components. Digital content of leisure information of each tour spots are preserved in the database. We will elaborate the components as follows. Shared knowledge database. The shared knowledge database stores every separated evaluation from different individuals. The design of shared knowledge database is to collect every e-service participants’ personal experience and knowledge they contributed for the e-service platform. Digital content evaluation module. In order to obtain high quality leisure information, it is important to consider the quality evaluation as well as information utilization. Our study provides an evaluation module for users to evaluate the digital content information of the tour spot. The evaluation includes four categories, which will be further explained in the design of “Leisure Information of Tour Spots” database. Users can provide their personal evaluation about the tour spots according to each
386
Y.-C. Hwang
Fig. 1. Collaborative Social Decision Model for Leisure e-Service
category. The provided information is stored in the shared knowledge database. The evaluation result is attached with the original evaluator. That is, when the social network relationships are considered as a parameter, users can acquire specific users’ evaluation result of a target tour spot for decision-making. Social network relationships database. The social network relationships database stores every e-service participants’ social network relationships. The record sets include each identity’s connection relationships as well as the trustworthy value. The trustworthy value is designed for users to determine the degree of trustworthy of their evaluations. Since we do not treat all friends equally, therefore, their evaluations should have different weight for consideration. Social Intelligence retriever module. On the contrast to the Information Evaluation Module, harvesting the intelligences among the social networks should consider two major topics. One is the subject that users want to evaluate; the other is user’s social network relationships. For every tour spots, there is a record data set that stores all leisure e-service participants’ evaluation results. Also, the record data sets include the four category evaluations of the tour spot. Once the Social Intelligence Retriever Module is triggered, the leisure e-service platform will compute the summarized results of the target tour spot according to the social networks relationship parameters. Personal decision parameters. Every leisure e-service participant’s decision parameters were managed in this subcomponent. Users may determine different weight of their various information sources. User can make decision from various information sources, including the information sources from their friends, or the summarized global information sources. For every information source categories, user could manage the detailed weight adjustment settings. All the parameters and weight values were stored in this subcomponent. The above subcomponents comprised the “Collaborative Social Network Intelligence” module, which is directly linked to the Digital Content & Leisure Information database.
A Collaborative Social Decision Model for Digital Content Credibility Improvement
387
Digital Content & Leisure information database. External leisure environment did not influence users’ decision indirectly, however, the preference does. The preference of user is the most direct influence for their entire decision. The factors that influence leisure activity participation maybe differentiae to every individual users, including the convenience of traveling, the time consumption, distance to the target tour spot, the economic ability they can afford, the interest traditional culture, their social status, the mental condition as well as the nature of the tour spot itself. However, some of the factors cannot be appropriately evaluated. In this paper, four factors were identified for decision-making. Those preference factors include the fee of the tour spot, the convenience parameter, the history preference, and characteristic of the tour spot. The four preference factors were considered in the leisure e-service platform. 3.2 Computational Designs for CSDM In order to improve the digital content creditability and facilitate the decision making for leisure e-service participants, several formulas were proposed in the CSDM on Leisure e-Service. Digital Content Evaluation. The evaluation cannot execute without authorization. The authorization design is to prevent malicious users to discredit the overall evaluation results. User must login the leisure e-service platform first, then the evaluation function will appear. The computation formula is based on the average value, which accumulate the whole available evaluation results for specific target tour spot. Since the computation is an aggregated value of all evaluation result. The updated value of each tour spot in four different categories can be calculated from the following formula.
V( fee ,conv,hist ,char ) =
(V(
fee ,conv , hist ,char )
× T( fee,conv ,hist ,char ) + NV( fee,conv ,hist ,char ) )
(T(
fee ,conv , hist ,char )
+ 1)
(1)
Vx denotes the average evaluation result of each category. Tx denotes the total number of evaluations of each category. NVx denotes the new evaluation result of each category. Where x represent the four categories: fee, convenience, history, and characteristics. The more evaluation results there exists represent the leisure e-service platform collect more information sources from the social network intelligence. This will leads the estimation creditability more accuracy. Social Intelligence Accumulation. In order to harvest the social network intelligence for decision-making, the social network parameters should be considered in the formulation. n
π all _ friend = ∑ π friend i
(2)
i =1
π all _ friend
denotes the sum total of all trustworthy value of user’s friend (among their
social network relationships)
388
Y.-C. Hwang
Ω ( fee ,conv ,hist ,char ) = ⎡ π friend j ⋅ F( fee ,conv ,hist ,char ) j ⎤ ⎥ + ω system × V ( fee ,conv ,hist ,char ) π j =1 ⎢ ⎥⎦ all friend _ ⎣ n
ω friend × ∑ ⎢ Where
Ω ( fee,conv ,hist ,char ) denotes the updated value of each tour spot. ω friend denotes
the weight of trustworthy from social network relationships. weight
(3)
of
ω system
trustworthy from global accumulation value F( fee ,conv ,hist ,char ) j denotes the evaluation result from user’s friend.
Score preference =
(λ
fee
denotes the
from
× Ω fee + λconv × Ω conv + λ hist × Ω hist + λchar × Ω char )
(λ
fee
+ λconv + λ hist + λchar )
system.
(4)
Where λ ( fee ,conv ,hist ,char ) denotes the weight of each category. The ScorePreference represent the final score according to user’s personal decision model. For facilitating the decision-making process, the recommended digital content list will be provided to leisure e-service participants according to their preference settings.
4 Evaluation Design In order to evaluate the overall performance of the collaborative social decision model, the creditability of leisure digital content is evaluated according to user’s perceptions. Creditability improvement of digital content is the perception of users that can be further unfolded into four categories, including its utility, convenience, satisfaction, and the wiliness to use. The user perception questionnaire contains 21 questions in 7 points Likert scale. The ongoing evaluation is set to compare a traditional leisure e-service with the CSDM-based leisure e-service on its digital content. Preliminary result indicates that user cognize the CSDM-based digital content was more satisfied than traditional one.
5 Discussion This study utilize collaborative social network as the decision information, features of the CSDM based service can be discussed in the following subsections. 5.1 Customized Personal Service Too much information is as useless as not enough. The amount of information available will continue to increase, and it is likely that there will be an urgent need to organize information for personal use. CSDM-based leisure e-service will deliver
A Collaborative Social Decision Model for Digital Content Credibility Improvement
389
services that customize information according to various personal tastes, preferences and situations, delivering them on a timely basis. These services will not only enhance the quality and convenience of everyday life, but also create business opportunities for corporations. 5.2 Human Nature The CSDM-based leisure e-service supports the urgent needs in the ad-hoc service environment. Exploring the whole leisure e-service users who would like to share their knowledge and contribute to the greater society truly enhances the creditability and reliability of information. The context of the user (e.g., time and place) can be measured and interpreted; creditable services can be provided at the point of need; and applications can be highly interactive, portable and engaging. Complying with user’s human nature, the CSDM-based e-service will encourage users to embrace various e-service applications. 5.3 Relationship Sensibility CSDM-based e-service enables the establishment of social network connections that generally provide valuable and reliable information exchange of decision-making. Collective wisdom from CSDM-based digital content encourages users to accept each other as alternative information sources. Information heterogeneity provides rich resources for the user to estimate whether the provided digital content user is trustworthy and reliable. 5.4 Limitations and Assumptions For all collaborative e-services, trust is still the major barrier for users to embrace the social network intelligence. In this paper, we utilize the homophily of users in CSDMbased service as the incentives for encourage users to participate in the collaborative e-service. There are some assumptions for the development of CSDM-based eservices, as we assume the incentives are high enough to courage all users to join in the leisure e-service. In the collaborative scenario, the ubiquitous accessibility of wireless communication environment is another assumption since there are some transmission costs. However, this could be solved by ICT technologies such as 3G and WiMax. The advanced user perception for CSDM-based leisure e-service is still waiting for future studies. In this paper, the focus is on the opportunity for harvesting the social network intelligence and enables the leisure eservice innovation on digital content creditability improvement.
6 Conclusions This paper proposed a concept of harvesting the social network intelligence as the high quality information source. We integrate information technology and follow people’s psychological feelings and perceptions as well as their social networks for improving digital content creditability and exploring opportunities for leisure eservice innovations.
390
Y.-C. Hwang
A collaborative social decision model is proposed for improving the digital content creditability on leisure e-service. This study investigates people’s original experiences and feedbacks form their social network relationships. The proposed CSDM design aims to provide an environment for knowledge sharing through the social network relationships for leisure e-service participants. The homophily of similar interests could becomes one of the incentives for leisure e-service participants to break the barrier and join the e-service. The homophily of proximity makes user feel comfortable and close to each other. The social networks could reserve huge useful information sources, and worthy for advanced utilization. Users may improve their decision quality through the collaborative social decision model and obtain creditable digital contents of the leisure e-service system.
References 1. Spence, Muneera, U.: Graphic Design: Collaborative Processes - Understanding Self and Others. Oregon State University, Corvallis (April 13, 2006) 2. Wikipedia, http://en.wikipedia.org/wiki/Collaboration 3. Hwang, Y.C., Yuan, S.T.: Ubiquitous Proximity e-Service for Trust Collaboration. Internet Research 19(2), 174–193 (2009) 4. Surowiecki, J.: The Wisdom Of Crowds: Why The Many Are Smarter Than The Few And How Collective Wisdom Shapes Business, Economies, Societies And Nations, Little, Brown (2004) 5. Lerman, K.: Dynamics of Collaborative Document Rating Systems. In: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, pp. 46–55 (2007)
Improving Similarity-Based Methods for Information Propagation on Social Networks Francesco Buccafurri and Gianluca Lax DIMET, Universit` a degli Studi Mediterranea di Reggio Calabria Via Graziella, Localit` a Feo di Vito, 89122 Reggio Calabria, Italy [email protected], [email protected]
Abstract. Social networks are a means to deeply and quickly propagate information. However, in order not to vanish the advantage of information propagation, the amount of information received by users should not be excessive. A typical approach operates by selecting the set of contacts to which propagate the information on the basis of some suitable similarity notion, trying thus to reach only nodes potentially interested. The main limit of this approach is that similarity is not completely suitable to propagation since it is typically non-transitive, whereas propagation is a mechanism inherently transitive. In this paper we show how to improve similarity-based methods by recovering some form of transitive behaviour, through a suitable notion called expectation. The non-trivial combination of similarity and expectation in a nice mathematical framework, provides the user with a flexible tool able to maximize the effectiveness of information propagation on social networks.
1
Introduction
A social network is a community of users, called nodes, connected each other by friendship, job, religion, common interests, information sharing, etc. Being member of a social network, an individual can receive useful information or suggestions from the other members of the community. The wideness of a social network is an index of its effectiveness. Indeed, networks with a few nodes produce poor information, whereas larger networks are more likely to introduce new ideas and opportunities to their members. One of the advantages of social networks is the deepness and rapidity of information propagation among friends of a social network. However, the amount of disseminated information is not always proportional to its real utility for users. The grow of marketing techniques aimed at massive spreading of products and information has made the user, who constantly receives too much data, incapable of selecting the information really interesting, thus vanishing many of the advantages potentially given by the membership to a large social network. As a consequence, it is a very interesting issue to design methods for supporting information propagation. The problem has been investigated in the recent literature. In [1] a mechanism which uses gossip algorithms for information dissemination on social networks is presented. In gossip algorithms each node communicates with no more than one neighbor in each time slot [4]. The problem F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 391–401, 2010. c Springer-Verlag Berlin Heidelberg 2010
392
F. Buccafurri and G. Lax
has been studied for both single-piece and multi-piece information spreading [25]. Interesting results regarding how information is propagated over real-life environments have been presented in [10,15,7]. The most promising research direction is designing methods for filtering the information which flow on the network trying to direct such information only toward nodes that are potentially interested [17,16,12]. These solutions typically rely on some suitable notion of similarity [11,18]. Due to information propagation, a semantic-driven approach based on similarity between nodes should have some form of transitive behaviour. It is a matter of fact that similarity is not transitive, due to its multi-dimensional nature. A first issue to deal with, in order to improve the similarity-based approach, is therefore to recover some transitive feature in the method we use to select nodes. Another important issue is whether the semantics embedded into the information has to be used. We observe that, even though this should be clearly hopeful, it is non-realistic in most cases. Indeed, in order that the information can carry a sufficient semantic power, its structure, which has to be compared with the node content, should be enough rich and complex. There is a rich literature about semantic extraction on social networks (see for example [2,20,3,19]). Unfortunately, this is not the case in general. In our work we consider the most general real-life case, where the propagated information is not structured in some way (like, for example, in XML) allowing semantic extraction. In this paper, we propose a method facing both the above issues, i.e., (1) recovering some transitive feature for the notion which the filtering mechanism relies on and (2) dealing with unstructured propagated information, improving thus the similarity-based approach to propagating information on social networks in the most general real-life case. Our proposal takes its origin from a previous work presented in the context of information retrieval in P2P systems [5], where the possibility of receiving user feedbacks intrinsically returned by the query answer has been profitably exploited in the framework. In contrast, in the case of information propagation on social networks, there is no chance to have feedbacks from users if we want to have a non-invasive system. The method here proposed is based on the detection, for each node involved in the information propagation, of the most interested nodes which the information has to be forwarded to. In order to select such nodes, our method exploits, instead of just local similarities (i.e., similarities between nodes), a neighborhood-based semantic property, called expectation, embedding similarities and presenting some form of transitive behaviour. With the term neighborhood we mean that what we use is not simply the similarity of a node w.r.t. another node, but also the closeness of the latter w.r.t. the community of nodes enough similar to the former. In order to reach the above goal, we introduce in this paper a suitable notion of similarity, assuming that, as usual in the context of social network, no meta-data are available describing user profiles. This is in contrast a characteristic on which the notion of similarity adopted in [5] relies.
Improving Similarity-Based Methods for Information Propagation
393
The paper is organized as follows. In the next section we define the notion of similarity between two nodes. In Section 3 we introduce the notion of expectation and we define the mathematical framework allowing us to represent semantic closeness among nodes. Moreover, an example on how the method works is reported. In Section 4 we describe the result of the experiments conducted in order to validate the proposed method. Finally, in Section 5 we draw our conclusions.
2
Node Similarity
Our approach is based on the notion of expectation of a node w.r.t. another node and expectation is computed starting from the similarity between nodes. For this reason we have to define how to measure node similarity. It is worth noting that since expectation is parametric w.r.t. similarity, different similarity notions could be exploited without changes in the method for the computation of the expectation. In this section we consider the problem of defining a suitable notion of similarity between two nodes. When a similarity notion S has to be designed, the following question should have to be considered: Has S to be a fuzzy similarity relation [29]? That is, has it to be reflexive (i.e., Sii = 1), symmetric (i.e., Sij = Sji ) and transitive (i.e., Sij ≥ Sik ⊕ Skj , where ⊕ is any triangular-norm (T-norm) like the minimum function)? While renouncing to the first two properties would be unfounded, the last property is often missed (under plausible T-norms as the minimum), every time similarity is computed on the basis of possibly independent dimensions used for representing the content of nodes. In words, it is acceptable that we want to capture the case: node A is very similar to node B on the basis of some dimensions belonging to their description, B is very similar to node C, but A is not similar to C, since dimensions, which the similarity between B and C is based on, are different from those supporting the closeness between A and B. Given two sets of elements, A and B, we can define several notions of similarity [26] of the two sets (like, for example, Pointwise Mutual-Information [9] or Salton [24]). The most used notions exploit the L1 and L2 norms. We can evaluate the similarity by the L1-norm [13] as the overlap between the two sets divided by ∩ B the product of their sizes L1(AB)= A |A|·|B| . Similarly, we can measure the similarity by the L2-norm as L2(AB)= √A ∩ B . |A|·|B|
As observed earlier, any (more complex) notion of similarity can be used in our framework with no change on its scheme. For example, our similarity notion could be extended by computing the similarity between terms which are not necessarily lexically similar, following the same approach used in [23] where semantic similarities are computed by mapping terms to an ontology and by examining their relationships in that ontology. Also approaches exploiting SelfOrganizing Maps [14,21] could be used for topology-preserving mapping from a high-dimensional input space (typically, text messages) to a two-dimensional output space and then, deriving the similarity measure from the distance in the output space.
394
F. Buccafurri and G. Lax
The similarity measure we adopt in our system is based on the L1-norm, since it is proportional to the number of common interests of the two nodes. We observe that our choice of exploiting the most simple notion of similarity arises from two considerations. First, the main aspect of our proposal regards the use of the expectation property. Therefore, we have chosen to introduce the expectation in a context as simple as possible. The second advantage arising from our choice is making our approach applicable to real-life social networks where nodes are basically described in the same way as our model, with no complex structure. In a very realistic fashion, we model the information maintained in a node of a social network as a profile, which is a set of the IDs of its node friends (contacts), RSS1 subscription, blogs, products, services, celebrities, etc. In our system the similarity between two nodes is defined as follows. Given two nodes A and B with profile PA and PB respectively, the similarity PA ∩ PB where the symbols SAB between A and B is defined as: SAB = min(|P A |,|PB |) ∩ and | | denotes the operator intersection and cardinality, respectively. The numerator computes the number of elements contained in the profile of both the nodes. The denominator normalizes the result so that similarity value lies into the range [0, 1]. In the following example we show how the similarity is computed and also that our notion of similarity (but, we guess, every reasonable similarity) is not transitive under the T-norm minimum. Consider three nodes A, B, and C having profile PA = {a1 , . . . , a5 , b}, PA = {a1 , . . . , a5 , c1 , . . . , c5 }, and PC = {c1 , . . . , c5 , d}, respectively, where ai , b, ci , d with 1 ≤ i ≤ 5 are the IDs of the elements of the node profiles. We have that SAB = 56 = 0.83, SBC = 0.83, and SAC = 0 since there is no intersection in the profile of A and C. This is a clear example where the similarity notion is not transitive.
3
Beyond the Similarity: The Expectation Notion
The aim of this section is (recursively) defining a new property, called expectation of a given node A w.r.t. another node B, by taking into account not only the similarity between A and B, but also the expectations of all other nodes C, enough similar to B. Expectation, as well as similarity, is represented as a fuzzy coefficient lying into the range [0, 1]. In words, expectation, from the point of view of a node A, captures something like: if all nodes C, to which my expectation is high, are highly similar to a node B, then my expectation w.r.t. B has to increase, possibly compensating a low similarity of B w.r.t. me. By using expectation, we give a sort of transitive behaviour to such a semantic property overcoming the limit of the pure similarity. The propagation mechanism implemented by each node of the social network is to forward information only to contacts having an expectation greater than or equal to a given threshold in order to concentrate the direction of information flow towards nodes that hopefully will be interested in this information. 1
Rich Site Summary or Really Simple Syndication [28].
Improving Similarity-Based Methods for Information Propagation
395
The informal notion of expectation given above is defined through a linear system as follows. Given a node N0 , let {N1 , N2 , . . . , Nn } be the set of its contacts. The tuple E01 , E02 , . . . , E0n of the expectations of N0 w.r.t. its contacts is the unique solution of the following linear system: ⎞ ⎛ n (1) E0k Skj ⎠ (∀j ∈ {1, . . . , n}) ⎝Eoj = β0 S0j + (1 − β0 ) Sˆ1 j
where Sˆj =
n
k=1,k=j
Skj and 0 ≤ β0 ≤ 1 is a suitable coefficient initialized to 1.
k=1,k=j
Let give now an informal description of how the system works. E0j is obtained by summing two contributions. The first one (i.e., S0j ) is the similarity between N0 and Nj . It can be considered a local component. The second one, n i.e., Sˆ1 E0k Skj , makes E0j neighborhood-dependent. It increases E0j by j
k=1,k=j
a contribution proportional to the expectancy of N0 w.r.t. any node Nk different from Nj . In order to make relevant only contributions relating to nodes enough similar to Nj , the higher the similarity between Nk and Nj , the higher S the coefficient weighting such a contribution (i.e., Sˆkj ) is. In this way the overj all expectation E0j of N0 w.r.t. Nj takes into account not only the similarity between N0 and Nj , but also the expectations of all other nodes Nk , “enough” similar to Nj . According to this mechanism, the linear system can be viewed as a sort of set of equilibrium equations for the whole system. The purpose of the coefficient β0 of the linear system is to implement the adaptivity of the system. Indeed what we want the system learns while it works is the influence of similarity w.r.t. the success degree. By modifying β0 we increase or reduce the importance of the similarity in the computation of the expectation. Concerning the value of the coefficient β0 , initially set to 1, it has to be increased if useful information are received from nodes similar to N0 . Otherwise, it has to be decreased. In particular, whenever N0 receives a useful information from a 0h node Nh the adopted updating rule is: β0new = β0 +S . Observe that the new 2 value of β0 is obtained by averaging with the previous one in order to smooth its change and to avoid an unstable behaviour of the system. It is worth noting that the above definition is well founded, since the system admits always a unique admissible solution. This is in fact stated in the next theorem showing that, for every value of the parameters occurring in (1), there exists a unique solution of the linear system (1), which is the value assignment to the expectation coefficients satisfying (1). Theorem 1. Given a set of real coefficients β0 , S0j , Sjk such that β0 , S0j , Sjk ∈ (0, 1) with j ∈ {1, . . . , n} and k ∈ {1, . . . , n} \ {j}, there exists a unique n-tuple of [0, 1] real values S = E01 , . . . , E0n satisfying (1). Proof. We first prove that there exists a unique solution of the System (1). Then, we show that such a solution is a tuple of fuzzy values ranging from 0 to 1.
396
F. Buccafurri and G. Lax
Existence and Uniqueness. Denoting by mhk the value (1 − βi ) SSˆkh , we can write the coefficient matrix M of the System (1) ⎡ −1 m12 ⎢ m21 −1 M =⎢ ⎣ ... ... mn1 mn2
j
as
⎤ ... m1n ... m2n ⎥ ⎥ ... ... ⎦ ... −1
According to Cramer’s rule, we will prove that the system (1) has a unique solution since the determinant of its coefficient matrix is nonzero. We proceed by contradiction assuming that the determinant of M is zero and thus, there exists a linear combination with nonzero coefficients h1 , . . . , hn of the columns of M producing the 0-tuple. Let hk and hs be the maximum and the second maximum values among h1 , . . . , hn . By considering the k-th row, we can write: n
−hk +
mkr hr = 0
⇒
hk =
r=1,r=k
Since
n
mkr hr
r=1,r=k
n
and also that hk ≤ hs
n
mkr .
r=1, r=k,s
mhr < 1, it results hk < hs , which is a contradiction because
r=1, r=h
hs is the second maximum. The existence and uniqueness is thus proven. Now we prove that the solution is a tuple of values ranging from 0 to 1. First we show that the solution is a tuple of values greater than or equal to 0. E0h ≥ 0 ∀h ∈ {1, . . . , n}. By contradiction suppose that there exists in the solution of (1) a negative expectation. We denote by E0m the minimum expectation, then E0m < 0. We have that: n E0m = β0 S0m + (1 − β0 ) Sˆ1 E0k Skm < 0 m
k=1, k=m
Since E0m is the minimum expectation we have that E0m ≥ β0 S0m +(1−β0)E0m and thus β0 E0m ≥ β0 S0m ⇒ E0m ≥ 0. We have thus reached a contradiction, concluding this part of the proof. The last step is showing that the solution is a tuple of value less than or equal to 1. E0h ≤ 1 ∀h ∈ {1, . . . , n}. By contradiction suppose that there exists in the solution at least one expectation greater than 1 and let E0M be the expectation having the maximum value. Necessarily, E0M > 1. Since we have already proven that E0h ≥ 0 ∀h ∈ {1, . . . , n}, from (1) we obtain: E0M = β0 S0M + (1 − β0 )
1 ˆ SM
n
E0k SkM ≤ β0 + (1 − β0 )
k=1, k=M
and, thus: E0M ≤ β0 + (1 − β0 )E0M The theorem is then proven.
⇒
1 ˆ SM
n
E0M
k=1, k=M
E0M ≤ 1, that is a contradiction.
Improving Similarity-Based Methods for Information Propagation
397
Concerning the computational complexity of solving the linear system (1), we observe that this is a classic well studied problem and many algorithms have been proposed for making feasible its solution also for large system dimensions. These results are of high practical interest, since large systems of linear equations occur in many applications such as finite element analysis, power system analysis, circuit simulation for VLSI CAD, and so on. In the general case, the cost for finding an exact solution is O(nω ), coinciding with the cost of executing a n × nmatrix product. The currently best known ω is 2.376 [8], while a practical bound is 2.81 [27,6]. We conclude this section by reporting an example showing that expectation has a transitive behaviour. Referring to the example presented at the end of Section 2, assume that the elements ai in the profile of nodes A and B regard wine (for example, they could be the subscription to RSS about wine meetings), whereas the elements ci that are common to the profile of B and C regard food. We guess that A is interested in wine, C in food, and B in both wine and food (for example, B could be a sommelier). Moreover, assume that only the pairs (A,B) and (A,C) are contacts (i.e., B and C cannot communicate each other). In an information propagation driven by similarity between nodes, A forwards the information only to B (not to C since SAC = 0). In its turn, B should send this information to C (since SBC is very high) but this does not occur because B and C are not contacts. What we want to highlight with this example is that, even though C has the potentiality to be reached by the information (through A) and even though the similarity-based approach would drive the information towards C following the path (A-B-C), due to the non-transitive behaviour of the similarity, C does not receive this information. Using an expectation-driven propagation, the expectation of A w.r.t. C is much greater than zero because C is very similar to B, w.r.t. the expectation of A is high, since SAB = 0.83. As a consequence, A propagates this information to C provided that a suitable value for the propagation-threshold is fixed. Concerning this aspect, we remark that by a similarity-based approach C cannot receive the information for any non-zero value value of the threshold t.
4
Experiments
In this section we describe the results of a number of experiments performed to validate our approach. 4.1
Data Set
We generated a social network with 10,000 nodes and we randomly created a graph of connections between nodes in such a way that the network follows a power law [30], in which the probability that the degree of a node (i.e., the number of contacts) is k is proportional to k −2 . Many existing social networks present this characteristic [22,15]. We created a set of interests belonging to a universe of 100 domains, representing potential interests of nodes. The profile
398
F. Buccafurri and G. Lax
of each node contains some information items concerning i domains, where i follows a Zipf distribution [30], with z = 1. Here the Zipf distribution models that the most of users are interested in a few domains. 4.2
Measurements
In each experiment, we first set a real value t lying in the range from 0 to 1, called (propagation) threshold, then we randomly selected a node of the network and randomly chose one of its information items, say I. I has been propagated over the network in such a way that each node N propagates I to a set C of nodes, chosen by three different methods: 1. method All, in which C coincides with the set of all contacts of N . This method is used to measure the best result obtainable in each experiment; 2. method Sim, in which C is the set of contacts of N having similarity with N greater than or equal to t; 3. method Exp in which C is the set of contacts of N having expectation w.r.t. R greater than or equal to t. At the end of the propagation, we measured the number, say Nx , of nodes having in their profile information items of the same domain as I which are reached by this information, using the method x (thus, for an instance, Nall is the number of nodes measured when the method All is exploited). We measured also how many times the information is propagated to a node by the method Exp. We varied the threshold t from 0 to 0.9 and we repeated the experiment 50 many times and averaged the measured values. 4.3
Results
The effectiveness of the methods Sim and Exp is computed by NExp NAll %,
NSim NAll %
and
respectively. In words, this is the fraction of nodes interested in the information items, reachable by means of the method All, which are also reached by the methods Sim and Exp, respectively. The results of these experiments are depicted in Figure 1. We observe that the efficacy of the methods decreases dramatically for t increasing, especially for t > 0.3. Moreover we observe that the use of expectation gives always better results than that of similarity. In particular, we point out that setting t ≤ 0.3 the method Exp reaches an efficacy very close to that of the method All, that is, more than 94%. Concerning the amount of information propagated to a node by the method Exp, the results of these experiments are shown in Figure 2. For t = 0 we have that the information is propagated about 19,000 many times, whereas setting t = 0.3 this amount is halved. From these experiments, we can conclude that the use of Exp with a threshold t = 0.3 allows us to obtain a quasi-optimal efficacy (about 94% w.r.t. the result obtained by propagation of information to all the contacts) while reducing of 50% the number of propagations, thus enforcing a very significant advantage. Clearly, this reduction is obtained by avoiding to propagate information items to these nodes that are not interested in them.
Improving Similarity-Based Methods for Information Propagation
399
4
2
100 Sim Exp
80
1.6
70
1.4
60 50 40 30
1 0.8 0.6 0.4
10
0.2
0
0.1
0.2
0.3
0.4 0.5 Threshold t
0.6
0.7
0.8
Fig. 1. Effectiveness vs threshold
5
1.2
20
0
Exp
1.8
Information Propagation
Effectiveness %
90
x 10
0.9
0
0
0.1
0.2
0.3
0.4 0.5 Threshold t
Fig. 2. Information threshold
0.6
0.7
0.8
propagation
0.9
vs
Conclusion and Future Work
Information propagation on social networks is a very relevant topic in the research community. In this paper we have proposed a new method based on the notion of expectation allowing us to drive the dissemination of information towards only nodes that are considered more interested in the information. In particular, expectation shows some transitive characteristics that are not typically presented by the most used definitions of similarity. Thus, our method improves the similarity-based approach emerging in this field. An experimental validation carried out on synthetic data sets shows that our approach based on expectation is more effective than that exploiting only similarity. At the moment, we are testing the behaviour of our approach when used together with other similarity notions, such as Pointwise Mutual-Information or Salton, described in Section 2. As a future work, we are studying how to improve our method in order to deal with the issue of spam or viruses dissemination which represents a real problem related to the topic studied in this paper. In this case, prior to the selection of nodes which the information is propagated to, it will be necessary to decide whether the information is trusted.
Acknowledgement This work was partially funded by the Italian Ministry of Research through the PRIN Project EASE (Entity Aware Search Engines).
References 1. Ahmadi, H., Mehrbakhsh, A., Asgarian, E.: Towards an efficient method for spreading information in social network. In: Asia International Conference on Modelling & Simulation, pp. 152–157 (2009)
400
F. Buccafurri and G. Lax
2. Aleman-Meza, B., Nagarajan, M., Ramakrishnan, C., Ding, L., Kolari, P., Sheth, A.P., Arpinar, I.B., Joshi, A., Finin, T.: Semantic analytics on social networks: experiences in addressing the problem of conflict of interest detection. In: WWW ’06: Proceedings of the 15th international conference on World Wide Web, pp. 407–416. ACM, New York (2006) 3. Bekkerman, R., McCallum, A.: Disambiguating web appearances of people in a social network. In: Proceedings of the 14th international conference on World Wide Web, pp. 463–470. ACM, New York (2005) 4. Boyd, S., Ghosh, A., Prabhakar, B., Shah, D.: Gossip algorithms: Design, analysis and applications. In: IEEE INFOCOM, vol. 3, p. 1653 (2005) 5. Buccafurri, F., Lax, G.: Enabling Selective Flooding to Reduce P2P Traffic. In: Meersman, R., Tari, Z. (eds.) OTM 2007, Part I. LNCS, vol. 4803, pp. 188–205. Springer, Heidelberg (2007) 6. Bunch, J., Hopcroft, J.: Triangular factorization and inversion by fast matrix multiplication. Math. Comp. 28(125), 231–236 (1974) 7. Cha, M., Mislove, A., Gummadi, K.: A measurement-driven analysis of information propagation in the flickr social network. In: Proceedings of the 18th international conference on World wide web, pp. 721–730. ACM, New York (2009) 8. Coppersmith, D., Winograd, S.: Matrix multiplication via arithmetic progression. Journal of Symbolic Computation 9(3), 251–280 (1990) 9. Cover, T., Thomas, J.: Elements of information theory, New York (1991) 10. Gruhl, D., Guha, R., Liben-Nowell, D., Tomkins, A.: Information diffusion through blogspace. In: Proceedings of the 13th international conference on World Wide Web, pp. 491–501. ACM, New York (2004) 11. Kang, S.: A note on measures of similarity based on centrality. Social Networks 29(1), 137–142 (2007) 12. Kim, J., Lee, K., Shaw, M., Chang, H., Nelson, M., Easley, R.: A preference scoring technique for personalized advertisements on Internet storefronts. Mathematical and Computer Modelling 44(1-2), 3–15 (2006) 13. Kitts, B., Freed, D., Vrieze, M.: Cross-sell: a fast promotion-tunable customeritem recommendation method based on conditionally independent probabilities. In: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, p. 446. ACM, New York (2000) 14. Kohonen, T.: The Self-Organizing Map. Proceedings of the IEEE 78(9) (1990) 15. Kumar, R., Novak, J., Raghavan, P., Tomkins, A.: On the bursty evolution of blogspace. World Wide Web 8(2), 159–178 (2005) 16. Kwon, I., Kim, C., Kim, K., Kwak, C.: Recommendation of e-commerce sites by matching category-based buyer query and product e-catalogs. Computers in Industry 59(4), 380–394 (2008) 17. Li, Y., Lien, N.: An endorser discovering mechanism for social advertising. In: Proceedings of the 11th International Conference on Electronic Commerce, pp. 125–132. ACM, New York (2009) 18. Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the 15th International Conference on Machine Learning, Citeseer, pp. 296–304 (1998) 19. Matsuo, Y., Hamasaki, M., Nakamura, Y., Nishimura, T., Hasida, K., Takeda, H., Mori, J., Bollegala, D., Ishizuka, M.: Spinning multiple social networks for semantic web. In: Proceedings of the National Conference on Artificial Intelligence, vol. 21, p. 1381. AAAI Press/MIT Press, Menlo Park/Cambridge (1999)
Improving Similarity-Based Methods for Information Propagation
401
20. Matsuo, Y., Mori, J., Hamasaki, M., Nishimura, T., Takeda, H., Hasida, K., Ishizuka, M.: POLYPHONET: an advanced social network extraction system from the web. Web Semantics: Science, Services and Agents on the World Wide Web 5(4), 262–278 (2007) 21. Mayer, R., Roiger, A., Rauber, A.: Map-based Interfaces for Information Management in Large Text Collections. Journal of Digital Information Management 6(4), 295 (2008) 22. Mitzenmacher, M.: A brief history of generative models for power law and lognormal distributions. Internet mathematics 1(2), 226–251 (2004) 23. Petrakis, E., Varelas, G., Hliaoutakis, A., Raftopoulou, P.: X-similarity: Computing semantic similarity between concepts from different ontologies. Journal of Digital Information Management 4(4), 233 (2006) 24. Salton, G.: Automatic text processing: the transformation, analysis, and retrieval of information by computer (1989) R in Networking 3(1), 1–125 25. Shah, D.: Gossip Algorithms. Foundations and Trends (2008) 26. Spertus, E., Sahami, M., Buyukkokten, O.: Evaluating similarity measures: a largescale study in the orkut social network. In: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, p. 684. ACM, New York (2005) 27. Strassen, V.: Gaussian elimination is not optimal. Numerishe Mathematik 13, 354– 356 (1969) 28. Wikipedia. Rss — wikipedia, the free encyclopedia (2010), (Online; accessed March 8, 2010) 29. Zadeh, L.: Similarity relations and fuzzy orderings. Information Sciences 3, 177–200 (1971) 30. Zipf, G.K.: Human behaviour and the principle of least effort. Addison-Wesley, Reading (1949)
Approaches to Privacy Protection in Location-Based Services Anna Rohunen and Jouni Markkula University of Oulu, Department of Information Processing Science, P.O. Box 3000, 90014 University of Oulu, Finland {Anna.Rohunen,Jouni.Markkula}@oulu.fi
Abstract. Location-based services (LBS) introduce serious privacy threats, which need to be addressed before the users and the service providers can get full benefit of these promising services. We addressed this challenge by reviewing and analysing privacy protection solutions proposed in the literature. Based on the analysis, we identified three general approaches for implementing privacy protection in LBS: privacy parameters, data disclosure control algorithms and information architectures. Implementation of specific privacy protection methods based on these approaches have still many unsolved challenges, such as data precision requirements of different service types and technical issues in information architectures. In addition, we see that user centric approach should be emphasised in the future privacy protection method development, in order to foster users’ trust to the services. Keywords: Privacy protection, data disclosure control, location-based services.
1 Introduction Privacy can be defined in general form, for example, as “the personal control over private information; such control is usually assured to the customer through a set of policies that are enforced by some technical means” [1] or as “the ability of individuals to decide when, what, and how information about them is disclosed to others” [2]. Before collecting personal data, consent from the data subjects (users) should be obtained, for example by notifying about the nature and purpose of the data collection and by offering policy choices [2]. Privacy protection is also required, either through technology, business practices, laws, or some combination thereof, in the use and further dissemination of the disclosed information [2]. Location-based services (LBS) present widely known serious privacy threats for their users. For using LBS, a user is required to give her or his location information to the service provider. For the user, it means trading privacy with the service [3]. Public concerns about privacy have to be addressed to get the benefits of the services realized for both the customers and business [1]. In addition, there are participatory sensing systems that rely on altruistic participation of users [4]. The users have to be assured that their privacy will not be violated [4]. In this paper, we present a literature review and analysis on privacy protection approaches that have been proposed for LBS. The objective is to discover general F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 402–409, 2010. © Springer-Verlag Berlin Heidelberg 2010
Approaches to Privacy Protection in Location-Based Services
403
approaches to privacy protection in LBS, and the characteristics of these approaches, in order to identify appropriate starting points for constructing practical solutions for specific LBSs in particular application domains. The paper is organized in the following way. In chapter two, the three identified approaches to privacy protection are introduced. After that, the following three chapters present the review and analysis of each of these approaches. Conclusions, including discussion about promising future research directions, are presented in the last chapter.
2 Three Approaches to Privacy Protection in Location-Based Services Privacy protection issue has been addressed in the literature in many ways. There exist a number of privacy protection methods for normal statistical personal data, which have been available already a long time, and applied in practice widely. However, LBS and associated geographic data present new challenges, which require different types of solutions. The traditional approach of pseudonymity (i.e. using a fake identity) is not applicable to LBS, because a location of a person can directly reveal his/her true identity [3]. If an LBS is not trusted, an adversary can collaborate with the LBS to acquire the location information of a user and his/her query [5]. Sensitive data can be revealed by combining the location with other publicly available information [6], or an adversary can have a priori knowledge of the user’s movement patterns [4]. The coordinates can be related to a specific user for example via physical observation, triangulating user’s mobile phone’s signal, or consulting publicly available databases [5]. Also multiple reports can be linked as being from the same user, and they may reveal the identity of the user [7]. If a user is identified at any point, his/her complete movements can also be exposed [8]. The privacy problems are even magnified if location information is recorded and distributed continuously in telematics applications such as “pay as you drive” insurance, traffic monitoring, or fleet management [9]. Instead of only learning about network services that a user uses, an adversary can also track the user’s movements and thus receive real-world information, such as frequent visits to a medical doctor, nightclub, or political organizations [9]. In this paper, we present a literature review and analysis on the privacy protection methods that have been proposed for LBS. Based on our analysis, we identified three general approaches of LBS privacy protection: • • •
Privacy parameters: Using privacy parameters (for example k-anonymity and minimum cloaked area Amin) that define users’ privacy requirements. Data disclosure control algorithms: Blurring information via data disclosure control algorithms, based on parameters specified by users. Information architectures: Information architectures based on, for example, trusted third party or data aggregation close to the data source.
The results of the review and analysis are presented in the following chapters organised according to these three approaches.
404
A. Rohunen and J. Markkula
3 Privacy Parameters The first identified general privacy protection approach is based on privacy parameters. In the context of LBS, it approaches privacy protection by utilizing parameters, which set the level of privacy risk. From the LBS system point of view, it means that user, the client, can set the privacy risk level as a parameter, which is used by the service to specify the allowed data privacy level. The privacy parameters specify the amount and precision of the information the user is willing to reveal about her or his location in order to use the service. The amount of disclosed personal information is usually in direct relation with the quality of service. Tuning of privacy parameters can be seen as a trade-off between the amount of information that a user is willing to reveal about her or his locations and the quality of service [3]. The approaches utilising privacy parameters in privacy protection usually base the methods on the concept of k-anonymity [3], [10], [11]. The concept of k-anonymity originates from releasing private data with scientific guarantees that the individuals, i.e. data subjects, cannot be re-identified while the data remain practically useful [12]. k-anonymity is achieved if the information for each person contained in the release cannot be distinguished from at least k-1 individuals whose information also appears in the release [12]. The set of users defined by k is called anonymizing set (AS) [5]. In the case of LBS involving geographic data, corresponding AS is specified by anonymizing spatial region (ASR), which can be defined as “the area that encloses the client that issued the query, and at least k-1 other users” [5]. Privacy parameters usually include k-value as such to determine the level of anonymity of a user [3], [10], [11]. Minimum spatial area, Amin, is used to determine the minimum spatial area around the requester, where the participants of the anonymity set should be searched for [10]. In sparse areas it is useful to define also maximum search area [11]. Temporal constraints can be used to let the user specify multiple instances of the above mentioned parameters for different time intervals [11]. A preferred parameter value can also be defined for each requested LBS [10]. As already mentioned above, tuning of the privacy parameters can be seen as achieving user’s trade-off between the amount of the information she or he is willing to reveal about her or his locations and the quality of the service. However, it has to be borne in mind that there are differences in data accuracy requirements between different services. For example for driving conditions monitoring, highly accurate position information is not necessary: about 100 m road segments should be a suitable resolution for most cases [9]. For applications collecting long-term statistics information about accident risk, delay is not important and time accuracy requirements are low [9]. Instead, for this kind of applications precise location information is crucial [9]. For drivers requesting information related to their current location from LBS, the location can be transmitted with medium accuracy [9]. In addition, for different area types in terms of user density (city center, rural area) privacy parameters should be defined differently. Tuning the privacy parameters has also been criticized. Kapadia & al. state that users’ privacy is maximized when all users have the same value of k because the user’s preferred degree of anonymity can itself leak information about him [13].
Approaches to Privacy Protection in Location-Based Services
405
4 Data Disclosure Control Algorithms The second identified general privacy protection approach is based on utilisation of data disclosure control algorithms. This approach is the next level, after the setting of privacy parameters. When the privacy parameters defining the level of allowed privacy risk are set, then data disclosure algorithms can be applied for achieving that level of protection, without destroying completely the usability of the personal data. In the context of LBS systems, it means that particular data disclosure control algorithms are applied for processing the personal data at the service side. Data disclosure control algorithms, which are applied to geographic data, can be roughly classified into four categories: 1) decreasing the accuracy of the data in terms of spatial accuracy, 2) decreasing the accuracy of the data in terms of temporal accuracy, 3) reducing sampling frequency, and 4) determining areas where location updates should be provided. In the literature, decreasing spatial accuracy seems to be the most frequently used algorithm category. Spatial accuracy can be decreased through tessellation algorithms that are based on generalization of a point coordinate to a plane [4]. User coordinates are replaced with a spatial region also in Nearest Neighbor Cloak (NNC) and Hilbert Cloak (HC) [5]. NNC is not vulnerable to the center-of-ASR attack. HC satisfies reciprocity and never reveals the query source. Also HilbASR algorithm, presented in [6], guarantees anonymity under any distribution; probability of identifying the query initiator is always bounded by 1/k even if the attacker knows the locations of the users. In microaggregation the data values are replaced by the means of equivalence classes (EC), and both spatial and temporal privacy can be ensured. Huang et al. [4] have presented several modifications of algorithms to generate ECs with maximum within-class homogeneity [4]. Location can also be disturbed by Gaussian noise [14]. However, there are problems with random perturbation: noise with large variance does not preserve sufficient data accuracy, while noise with small variance may be filtered by tracking algorithms [15]. Path perturbation algorithm can be used when the paths of two users meet; location information is perturbed to increase the chances of confusion [16]. Temporal accuracy can be decreased by delaying request by a user until k vehicles have visited the area [9]. Also sampling frequency can be reduced [17]. Virtual trip lines (VTL), stored in the client, can be seen as a sampling in space method: they indicate where location updates should be provided [18]. In sparse traffic situations VTLs can also be changed, based on traffic density heuristics [18]. There are algorithms to choose VTLs to maximize travel time accuracy and preserve privacy [18]. A combination of sampling in time and space is reported in [19]. In this approach, location information can be reduced both in and around sensitive areas via three algorithms: 1) Base algorithm releases only location updates in the areas that are classified as insensitive, 2) Bounded-rate algorithm includes Base algorithm, and uses a predefined threshold frequency for sending updates, and 3) k-area algorithm, where location updates are restricted only when an individual enters a sensitive area, and updates are released only when they do not give away which of at least k sensitive areas the user visited. These three algorithms are based on an assumption that an adversary has no significant a priori knowledge about the user location, such as users’ preferences for a certain type of location.
406
A. Rohunen and J. Markkula
If user density is high, data disclosure control algorithms based on k-anonymity provide sufficient accuracy for LBSs. However, with a low user density and large anonymizing spatial region, they do not achieve high accuracy that is required for example by traffic monitoring applications. Data protection algorithms, which create confusion where the traces from several users converge, achieve accuracy and provide a defined level of privacy in areas of confusion [15]. They still cannot provide overall privacy guarantees because these areas of confusion might not occur in lower-density areas [15]. Even when using reduced information of spatial and temporal cloaking, user moves can be comprehended if their data is tracked for several minutes [20]. Kido et al. [20] propose disturbing position data with a noise that consists of a set of false position data, dummies. There are also challenges related to determination of k-value. When applying kanonymity with statistically determined k-value, also smaller values of k could provide desired privacy levels [21]. Hence better quality of service could sometimes be guaranteed through decreased k-value [21]. On the other hand, in spatio-temporal cloaking some regions can leak information about users even if k-anonymity is obtained (for example when having a meeting in an office) [13].
5 Information Architectures The third identified general privacy protection approach is based on information system level solutions. As noted earlier, in the previous chapter, the data disclosure control algorithms are processed in the server side and solve privacy problems at the data level. The information architecture approach handles the privacy problem in higher level. In the context of LBS, information architecture approach is providing privacy protection solutions by managing location information in system level information architectures. The basic generic solution is based on using trusted third party servers. LBS systems can include trusted (third party) servers, which handles the privacy protection tasks. Trusted servers can be used to execute data disclosure control algorithms to blur the exact location information collected by a user. They can also maintain databases storing, for example, users’ privacy profiles with privacy parameters, users’ history of movements, their frequent, safe and unsafe routes, and some performance-related data. Trusted servers send anonymized location data for processing to LBSs. [3, 10] Distributed information architecture is composed of mobile clients and LBS servers. For example, in peer-to-peer (P2P) spatial cloaking presented in [22], each mobile user can communicate with the location-based database server and with other peers. Before requesting any LBS, the mobile user will form a group from his peers. Then the spatial cloaked area is computed as the region that covers the entire group of peers. To issue the query to the location-based database server, a mobile client in the group is randomly selected as an agent. The agent forwards the query to the locationbased database server. The location based database server processes the query with respect to the cloaked spatial region and sends a list of candidate answers back to the agent. The agent forwards the list to the request originator, who filters out the false answers.
Approaches to Privacy Protection in Location-Based Services
407
In centralized information architectures, trusted servers of third parties are bottlenecks that handle huge amounts of information. In addition, in fault situations their damage is fatal to the system. These problems can be avoided in distributed information architectures. On the other hand, improving the privacy against the system in information architectures based on P2P approach poses a new challenge: it is required that k – 1 other users are online in the user’s vicinity in order to aggregate data locally before presenting it to the server, and these users may be located beyond the range of P2P communication (e.g. if Bluetooth is used) [13].
6 Conclusions In this paper, we presented a literature review and analysis on privacy protection methods in the context of LBS. We identified three general approaches for implementing privacy protection: privacy parameters, data disclosure control algorithms and information architectures. Each of the three approaches, their characteristics and limitations, were described by examples of specific methods. There exist still several challenges related to determination of privacy parameters, the use of data disclosure control algorithms, and information architectures. When determining privacy parameters, data accuracy requirements have to be considered with respect to service types. With data disclosure control algorithms, there are both accuracy and privacy concerns that have to be discussed, in order to ensure sufficient privacy and quality of service. Concerning the information architectures, there are many technical challenges, which need to be taken into account when implementing the services. The information architecture consideration also brought up concerns of the future development of LBS and related applications. Traditional service provider centric client server system architectures are not necessarily anymore the most appropriate way to build services in the future. Distributed systems and ubiquitous computing environments seem to be the direction where location-based services are developing. This development calls for user-centric views in service and application design. Consequently, this implies moving from service-centric view to user-centric view also in privacy protection. The change of the viewpoint brings also new challenges to the privacy protection methods. The traditional methods might not be efficient, or even adequate, anymore. However, also possibilities for applying traditional approaches in novel ways emerge, and even completely new types of privacy protection methods can be developed. The most interesting and promising future research possibilities seem to lie in this paradigm change, which follows from the evident technological development towards ubiquitous computing environments, and user-centric view in location-based service development. Acknowledgments. This research presented in this paper was carried out in Sensor Data Fusion and Applications project as a part of the Cooperative Traffic research program of the Strategic Centre for Science, Technology and Innovation in the Field of ICT, and funded by the National Technology Agency of Finland.
408
A. Rohunen and J. Markkula
References 1. Youssef, M., Atluri, V., Adam, N.R.: Preserving Mobile Customer Privacy: An Access Control System for Moving Objects and Customer Profiles. In: Zaslavsky, A., Delis, A., Wolfson, O., Chrysanthis, P.K., Samaras, G. (eds.) Proceedings of the 6th international conference on Mobile data management, pp. 67–76. ACM, New York (2005) 2. Duri, S., Gruteser, M., Liu, X., Moskowitz, P., Perez, R., Singh, M., Tang, J.-M.: Framework for Security and Privacy in Automotive Telematics. In: Proceedings of the 2nd international workshop on Mobile commerce, pp. 25–32. ACM, New York (2002) 3. Mokbel, M.F., Chow, C.-Y., Aref, W.G.: The New Casper: Query Processing for Location Services without Compromising Privacy. In: Dayal, U., Whang, K.-Y., Lomet, D., Alonso, G., Lohman, G., Kersten, M., Cha, S.K., Kim, Y.-K. (eds.) Proceedings of the 32nd international conference on Very large data bases. VLDB Endowment, pp. 763–774 (2006) 4. Huang, K.L., Kanhere, S.S., Hu, W.: Preserving privacy in participatory sensing systems. Computer Communications (2009) 5. Kalnis, P., Ghinita, G., Mouratidis, K., Papadias, D.: Preventing Location-Based Identity Inference in Anonymous Spatial Queries. IEEE Transactions on Knowledge and Data Engineering 19, 1719–1733 (2007) 6. Ghinita, G., Kalnis, P., Skiadopoulos, S.: PRIVÉ: Anonymous Location-Based Queries in Distributed Mobile Systems. In: Proceedings of the 16th international conference on World Wide Web, pp. 371–380. ACM, New York (2007) 7. Cornelius, C., Kapadia, A., Kotz, D., Peebles, D., Shin, M., Triandopoulos, N.: AnonySense: Privacy-Aware People-Centric Sensing. In: Proceedings of the 6th international conference on Mobile systems, applications, and services, pp. 211–224. ACM, New York (2008) 8. Gedik, B., Liu, L.: Location Privacy in Mobile Systems: A Personalized Anonymization Model. In: Proceedings of the 25th IEEE International Conference on Distributed Computing Systems. IEEE Computer Society, Washington (2005) 9. Gruteser, M., Grunwald, D.: Anonymous Usage of Location-Based Services through Spatial and Temporal Cloaking. In: Proceedings of the 1st international conference on Mobile systems, applications and services, pp. 31–42. ACM, New York (2003) 10. Gkoulalas-Divanis, A., Verykios, V.S., Bozanis, P.: A network aware privacy model for online requests in trajectory data. Data & Knowledge Engineering 68, 431–452 (2009) 11. Mokbel, F.M.: Towards Privacy-Aware Location-Based Database Servers. In: Proceedings of the 22nd International Conference on Data Engineering Workshops, p. 93. IEEE Computer Society, Washington (2006) 12. Sweeney, L.: k-anonymity: A Model for Protecting Privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 10, 557–570 (2002) 13. Kapadia, A., Triandopoulos, N., Cornelius, C., Peebles, D., Kotz, D.: AnonySense: Opportunistic and Privacy-Preserving Context Collection. In: Indulska, J., Patterson, D.J., Rodden, T., Ott, M. (eds.) PERVASIVE 2008. LNCS, vol. 5013, pp. 280–297. Springer, Heidelberg (2008) 14. Domingo-Ferrer, J.: Microaggregation for Database and Location Privacy. In: Etzion, O., Kuflik, T., Motro, A. (eds.) NGITS 2006. LNCS, vol. 4032, pp. 106–116. Springer, Heidelberg (2006) 15. Hoh, B., Gruteser, M., Xiong, H., Alrabady, A.: Preserving Privacy in GPS Traces via Uncertainty-Aware Path Cloaking. In: De Capitani di Vimercati, S., Syverson, P., Evans, D. (eds.) Proceedings of the 14th ACM conference on Computer and communications security, pp. 161–171. ACM, New York (2007)
Approaches to Privacy Protection in Location-Based Services
409
16. Hoh, B., Gruteser, M.: Protecting Location Privacy through Path Confusion. In: Proceedings of the First International Conference on Security and Privacy for Emerging Areas in Communications Networks, pp. 194–205. IEEE Computer Society, Washington (2005) 17. Hoh, B., Gruteser, M., Xiong, H., Alrabady, A.: Enhancing Security and Privacy in Traffic-Monitoring Systems. IEEE Pervasive Computing 5, 38–46 (2006) 18. Hoh, B., Gruteser, M., Herring, R., Ban, J., Work, D., Herrera, J.-C., Bayen, A.M., Annavaram, M., Jacobson, Q.: Virtual Trip Lines for Distributed Privacy-Preserving Traffic Monitoring. In: Proceeding of the 6th international conference on Mobile systems, applications, and services, pp. 15–28. ACM, New York (2008) 19. Gruteser, M., Liu, X.: Protecting Privacy in Continuous Location Tracking Applications. IEEE Security and Privacy 2, 28–34 (2004) 20. Kido, H., Yanagisawa, Y., Satoh, T.: An Anonymous Communication Technique using Dummies for Location-based Services. In: Proceedings of International Conference on Pervasive Services, pp. 88–97 (2005) 21. Ravi, N., Gruteser, M., Iftode, L.: Non-Inference: An Information Flow Control Model for Location-based Services. In: Proceedings of the 3rd International Conference on Mobile and Ubiquitous Systems - Workshops, pp. 1–10 (2006) 22. Chow, C.-Y., Mokbel, M.F., Liu, X.: A Peer-to-Peer Spatial Cloaking Algorithm for Anonymous Location-based Services. In: Proceedings of the 14th annual ACM international symposium on Advances in geographic information systems, pp. 171–178. ACM, New York (2006)
Social Media as Means for Company Communication and Service Design Elina Annanperä and Jouni Markkula University of Oulu, Department of Information Processing Science, P.O. Box 3000, FI-90014 University of Oulu, Finland {Elina.Annanpera,Jouni.Markkula}@oulu.fi
Abstract. Service development in companies can have a new form when using social media as a communication interface. This communication can occur between company and its customers, but also the company’s internal communication using social media services can prove beneficial. In this paper, we review and analyze the use of social media as a means for company communication in general and specifically in the case of customer involvement in service development. As a result, we present guidelines for of using social media tools in companies for improving the customer centric service development process. Keywords: Social media, service development, service design, company communication, user involvement.
1 Introduction During the recent years, social media has turned into most popular application type in Internet. Its popularity has also woken up companies and they have gradually started to use it for different purposes, still searching for its business potential. The usage of social media in the companies is still in the beginning and the potential for business, or as a part of it, has not yet been fully utilized. As social media is a place where people are, express themselves and communicate, it is an increasingly important customer interface for the companies. It is a means to collecting customer data, on their behavior, needs and interests. In addition to delivering information from companies to customers, as in marketing, it is also an information channel from the customers to companies. From the companies’ point of view, the usage of social media as a customer interface is one of its clear applications. However, it also has other usages for the companies. Companies are also starting to use social media internally. Personnel of the companies is probably already using social media services for their own personal use, so why not use their experiences and transfer in to the company’s benefit. Also this has further potential, which can be more extensively utilized for business purposes. When considering the business value of social media, we take service design as the point of view. Companies are increasingly moving their business to services, which can be also supporting their product business. User centric development is the natural way to see the potential of social media. In user centric development, the customer’s F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 410–419, 2010. © Springer-Verlag Berlin Heidelberg 2010
Social Media as Means for Company Communication and Service Design
411
feedback can be utilized for developing existing services further, or customers’ ideas can be used for user innovation. Social media facilitates the innovation input from the users. However, customer information is not enough for developing the services; it is equally important to consider the management and processing of customer information within the company, in order to end up with real results in new services. The objective of our study in this paper is to analyze the characteristics and potential of social media in company communication and user centric service development. We review at first the characteristics of social media and analyzing the usage of it in company communication. After that, we presented a service development process model, where the focus is in customer involvement in the process. This model is further used for associating the possibilities of social media to the different stages of service design. As a result, we suggest guidelines for the utilization of social media in service design. The paper is organized in the following way. In the next chapter, we review and analyze social media and its usage in company communication. After that, in Chapter 3, we describe a service development process model, specifying its key stages. Based on these, in Chapter 4, we present guidelines for using social media in the customer centric service development process. The conclusions are drawn in the final Chapter 5.
2 Social Media in Company Communication Social media can be regarded as an umbrella concept covering Internet communities and a variety of associated communication tools [1]. The key aspects of social media are networks of the online communities and possibilities of sharing user created contents. Another characteristic in the definition of social media is that there has to be an element of user creativity and production of content, which can be regarded as the social aspect of social media [2]. The different features of Facebook and other networking tools have been thoroughly discussed for example by Boyd and Ellison [2]. Other examples of social media tools are user created content sharing services, such as YouTube and Flickr, content collaboration platforms, such as Wikipedia and other wiki tools, and different social bookmarking services, such as Delicious. Social media is seen as one of the indications of transformation from users’ information consuming to information creation in the Internet. Characterizing features in social media are the integration of technology, social interaction and user generated content [3]. From the viewpoint of our study, these features can be seen as tools enabling communication and communities enable networking between people. 2.1 External Communication Companies are increasingly using social media services for communication with customers, instead of only as a marketing channel. Some of them have found innovative ways to use the various available tools and services available. For example, Facebook fan pages are used to increase the customers’ interest toward the companies i.e. to raises the awareness and use the interface for marketing and brand management [4]. Lego, the building block manufacturer, has embraced the innovative minds of its users, and encourages people to post ideas for toys and alternative building instructions for its several social media sites. These sites are now popular among people of
412
E. Annanperä and J. Markkula
all ages, and a widely used example of a good brand management on the company’s side. According to a Lego community executive, the company was not thrilled when it first found out that people were using their products in new ways, fearing that this would damage the brand [5]. The popular use of social media tools for companies at the present is brand visibility and engaging customers to visit the company’s social media sites and add themselves as followers, so the customers receive updates from the company sites. For example, Dell has created a community that takes advantage of the many features provided by the social media sphere [6]. The community offers support and news concerning Dell products and user participation by asking customers to discuss and suggest topics, and to vote on them. Dell is also an example of a company which offers possibility to create content and to form customer-centered online communities. The company has created communities for customer who are adding multimedia content about themselves using different products and giving advice to each other in problem situations. In other words, the social media community of Dell is an integral part of their global customer service. When companies realize that they should be offering their services and be available where their customers spent their time, they can start utilizing social media tools for customer service. Other examples of service driven companies are HP [7], with a support page in Facebook, and Samsung, with help desk personnel on Twitter. Twitter, as a social media tool that is designed for quick and short messaging, allows customers to take their first contact with the company quickly and easily. Companies can entice the people to spend more time in their social media communities by providing visitors with games, questionnaires and prizes that they can win by contributing. At the same time, the social media adds another dimension to the external communication; customers can be given the opportunity to really contribute to the community formed for them. In the social media, user generated content can be easily added. Customers should be given the opportunity to use their creativity, by adding pictures, video or other media to the community. [8] 2.2 Internal Communication Within companies, probably the most common means of internal communication still today are e-mails and intranet. Also, a common sentiment about the e-mailing policies among the members of any organization is that they tend to get their mailboxes flooded with messages that do not really concern them, or just get too much mail daily. Social media services have the potential to decrease some of the mail flood, as an alternative information distribution channel. General announcements and conversation can be read from the company’s community page. Social media tools can also be used for locating the right people, i.e. for finding resources, information and expertise within the organization. [9] In addition, company resources, documents and guidelines can be transferred to a Wiki platform, where employees can easily find the needed information. The employees should also be encouraged to contribute to the development of the Wiki environment. The idea is that all the members of the organization have the access and the right to change and add content to the Wiki platform. Sometimes different company guidelines and templates have been made by someone who is not actually using them
Social Media as Means for Company Communication and Service Design
413
in everyday work. In time the documents have been modified to fit to purpose better, but the new versions exist only on some individual’s computer, instead of being available to all. Intranets could completely be replaced by Wikis, and ideally it would be a good platform for passing all knowledge inside the company. [10] Companies such as Sun Microsystems have moved some of the corporate internal communication to social media services, claiming that it has boosted the creativity in the organization. One of their findings is that company meetings can be well handled using online social media tools, instead of face-to-face meetings. They have also discovered that, by using of social media tools, the gap between company executives and employees has grown smaller. [9] 2.3 Potential of Social Media in Company Communication One of the benefits of using social media services as a communication interface is that a lot of people are already there; they know how to use most of the features and perhaps the threshold for participating in interaction with others is lower. From the company point of view, these services offer existing tools for adding their own content, instead of having to build a new service or web site. Social media tools are, by their nature, open to all people. This may worry some of the companies at first. The companies have to accept and embrace the fact that it is an open forum for all opinions and views. Users may not always give companies the feedback or participate in a form that the companies would prefer. Also, customers tend to trust the information about companies that they receive from the other users in social media (see [11]). On the other hand, social media is an easy and inexpensive channel for communicating with customers, and can reach a wide audience. This should be seen as a possibility for the companies. Social media is an open communication channel between the company and the customers, but also between the customers. As an internal communication channel, some social media services offer possibilities to limit the visibility and access to the site, if needed. They also have multiple features for building little applets, gathering data using available analysis tools, creating discussion areas, as well as sharing multimedia content. Making polls or questionnaires is mostly possible as well [1]. The advantages of using social media tools for company communication, whether it is external or internal, include that they provide a given format or template for communication and collaboration. The platforms also offer centered information gathering; as the social media services are designed for content sharing and networking use, a lot of the information is ready-formatted. There are tags in use in almost all of the services and many information fields have a defined purpose. Utilizing these features in social media means that part of the content is in a format that can be easily found and analyzed. In addition, the range of social media tools is relatively wide. This makes the finding of a suitable channel of communication and reaching of the customers relatively easy. At the moment, the utilization of tags that the users place in the content which they add to social media is suggested as one of the most effective ways of finding relevant information from social media. Automated analysis tools for written text are also an emerging research topic [10, 12]. The analysis of the content coming from the customers can thus be handled using the same services when using the company’s internal procedures. In the same way, the customers can be asked to rate service ideas or
414
E. Annanperä and J. Markkula
give ideas to new development and the company personnel can use the same tools for improving their work and the processes as well. However, these different features for collecting information are not yet widely used. In order to really benefit from the information social media could offer, they also have to be further developed. Automated data collection needs to be designed within the company to meet its needs. The questions related to automated data collection, however, are topics of future research. The social media tools are widely available for communication purposes of companies. The question still remains, how to take the most out of them in the companies. One of the most promising possibilities is to use them for user involvement in service development.
3 Customer Involvement in Service Development The importance of services to the traditional manufacturing industry, alongside with the service industry, has led to an emerging interest of research in the field of service development. According to Edvardsson et al. [13] customer involvement in service development became an interest for the companies after it had been successfully used for some time in the product development. The earlier research in this field was conducted at the end of the 1970’s, in beginning of the 1980’s. The aim of customer involvement in service development is to discover the customers’ needs, wishes and preferences in order to find out what would create value for the customer; these being also the traditional aspects of market research [13]. The more modern approach to customer involvement in service development includes involving them to create and assess ideas; involving customers in the stages of developing and designing new solutions to expressed needs [14]. Some studies suggest that the same models can be applied to both manufacturing industry product development and service development [15]. Most of the suggested customer involvement methods in service development area literature are based on case study research or literature based theoretical research [see, for example, 16]. A number of service development process models have been presented in the literature. The main phases of these various process models can be generalized to be design, production and consumption [17]. Our interest lies in the firs of these general phases, design. Correspondingly, the design phases can be further divided into a number of steps in several ways. For the present purpose, we utilize the development process model presented by Alam [18] and complement it with some aspects presented by Ulwick [19] as the basis. We use the key stages of the service development process for examining the involvement of customers in service design. Based on the process model presented by Alam [18], the service development process can include ten sequential stages: strategic planning, idea generation, idea screening, business analysis, the formation of the cross-functional team, service and process/system design, personnel training, service testing and pilot run, test marketing, and commercialization. Even if users can be involved in all of these stages, three of them were discovered to be more important from the user involvement point of view than the others. These three stages were the idea generation, the service and process/system design and the service testing and pilot run [18]. For the present purpose, we will focus on these three stages, supplementing them with the test marketing
Social Media as Means for Company Communication and Service Design
415
stage and an idea refinement stage, which is a combination of the above mentioned idea screening and business analysis stages. For the further analysis, we have five key stages of the service design process: idea generation; idea refinement; service and process/system design; service testing and pilot run, and test marketing. In the idea generation stage, the service producers (meaning both, the company and the personnel involved in the service design) search for ideas, internally and externally. At this stage, customer information related to needs, preferences and choice criteria is needed. Customer involvement can be used, for example, in gathering service requirements, customer’s needs, problems and proposed solutions, evaluation of existing services and new service adoption criteria. [18] In the idea refinement stage, which includes idea screening and business analysis, the service producers carry out a feasibility analysis, gather user problems and their solutions, eliminate the weak concepts based on information about the user needs, asses the markets and market potential of the developed services, carry out economic analysis and competitive analysis. At this stage, customers can be involved, for example, in analyzing of the ideas, suggesting desired features, providing reactions to the concepts, giving information about the purchase intent for the basis of profitability analysis, and even for giving marketing ideas. The service concepts can also be put into a vote for the customers, in addition to the service producers. At this point, the ideas that have the most market potential and are possible to develop are picked from the idea pool. [18, 19] In the service and process/system design stage, the service producers design the service based on the results of the earlier stages. The design includes specification of the delivery process, personnel, documentation, delivery mechanism etc. Customer involvement can be used, for example, for suggesting improvements, identifying failing points and observing the delivery trials. [18] In the service testing and pilot run stage, the service producers test the service concept, implement and refine the service design, test the service in real conditions and study the user acceptance of the service. At this stage, customer involvement can be used in participating in the evaluation of the service concept and testing of the service, as well as in suggesting changes and improvements. [18] In the test marketing stage, the service producers develop a marketing plan and examine the marketing options, making also limited rollout of the service in selected markets. In this stage, customer involvement can be utilized in getting feedback on different aspects of the service and marketing, customer satisfaction and suggestions for improvements. The users can also be part in the further developing cycles by giving their feedback and testing the services. [18] Customer involvement can have a significant impact on the service design process. Magnusson et al. [20], for example, got the following results in their research on user involvement in the technology service development process. When comparing groups of experts with groups of common users, the users had more new service ideas than experts. Although the experts’ ideas were more easily implemented into usable services, the common users had more original ideas. Indeed, if the users are given too much information about the technical restrains, they do not tend to want to present their most original ideas, and thus finding truly novel ideas can be difficult through user involvement. Also, simply collecting ideas using an open question form can prove ineffective, instead users should be involved in a more in-depth study where
416
E. Annanperä and J. Markkula
they actually get to exchange ideas and try out some new services themselves, and make suggestions based on this. This can generate more usable ideas from the company point of view. Using user involvement in service development can at best lead to differentiated services, reduced development cycle time, fast diffusion in among the users and longer customer relationships [18]. When users are involved in the service design process, their input and feedback is typically collected and received using various means. User input is collected, for example, from in-depth or focus group interviews, from user participation in service producer’s meetings and brainstorming sessions, from the observation of users and by collecting their feedback. In addition to face-to-face participation, the used traditional communication channels are phone, faxes, and e-mails. The most used user involvement methods in service design appear to have been in-depth interviews and user visits, while focus group interviews and phone, faxes, and e-mails have not been utilized much. [18] It is quite clear that involvement of users in the service development process has clear advantages. Therefore, the companies are nowadays using user centric approaches, in variety of forms, quite widely. However, the methods of user involvement and obtaining of user input are still mostly traditional, based on the solutions presented above. The potential of the new media and communication channels have not yet been widely utilizes, and not even fully noticed. Therefore, it is essential to analyze the possibilities the emerging popularization of social media provides.
4 Utilizing Social Media in Service Development Social media tools have evident potential as a means for customer involvement and information collection in the service development process. As the examples in our study highlight, there are already existing cases, where some aspects social media have been utilized in service development processes. However, existing practices have not yet been systematically analyzed, and coherent processes and guidance for companies are still missing. Our objective here is to bring together the processes from service development and company communication practices, and consider the possibilities of using social media as a contemporary novel tool in developing of these processes. We have taken the service development process steps, presented in Chapter 2, and associated there the different social media features that the tools and communities can potentially offer. As a result, we have produced a suggestion of guidelines of using social media tools in companies for improving the customer centered service development, which is presented in Table 1 below. Our guidelines reveal that social media offers tools for all of the stages of service development, from idea generation to test marketing. We argue also that in-house service development processes can benefit significantly from the introduction of social media and customer involvement. This would be achieved by increasing of the exchange of ideas between common users and company personnel, who are more experienced, have an insight of the industry, thus maybe being able to take the customers ideas “to the next level” [20]. When the social media usage has been introduced to the service development, it should be utilized also to bring the developers and the customers closer to each other, to interact with each
Social Media as Means for Company Communication and Service Design
417
Table 1. Usage of social media for customer involvement in service development Development Step Idea generation
Idea refinement
Service and process/system design Service testing and pilot run
Test marketing
Customer Input Customers involved in gathering service requirements, needs, problems, and potential solutions; evaluation of existing services
Social Media Use Company: find the suitable customers from the online communities; Customers: give input or feedback discussing in the online community, or answering a questionnaire online; an idea game. Customers analyzing the ideas, Voting and rating suggest desired features, systems in the reactions to concepts; informa- company’s social tion on purchase intent media site; questionnaires Customers suggesting Discussion in the improvements, identifying online community; failing points, observe delivery multimedia content trial evaluation Customer participate in evalua- Company social media tion of service concepts; testing site can be used to find of service, suggest pilot testing groups; improvements gathering feedback of the service Customer feedback on Multimedia content marketing plan, satisfaction on evaluation; idea games marketing and suggestions for for marketing; improvement gathering feedback
other more directly. In this way, the valuable customer feedback and innovations are not lost in the barriers of communication channels, but brought to the attention of the right people in the company. The right people are those who are willing and able to utilize this information for improving existing and generating new services in the company.
5 Conclusions Social media tools and communities are already popular among the extensive Internet user population. In Internet, people are actively networking and collaborating with each other, as well as creating media content and producing information to the services, using openly available social media tools and communities. Because social media services have become popular so fast, also the companies have had to react and join them. Many of the companies, however, may not yet have a full insight how to make the entrance to the social media successfully. In this paper, we have presented some successful examples of using social media, ranging from customer service to
418
E. Annanperä and J. Markkula
product development. Thinking about the services industry, the companies can learn from the product development such as Lego has done, when starting to utilize social media services, and as there are more success stories available on how companies have been able to engage their customers in the product development process. Customer involvement in product and service development through social media is clearly the future direction of improving service development in the Internet age. We chose this phenomenon as our research challenge in this paper. The objective of our study was to analyze the potential of social media for communication and supporting of service development in the companies. We approached the issue by first reviewing the characteristics of social media and analyzing the usage of it, especially from the viewpoint of company communication. In this analysis, we took into account external and internal communication, as the service development process in the companies uses both, communication externally with the customers and communication internally between the service producers within the company. After the social media analysis, we presented a process model of service development, where our focus was in customer involvement in the process. We used this service design process model for associating the possibilities of social media to the different stages of service design. As a result, we presented a suggestion of guidelines for the utilization of social media in service design. The presented guidelines can be used by the companies as a basis when they start to develop their service design processes towards an efficient usage of active customer involvement, which the new possibilities opened by the current trends of social media in Internet can offer.
Acknowledgement This study was carried out in Quicksteps research project, funded by The Finnish Funding Agency for Technology and Innovation (Tekes) and a consortium of companies.
References 1. Kaplan, A.M., Haenlein, M.: Users of the World, Unite! The Challenges and Opportunities of Social Media. Business Horizons 53, 59–68 (2010) 2. Boyd, D.M., Ellison, N.B.: Social network sites: Definition, History, and Scholarship. Journal of Computer-Mediated Communication 13(1), 210–230 (2007) 3. Chai, K., Potdar, V., Dillon, T.: Content Quality Assessment Related Frameworks for Social Media. In: Gervasi, O., Taniar, D., Murgante, B., Laganà, A., Mun, Y., Gavrilova, M.L. (eds.) ICCSA 2009. LNCS, vol. 5593, pp. 791–805. Springer, Heidelberg (2009) 4. Ferguson, R.: Word of mouth and viral marketing: Taking the temperature of the hottest trends in marketing. Journal of Consumer Marketing 25(3), 179–182 (2008) 5. Jansson, H.: Social Media Helps LEGO Connect With Users (2009), http://www.ericsson.com/ericsson/corpinfo/publications/ telecomreport/archive/2009/social-media/article1.shtml 6. Dell Community, http://en.community.dell.com/ 7. Grensing-Pophal, L.: Social Media Helps Out the Help Desk. EContent 32, 36–41 (2009)
Social Media as Means for Company Communication and Service Design
419
8. Levy, M.: WEB 2.0 Implications on Knowledge Management. Journal of Knowledge Management 13(1), 120–134 (2009) 9. Barker, P.: How Social Media Is Transforming Employee Communications at Sun Microsystems. Global Business and Organizational Excellence 27(4), 6–14 (2008) 10. Figueiredo, F., Belém, F., Pinto, H., Almeida, J., Gonçalves, M., Fernandes, D., Moura, E., Cristo, M.: Evidence of quality of textual features on the web 2.0. In: Proceedings of International Conference on Information and Knowledge Management, pp. 909–918 (2009) 11. Mangold, W.G., Faulds, D.J.: Social media: The new hybrid element of the promotion mix. Business Horizons 52, 357–365 (2009) 12. Agichtein, E., Castillo, C., Donato, D., Gionis, A., Mishne, G.: Finding High-Quality Content in Social Media. In: Proceedings of First ACM Conference on Web Search and Data Mining, pp. 183–193 (2008) 13. Edvardsson, B., Gustafsson, A., Kristensson, P., Magnusson, P., Matthing, J.: Introduction. In: Involving Customers to New Service Development, pp. 1–13. Imperial College Press, London (2006) 14. Sanden, B., Gustafsson, A., Witell, L.: The Role of the Customer in the Development Process. In: Involving Customers to New Service Development. Involving Customers to New Service Development, pp. 33–56. Imperial College Press, London (2006) 15. Von Hippel, E., Katz, R.: Shifting Innovations to Users via Toolkits. Management Science 48(7), 821–833 (2002) 16. Matthing, J., Sanden, B., Edvardsson, B.: New Service Development: Learning from and with Customers. International Journal of Service Industry Management 15(5), 479–498 (2004) 17. Wikström, S.: The Customer as Co-Producer. European Journal of Marketing 30(4), 6–19 (1995) 18. Alam, I.: An Exploratory Investigation of User Invovement in New Service Development. Journal of the Academy of Marketing Science 30(3), 250–261 (2002) 19. Ulwick, A.W.: Turn Customer Input into Innovation. Harvard Business Review 80(1), 91–97 (2002) 20. Magnusson, P.R., Matthing, J., Kristensson, P.: Managing User Involvement in Service Innovation: Experiments with Innovating End Users. Journal of Service Research 6, 111–124 (2003)
A Problem-Centered Collaborative Tutoring System for Teachers Lifelong Learning: Knowledge Sharing to Solve Practical Professional Problems Thierry Condamines MIS Laboratory University of Picardie Jules Verne Amiens, France [email protected]
Abstract. Our works aim at developing a Web platform to allow teachers to share know-how and practices and capitalize on them for lifelong learning. In this context we propose a problem-centered tool based on the IBIS method initially developed for capturing design rationale. This tool aims at helping teachers to solve practical professional problems through a collaborative coconstruction of solutions. A general model is given using among other things the three elements of the IBIS method (problem, position/solution, argument) and allowing multiple solutions for one problem. The problem description and solution construction steps are explained and the way teachers are connected from their profile is detailed. We also take into account a system of opinions and feedback about the usefulness of a solution. A use case is given to illustrate how the model is implemented. Keywords: Knowledge Capitalization; Knowledge Sharing; Problem Solving; Teachers Lifelong Learning; IBIS method; Web 2.0.
1 Introduction Lifelong learning especially for the arrival of new collaborators in an organization is often difficult to manage. If learning theoretical knowledge seems now facilitated by the volume of resources we can find on the Net (thanks to training institutes like Ariadne, Globe, Lornet, Open Courseware Cons., …), the situation is not so simple for know-how resulting from experience. This paper deals with the important topic of helping novel teachers to cope with the tasks and activities of teaching settings. We will refer to elementary school teachers (pupils from 3 to 11 years) for whom the needs seem to be more important ([1] and [2]). This knowledge worker [3] uses very varied knowledge (the fields he teaches, didactics, pedagogy, child psychology …) and must cope with a double training: that of his pupils and that of his job [4]. A good example of the difficulties encountered by novel teachers is, without any doubt, class management. F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 420–429, 2010. © Springer-Verlag Berlin Heidelberg 2010
A Problem-Centered Collaborative Tutoring System for Teachers Lifelong Learning
421
2 From the Needs to the TeTraKap Project Here are two representative examples of help needed by novel teachers: ─ “Next school year I will have a double level class. I have never had this configuration before. How can I organize my teaching (schedule, group work, differentiated instruction …)?” ─ “I have a child who seems to have no auditory memory. He can’t remember the alphabet, recite the first numbers list, […] but he can […]. What can I do for him?” The first one refers to a global help on how a teacher work in a given context (class level) while the second teacher needs help to solve a particular problem. This is what we mainly find on the huge quantity of forums created by teachers on the net. They reflect the need to be connected with other teachers having a similar context and share their practice. In this context, the TeTraKap project aims at improving know-how sharing between teachers following the two main needs quoted above. To do that two modules have been developed: ─ A knowledge capitalization module where each teacher describes his practices in a computer supported personal memory. It refers to his general practice in his own class: how does he manage a group work, differentiate the work, manage discipline… This memory can be annotated by other teachers giving ratings to get feedback about the usefulness of a given practice. ─ A problem centered module to help teachers to solve a problem too particular to be solved from the practices described in the above module. These two modules rest on a teacher profile to help users to search for knowledge adapted to their concern but also to connect teachers able to co-construct a solution to a given problem. This paper will discuss about the second module and the teacher profile. Our work is based on an iterative process driven by the teachers’ needs through interviews and experiments. The prototype of the platform evolves according to new emerging needs.
3 Principles of the Problem-Centered Module 3.1 General Model In cognitive psychology, a problem can be seen as a task to be realized (we know the goal to reach) under well defined conditions (the context is known) and for whom we don’t know a solution or a systematic method of resolution. For a teacher, what matters is the way in which he can manage his teaching so that to be the most efficient. This is named Instructional Design [5] and give ill-defined (or ill-structured) problems [6] in the sense that various solutions can be acceptable depending on the point of view. This can be observed in the forums quoted above: a teacher A give a solution to a given problem and a teacher B objects to this solution giving arguments and another solution.
422
T. Condamines
This is not far from the IBIS method [7] used for capturing Design Rationale using three elements (issue, position and argument) and relations between these elements: a position responds to an issue, an argument supports or objects to a position,… We follow this approach in our global model (Fig. 1).
Fig. 1. Global model
In this model, a problem occurs during a task: for example a problem of discipline during a teaching sequence implementation and exactly during a collaborative working phase. It occurs in the working context of the teacher: class level(s), number of children, special characteristics (handicapped children,…), school location (urban, rural, …), … A problem can have several solutions given by several teachers. An argument is given by the author to support the solution. Teachers can give (positive or negative) opinions about a solution given by someone else. A teacher can use a solution or construct his own solution from several solutions proposed to the problem by different teachers. After applying a solution he gives a feedback which evaluates its usefulness. Following this feedback he can give an improvement or construct/choose a new solution (in the case of a too negative feedback). 3.2 Problem and Solution Description When a teacher describes a problem, he writes a problem card given several characteristics: ─ The type of the problem. For example :”problem of discipline” ─ The task during which the problem was encountered. For example : “Teaching sequence implementation – Collaborative working phase” ─ The context during which the problem was encountered. It’s different from the working context of the teacher mentioned above. For example: “It was on Friday afternoon. After a one hour sequence on painting, pupils were working in mathematics on problem solving.”
A Problem-Centered Collaborative Tutoring System for Teachers Lifelong Learning
423
─ The description of the problem itself. For example, in the above context: “The children were very excited, speaking loudly, moving from one group to another. After one hour during which I was often shouting, I stopped the work and ask for the solutions found. But there were no interesting results. I have to do it again later!” ─ Questions that specify the help needed. For example: “If I do it again what can I do for more concentration of my pupils?” ─ Keywords. When other teachers want to propose a solution, they can ask the problem author for more details. This can be done through a discussion between all the teachers (like on electronic forums). But in this way it’s long and difficult for a third teacher to read all the exchange and have a good idea of the problem and its solution. Moreover, we’ve observed on electronic forums that collective discussions between teachers are influenced by the fear to be judged (important in this community). A teacher often gives a “generic” solution which doesn’t correspond exactly to what he could give in a oneto-one discussion (and less useful). So we choose another way using a problem card and a solution card (Fig. 2) and separate asynchronous discussions between the problem author and solution authors.
Fig. 2. Request for more details in problem or solution description
Here, we’ve been inspired by the REX method [8] used for knowledge capitalization which uses knowledge cards (experience cards) containing elements such as context, description, opinion, commentary, recommendation. In the solution card, the recommendations are recommendations to success in the solution practical use, especially for novel teachers (with low experience). After a request for details, the problem or solution card is modified. The solution card is visible only by its author and the problem author until the description is clear for the problem author. When the discussion is closed, the solution card becomes visible to everyone and opinions can be given. If it happens that two solution cards corresponding to the same solution are given, the problem author can choose the best written one (the other one will be deleted).
424
T. Condamines
3.3 Teacher Profile for Connecting People The teacher profile is composed of: ─ ─ ─ ─
Identification data for accessing the platform Age, sex Studies and diplomas The working context : characteristics of the current class (level, number of pupils, normal or specialized class, presence of handicapped pupils), characteristics of the school (location in city center, suburbs, rural area, special locations like “violence areas”, number of classes), characteristics of the position (fulltime, part-time, …) ─ Past experiences: different working contexts since the first teaching year. ─ Centers of interest: tasks about which the teacher wants to be kept informed in priority. For example “constructing a teaching sequence”, “managing a collaborative working sequence”,… ─ Statistics about the teacher activity on the platform: number of connections, last connection, number of problems/solutions published, number of opinions/feedbacks written … This profile is very important because it gives a better understanding of problems. For example: ─ A noise problem in a class is not the same when the number of pupils is 12 or when it’s 30. ─ An 8 year old pupil who doesn’t know how to read isn’t the same problem as a 7 year old pupil. ─ Problems for heterogeneity management are different for a one level class or a two level class. ─ Problems in a rural location are not the same as in a big city location. ─ For authority problems, when you have a full time position it’s easier to stand out than for a part-time position. ─ When most of your studies are in literacy, you can have more problems in teaching mathematics than in teaching language. The profile is also important to connect teachers. Knowledge broadcasting is a wellknown problem and [9] gives different ways of collecting and broadcasting knowledge. When a problem is put in the system, its author generally needs help quite quickly. To avoid passivity we choose an active broadcast (push way): when a problem card is written, it’s published in the system on the front page of teachers who are likely to give an appropriate help, that is people who have a sufficient experience in/near the same working context. They can give a solution or annotate a solution given by another one. On the other way, when solution cards are given, the problem and its solutions are published on the front page of teachers having quoted the task where the problem occurs as a center of interest. On his front page, the user can quickly take a look at new “interesting” actualities since his last connection (Fig. 3). He can of course follow any help given to the problems he have published (“My problems” item) with a star giving the number of new solutions (or modifications of solutions on construction) and a speech balloon reflecting the number of opinions/feedbacks about solutions of his problems. The “My solutions”
A Problem-Centered Collaborative Tutoring System for Teachers Lifelong Learning
425
item is about solutions he has proposed to problems of other teachers. This item is divided into two items : “Solutions in construction” with a star giving the number of new messages received from problem authors (asking for more details in the solution description) and a “Solutions already given” item for closed solutions cards with a bullet giving the number of opinions/feedback given about the solutions he has proposed. The “Bookmarked problems” and “Problems of interest” items are about problems that could be “useful” for him. The second one corresponds to problems in the centers of interest checked in his profile (the number of new problems is displayed) while the first one is about problems he had selected and for which he wants to follow solving evolution: New solutions (with their number quoted by the star) or new opinions/feedbacks (quoted by the speech balloon). Finally, an item gives access to a view on the problems base corresponding to problems in the same working context as the user’s one. The number of new unsolved problems is given to urge him to have a look at them and perhaps give a solution.
Fig. 3. Main navigation items on the user’s front page
3.4 Solution Appropriation and Feedback When a teacher publishes a problem, he can receive one or more solutions. Then he has two choices: either he chooses one of the solutions and decides to use it, or he constructs his own solution by combining two or more solutions. In this second case he will have to write a new solution card pointing out the solutions from which it’s inspired. In both cases, he will have to write a feedback giving positive and negative effects observed during the solution use. Usefulness and easiness marks can also be given. He can also suggest an improvement from what has been observed. Here we call “improvement” a small modification of the initial solution. If the feedback is very negative, then he will have to use/construct a new solution. When a list of solutions is returned for a problem, we give for each one a graphic indication on its author’s experience in the same working context. This doesn’t ensure the quality of the solution but gives an indication on the degree of expertise of its author.
426
T. Condamines
3.5 Knowledge Indexing In Web 2.0, indexing often rests on tag clouds which give a collective categorization of concepts (folksonomies). This practice can be used for applications where users navigate without any precise intention. But the lack of structure of such a method [10] led us to use a more structured way. A teacher must be able to find a solution to his problem (if it exists in the knowledge base) very quickly or be connected also quickly with other teachers able to help him to construct a solution (if no solution is found). Here, a problem is mainly characterized by the working context, the task where it occurs, the problem type and keywords. As seen in the teacher profile, the working context is very structured (class, school … categories are well defined). But, for the tasks, a precise and structured classification of all the teacher’s tasks is difficult and costly because of the pedagogical freedom of each teacher. So we developed, by an iterative process based on interviews with teachers, a general classification (two to three levels) of high level tasks and allow users to create new subcategories for more precision. For the problem type, categories and subcategories are, in this first version, created directly by the users. When a teacher search for a problem he can select a task in the task taxonomy and then choose a problem type in the list returned (problem types already entered by users). He can also do it in the opposite way (beginning by the problem type and ending by the task). By default, the problems returned correspond to his working context (which is a part of his profile). But he can expand the request to another context. For example, looking for a discipline problem in a collaborative working phase can be less dependent on the class level than choosing the length of a teaching sequence (which can vary from a few minutes for 3-years old pupils to 1 hour and more for 10years old pupils). The request can also be refined by using keywords.
4 Use Case To illustrate how the system works we gives now a use case from the problem description to the solution returned. Let’s consider a teacher T1 who has a problem and wants to find solutions ideas to solve it. The first step is to look into the problems base if similar problems have already been solved. So he chooses in the task taxonomy the one during which the problems occurred. A list of problems types for this task is returned. If he chooses one problem type, the problems of this type (and in his working context) are returned. 4.1 If No Similar Problem Exists… It’s a new problem. To describe it precisely, T1 writes a problem card on the model seen in part I.B. Then it becomes visible, especially on the front page of users having the same working context (Fig. 3). He can write his own solution card which is published with the problem and wait for opinions on it before using it. If a teacher T2 decides to give a solution, he write a solution card and an asynchronous discussion can happen between T1 and T2 for more precision in the problem or solution description. Along the discussion the two teachers have to modify their card accordingly. When the discussion is closed, T1 can try to apply the solution and then
A Problem-Centered Collaborative Tutoring System for Teachers Lifelong Learning
427
give a feedback and perhaps an improvement. He can also decide that the solution isn’t completely suitable, give an opinion on it, and wait for other solutions. A teacher T3, reading the problem, can give an opinion on T2’s solution. He can also give his own solution through the same process as for T2. When two or more solutions are given, T1 can choose to construct his own solution by combining some of these solutions. He writes a solution card pointing towards the solutions from which it was inspired. If he tests it, he can give again a feedback and possible improvements. 4.2 If a Similar Problem Already Exists … If the similarity is very important, T1 can test one of its solutions and give feedback and possible improvements. He can propose his own solution, waiting for opinions. If the similarity isn’t suitable, he can write a problem card (new problem) and enter in the same process as in part A. He can also attach a copy of solutions of similar problems that are perhaps useful with modifications. Then he can have discussions with the solutions authors to adapt the solutions to his own problem. Of course new solutions can also be proposed by other teachers.
5 Discussion and Future Work Lifelong learning requires technology to be used effectively to support learners in solving problems, allowing them to construct the right solution at the right time. But this can’t be done alone and it’s important to facilitate collaborations among lifelong learners. In this paper, we have presented a problem-centered tool allowing collaborations between teachers in the solution search. Our aim was to help teachers in constructing their own know-how from different existing know-how (through the solutions proposed by other teachers) and to progressively construct a knowledge base composed of problems and their solutions but above all of opinions and feedbacks on them. In this approach we are not far from a case-based reasoning approach (CBR). It was our first idea as stated in [11] but perhaps too ambitious. In a CBR system, a case represents a problem solving episode and is divided into two parts: a part for the problem and another for the solution. According to Aamodt [12], a case follows a cycle (Fig. 4) during which, for a target case, a source case (nearly the target case) is retrieved in the case base. The solution case is then “adapted” to answer the target problem before being returned to the user. In our situation, the solution if frequently described in natural language which complicates the adaptation phase. Moreover, choosing an adaptation of a case supposed to answer the problem amounts to saying that a solution is adapted to any user confronted with the same problem. This is false in our case. When a teacher receives several solutions to his problem he chooses the one that correspond to his personality but also to his fine knowledge of the context (psychology of his pupils…) which cannot be exactly described in the system. Hence we have relegated to the teacher the choice and adaptation phase, storing all the solutions proposed to his problem.
428
T. Condamines
Fig. 4. The CBR Cycle
In the problem indexation, we use a teacher task classification but with only a two to three level tree. Users can add new task types but the problem is to manage these additions: doubles, task type not located at the right level of the classification... So we use these additions to regularly update our classification after discussions with experienced teachers and it seems that the process is going to stabilize. We have the same challenge for the problems types but the problems space seems to be quite considerable. We work on classification of high level types, crossing theoretical studies, forums analysis and date recorded on our platform. Another problem to which we were faced with is about the cards editing (problem and solution cards). The cards are often very imprecise in their first version and a discussion between the problem and solution authors is needed. But these authors must be conscious that if the cards are kept for helping other teachers, the messages exchanged aren’t saved. We have to better stress this point in the interface design and think to a learning stage. But when the cards are well detailed they are found very useful and easier to understand than a discussion thread. The problem is similar for the profile which importance is not obvious for users. It’s necessary to explain the link between the profile and problem indexing and show the interest to enter a precise profile for a better search. In addition, if the user doesn’t complete his profile he can only access to the problems but not to their solutions. Our future work will also aim at analyzing the users’ activity on the platform to refine this profile. At this stage, we connect people with a similar working context. But two teachers can have very different practices in the same working context. For some problems, the teacher can need help very quickly. So it’s important to connect him in priority with teachers whose practices are “compatible”. If it’s difficult a priori to put this parameter in the profile, it can be learned by analyzing the activity on the platform, for example by identifying communities with similar practices: we can learn from the chosen solutions and the feedbacks and opinions.
References 1. Condamines, T.: How to favor know-how transfer from experienced teachers to novices? In: A hard challenge for the knowledge society, IFIP International Federation for Information Processing. Learning to Live in the Knowledge Society, vol. 281, pp. 179–182. Springer, Boston (2008) 2. Condamines, T.: How can knowledge capitalization techniques help young teachers’ professional insertion? In: A new approach of teachers’ long-life training, 6th International Conference on Human System Learning (ICHSL.6), Toulouse, France (2008)
A Problem-Centered Collaborative Tutoring System for Teachers Lifelong Learning
429
3. Drucker, P.: The edge of social transformation. The Atlantic Monthly 274, 53–80 (1994) 4. Saujat, F.: Spécificité de l’activité d’enseignants débutants et genre de l’activité professorale. Polifonia 8, 67–93 (2004) 5. Schott, F.: Instructional Design. In: Baltes, P.B., Smelser, N.J. (eds.) Encyclopedia of the Social and Behavioral Science, pp. 7566–7569. Elsevier, Oxford (2001) 6. Jonassen, D.H.: Instructional Design models for well-structured and ill-structured problem solving learning outcomes. Educational Technology Research and Development 45(1), 65– 94 (1997) 7. Conklin, E.J., Yakemovic, K.B.: A Process-Oriented Approach to Design Rationale. Human-Computer Interaction 6 (1991) 8. Malvache, P., Prieur, P.: Mastering Corporate Experience with the REX Method. In: Proceedings of the International Symposium on the Management of Industrial and Corporate Knowledge (ISMICK’93), Compiegne, France, pp. 33–41 (1993) 9. Van Heijst, G., Van der Spek, R., Kruizinga, E.: Organizing Corporate Memories. In: Proceedings of the 10th Banff Knowledge Acquisition for Knowledge-Based Systems Workshop (KAW’96), Banff, Canada, pp. 42-1–42-17 (1996) 10. Guy, M., Tonkin, E.: Folksonomies: Tidying up Tags? D-Lib Magazine 12(1), http://www.dlib.org/dlib/january06/guy/01guy.html 11. Quénu-Joiron, C., Condamines, T.: Facilitate on-line Teacher Know-How Transfer Using Knowledge Capitalization and Case Based Reasoning. In: Cress, U., Dimitrova, V., Specht, M. (eds.) EC-TEL 2009. LNCS, vol. 5794, pp. 273–282. Springer, Heidelberg (2009) 12. Aamodt, A., Plaza, E.: Case-Based Reasoning: Functional Issues, Methodological Variations and System Approach. AI Communications 7(1), 39–59 (1994)
Bridging the Gap between Web 2.0 Technologies and Social Computing Principles Giorgos Kormaris and Marco Spruit Department of Information & Computer Sciences, Utrecht University {gkormari,m.r.spruit}@cs.uu.nl
Abstract. This research presents a brief review of the different definitions of Web 2.0 and presents the most important Web 2.0 Technologies that underlie the evolution of the Web. We map these Web 2.0 technologies to the Social Computing Principles and describe the different relations and patterns that occur. We argue that creating insight into the relations between Web 2.0 Technologies and Principles will help enable the creation of more successful services and accommodate a better understanding of Web 2.0 and its social aspects. Keywords: Web 2.0 technologies, Web 2.0 applications, Social Computing Principles.
1 Introduction: The Theory-Technology Gap in Web 2.0 Web 2.0 is a term that is nowadays broadly used; people perceive it in many different ways. There are various definitions and opinions about the true meaning of Web 2.0, what it includes and if it has been given a proper title. Despite all this discussion and research, there is still a gap between the theoretical view of the new type of services that have risen along with Web 2.0 and with the actual technologies or applications which are used to build, implement and offer these services. This research attempts to bridge the gap that exists between theory and practice in the field of Web 2.0, also referred to as Social Computing and the internet services which are provided by it. In the present paper, we approach this gap from a theoretical point of view and the relation of different theories, principles and technologies which are used will be examined. We shall not present an implementation of a practical way to connect the two sides; instead, our research focuses on creating a better understanding of the underlying principles (theory) and technologies (practice) behind Web 2.0 and Social Computing. Firstly, in Section 2: Related Literature on Web 2.0 we briefly present the existing literature regarding Web 2.0, the different definitions that have been conceived by scholars or technology experts and present in more detail the technologies that recently emerged and enabled this change to happen will be presented and discussed, describing the ways that each technology contributes to the new social view of the web. F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 430–443, 2010. © Springer-Verlag Berlin Heidelberg 2010
Bridging the Gap between Web 2.0 Technologies
431
Secondly, in Section 3: Social Computing Principles we briefly present the Web 2.0 Principles, as defined in [1]. Every principle is explained and we also show a way of categorizing these principles according to their orientation. Consequently, in Section 4: Mapping Web 2.0 Technologies and Principles we have created a map of these principles with corresponding technologies. In continuum, we explain all the different patterns which can be identified and interpret the different connections between principles and technologies. Having presented the map of technologies and principles, in Section 5: Conclusions, our research is concluded and our results are presented. Additionally, we add more reason that prompted this research, which initially was the vague nature of Web 2.0; thus, providing a new solid structure, will make it easier to define the architecture of a service. Finally, in Section 6: Discussion we will also provide opportunities for further developments, how the present research could evolve in the future and mention some interesting open areas for research.
2 Related Literature on Web 2.0 In this section we present the literature that has served as the basis of our research and that lead us towards the effort of combining theories and technologies of Web 2.0, in order to create a new relationship between the technologies and the principles of Web 2.0, also called Social Computing. 2.1 Web 2.0: How Can It Be Defined? Ever since the term ‘Web 2.0’ was officially coined in 2004 by Dale Dougherty, a vice-president of O’Reilly Media Inc., during a team discussion on a potential future conference about the Web, everyone was in the Web 2.0 hype [2]. Specifically, O’Reilly firstly describes the new trend for the web, by using the phrase “The Web as a platform” and by this he means that there are no hard boundaries, but rather a set of principles which outline the area of Web 2.0. If a certain company claims to be “Web 2.0 oriented”, then it has to be along these principles, which are translated into core competencies for Web2.0 companies. Since O’Reilly was the first to talk about a new kind of World Wide Web, a large debate broke out; some were trying to define Web 2.0 and others to deny there is actually such a major difference, after the big web crash during 1999-2000, [3]. Another analyst that approached the subject is Andrew McAfee, who did agree with the overall definition of having a general boundary around what is called Web 2.0 and characteristically says: “…most current platforms, such as knowledge management systems, information portals, intranets and workflow applications, are highly structured from the start, and users have little opportunity to influence this structure.”, [4]. As quoted above, it is preferable to create a set of ground rules that serve as a basis for new applications and tools to be created, which will promote the exchange and emergence of new knowledge.
432
G. Kormaris and M. Spruit
Yet another trend analyst described the same general concept as McAfee and Sir Tim Berners Lee [5]; it was Hinchcliffe, who proposed that this platform should be open and that it should not only be about the Internet. It should include all the connected devices, such as mobile telephones and smartphones, as well as “rich and interactive user interfaces” [6]. 2.2 Web 2.0 Technologies We saw that Web 2.0 has many different definitions, by many types of experts but where it all goes down to is creating a unified platform that can be used to connect everyone and all the related devices, fast and safely. In addition to speed and safety, it is equally important to have a friendly, easy-touse interface, combining usability and functionality in one. We have seen many different services emerge during the 00s, such as Flickr, Facebook, Twitter, Gmail and many others that have tried to combine all these different elements in their service. All these services became a reality by having new technologies and standards “under the hood”, which enable web developers to create innovative applications that can be used to distribute, share and create information in new ways. In this section, we are going to discuss the following technologies in detail and mention the reasons why they played such an integral role in the evolution of the World Wide Web. We categorize them according to the OSI 7-layer model: AJAX, SOAP (Simple Object Access Protocol) and REST (Representational State Transfer) , Adobe Flash, Flex and AIR, Open APIs and Mashups, RSS Feeds, Microformats and Semantics. AJAX. The development model of AJAX is probably the one that revolutionized the way web applications and services are delivered, a term which is accredited to Jesse James Garrett and stands for Asynchronous JavaScript + XML, [7]. Ajax is not a new web-based programming language, but it’s a group of technologies combined together, having as a base the Ajax engine, creating a new experience for users and their interaction with web applications. The technologies that it incorporates are already mature, stable, and popular web-based programming languages and script languages [7], [8] and include (X)HTML, CSS, XML, XSLT , JSON, DOM, XMLHttpRequest, Javascript, VBScript, Adobe’s Flash, Flex and AIR and Microsoft’s Silverlight. • HTML - XHTML & CSS: Hyper Text Markup Language, its extended version XHTML and Cascading Style Sheets are used to change the format and display of the data of web pages, both being thoroughly tested, popular and standardized according to World Wide Web Consortium (W3C). • XML & XSLT – JSON: EXtensible Markup Language is a self-descriptive language that was designed to carry data and gives developers the ability to define their own tags. XSLT a way to transform XML documents that essentially change the way data are displayed. • JavaScript Object Notation is an alternate of XML and “it is a lightweight datainterchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. It is based on a subset of the JavaScript Programming Language”, Ecma International, [9].
Bridging the Gap between Web 2.0 Technologies
433
• DOM: Document Object Model is cross-platform and language-independent way of dynamically controlling objects and to describe the data of an HTML, XHTML and XML documents. • XMLHttpRequest: First implemented by Microsoft as an ActiveX object but now also available as a native object within both Mozilla and Apple's Safari browser, enables JavaScript to make HTTP requests to a remote server without the need to reload the page. In essence, HTTP requests can be made and responses received, completely in the background and without the user experiencing any visual interruptions, [10]. • JavaScript – VBScript: JavaScript was created by Brendan Eich of Netscape and it is an open, cross-platform object scripting language for the creation and customization of applications on enterprise networks and the Internet, [11]. It is mainly used to bring all the aforementioned technologies together. VBScript a scripting language quite similar to JavaScript and it is an Active Scripting language, developed by Microsoft, which uses the Component Object Model to access elements of the environment within which it is running, [12]. Where Ajax is completely different from the classic web application model is its asynchronous way of client-server communication, putting a stop to the “start-stopstart-stop nature of interaction”, [7]. The Ajax engine is responsible for the communication and data exchange with the server, which hidden from the user, but also renders the interface that users interact with. This kind of architecture is widespread and used in a variety of applications which can be very simple, like a small website or even complex services, such as Google Maps. Finally, it has also become quite popular to be used in combination with the Ruby on Rails web development framework, in an effort to use Agile Development methodology in web-based projects. SOAP & REST Architectures. Since we discussed the technologies that are used to implement the applications and services that have become known as Web 2.0, we also have to mention the architectures that are used, in order to plan the development of these services. The choice is always up to the hands of the developer, but as quoted by Anderson in [8], Sean McGrath describes the Web as an enormous information space, littered with nouns (that can be located with URIs) and a small number of verbs (GET, POST etc). Where SOAP is more of a Verb Noun system he argues that SOAP allows the creation of too many (irregular) verbs. We will not go into much detail, since this topic is rather deep and the subject of a different research, but what is worth mentioning is the dispute between these two schools. SOAP is the traditional, standards-based approach; it was developed by Microsoft in 1998 and since then has become the most popular standard in exchanging XMLbased messages between applications, [8]. It has become so popular, because it was the first architecture that enabled the usage of new technologies like AJAX and being introduced by a major corporation as Microsoft is, it was bound to become a success. From a technical point of view, SOAP simply is “A protocol ‘framework’, to deliver the necessary interoperability between message-based middleware tools across the entire industry”, [13].
434
G. Kormaris and M. Spruit
Fig. 1. The building blocks of the two different approaches on Web service architectures [13]
REST was developed by Roy Fielding and is the conceptually simpler “trendy new kid on the block” and provides a simple communications interface using XML and HTTP, using simple commands, such as POST, GET and PUT, [14]. It relies on simplicity, ubiquity and scalability, since it can support small, simple services and complex services, offered by large service providers, such as Amazon and Google, [13]. Adobe Flash - Adobe Flex - AIR & Microsoft Silverlight. Adobe Flash is a multimedia platform, which is used to create Rich Internet Applications (RIA), giving developers the ability to include animation, interactive graphics and other options to webpages, without considerably slowing down the loading of webpages. Adobe Flex is package for developing such applications and offers a separate IDE for developers to create their RIAs based on the Flash platform. AIR is another development of Adobe, which is a cross-operating system runtime that enables you to use your existing HTML/Ajax, Flex, or Flash web development skills and tools to build and deploy rich Internet applications to the desktop. Silverlight was developed by Microsoft and was officially released in 2007, as an alternative way to create multimedia applications for the web. It is “a cross-browser, cross-platform and cross-device browser plug-in that helps companies design, develop and deliver applications and experiences on the Web”, [15]. Open APIs & Mashups. According to [6], an Application Programming Interface (API) provides a mechanism for programmers to make use of the functionality of a set of modules without having access to the source code. The addition of Open is used when these APIs are free and open for all programmers to use and to take advantage of certain features in their own projects. We have seen big social networks and major web service providers giving out APIs for developers, most notably Facebook and the Facebook Apps API, Flickr and Google with various APIs, with the likes of the Google Maps API being rather popular. This is a growing trend and has become quite popular during the Web 2.0 era, something that was augmented by the growing number of mashups that we have encountered during the past years. Simply put a mashup “…is a customizable application that takes seemingly disparate data sets - both static and real-time - and integrates them to create a new data set”, [16].
Bridging the Gap between Web 2.0 Technologies
435
To do this easily and successfully, the use of Open APIs has become a must amongst developers and this has lead to the creation of the alliances such is the “Open MashUp” alliance (www.openmashup.org), having support of major companies such as HP, Intel, Cap Gemini and the –under development– project lead by Google, called “Open Social” (www.opensocial.org), in cooperation with social service providers such as MySpace, Hi5, XING and others. RSS feed, Microformats & Semantics. Since the social aspect of Web 2.0 is rather crucial, we have seen many technologies that give users the opportunity to have personalized information right on their desktop or mobile phone (RSS), personalize their accounts at websites (Microformats) and even to be able to get information specifically for them, based on their interests or previous browsing (Semantics). According to specifications of RSS 2.0, “RSS is a Web content syndication format, its name is an acronym for Really Simple Syndication RSS is a dialect of XML and all RSS files must conform to the XML 1.0 specification, as published on the World Wide Web Consortium (W3C) website”. As mentioned what RSS does, is give users the opportunity to receive information they have selected as interesting and to be instantly notified about new updates on their favorite subjects, from their preferred blogs or websites. Microformats could be placed on step above RSS, since it’s a technology that provides developers with a way of adding simple markup to human-readable data items such as events, contact details or locations, on web pages, so that the information in them can be extracted by software and indexed, searched for, saved, cross-referenced or combined. Semantics have been and still are a topic of broad discussions, since many analysts and scientists mention another web, called the Semantic Web, where semantics play the key role. Despite this view, since Web 2.0 has become the main point of reference, another trend has emerged; it is broadly discussed in scientific research, [17], [18], that what seems more logical than is having a unified web, where the social aspect of the web, commonly named as Web 2.0 is one and the same with the Semantic Web, which is used to augment the personalized aspect of the web and provide valuable data to developers and/or businesses. Although some have second thoughts that such an attempt would be successful saying that “…I’ve adopted a cautious outlook toward the prospects of a marriage between Semantic Web technology and Web 2.0” [17], others seem to be more optimistic and state that “…there is growing realization that the two ideas complement each other and that in fact both communities need elements from the other’s technologies to overcome their own limitations”, [18]. An example of how semantics can be used, is that of Collaborative Tagging, [19], [20], which has become an integral part of the social aspect of the web. Simply put, it enables users to tag photos, videos and every other web-related object they come across. Using this technology augments the sharing between users, makes content categorization easier and reduces the amount of effort that users have to put into finding what they really are looking for, since others have already added explanatory tags to its description.
436
G. Kormaris and M. Spruit
2.3 Social Computing and Its Principles In the previous section we presented the different technologies that have served as the base of the evolution of the World Wide Web, into what is called Web 2.0. Some tend to disagree with the name that has been given to these new services, which have changed the landscape of the web. This new effort to re-define the state of the web has moved towards the term Social Computing, which has been the new center of discussion amongst internet experts, scientists and trend analysts. According to a thorough investigation that was conducted in [1], all the different definitions that have emerged during the past years were examined and created a new definition which combines all these different opinions by experts like O’Reilly and Hinchcliffe: “…Social Computing refers to a development where technologies enable empowerment of individuals, or groups of individuals, to express themselves in a more natural way, leading to easier creation, enriching, and finding of content…” [1].
3 Social Computing Principles In addition to the definition of Social Computing in [1], a set of nine principles that can be identified as the main characteristics of Social Computing have been defined, by comparing the most popular definitions of Social Computing.
Fig. 2. The nine (9) principles of Social Computing as described by Knol in [1]
This review and comparison resulted in defining the following nine (9) principles of Social Computing: Open Platform, Lightweight Models, Enabling Services, Intuitive Usability, Long Tail, which are more Technology oriented and Unbounded
Bridging the Gap between Web 2.0 Technologies
437
Collaboration, Collective Intelligence, Network Effect and User Generated Content, which are more Socially oriented. In the following subsections we will briefly present these principles according to their orientation. 3.1 Technology Oriented • Open Platform. This principle dictates that Web 2.0 services provide users the possibility to access information they desire with their browser and puts an emphasis on synergy between different devices and applications that are connected to the Internet. At the same time it does not imply the replacement of desktop computers and classic operating systems but promotes compatibility and collaboration, towards a more social web. • Lightweight Models. When talking about lightweight models, we this of flexible and agile ways of developing a product, thus being able to update, change and reuse much faster than the classic development methods. This is common for a web based service, “…it requires an agile business model, which can handle such a fast update rate…” and it also helps to reduce the costs in organizational change, a process that is expensive in terms of energy usage as well as investments. • Enabling Services. What enabling services define is the differentiation of online services that follow the model of SaaS - Software as a Service, a model used extensively, by services such as the applications by Google e.g. Gmail and Google Docs. What is considered crucial for these services is to be characterized by flexibility, openness, scalability and re-usability. These characteristics enable the creation of mashups, faster updating, online management of data and lower transaction costs. • Intuitive Usability. The meaning of this principle is quite clear, but this does not mean that it is simple to implement; Usability is key when it comes to interfaces, which is a main element of all web services. Consequently, interfaces must be easy and simple for all types of users, according to the walk up ‘n’ use mentality, but at the same time to offer expert users more options to personalize their side of the service according to their liking. For example, Macromedia introduced the term of Rich Internet Applications as described in the previous section, but things became even more usable when AJAX was introduced, exploiting the graphic environment offered by JavaScript. 3.2 Socially Oriented •
•
Long Tail. The basis of this principle is the 80 - 20 Pareto distribution, meaning that services should aim at both sides of the market; the 20% of customers that generate large profit, as well as the 80% of customers that generate small profit, but are by far larger in number. To achieve this, services must attract users of all orientations, having Amazon as an example which manages to gain a lot of profit from this group of customers, by offering niche products, exclusively from their online store. Unbounded Collaboration. Social computing is all about collaboration and communication, between users and between services. This can be achieved, by
438
•
•
•
G. Kormaris and M. Spruit
giving developers the ability to create mashups, through the Open Platform principle, for example by using Open APIs (as described in 2.2 Web 2.0 Technologies). Additionally, users meet, interact and communicate in online communities. User contributions within these communities, add value to these services or to the organizations that have created the service. Collective Intelligence. When talking about Social Computing and Unbounded Collaboration of users, then we also have to take into consideration the knowledge, which is created by these communities, whether they work in groups or individually. The main issue for this principle is trust amongst users, since this decentralization of the web suffers from lack of control. Web services that want to harness the vast amounts of information generated by all these users, it is important attract as many as possible to contribute, since the 1% rule applies in these communities. Network Effects. This principle describes the interaction between users and services, but this relation works both ways; users contribute to make services richer in content, by sharing their knowledge and services attract users, by offering more options and by adjusting their content to every user’s preferences and history. Due to the limited number of users that actually spend time and effort to contribute to an application, attracting users is an ongoing race, where the rule of first-moverwins does not apply, since we constantly encounter new services that make competition harder. User Generated Content. Social Computing contains all of the previously mentioned principles and technologies, but in its core, data is the main driver of social computing and the owner of the data is key. Managing all this data is a difficult task, since there are many issues regarding this matter; authorship, privacy and security. How content is used and protected by service providers is a major issue and often influences how popular a service is.
4 Mapping Web 2.0 Technologies and Principles In the previous sections, we have presented the technologies that enabled the evolution of the Web into Web 2.0, which in turn focuses in the Social aspects of computing and the interaction between users and services. We also presented the principles which define the main borders of Social Computing and how they are separated into two categories, Technology oriented and Socially oriented. In this section we are going to present a map of the principles, their corresponding technologies and explain why we consider these relations to be valid. In the following table, where we have mapped the relationship of Principles and Technologies, we have placed columns with the Principles divided in the two orientations that can be identified. We have placed the Technologies in the rows of our table, in a sequence from the lowest level technologies, according to the OSI 7-layer model of the internet layers, to the highest. The shade of gray is correspondent with the closeness of the relation between Technologies and Principles. Darker shades equal higher relevance and lighter shades equal lower relevance.
Bridging the Gap between Web 2.0 Technologies
439
The positions of the principles are correspondent to the initial description from [1], but using a gradient perspective; Principles more Technology oriented are placed on the left and more Socially oriented principles are placed on the right. Table 1. The relation between Web 2.0 Technologies and Principles of Social Computing
PRINCIPLES
User Generated Content
Network Effects
Collective Intelligence
Unbounded Collaboration
Long Tail
Socially Oriented Intuitive Usability
Enabling Services
Lightweight Models
Open Platform
Technology Oriented
TECHNOLOGIES AJAX
SOAP & REST
Open APIs & Mashups
Abobe Flash, Flex and AIR, Silverlight
RSS, Microformats, Semantics
With a first glance, we can see the clear formation of two clusters, one on the top left of the Table 1 and one on the bottom right. Let’s take a deeper look into the relations between Technologies and Principles and try to explain why they occur. Technologies and Technology oriented Principles. As logic dictates, one would assume that the Technology oriented principles would be related with technologies that belong to a lower OSI layer, which are placed on the first rows of the table; AJAX and SOAP & REST have most of the relations with the Technology oriented Principles. These technologies have started a true revolution in web services, since they are open to all developers, especially AJAX and REST, enabling the creation of lightweight applications that hide unnecessary information from the user. At the same time, they present users the look and feel of a normal desktop application, thus also contributing to the Intuitive Usability of these applications. Multimedia technologies like Flash and Silverlight do not relate with many Principles, but their importance is not reduced; these technologies give developers the ability to create impressive and interactive applications, thus adding new elements to the
440
G. Kormaris and M. Spruit
Intuitive Usability of services. This type of technologies is widespread and is constantly becoming easier to use, so that everyone can create their own personal content, thus attracting a larger percentage of the Long Tail. Another type of technology that attracts a large number of potential clients, thus augmenting the Long Tail Principle, is that of technologies that add a more personalized view on services, such as RSS feeds and Microformats. Additionally, these technologies that augment personalized User Interfaces also have a close relation with Intuitive Usability, since it helps users to access their favorite topics much faster. Technologies and Socially oriented Principles. On the other hand, higher level technologies, like Flash, and RSS have most of their relations with the Socially oriented Principles. They are the ones that promote and augment the interaction between users and enable them to create new content that adds value to existing services. Using multimedia technologies like Flash or Silverlight, is accomplished either by complete solutions such as Rich Internet Applications, or by embedding music, movies and other multimedia in web services. Another way of using this technology is quite common nowadays, when users are asked to create their own content (User Generated Content), to share it with others and in return they receive various rewards. Such rewards might include their videos being included in their favorite music group video clip (‘Placebo – Running Up That Hill’ music video clip, Prodigy Video Clip Contest, [21]), or even using their videos to add more depth to their university applications, something that Tufts University recently introduced [22], [23]. Examples like these, point out the significance of user generated content, which is enhanced by multimedia technologies. In the previous subsection we explained how RSS and other personalization technologies are related with Technology oriented Principles, but there is more to these technologies; Personalization means that users have spent time on these applications and have tweaked the interface according to their preferences. What we can derive from this behavior is that these users have become dependent from these services, thus they are valuable to them and this value keeps increasing. This is also one of the main ideas of the Network Effects principle, making this type of technologies and this principle, closely related. Finally, these personalization options give users the ability to create their own personalized content and share it with their peers, who can comment, add or respond with their own content, thus explaining the relation with the User Generated Content principle. The special case of Openness. The only technology that has many relations, with both Technology and Socially oriented Principles is that of Open APIs & Mashups and the reason for this is simple; Open APIs promote the principle of the Open Platform, giving developers the opportunity to use many different parts of applications to create Enabling Services. This openness also creates an attraction of large numbers of users, who are potential customers, thus giving organizations the ability to exploit the Long Tail. Such a large concentration of users also results in the creation of communities regarding these new APIs or initiatives for new mashups. The benefits of these communities have been already described in 2.3 Social Computing and its Principles, where all of the Socially oriented principles are mentioned.
Bridging the Gap between Web 2.0 Technologies
441
5 Conclusions In our research we presented a series of technologies that have enabled the evolution of the World Wide Web, the creation of innovative services and new applications that have been tagged with term Web 2.0; since this term is still new and there is confusion surrounding it, we conducted a literature research of all the different definitions of Web 2.0. Furthermore, we created a list of the most important technologies behind Web 2.0 and presented all these technologies, describing their key elements and how they contributed in changing the landscape of web services and applications. Having done this, the Social identity of this new trend came to light and in order to define it we used the typology defined in [1] as our main base. Thus we described the term Social Computing, which is another name for Web 2.0 and the wave of social services that came with it. In order to have a clear view of what Social Computing really means, we presented the principles that govern it and that create the general borders in which it functions. These principles were also divided, according to their orientation, in two categories; Technology oriented and Socially oriented. The main idea behind this paper was triggered by the complicated and abstract character of Web 2.0 and our main goal was to provide a more solid structure for Social Computing. To do so, we created a map of the relations that can be identified, between its two main elements; the Principles that govern it and the Technologies that enabled the World Wide Web to evolve into Web 2.0 or Social Computing. Concluding our research, we explained the relations that were identified and showed the different relations between the two different types of Principles, with Technologies of different OSI layers. Through the creation of this solid structure, if we were to create a new Social Computing service or application, it would be best if we actually looked carefully at the map of Technologies – Principles; we should decide what we wish to achieve with our new service and use the map as our guide in creating a detailed plan about the development of our new product.
6 Discussion During our research, we encountered several analyses about Web 2.0, Social Computing and the technologies behind it. We also came across new developments in the area of Web 2.0 Technologies, such as the work of Mozilla Labs and the new services that were recently introduced by Google, Google Wave and the controversial Google Buzz. With new services like the ones mentioned, the relation between principles and technologies becomes even more obvious, since they promote social interaction, sharing of knowledge and user generated content, all based on the technologies mentioned in the present paper [24], [25]. Although the main subject of our research is Web 2.0 Technologies, innovation does not only happen in services, but also in devices, closely linked to the Internet, which create new challenges and opportunities. For example, following the joint effort of Microsoft - HP Tablet presentation, Apple introduced iPad into the market, products which were launched as a ‘middle product’, between smart phones and laptops. These new devices create opportunities for the development of innovative services, which in turn add new value to web based services, such as the new bookstore application supported solely by Apple’s iPad.
442
G. Kormaris and M. Spruit
Finally, if we were to reflect on the future of Web 2.0, one thing is certain; there will always be new services, applications and technologies introduced, therefore we must always keep in touch with the latest innovations and an open mind. Open problems and research opportunities. Lastly, through this paper we can identify related problems, which still remain open among the scientific community of the certain research area, thus offering many opportunities for future developments. In the previous sections of our paper we have mentioned subjects, which constitute separate research areas. Most notably, Collaborative Tagging (subsection 2.2) and User Generated Content (subsection 3.2) can be detected as two subjects that remain open for further study and we can also find connections to the other subjects mentioned in this paper. Collaborative Tagging is highly debated and many researchers have tried many different approaches, [19], [26], in order to tackle the problems which arise with the extensive use of tags on the internet. The main problem is trying to identify and organize these tags, since large amounts of data exist on the web, being accessed and commented by millions of users; this chaotic environment results in a chaotic state of tags, with users creating new words, such as internet slang, [27] and dealing with typical with linguistic issues such as polysemy and homonymy, [19]. User Generated Content was one of the principles we identified, one of the more Socially oriented principles of Social Computing, since having users creating their own content and sharing it with others is a pure social activity. In addition to our principle, user generated content includes many issues that are subject of research by many scientists. This interest exists due to the incredibly large amounts of data which are user generated, which might serve positively from a community growth perspective and a better use of the long tail. However, all this data can also serve against a service, such as having floods of data from hyper active, malicious, or even ignorant users and low quality content added by untrustworthy users. All these issues call for new, improved ways of identifying and separating content and users of good quality, from users and content of lower quality [28]. To conclude our discussion, there are many opportunities for new research in the spectrum of Web 2.0 and Social Computing, since both subjects are rather fresh and having constant developments in technologies and services adds even more interest to this area.
References 1. Knol, M., Spruit, M., Scheper, W.: The Emerging Value of Social Computing in Business Model Innovation. In: Lytras, M.O. (ed.) Electronic Globalized Business And Sustainable Development Through IT Management: Strategies And Perspectives. IGI Global (2009) 2. O’Reilly, T.: http://www.oreillynet.com/pub/a/oreilly/tim/ news/2005/09/30/what-is-web-20.html 3. O’Reilly, T.: http://radar.oreilly.com/archives/2006/12/ web_20_compact.html 4. McAfee, A.: Enterprise 2.0: The Dawn of Emergent Collaboration. MIT Sloan Management Review 47(3), 21–28 (2006)
Bridging the Gap between Web 2.0 Technologies
443
5. Berners-Lee, S.T.: Weaving the Web: The Original Design and Ultimate Destiny of the World Wide Web. HarperCollins, New York (1999) 6. Hinchcliffe, D.: http://web2.socialcomputingjournal.com/ the_state_of_web_20.htm 7. Garrett, J.: http://www.adaptivepath.com/publications/essays/ archives/000385.php 8. Anderson, P.: What is Web 2.0? Ideas, technologies and implications for education. JISC Technology and Standards Watch (2007) 9. Ecma International.: Standard ECMA-262 5th Edition: ECMAScript Language Specification. Ecma International, Geneva (2007) 10. McLellan, D.: O’Reilly: xml.com, http://www.xml.com/pub/a/2005/02/09/xml-http-request.html 11. Netscape Communications Corporation. Javascript Press Release, http://web.archive.org/web/20070916144913/ http://wp.netscape.com/newsref/pr/newsrelease67.html 12. Microsoft Corporation, http://msdn.microsoft.com/en-us/library/1kw29xwfVS.85.aspx 13. Pautasso, C.: SOAP vs REST: Bringing the Web back into Web Services, Business Integration Technologies, IBM Zurich Research Lab (2007) 14. Spies, B.: http://www.ajaxonomy.com/2008/xml/ web-services-part-1-soap-vs-rest 15. Microsoft Corporation, http://www.microsoft.com/silverlight/overview/default.aspx 16. Benson, J., Company, H.A.: EMML Changes Everything: Profitability, Predicatbility & Performance through Enterprise Mashups. Open Mashup Alliance, Alexandria (2009) 17. Greaves, M.: Semantic Web 2.0. IEEE Intelligent Systems (2007) 18. Ankolekar, A., Krötzsch, M., Tran, T., Vrandečić, D.: The two cultures: Mashing up Web 2.0 and the Semantic Web. Journal of Web Semantics, 70m–75m (2008) 19. Golder, S.A., Huberman, B.A.: The Structure of Collaborative Tagging Systems. CoRR: abs/cs/0508082 (2005) 20. Halpin, H., Robu, V., Shepherd, H.: The Complex Dynamics of Collaborative Tagging. In: WWW 2007, Banff, Alberta, Canada (2007) 21. Prodigy Video Clip Contest, http://www.dailymotion.com/group/runwiththewolves 22. My College Guide, http://mycollegeguide.org/blog/02/2010/ college-applications-video/ 23. Tufts University Admissions, http://admissions.tufts.edu/?pid=184 24. Hasija, K., Singh, D., Mehta, A.: Re-Orientation of Web 2.0. In: International Conference on Information and Multimedia Technology, pp. 117–120. IEEE Computer Society, Jeju Island (2009) 25. Vossen, G.: Unleashing Web 2.0, From Concepts to Creativity. Morgan Kaufman, Burlington (2007) 26. Li, R., Bao, S., Yu, Y., Fei, B., Su, Z.: Towards effective browsing of large scale social annotations. In: WWW 2007, pp. 943–952 (2007) 27. Wellman, B.: The Glocal Village: Internet and Community. idea&s, Faculty of Arts and Science. University of Toronto, Toronto (2004) 28. Bian, J., Liu, Y., Zhou, D., Agichtein, E., Zha, H.: Learning to recognize reliable users and content in social media with coupled mutual reinforcement. In: WWW 2009, pp. 51–60 (2009)
Using Similarity Values for Ontology Matching in the Grid Axel Tenschert HLRS - High Performance Computing Center Stuttgart, Nobelstraße 19, 70569 Stuttgart, Germany [email protected]
Abstract. This work targets the issue of ontology matching in a grid environment. For this, similarity values are considered in order to qualify the similarity between several ontologies, concepts and properties. Based on the similarity values a matching is proposed in order to extend one priority ontology. The ontology matching is proposed in a grid environment with the aim to manage ontology matching for big data sets at same time. Keywords: Ontology Matching, Ontology Mapping, Grid Computing, Virtual Organizations.
1 Introduction This work envisages the mapping of concepts, properties and relations between entities of several ontologies in a semi-automatic way by considering well known approaches. Hereby, ontologies are selected for extending an ontology to a priority ontology of a given set. In order to achieve an adequate mapping of similarities for the extended priority ontology, similarity values for entity pairs are calculated. However, the calculation of similarity values for entity pairs encounters the problem of evaluating one measurement out of lots of values such as similarity of properties or relations and relevance to neighboring entity pairs. Further, a matching of large data sets requires large computing resources as well. For this, algorithms for executing compute jobs at same time on several nodes of a cluster are beneficial approaches. However, this raises the challenge of distributing the jobs in a high effective way as well as aligning algorithms to avoid latencies or conflicts. Therefore, two main issues are considered in this work: 1. 2.
Ontology matching based on similarity values Distributed ontology matching on several resources
2 Related Work When thinking of ontology matching in a distributed environment we have to think of established approaches in this field as well. Bloehdorn et al. [10] have indentified an F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 444–450, 2010. © Springer-Verlag Berlin Heidelberg 2010
Using Similarity Values for Ontology Matching in the Grid
445
ontology matching process divided into five steps and a number of iterations depending on the number of new proposed alignments. For the described process input is required, particularly two or more selected ontologies in order to generate one output ontology that includes the proposed alignments. Additionally, a user is enabled to enter already established matches manually. Bloehdorn et al. are identifying the whole process as a matching process but the steps for comparing the alignments is defined as mapping. The five mapping steps are as follows: 1. 2. 3. 4.
Feature Engineering. Relevant features of the ontology set are selected. Search Step Selection. A defined search space of matches is derived. Similarity Computation. Similarity values within the matches are defined. Similarity Aggregation. Several similarity values of one match (e.g. similarity between properties, relations, etc.) are aggregated to one. 5. Interpretation. In the final interpretation step the produced similarity steps are used to derive matches between entities.
However, the similarity of one entity pair influences the neighboring pairs as well. Hence, within the iteration the similarity values of the neighbors are considered. When the last proposed new alignment is calculated the iteration terminates. Further, several algorithms processing the iterations for the ontology matching process are executed at same time. When thinking of distribution techniques MapReduce [6] is taken into account as well. MapReduce is based on Apache Hadoop [7], an open-source software project for reliable, scalable and distributed computing. Hence, the MapReduce framework is developed for processing vast amounts of data on several nodes of a cluster in parallel. Urbani et al. [12] have analyzed the problem of scalable and distributed reasoning by usage of MapReduce. Within their work they have addressed the challenge of partitioning reasoning in a scalable way. The advantage for reasoning in a parallel way is the possibility to scale in two dimensions, the hardware performance of each node and the number of available nodes in the cluster. Urbani et al. have proposed an approach for reasoning of very large amounts of data. This approach has outperformed other common published approaches within their specific test conditions. However, other common approaches for reasoning such as Falcon-AO [8] or DBpedia [9] have been published. Hu et. al [11] are presenting Falcon-AO as a significant component of Falcon for automatic ontology alignment. Within Falcon-AO Hu et al. are describing the PBM component that is used for partition of ontologies. This technique is required to manage large-scale ontologies. The Falcon-AO tool is used for partition and alignment of ontologies. DBpedia is a community effort for extracting structured information from Wikipedia with the aim to make them available in the web. Within DBpedia several projects are available such as DBpedia Ontology, a cross-domain ontology that covers classes and properties for describing the Wikipedia content in a structured way. The described approaches are addressing the challenge of ontology matching and reasoning at same time as well as handling large data sets. Hence, they are considered for defining an effective approach for distributed ontology matching.
446
A. Tenschert
3 Ontology Matching Ontology matching can be used to combine content of several ontologies by matching properties, concepts and relations. Shvaiko and Euzenat [14] are describing several steps for ontology matching, including the comparison of concepts and properties of different ontologies. Within the comparison of similar ontologies it becomes obvious that there are differences between the ontologies even if they are in the same format and containing similar concepts. In order to avoid compatibility problems between ontologies, one solution is to select ontologies with a similar topic and similar structure, e.g. same format of two ontologies and same topic. Through this, it becomes possible to extend a priority ontology that is enhanced through a mapping based on a previous matching between the selected ontologies. Furthermore, Tun N. N. [16] presents an ontology matching approach, called MetaOntoModel-based ontology matching. This method enables the comparison of concepts with the aim to identify similar concepts by analyzing the meta-knowledge of the concepts. If the meta-knowledge is similar, a matching is beneficial. The metaknowledge is the concept-level knowledge. The described approaches follow a distributed ontology matching by considering several resources, e.g. various nodes of a cluster. However, applying grid technologies to a distributed ontology matching process increases the efficiency as well. 3.1 Using Similarity Values for Ontology Matching The usage of similarity values for ontology matching increases the accuracy of matching strategies. This approach is envisaged within this work. When thinking of Bloehdorn et al. the calculation of the similarity values, namely “Similarity Computation” is the third step of the ontology matching process. Therefore, we have to take look at the two steps before, the “Feature Engineering” and the “Search Step Selection”. Within the first step a user has to define relevant features for the matching. The relevancy of features regarding a matching depends on the research question as well as the topic of the ontologies. Further, a set of ontologies needs to be defined by a user as well. Through this, the user specifies the topic and the amount of data that is considered within the matching. The ”Search Step Selection” defines the search space of all entities that are considered for the matching. The search space affects the accuracy of matching results as well. A wider range of the search space might lead to a better analysis of neighboring entities that influences the current entity pair. However, this may lead to an amount of data that becomes hard to compute in an efficient and scalable way.
Fig. 1. Similarity Measurement Calculation Prerequisites
Using Similarity Values for Ontology Matching in the Grid
447
Figure 1 describes the prerequisites for calculating the similarity value. First of all, a set of ontologies is identified to provide the matching process with the required input data. Afterwards, an entity pair is selected and a selection of relevant properties for the matching are determined. The next step is to define the search space. The similarity value of an entity pair consists of several aspects, such as the similarity of: • • • •
properties concepts taxonomy relations
Especially, when thinking of the relations between entities in an ontology it becomes obvious that related entities have to be considered as well. At this, the search space describes the range of neighboring entities. However, an issue for further research is an algorithm for calculating such similarity values for entities out of a set of similarity values, namely of the properties, concepts, taxonomy and relations. To summarize the main issues for measuring the similarity value (SimV) between two selected entities we identify the following relevant conditions: i. The similarity grade of selected concepts itself - SimC ii. The properties of the concepts and the similarity grade between the properties - SimP iii. The taxonomy and the similarity grade of the taxonomy - SimT iv. The similarity grade of relations of the concepts – SimR Hence, SimV consitsts of SimC, SimP, SimT and SimR. The challenge for creating SimV is to measure the similarities in an adequate way, generating one similarity out of the identified similarities as well as considering the search space of the matches. Regarding the described strategy for generating one similarity value that is applicable instead of several different similarity values such as SimC, SimP, SimT and SimR the work from Euzenat and Valtchev should be considered. Within their work several similarity values are considered in order to receive one value for matching purposes [13]. However, they are presenting an approach for subsuming the similarities of the selected classes or other entities to one. The described function enables the usage of only one similarity value instead of several different values. This summarization should be considered for handling lots of similarity values. Beyond the described iteration, including the calculation prerequisites and the calculation of the similarity value, it has to be considered that several iterations will take place. Lots of these iterations are processed in parallel. Therefore, a solution for distributing the iterations on several resources is required. For this work, this means that an algorithm for distribution of ontology matching process iterations on several resources, e.g. different nodes, is required. 3.2 Ontology Matching in a Grid Environment When thinking of the grid, three main middleware have to be considered. The open source grid software Globus Toolkit [3], UNICORE [4] and gLite [5] that is part of the EGEE project. Such a middleware is required in order to establish ontology
448
A. Tenschert
matching in a grid environment. Hence, an adequate grid middleware is the connection between the ontology matching process and the available resources in a cluster. Ontology matching in a distributed way, as in a grid environment, enables a fast and resource effective matching. For this, the challenge is to provide a matching strategy that can be distributed in a grid environment.
Fig. 2. Initial Architecture for Ontology Matching in the Grid
Figure 2 presents an initial architecture for ontology matching in a grid environment. The user domain includes a set of ontologies that was selected by the user considering his/her specific research question. Further, the mapping includes the matching strategy, the distribution of the matching and therefore a distribution of several jobs on different nodes and in addition the mapping takes place in the mapping domain. A grid middleware is required as well and resources have to be identified. W. Xing et al. [15] are identifying the grid as a collection of virtual organizations (VOs) including several different resources which are combined and organized by a grid middleware. Through this, W. Xing et al. propose the usage of a grid as a beneficial method for supporting a user with required computing power, storage capacity and services. The possibility of using a virtual organization that allocates the required resources in order to increase scalability and computing power is considered within this work. The use of a grid middleware ensures a connection between computing resources and the ontology matching processes. Through this, the mapping processes are distributed on the available resources by considering virtual organizations. This allows an execution of various matching iterations at same time. The selected grid middleware provides mechanisms for distribution of several processes or compute jobs. A
Using Similarity Values for Ontology Matching in the Grid
449
grid architecture enables the coordination of several loosely coupled resources in order to make use of them. By this means, the set of available resources is used as a cluster and it becomes possible to support the ontology matching process for large data sets and urgent computing cases with the required resources. Figure 3 presents an example of a use case for distribution in a grid.
Fig. 3. Use Case - Ontology Matching in a Grid
As presented a user is enabled to initialize the ontology matching process. The ontology matching process in figure 3 includes the user and the mapping domain from the initial architecture. The ontology matching process requires computing resources that are accessible through a grid middleware. Hereby, several resources (e.g. Resource A, Resource B, Resource C) are used as a cluster. The proposed distributed process execution on several resources ensures the required compute power even if the resources are located at various locations.
4 Conclusions and Outlook The presented approach for ontology matching in a distributed environment by usage of grid middleware supports a highly efficient and scalable execution of processes. Through this, large amounts of data as well as a time urgent execution of processes are provided. In addition to that, the usage of similarity values increases the quality of the distributed matching. The combination of an ontology matching approach that is based on measurement of similarities between entities and a distributed environment is a promising strategy in order to face the challenge of matching very large data sets and/or urgent computing cases in the field of ontology matching. This approach is a basis for further work done in the field of ontology matching on distributed resources. However, further work in the field of grid computing in order to facilitate ontology matching will take place. Acknowledgments. This work has been supported by the plugIT project (http://www. plug-it-project.eu) and has been partly funded by the European Commission’s ICT activity of the 7th Framework Programme under contract number 231430. This paper expresses the opinions of the authors and not necessarily those of the European Commission. The European Commission is not liable for any use that may be made of the information contained in this paper.
450
A. Tenschert
This work has been supported by the LarKC project (http://www.larkc.eu), partly funded by the European Commission's IST activity of the 7th Framework Program. This work expresses only the opinions of the authors.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
11. 12.
13. 14. 15. 16.
The plugIT project, http://plug-it-project.eu/ The LarKC project, http://www.larkc.eu/ The Globus Toolkit, http://www.globus.org/ UNICORE, http://www.unicore.eu/ gLite, http://glite.web.cern.ch/glite/ The MapReduce Framework, http://hadoop.apache.org/mapreduce/ The Apache Hadoop project, http://hadoop.apache.org/ The Falcon infrastructure, http://iws.seu.edu.cn/projects/matching/ The DBpedia project, http://dbpedia.org/About Bloehdorn, S., Haase, P., Huang, Z., Sure, Y., Volker, J., van Harmelen, F., Studer, R.: Ontology Management. In: Davies, J., Grobelnik, M., Mladenic, D. (eds.) Semantic Knowledge Management, pp. 3–20. Springer, Heidelberg (2009) Hu, W., Cheng, G., Zheng, D., Zhong, X., Qu, Y.: The Results of Falcon-AO in the OAEI 2006 Campaign. In: Proceedings of OM (2006) Urbani, J., Kotoulas, S., Oren, E., van Harmelen, F.: Scalable Distributed Reasoning using MapReduce. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 634–649. Springer, Heidelberg (2009) Euzenat, J., Valtchev, P.: Similarity-based ontology alignment in OWL-Lite. In: Proceedings of European Conference on Artificial Intelligence (2004) Shvaiko, P., Euzenat, J.: Ten Challenges for Ontology Matching. In: Proceedings of ODBASE (2008) Xing, W., Dikaiakos, M.D., Sakellariou, R., Orlando, S., Laforenza, D.: Design and Development of a Core Grid Ontology. In: Proceedings of the CoreGRID Workshop (2005) Tun, N.N.: Semantic Enrichment in Ontolgies for Matching. In: Proceedings of the second Australasian workshop on Advances in ontologies (2006)
Rapid Creation and Deployment of Communities of Interest Using the CMap Ontology Editor and the KAoS Policy Services Framework Andrzej Uszok1, Jeffrey M. Bradshaw1, Tom Eskridge1, and James Hanna2 1
Florida Institute for Human and Machine Cognition (IHMC), 40 S. Alcaniz, Pensacola, FL 32502, USA {auszok,jbradshaw,teskridge}@ihmc.us 2 Air Force Research Laboratory, Rome, NY [email protected]
Abstract. Sharing information across diverse teams is increasingly important in military operations, intelligence analysis, emergency response, and multiinstitutional scientific studies. Communities of Interest (COIs) are an approach by which such information sharing can be realized. Widespread adoption of COIs has been hampered by a lack of adequate methodologies and software tools to support the COI lifecycle. After describing this lifecycle and associated dataflows, this article defines requirements for tools to support the COI lifecycle and presents a prototype implementation. An important result of our research was to show how consistent use of ontologies in the COI support tools could add significant flexibility, efficiency, and representational richness to the process. Our COI-Tool prototype supports the major element of the COI lifecycle through graphical capture and sharing COI configurations represented in OWL through the IHMC CMap Ontology Editor (COE), facilitation of the COI implementation through integration with the AFRL Information Management System (IMS) and IHMC’s KAoS Policy Services framework, and the reuse of COI models. In order to evaluate our tools and methodology, an example METOC (weather) community of interest was developed using it. Keywords: Community of Interest, Ontology, COI, OWL, Concept Maps, Policy, KAoS.
1 Introduction Communities of Interest (COIs) are a realistic approach to interoperability and collaboration on data sharing across organizations that provide an alternative to global data element standardization. They are a means by which the strategy of net-centric information sharing between individuals and organizations can be realized. COIs are defined as “collaborative groups of users who must exchange information in pursuit of their shared goals, interests, missions, or business processes and who therefore must have shared vocabulary for the information they exchange”[3]. Primarily, COIs F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 451–466, 2010. © Springer-Verlag Berlin Heidelberg 2010
452
A. Uszok et al.
promote data understandability through shared vocabularies, comprised of semantic artifacts (dictionaries, data models, taxonomies, ontologies, etc.), which help members establish a shared understanding of the data they exchange. But they also share policies controlling access to exchanged information and putting obligations on the community members. In any COI, there are two mains roles of participants: a producer and a consumer of information. In any practical community its members usually share these two roles, producing certain information and consuming others. The life of a COI extends beyond the phase of agreeing on vocabulary and policies when, based on abstract definitions, a concrete community is established. All these processes can be significantly augmented by software tools speeding the process of COI creation. This is essential in military, business and scientific domain. To better understand the needs and requirements of COIs, we began by surveying and assembling a collection of COI resources, which we catalogued on our COI devoted Web site1. From these resources and from discussions with a former Commander at the Naval Oceanographic Office, who has extensive experience in the creation of METOC COIs, we derived the necessary requirements for our prototype tool to support the COI lifecycle. This article presents a description of the Community of Interest lifecycle and dataflow, defines requirements for a tool supporting the COI lifecycle and presents its prototype implementation. Specific objectives for the COI-Tool prototype include the following: • • • •
Ability to model specific roles and physical resources for COI data producers and consumers so that each participant knows what is expected from them; Ability to define the semantics and structure of the COI information in a way that allows automatic encoding in a formal, computer processable notation; Minimization of training and support requirements for COI managers through simple user interfaces and automation of tedious aspects of lifecycle support; Mapping and translation of simple incompatibilities between semantic information from different sources.
In order to address the semantic aspects of COI modeling, we based our approach on the most widely-adopted standard today for semantically-rich model representation the W3C’s Ontology Web Language OWL [11]. We also for the first time integrated two existing IHMC tools CMap Ontology Editor (COE) and KAoS Policy Service with the AFRL Information Management Systems in order to create a comprehensive environment supporting the COI lifecycle. Numerous new mechanisms and capabilities were added to these tools to make this possible. This paper is organized as follows. First we summarize the key requirements of any tool to support creation and maintenance of COIs (Section 2). We then present the systems we used to create our prototype of the COI-Tool (Section 3). This is followed by the description of the COI-Tool that we have created (Section 4). The next section 5 presents the example weather-related COI which we created by using our new tool. Finally, we conclude the paper with a discussion and a summary. 1
http://ontology.ihmc.us/COI
Rapid Creation and Deployment of COIs Using the CMap Ontology Editor
453
2 Requirement Analysis of Communities of Interest Lifecycle The results of our analysis of information flow in the process of creating, implementing and operating the COIs are presented on Fig. 1. First, the COI Manager may use different sources of information to bootstrap the process of defining the initial model. Combining specific data schemas and vocabulary pulled from the existing metadata or ontology repositories (like for instance from the DoD Metadata Registry2) as well as saved data from previous COIs and results from the Web and WordNet3 searches, the initial community model can be rapidly assembled from reusable components and extended to meet new requirements. After this initial step, all the community members have to be involved in the collaborative shaping of the community vocabulary. Still the model at this stage does not usually represent any concrete COI but rather a reusable template of the given type of community. For instance, there are many METOC (meteorological) communities that share the same or similar roles and weather data products. Defining a generic template for the METOC community and only refining it for specific case is a time saver.
Fig. 1. Data products and dataflow of COI lifecycle
The resulting COI Configuration Template contains the following information: • Links to ontologies that specify vocabulary for definition of roles and data product structures, • Specific producers and consumer types, 2 3
https://metadata.dod.mil http://wordnet.princeton.edu/
454
A. Uszok et al.
• • •
Detailed structure of data products, Policies defining authorization and obligation, Additional specific resources such as maps, images, documents, etc.
From the COI Configuration Template, a concrete COI Configuration can be created by mapping producer and consumer roles from the given COI template to concrete resources such as organizational units, databases, devices, and so forth. Once the available resources have been mapped to assigned roles, the Implantation Phase of the COI can be started. Software stubs for producers and consumers can be generated based on the defined products data structures. The developer responsible for a particular consumer or producer will use stubs generated by a COI Manager either to implement new services or connectors to existing ones. In this phase, additional details must be decided and implemented. For instance, a COI Manager can refine (add/change/remove) generic policies originally defined as part of the generic COI Configuration Template model. If required, simple semantic translations can be also defined in order to harmonize data from incompatible participants. Finally in the next phase, the members can be activated inside the community infospace. A COI Manager should be able to monitor the community and obtain information about participant status, relations between them and compliance with the policies. In the next three subsections we present the detail requirements, which are the results of our examination, for any practical tool supporting the COI lifecycle. 2.1 Exploration and Creation Phase Requirements for the COI-Tool During the exploration and creation phase, the COI-Tool has to support the COI Manager and the participants of the community in the definition of a model for the COI. The important part of this process is an agreement on the terms and vocabularies to be used to exchange information products. In order to support this in an effective way, the following three main requirements have to be fulfilled: •
•
•
Easy-to-understand formal models of COI information representation COI-Tool must capture the COI requirements and structure in an easy-tounderstand format that can be automatically transformed into a formal model of COI information requirements. Support for collaborative COI development - to facilitate the process of achieving consensus on the COI vocabulary, roles and information to be exchanged, the COI-Tool should provide support for both synchronous and asynchronous distributed collaboration on vocabulary creation for the COI participants. Ease of reuse – COI-Tool should possess graphical templates for individual model elements or larger model structures so they can be easily extended and reused, thus saving development time in similar COI applications that may later arise.
Rapid Creation and Deployment of COIs Using the CMap Ontology Editor
455
2.2 Implementation Phase Requirements for the COI-Tool This phase of the COI lifecycle involves physical implementation of a concrete community. Support for this phase must allow mapping abstract elements of a COI model to physical assets (e.g., organizations, individuals) and data resources. Software components must be implemented or adapted to connect the required resources of producers and consumers to the community infospace. Additionally, meta-information for community data products has to be easily generated from the model. In particular the following requirements have to be fulfilled by the tool: •
• •
Link abstract COI model to implementation model - COI-Tool should provide a graphical interface to support mapping. Once the mapping is complete the tool should automatically generate the configuration files and software stubs that will be used to implement producers and consumers. Data product policies. COIs need an expressive and flexible way to define data product access and filtering policies, as well obligation policies for members. Harmonization of vocabulary. Finally, any realistic COI will need a way to accommodate at least simple differences in vocabulary among partners within the COI model.
2.3 Operation, Monitoring and Maintenance Phase Requirements for the COI-Tool During the Operational phase the tool should provide the following capability to a COI Manager: •
• •
Monitoring configuration, activity state, and policy compliance - authorized COI participants must be able to monitor activities inside the community, to discover discrepancies between the model and reality, and to be notified if community policies are violated. Monitoring producer/consumer/information object relationships - allowing to asses the state of the community and compliance with policy. Collecting overall history and statistics – essential for any forensic investigation.
3 COI-Tools Prototype as an Integration of Existing Systems During the development phase, our objective was to build and test a prototype implementation of the requirements enumerated during the analysis phase of the project. We found that existing IHMC and AFRL systems, when combined and enhanced in specific areas, would fulfill the requirements for COI-Tools. Fig. 2 maps the requirements for the COI Tool described and justified in the previous section to existing or new features of the COE, KAoS and IMS components. Below we briefly describe these systems.
456
A. Uszok et al.
Fig. 2. Summary of COI Lifecycle Needs and Solutions
3.1 COE - CMap Ontology Editor COE4 [4,8] is an extension of the IHMC CMapTools5 concept mapping program [2]. COE allows OWL ontologies to be conceptualized, developed, and managed though a powerful graphical interface based on concept maps. Unlike other tools [5,9], COE was specifically developed in order to exploit patterns of OWL structure to make it easier for both experts and non-specialists to use (e.g., hiding irrelevant information, use of templates for frequently-used patterns). To bridge the gap between the informal nature of concept maps and the formal, machine-readable Web ontology languages, COE uses a set of conventions and guidelines that enables users to construct syntactically valid Web ontologies using the concept-mapping interface. These conventions retain as far as possible an intuitive reading of the concept map while faithfully capturing the precision of the OWL syntax, and are based on a few basic ideas (which make them easy to learn). English words and phrases are used as far as possible, and we avoid the ‘mathematical logic’ terminology that pervades the OWL documentation. Figures 7 and 8 shows examples of ontologies depicted using COE, where the blue lines denote class inclusion, and the dotted lines denote property definitions. The text boxes containing unprefixed labels are defined as part of the ontology under development (e.g., “ForecastElement”), while the prefixed labels are concepts that are linked in from other existing ontologies (e.g., “time:ProperIntervalThing”). Linking 4 5
http://cmap.ihmc.us/coe http://www.cmappers.net/
Rapid Creation and Deployment of COIs Using the CMap Ontology Editor
457
our specific ontology elements to existing ontologies is a central aspect of our approach, whereby we reuse existing concepts from other ontologies and thereby achieve interoperability simply by being written in the same language. The side panel shown in Fig. 7 shows the list of ontologies that are being imported, and provides several different ways to browse and connect their information. 3.2 KAoS Policy Services Framework IHMC’s KAoS6 [10] is a mature services framework that relies on ontology in the specification, analysis, and enforcement of policy constraints across a wide variety of distributed computing platforms. KAoS Policy Services allow for the specification, management, conflict resolution, and enforcement of policies. KAoS policies distinguish between authorizations (i.e., constraints that permit or forbid some action by an actor or group of actors in some context) and obligations (i.e., constraints that require some action to be performed when a state- or event-based trigger occurs, or else serve to waive such a requirement). The use of ontology to represent policies enables reasoning about the controlled environment, about policy relations and disclosure, policy conflict resolution, as well as about domain structure and concepts. KAoS reasoning methods exploit descriptionlogic-based subsumption and instance classification algorithms and, if necessary, controlled extensions to description logic (e.g., role-value maps). Two important requirements for the KAoS architecture have been modularity and extensibility. These requirements are supported through a framework with welldefined interfaces that can be extended, if necessary, with the components required to support application-specific policies. The basic elements of the KAoS architecture are shown in Fig 3; its three layers of functionality correspond to three different policy representations:
Fig. 3. KAoS Policy Services Conceptual Architecture
6
http://ontology.ihmc.us/kaos
458
A. Uszok et al.
Human interface layer: This layer uses a hypertext-like graphical interface (KAoS Policy Administration tool, or KPAT7) for policy specification in the form of natural English statements. The vocabulary for policies is automatically provided from the relevant ontologies. Management of KAoS components and of the ontologies can also be performed by authorized users through KPAT. In addition to the standard KPAT interface, customizable policy templates and wizards make policy creation easy even for non-specialists. Policy Management layer: Within this layer, OWL is used to encode and manage policy-related information. The KAoS Distributed Directory Service (DDS) residing in this layer encapsulates a set of ontology reasoning mechanisms. Policy Monitoring and Enforcement layer: KAoS automatically “compiles” OWL policies to an efficient format that can be used for monitoring and enforcement, and handles selective distribution of policies and ontological concepts to the KAoS (software) Guards that need them. This compact policy representation provides the grounding for abstract ontology terms, connecting them to the instances in the runtime environment and to other policy-related information. KAoS Guards interface with the applications being governed by policy. The guards provide a well-defined API to support policy decisions and information queries. The “pre-computed” nature of the guards’ policy representation and the simplified reasoning engine means that policy decisions can be rendered locally with “table-look-up” efficiency. The network connection to the DDS is needed only if and when runtime policy updates are made. Although the description-logic reasoning integrated with KAoS handles most requirements, specialized reasoners (e.g., spatial and temporal reasoning, probabilistic reasoning) can be accommodated as “plug-ins” to KAoS to meet application-specific needs. 3.3 AFRL Information Management System The Information Management System developed at AFRL [6] utilizes the concept of an infosphere that includes many diverse highly distribute applications operating on data (called clients) and a set of core services that enable the dissemination, persistence and control of information being shared among these applications. Key services provided to disseminate information are • •
publish and subscribe – allowing a consumer to register a subscribe to a specify type of information and obtain these type of information from any publisher in the system. query - allow a consumer to query for specific type of date fulfilling specific predicate the archive of previously published information.
The quantum of managed information is called a managed information object (MIO). A MIO comprises a payload and metadata that characterizes the object (e.g., topic, time, and location, in general XML Schema). It is desirable that all of the information needed for making information management decisions (such as content-based brokering and dissemination) be present in the metadata in a form that permits efficient 7
Pronounced “KAY-pat.”
Rapid Creation and Deployment of COIs Using the CMap Ontology Editor
459
processing. An important message characterization element is the concept of “type” (e.g., “satellite imagery”). Type is used to determine what policies describe its appropriate use. The system’s design as a set of independent services [7] allows for integration of a very diverse producers and consumers of information.
4 COI-Tool Architecture and Functionality The tool’s functionality (Fig. 5) uses ontologies as a central component. It was assumed that any model of a community of interest would start from the generic COI ontology developed during the project, which is itself based on the KAoS ontology8. Through this extension, any community model can be a source for the vocabulary for the KAoS-defined policies of the community. Our generic COI ontology captures the basic elements of any COI. Using it as a starting point for domain specific COI ontologies establishes common ground in terms of vocabulary and operations that will help standardize practice and enable collaboration across COIs. The tool is further grounded to the specific infosphere technology by using the IMS mapping between the COI ontology and IMS ontology developed in the J-DASP project9. The COI-Tool produces further layers of specialization that build on this grounded framework in the form of concrete COI ontologies developed for specific communities (Fig. 4).
Fig. 4. Developed COI ontologies and their relations
COE is used for almost all the graphical interaction with the user with exception of policy definition, which is done by KAoS KPAT interface. The original CMap Tool on which COE is based allows for easy plug-in extensions with new graphical menus activating specific new actions on the concept maps. We developed new specific extension in order to provide specific COI related menus of actions to COE. KAoS provides policy, domain and matching services as well as serving as an integration framework for the COI-Tool, conveying messages between COE and IMS clients fulfilling roles in a concrete operating COI. The IMS provides an examples infosphere implementation which can support community communications. However it is possible to replace it with other infosphere technology. 8 9
http://ontology.ihmc.us/ontology.html http://jbi.isx.com/jdasp/
460
A. Uszok et al.
Fig. 5. Overview of the COE-KAoS-IMS integration for COI-Tool
4.1 COI-Tool Functionality for the COI Exploration and Creation Phase We aimed at providing a rich graphical environment to model the COI Configuration Template and its mapping to a specialized, concrete configuration. Considerable effort was spent on the enhancement of the COE Ontology Editor functionality. We added the ability for COE to edit, browse, and manipulate multiple ontologies, which is an essential requirement for real applications. We also implemented a mechanism to automatically generate OWL encodings of a map. COI managers are provided with the following functionality when they want to create a new COI, modify an existing one, or collaborate with other community members: Creation of a new COI This new COE menus option allows bootstrapping the COI creation process by the generation of four COI configuration concept maps for definitions of community partners (roles/actors), data products, classes of actions, and COI properties such as information about the manager’s identity, applications used, and so forth. Fig. 6 shows the new menu in COE and the resulted generated maps (on the right in the list of maps). These new placeholders are empty concept maps dedicated to particular part of the community model. However, they are already linked to the appropriate parts of the COI generic ontology, so the menu of concepts available in the particular maps is focused according to its context. The new COI template can be started from the generic ontology but also from the existing COI templates.
Rapid Creation and Deployment of COIs Using the CMap Ontology Editor
461
Fig. 6. GUI for creation of new domain specific COI and the result
Definition of COI properties, roles and data product A COI manager can open any of these maps in the editor and graphically define concepts and their relations. He can use the Semantic Space Panel, developed in the scope of this project, to access concepts defined in other concept maps (ontologies) such as, for instance, the weather ontology when defining weather products (Fig. 7). The Semantic Space (Namespace) Panel functionality includes the ability to: • • • •
Show a list of concepts, from the selected namespaces, (as a hierarchy with additional information) applicable to a given selected map node, Allow the user to drop a selected concept on the map node and automatically create appropriate map semantic constructs, Show current semantic information, from the chosen namespaces, about the selected map node, Use the ontology reasoner to compute the menu list of concept instances or superordinates based on the selected concept node.
Usage of Web search and WorldNet The COI-Tool provides access to the existing Web resources in order to obtain concept definitions or related concepts. The found concepts can be added to the map by creation of new map nodes using the button from the COE search window. Definition of relationships between COI roles and COI products through the use of COE templates We developed automatic generation of concept map templates to be used in the specialized COI maps. When a given community map is saved, a set of templates is generated from it, making it easy to create subclasses or instances of concepts from this map. So, for instance, it is easy to specify that a given community role is a specialization of some other role because of the existence of the template. Annotation of every map with the parent ontology allows the tool to select which template should be shown for a given map. In addition, there is a set of static templates making creation
462
A. Uszok et al.
of generic ontological relations easy and readily understandable. This set can be easily extended using the COE template editor. The COI-Tool has a special panel on the right dedicated to template management.
Fig. 7. GUI enabling the modeling of COI roles and products
Usage of collaboration mechanisms The COI Manager and his partners are able to simultaneously edit concept maps with community vocabulary and models using the COI-Tool. If only their tools are connected to the same CMap server they can invite each other to edit specific maps. If only they accept the invitation, they will see, in their COE window, the edits and annotations performed by others, will be able to edit maps themselves, what will be immediately visible to other members, and will be able to send chat messages to others. This mechanism allows them to archive a consensus on concepts definitions, data products definitions, and so forth. Addition of links and multimedia resources A COI manager can in the COI-Tool embellish a COI concept by adding URLs to web resources, links to documents and maps. This additional information provides context and related information important in facilitating the shared human understanding of the created community model. Access to historical versions of concept maps COE has been integrated with Subversion10 to make it possible to store and then subsequently access previous versions of the developed maps from the integrated version repository. The COI-Tool provides a browser for viewing the different revisions of the community concept maps and a compare tool for identifying their differences. 10
http://subversion.tigris.org/
Rapid Creation and Deployment of COIs Using the CMap Ontology Editor
463
Creation of a new implemented COI configuration A COI manager can create the implementation concept maps of a given COI Configuration Template using the COI-Tool. The dedicated map for the mapping between resources and community roles has specialized grahical templates making this process easier. 4.2 COI-Tool Functionality for the Implementation Phase The COI-Tool mechanisms described here are targeted for deployment of a given COI as a set of clients in an infosphere based on the ARFL Information Management System. A similar support can be added to the COI-Tool for other information systems realizing COI. Generation of bootstrap files and code skeletons The COI manager can generate a set of configuration and implementation files specifically for each community participant, and delivers them to the developers responsible for that participant. This file set contains only the relevant information for the given participant. Based on this information, the developer can implement a client for the Information Management System infosphere producing and consuming products with the appropriate for its roles types and structure. Definition of COI policies The OWL representation of the given COI can be directly used as an ontology vocabulary to define the COI policies in KAoS. Using KAoS is it not only possible to define authorization policies controlling distribution of information, but also obligation policies obliging, for instance requiring the timely manner of issuing periodic updates. Definition of semantic translation In reality when implementing COI, some partners will use incompatible representation of data, for instance when coming from some other community of interest. The COI-Tool has been equipped with a prototype graphical interface allowing the mapping of structures between two ontological classes defining schemas for data products. The interface generates translation code in SPARQL11, which then can be used to translate concrete data during communication. 4.3 COI-Tool Functionality for the Operation, Monitoring and Maintenance Phase In order to provide monitored and other functionality for the community of interest we have developed and implemented in the special interceptor for the IMS clients. This interceptor modifies the standard IMS client library linked with every client and transparently analysis client communication. The new module provides the following functionality: •
11
Monitoring of the members status and relation; reports when a given member is activated and what data its produces and from whom it received data,
http://www.w3.org/TR/rdf-sparql-query/
464
A. Uszok et al.
• • • •
Collects statistical information about the relations; first/last time of data consumption, frequency and average time between consumptions, Checking authorization policies compliance, Monitoring of obligation policy fulfillment, Semantic translation.
Monitoring of partners activation and relations The information intercepted in the IMS clients is forwarded through KAoS to COE if the map with the community model is opened by its manager. The map will show which participants are active and if any unanticipated clients are present. It will also show by way of linking lines any producer–consumer relation. This links can be clicked to show statistical information. Monitoring of obligation policies In order to monitor the fulfillment of the obligations of an information producer, we developed the new KAoS mechanism – the policy monitor, which allows determining compliance with the obligation policies. If any obligation is violated, a feedback callback to the COI Monitor Mode is activated; a notification is issued and showed up on the appropriate community map. Recording of monitoring session and access to recorded session The history of a community session can be stored in a special folder associated with the COI configuration. Any community activity session can be also replayed later.
5 Proof of Concept – METOC Community of Interest In order to test the usability of the presented methodology and the implemented prototype tool, an example METOC community of interest was developed. Background research for this community and a collected set of references is available on our web site dedicated to METOC (http://ontology.ihmc.us/coi/metoc.htm).
Fig. 8. Fragment of the concept map with vocabulary for weather related concepts
Rapid Creation and Deployment of COIs Using the CMap Ontology Editor
465
Using the COI-Tool, we developed weather ontology (Fig. 8) and then used it to define different weather data products as we found in the collected resources. Based on the concrete documents for the Korea METOC Operation Plan, as well as air and space weather operations, we defined roles in this community and then mapped them to the actual units in the Korea military area (Fig 9). The model was then used to generate bootstrap files and stubs for example clients, which were developed in AFRL Information Management System. The clients simulated collection of weather sensor data and their distribution to the weather forecast centers. The centers were obliged by policy to periodically published weather forecast and, if necessary, weather warnings. We defined policies controlling access to the weather sensor data and obligations for providers of data and issuer of forecast and weather warnings.
Fig. 9. Monitoring the Korea METOC community
This use case established the feasibility of COI managers relying on the tool to develop realistic communities of interest and to support all aspects of its lifecycle.
6 Conclusions The key results of our research include a deepened understanding of the community of interest lifecycle, definition of requirements for tools supporting COI lifecycle and description of the COI lifecycle dataflow. Additionally, we demonstrated how the consistent use of ontologies in the COI supporting tools can add significant flexibility and representational richness to the process. Our current work with AFRL concentrates on federation mechanism, controlled by policy, for new tactical service-oriented versions of the IMS and it is aimed at supporting even more diverse and dynamically created communities (such as coalitions).
466
A. Uszok et al.
Even though the system was evaluated in a military application, the lessons learned and the tools developed can be used in business, academic, or research domains in order to create common vocabulary/ontology allowing for integration or creation of alliances between people and enterprises. We believe that the system can be useful for the sharing information in the scientific observation and experiments.
Acknowledgements We are grateful for the contributions and help from of Robert Hillman, Asher Sinclair, Niranjan Suri, James Lott and Maggie Breedy. This research has been funded by the Air Force Research projects (reference FA8750-06-2-0065 and FA8750-07-2-0174).
References [1] Baader, F., et al. (eds.): The Description Logic Handbook. Cambridge University Press, Cambridge (2003) [2] Cañas, A.J., Hill, G., Carff, R., Suri, N., Lott, J., Eskridge, T., et al.: CmapTools: A Knowledge Modeling and Sharing Environment. In: Cañas, A.J., Novak, J.D., González, F.M. (eds.) Concept Maps: Theory, Methodology, Technology. Proceedings of the First International Conference on Concept Mapping, vol. I, pp. 125–133. Universidad Pública de Navarra, Pamplona (2004) [3] DoD Chief Information Officer, DoD Net-Centric Data Strategy (March 2003), http://www.defenselink.mil/nii/org/cio/doc/ Net-Centric-Data-Strategy-2003-05-092.pdf [4] Eskridge, T., Hayes, P., et al.: Formalizing the Informal: A confluence of concept mapping and the semantic web. In: Cañas, A.J., Novak, J.D. (eds.) Concept Maps: Theory, Methodology, Technology. Proceedings of the Second International Conference on Concept Mapping, vol. 1. Universidad de Costa Rica, San Jose (2006) [5] Fortuna, B., Grobelnik, M., Mladenic, D.: OntoGen: Semi-automatic Ontology Editor. In: Smith, M.J., Salvendy, G. (eds.) HCII 2007. LNCS, vol. 4558, pp. 309–318. Springer, Heidelberg (2007) [6] Infospherics Web Site, http://www.infospherics.org [7] Grant, R., Combs, C., Hanna, J., Lipa, B., Reilly, J.: Phoenix: SOA based information management services. In: Proceedings of the 2009 SPIE Defense Transformation and Net-Centric Systems Conference, Orlando, Fl (2009) [8] Hayes, P., Eskridge, T., et al.: Collaborative knowledge capture in ontologies. In: Proceedings of the 3rd International Conference on Knowledge Capture, Banff, Alberta, Canada. ACM Press, New York (2005) [9] Sarker, B., Wallace, P., Gill, W.: Some Observations on Mind Map and Ontology Building Tools for Knowledge Management, Ubiquity, vol. 9(9). ACM Press, New York (2008), http://www.acm.org/ubiquity/volume_9/pf/v9i9_sarker.pdf [10] Uszok, A., Bradshaw, J., Lott, J., Breedy, M., Bunch, L., Feltovich, P., Johnson, M., Jung, H.: New Developments in Ontology-Based Policy Management: Increasing the Practicality and Comprehensiveness of KAoS. In: Proceedings of the IEEE Workshop on Policy 2008. IEEE Press, Los Alamitos (2008) [11] Web Ontology Language, http://www.w3.org/TR/owl-features
Incorporating Semantics into an Intelligent Clothes Search System Using Ontology Ching-I Cheng, Damon Shing-Min Liu, and Li-Ting Chen Computer Science and Information Engineering, National Chung Cheng University, No. 168 University Rd., Minhsiung, Chiayi, 62102, Taiwan {chengcy,damon,clt96m}@cs.ccu.edu.tw
Abstract. The research aims to develop a novel ontology structure for incorporating semantics into an intelligent clothes search system. It helps users to choose appropriate attire fit to the desired impression for attending a specific occasion using all of what is already in their closets. The ontology structure proposed in the paper takes a single piece of garment as the most specific instance. Hence, pictures of user’s garment items are firstly fed. An internal image processing module is then invoked to extract attributes from each piece of garment through a series of image processing techniques, such as color distribution. Finally, the derived information is attached to the ontology. In runtime, the system follows the requirements, such as mood keyword, occasion type, and weather status, to search for matched outfits with the help of semantic searching of the ontology. Keywords: Semantic search, fashion styles, affective state, image processing, database.
1 Background People in modern society usually use some adjectives (e.g., casual, comfort, organic, and so on) to describe feelings and impressions that a product evokes through their senses. Different set of adjectives are used in different product domain. [1] proposes a novel approach for automatically mood transferring between color images. It is a new color-conversion method offering users an intuitive, one-click interface for style conversion and a refined histogram-matching method that preserves spatial coherence. Kansei Engineering (Sense Engineering in Japanese) methods are proposed to build models in which people’s emotional responses to design are linked to the product properties [2][3]. It integrates affective elements in the developing process and has been successfully applied to the design of various products. [4] is a study on Kansei of fashion style based on human sensibility which proposes the correlation between factors affecting fashion style and the sensory psychology of consumers based on the theory of Kansei Engineering. Moreover, Lin [5] analyzed the cognition of image categories for fashion apparels and proposed the five pair-wise image groups that constitute the impression groups recognized by fashion experts. The primary goal of most researches is providing assistances for emotional design in industries. Cheng [6] hence proposed an intelligent clothes search system which allows users to query for a F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 467–472, 2010. © Springer-Verlag Berlin Heidelberg 2010
468
C.-I Cheng, D.S.-M. Liu, and L.-T. Chen
garment piece using some specific fashion style they predefined. However, when searching for an instance stored in a database, users tend to give a semantic description as input for the search. Yet database itself does not consist of semantic information, nor is it easy to search in. Ontology is a better mechanism to give search results that satisfy their needs, as it could act as a mechanism between the user and the database, and pick out appropriate results from the database by semantic information supplied by the user. In this paper we propose an ontology that serves as any individual’s virtual wardrobe, which can be used to categorize clothing and semantic search, bringing the old database to a higher semantic level. An ontology is a representation of a set of concepts within a domain and the relationship between those concepts. It is widely accepted whtat ontologies can facilitate text understanding and automatic processing of textual resources. [7] addresses the issues of why one would build an ontology and presents a methodology for creating ontologies based on declarative knowledge representation systems. It leverages the two authors experiences building and maintaining ontologies in a number of ontology environments including Protege-2000, Ontolingua, and Chimaera. It also presents the methodology by example utilizing a tutorial wines knowledge base example. In addition, ontologies allow reasoning in hitherto intractable domains by codifying specific knowledge. [8] presents a new methodology for leveraging the semantic content of ontologies to guide knowledge discovery in databases. Their system scans new databases to obtain type and constraint information, which users verify. The system then uses this information in the context of a shared ontology to intelligently guide the potentially combinatorial process of feature construction. Further, it learns each time it is applied, easing the user's verification task on subsequent runs. Moreover, Arnulph Fuhrmann, Clemens Groß, and Andreas Weber proposed an explicit ontology for cloth patterns that can be sewn together to form a single piece of garment in [9]. Their ontology can be extended to be used on garments, and supervised learning of various properties. Furthermore, with their explicit definition of the geometric information of each cloth patterns and their corresponding discrete properties, their ontology can be used to retrieve garments, model virtual garments on an abstract layer above the physics-based modeling layer, and also provide semanticbased collision detection between several layers of garments. Our system takes a piece of clothing as input data and extracts semantic information from it through several image processing techniques, and automatically categorizes it into the ontology. Each piece of clothing is put into an appropriate class of the ontology (e.g., tops, skirts, pants…) with properties tagged to it. After having created a virtual wardrobe, user can search for matched outfits that fit the user’s mood or other requirements (e.g., ‘I need casual outfit best made of cotton for a slumber party at a friend’s house’, or ‘I need a bright sporty top that is not too tight for a sunny day outdoors’, or ’I need a warm casual, modern garment best made of wool for a cold day indoors’, and so on) Then we do a semantic search and mix-and-match outfits by analyzing information extracted through image processing for each piece of clothing and their relationship with other pieces. Obviously this is something the old databases lack, since there is no semantic information, nor is the data categorized, making searching and mixing-and-matching a lot more inconvenient. On the basis of this ontology of a virtual wardrobe, we can manipulate the process of mixing-andmatching outfits of the user’s requirements on a higher semantic level.
Incorporating Semantics into an Intelligent Clothes Search System Using Ontology
469
Fig. 1. Directed acyclic graph (DAG) of the proposed ontology structure
Fig. 2. The overall process of performing a semantic search
2 Ontology Structure for Personal Digital Wardrobe The Intelligent Clothes Search System aims at providing general users a friendly interface to search for apposite garments for attending a specific occasion as well as matching desired image. The ontology for the system is mainly used by women to find their personal clothing from their own virtual wardrobe using semantic information. So the ontology takes a single piece of garment (e.g., a T-shirt) as the most specific instance. As you can see in Fig. 1, there are five major classes in the ontology, namely Fabric, Clothing, Image, Occasion and Color, indicating five major concepts in the ontology of personal digital wardrobe. The Clothing class consists of three discrete subclasses, Top, Skirt and Trouser. The class Top is divided into subclasses sleeveless_Top and sleeved_Top. It is to give a better categorization of the garments, as the instances of each subclass have different properties attached to it (e.g., type of neckline and length of sleeves), and participate in different relations with other instances of the ontology. There is only one numerical attribute, hasNeckline, attached to the class sleeveless_Top, but there are two numerical attributes, hasNeckline and hasSleeve, attached to the class sleeved_Top. hasNeckline is used to store the type of
470
C.-I Cheng, D.S.-M. Liu, and L.-T. Chen
neckline (e.g., squared, rounded, or V-shape) and hasSleeve is used to store the length of sleeves. hasShape and hasLength are two numerical attributes in the class Skirt, indicating the shape and length of a skirt, whereas hasFit and hasLength are two numerical attributes in the class Trouser, indicating the fitness and length of a pair of trouser. Generally, there are many adjectives used to describe the mood a garment shows and some adjective are described a similar mood. Hence, according to an analysis and conclusion given from fashion experts, we categorized clothing into six basic images to describe feelings of a garment, which are Modest, Sophisticated, Vivid, Feminine, Masculine, and Casual. Each of them consists of several content concepts collected by fashion experts. For example, clean, calm, fresh, and simple are content concepts of Modest image. Each piece of clothing has a relation isShowing pointing to the class Image, in which the values are one of the six defined image category, giving each garment its own mood information. The process of a semantic search is as follows. First users input their query. We derived semantic information from the input, and expand its semantic meaning. Then we do semantic searches through the ontology, which checks out the database, using rules that match the semantic meaning of the user’s input. Since searches of the expanded semantic meaning may result in finding some same pieces of garments, we remove duplicates of the same clothes. After the final results are generated, we sort the results by order of relevance, and then output pictures of all the garments that meet the user’s requirements. Fig. 2 illustrates the process flow. Here is an example to illustrate the process of a semantic search. The user wants to find warm clothes for a cold day. We take the semantic information ‘warm’ derived from it, and return clothes that are made of wool or made of leather, as well as the clothes that are long-sleeved, by determining their property hasSleeve. We also return pants that are long in length, but perhaps not skirts which tend to be short, thus not warm enough. In addition, the semantic searches that consist of a few different types of searches, generalized search, specificalized search, sibling search and cousin search. Generalized search finds the superclass(es) of the input given, and does a search in the superclass(es), as a class can inherit multiple superclasses in the ontology. Specificalized search finds all subclasses of the given input, and does a search through all the subclasses. For example, if we were to find a cyan blue top, we would have to do a search in both the subclasses of Top, namely sleevelessTop and SleevedTop, to return all the cyan blue tops that can be found in the virtual wardrobe. Sibling search finds all classes of the parent(s) of the input given (note that a class can have multiple parents), and does a search through all the subclasses of the parent(s) found. Cousin search does a search all the classes that are not directly under the same parent, but their parents share a same parent. For example, if we were to find a feminine garment, we would have to do a cousin search, to find tops (either sleeveless or sleeved) and skirts that have properties (e.g., patterns = floral, fabric != leather) that meet the feminine requirement. Depending on the input and rules of semantic searching, we could use some or all of the searches mentioned above. We return the set of all results generated from every search used in the query, using pictures of garments.
Incorporating Semantics into an Intelligent Clothes Search System Using Ontology
471
3 Attributes Extraction For clearly representing a garment and reuse in the system, we need to find some specific garment attributes. Each attribute in the ontology has at least a name and a value, and is used to store information specific to the individual it is attached to. Additionally, attributes in the ontology is also a way of describing a kind of relationship between individuals. In our ontology, there are five attributes, which are isFitfor, isShowing, hasColor, hasFabric, and isWarm, attached to the individual Clothing. isFitfor points to the class Occasion, indicating a piece of clothing fits for one of the defined occasion categories; isShowing points to the class Image, indicating a piece of clothing is showing one of the defined image categories; hasColor points to the class Color, indicating a piece of clothing has a major color as one of those defined color; hasFabric points to the class Fabric, indicating a piece of clothing has a major fabric as one of the defined fabric categories; isWarm is an integer for indicating the degree of warmth which is obtained using a transform function based on color psychology. The values of isFitfor and hasFabric for each piece of clothing are pre-given by the user when building the personal digital wardrobe. The value of hasColor is obtained using the color tone classification module, which is constructed in [10]. After a user inputs a garment photo, the system processes a color histogram technique to analyze the major color. The major color is then separated into three elements: red (R), green (G), and blue (B), as inputs for color tone classification module. In addition, clothing image refers to a feeling user sensed while looking at a piece of clothing, which is usually affected by its shape, pattern, color, and fabric material. The four physical properties of textile can be extended in clothing psychology, sensation of sexed, fancy, cold, and hard, respectively. Hence, a garment image cognition model based on the four sensations is proposed in [6], and exploited in the system to automatically assign a piece of clothing into a proper image class. In the process, the garment is firstly analyzed to obtain its physical features (e.g., major color). The corresponding sensation features (e.g., coldness) are then inherited through transformation functions and fed to the cognition model to classify the clothing item into one of the six image classes. Moreover, in order to automatically classify garments into three types, i.e., tops, skirts and pant, using a single photograph of a garment, the novel template based classification method proposed in [11] is exploited. It is a shape matching technique, using one template for each garment type, to recognize the type of an input garment. Since all garment types differ largely in shape, comparison of the contour shape of input garment’s and template’s is sufficient for recognition. Besides, they also provide three types of neckline templates (rounded-neckline, V-neckline and square neckline) and a neckline classification method to locally recognize the neckline shape of an input garment. Hence, the value of hasNeckline is thus obtained using the method. Further, a set of predefined feature points which are correspondent with human anatomy (e.g. points Armpit_left and Shoulder_left are two prescribed feature points of tops; points waist_point_left and hemline_left are two prescribed feature points of skirts; points leftPantLeg_left and leftPantLeg_right are two prescribed feature points of trousers) is obtained after performing garment type classification. The values of hasSleeve and hasLength in the classes Top, Skirt, and Trouser, are then calculated using those feature points.
472
C.-I Cheng, D.S.-M. Liu, and L.-T. Chen
4 Conclusion We proposed an ontology structure for Intelligent Clothes Search System. The ontology is supremely beneficial in many ways, as it helps categorize the garments in the database in a class hierarchy, supporting multiple inheritance, drawing relations between instances of the database, allowing for properties to be assigned to instances and classes, bringing ease to finding garments that fit the user’s specific requirements. Moreover, the ontology acts as a mechanism between user and the database, during both runtime and the preprocessing stage. During preprocessing, all user has to do is to input pictures of garments to be added into their digital wardrobe. An internal image processing module and couple of cognition models are then invoked to extract numerical and semantic information. A user can query for garments with some restrictions like garments of a certain impression. Then it does semantic searches through the database and picks out garments that fit the semantic meaning of the input given, and shows the pictures of the results found. In conclusion, the system gain knowledge from visual image, and has the capability that allows people to digitalize their closets. With the help of ontology, it is able to automatically find suitable garments by submitting the semantic requirements to the system.
References 1. Yang, C.K., Peng, L.K.: Automatic Mood-Transferring Between Color Images. IEEE Computer Graphics and Applications 28(2), 52–61 (2005) 2. Nagamachi, M.: Kansei Engineering: The Framework and Methods. Kansei Engineering 1 (1997) 3. Nagamachi, M.: Kansei Engineering: A New Ergonomic Consumer-oriented Technology for Product development. International Journal of Industrial Ergonomics 15(1), 3–11 (1995) 4. Liu, G., Jiang, Y.: Study on Kansei of Fashion Style Based on Human Sensibility. Journal of Textile Research 28(11), 101–105 (2007) 5. Lin, C.H.: Explore the Differences of Cognitive Between Fashion Style and Image from Consumer Standpoint. Journal of Tainan Technology University 22, 1–4 (2003) 6. Cheng, C.I., Liu, D.S.M.: An Intelligent Clothes Search System Based on Fashion Styles. In: Proc. International Conference on Machine Learning and Cybernetics, Kunming, China, pp. 1592–1597 (2008) 7. Noy, N.F., McGuinness, D.L.: Ontology Development 101: A Guide to Creating Your First Ontology. Stanford Knowledge Systems Laboratory Technical Report KSL-01-05 and Stanford Medical Informatics Technical Report SMI-2001-0880 (2001) 8. Phillips, J., Buchanan, B.G.: Ontology-guided Knowledge Discovery in Databases. In: Proceedings of the First International Conference on Knowledge Capture, pp. 123–130 (2001) 9. Fuhrmann, A., Groß, C., Weber, A.: Ontologies for Virtual Garments. In: Workshop Towards Semantic Virtual Environments (SVE 2005), pp. 101–109 (2005) 10. Cheng, C.I., Liu, D.S.M.: Discovering Dressing Knowledge for an Intelligent Dressing Advising System. In: Proc. 4th International Conference on Fuzzy Systems and Knowledge Discovery, Haikou, China, vol. 4, pp. 339–343 (2007) 11. Chen, L.T.: Automatic Garment Type recognition and Feature detection. Master Thesis, Dept. Computer Science and Information Engineering, National Chung Cheng Univ., Chiayi, Taiwan (2009)
SPPODL: Semantic Peer Profile Based on Ontology and Description Logic Younes Djaghloul1 and Zizette Boufaida2 1
Centre de Recherche Public Henri Tudor, Luxembourg [email protected] 2 LIRE laboratory, Computer Science Department, Mentouri University, Constantine, Algeria [email protected]
Abstract. The main purpose of this work is to propose a semantic and formal profile to the peer called SPPODL. We present an approach that utilizes the formal ontology based on description logic and OWL language in order to create this rich profile. The creation of this ontology follows a clear and complete process that guarantees the quality of the final result. SPPODL can be helpful in lots of domains, such as social networks, interoperability between heterogeneous peer platforms and data integration in P2P environment. Keywords: Peer, Profile, Semantic web, Ontology.
1 Introduction There has been a lot of interest in P2P technology. This is due to the nature of P2P model that has several advantages as self-organization, load balancing, adaptation, and fault tolerance. The P2P system is characterized by the autonomy of each peer and a high degree of decentralization. Each peer acts as a client and as a server at the same time, so it demands and provides services without using the same typology of a client/Server model. Actually, the P2P systems are used in several domains as: distributed computing, distributed storage, communication and file sharing. In this area, a lot of systems and applications are proposed. One can classify them in tow main categories: Structured and unstructured approaches. In structured P2P systems, the Distributed Hash Tables plays a key role to determine exactly the location of the peer and its resources. Many systems belong to this category as: Chord [17], CAN [16]. In unstructured P2P approach all peers have equal role and there is not special peer to control the execution of query or to maintain a global repository over others peers. One says that the peers are loosely controlled. This characteristic allows to have a high dynamic behavior of the system and a high scalability. Gnutella [9][13], KaZaa [12] and Freenet [5] belong to this type. F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 473–488, 2010. © Springer-Verlag Berlin Heidelberg 2010
474
Y. Djaghloul and Z. Boufaida
If a peer joins or leaves the network, there are no added tasks to do. However, this approach needs more network bandwidth because each peer should keep permanent network connection with its neighbors. In order solve this problem, some techniques are proposed [18] and the notion of super-peer is introduced [19]. One remarks that several P2P systems are proposed, each system is based on a specific platform and appropriate technique. In these systems, a peer is presented with a specific structure that it is influenced by the technical aspect of the node. The proposed profiles depend on technical aspects and the lack of semantic causes more difficult to deal with these systems. In our work, we aim to provide a rich and complete peer profile based on clear description and standard formalism. In this case, the semantic web [3] is considered as an interesting solution to describe the profile. It permits to give a clear and formal semantic to the web resources. Since it is based on a formal and logical foundation which are the description logics [2][11]. The web semantic and its language OWL[6] provide a good solution to solve semantic problems and ontology presentation. The proposed profile is considered as ontology based on formal description and the wildly used language, OWL. The use of OWL language to describe the profile of the P2P permits to export the P2P system itself as web semantic resource. The structure can be used by others systems with strength semantic. The remaining of this paper is sectioned as follow: section 2 gives some important related works in the domain of web semantic and P2P profile. The thirds section presents the proposed ontology to manage the structure by following the development process. The paper is ended by a conclusion.
2 Related Works In this section we discuss existing research work in the context of peer profile and the use of web semantic in this P2P environment. Various research projects address the use of semantic in P2P system to describe the peer profile. In [15] a peer profile is proposed that can help in some issues, such as security, resource aggregation, group management. Edutela [14] uses the sun JXTA platform to exchange learning resources. P-Grid [1] provides a strong selforganization service in a high-decentralized system. It is based on a virtual distributed search tree. SWAP [8] combines P2P and web semantic in order to improve the proposed services. In SWAP, two RDFS classes are proposed to describe the structure of the peer. RDFPeer [4] builds the Multi-Attribute Addressable Network (MAAN) that extends Chord. When analyzing the P2P systems, one remark that several architectures and techniques are proposed to assure the lookup service and the organization task. For the lookup service, techniques as DHT (in structured P2P) provide identifications for the peer and the resources. However, this identification differs according to the used system; one may have the same peer with the same resources but the identification changes. So, A peer p1 can be identified in P-Grid with ID = “pgrids://01001101”
SPPODL: Semantic Peer Profile Based on Ontology and Description Logic
475
and the same peer can be reference in JXTA with ID="jxta:uuid-150325033CD144F82DED74E". Furthermore, a peer must belong to a unique P2P system in order to be referenced. So the peer cannot keep its identity between P2P systems because of the characteristics of each system. In addition, each system uses a specific platform and peers are relied to the specific characteristic of its platform. Furthermore, one can present other problems and limitation of the actual propositions, as the absence of a clear and formal presentation of the Peer structure, the impossibility to navigate between different peer platforms, the heterogeneity of techniques used in the different platforms, he loose of the historical activity of peer between the different sessions of the user. In order to enhance the capability of a P2P system, and to resolve some of these problems, we propose a new P2P design that has as objectives: • The proposition of a formal peer profile. • The separation between the technical aspect of a peer and the logical ones. • The proposition of a unique identification of peer unless they use different platforms. • The possibility of the peer to navigate between P2P systems. • The use of the ontology paradigm (based on Description logic and OWL DL language). This permits to export peer structure as web semantic resource, in order to unify the notation and enhance the interoperability between the different systems. In our previous work [7], we provided a peer profile that is used in the case of peer platforms integration. The actual work enhances the proposed peer profile with more concepts and relations in order to cover more aspects of peer environment and resources sharing. In the next section we will build SPPODL by following a specific and rigorous ontology development process.
3 Build SPPODL We consider the peer profile as ontology. In order to build the profile, we use an appropriate process for ontology development. We think that using a clear development process is very important to guarantee the quality of the ontology at the end. In our case, we use an Ontology Development Process for the Semantic Web [10]. The main interest of this process is that it is a complete and clear. We follow five phases that are; needs specification, conceptualization, formalization, codification and finally verification. In the following we will use the process step by step and before applying each step we will give a short explanation. 3.1 The Conceptualization Step The goal of this phase is to provide a conceptual ontology. The later is based on intermediate representations that are: The glossary of terms, the Concepts taxonomies, the Binary relation diagram, the concepts dictionary and the table of binary relations.
476
Y. Djaghloul and Z. Boufaida
3.1.1 The Glossary of Terms This glossary contains the most important terms used in the context of both peer profile area and resource descriptions in web environment. In the following, one gives some important terms with their brief descriptions. • PeerGroup is a group of peer. • PeerCommunity allows to organize the peers into groups of interest. The main goal of this organization is to facilitate the search and the creation of some social networks. • PeerComID represents an identification of the community. • PeerComDesignation is the designation that helps the user to choose a community. • PeerComDomain: indicates the domain and the global interest of the Peer community. It plays an important role to facilitate the search process • UpLoad and Download indicate a connection debit. • PeerUpBandwidth is the Upload debit for a peer. • PeerDownBandwidth in a Download debit for the peer. • PeerCluster is another solution to organize the peer system, but not with the some goal of the PeerCommunity. Indeed, if the PeerCommunity aims to organize the peer system according to their interests, the main idea with the PeerCluster is to split the peer community into small groups. This division will reduce the number of query between the peers. Therefore, as presented in fig 3, only the SupperPeer sends and receives query between clusters. The query of the other peers of the some cluster must be passed by the SupperPeer. • NumberOfRedundancy determines the number of supper peer in the cluster that ensures the redundancy if the supper peers is not available. • PeerProfile represents a set of information about the profile of the peer. • PeerLogicalProfile represents a set of information about the logical aspects of the peer. • TheOnlineState indicates the online stat of the peer. • IsSupperPeer indicates if the current peer is considered as a supper one in its cluster or no. If it is the case, it will be used as a proxy and all the other peers in the some cluster will pass their queries by it. • PeerID guarantees the identity of the peer between its different sessions in the same platform or in different ones • PeerTrust is used to put a trust value for this peer. • PeerPlatformProfile represents a set of information about the profile of the peer in the platform in use. Unlike the "PeerLogicalProfile", this profile changes according to the platform in use and the current session. • PeerPlatformID represents the identification of the platform. For example: Pgrid1.0, JXTA 1.2, LimeWire 5.5. • PeerTypeMachine indicates the type of the machine. It can be: PC, Mac, Pocket PC, a smart phone…
SPPODL: Semantic Peer Profile Based on Ontology and Description Logic
• • • • • • • • • • • • • • • • • • • • • • • • •
477
PeerQuery represents the queries exchanged between peers. QueryID indicates the identification of the query. PeerQueryDescriptiont is the description of the query. QueryType indicates the type of a query. The later can be : a request for resource, insert, connect… Term represents terms that will be used in a query. Each term has its Term_ID, Term_Type and Term_Domain. PeerRessource describes the resources provided by the peer. This concept is a complex one. In Fig2, one can see its taxonomy. Document is a document resource like a book, article… It has a set of properties: Document_Title, Page_Number, Document_Editor, Publication_Date. Graphic is resource that has graphic characteristics. Media_Resource is a multi media resource. It can be: Audio_Resource, Animation_Resource… Audio_Resource is a Media_Resource. It has a set of properties: Artiste, Compositor, Sound_Quality (Dolby, Stereo, Mono…) Video_Resource is a video composed of Animation_Resource and Audio_Resource. Film is a video resource with some properties: Film_Studio, Director and Film_Genre. Software_Application can be any software that the peer share ( Medi_Soft, Development_Soft, Security_Soft …) File represents all resources that are file. A file resource is defined by: CreatedBy that indicates the creator, FileSize, Creation_Date FileType. Torrent is a type of file widely used in internet and P2P context. Compressed_File is file compressed by a specific tool. PeerActivities saves all the activities done by the peer. ActivityDateTime indicates the time and date of an activity execution. ActivityPeerRessource indicates the resource used in this activity. SupperPeer is a peer that plays the role of a proxy for others in the cluster. SimplePeer is a peer that has not the privileges of a SupperPeer. Trust_Mechanism represents the mechanism to calculate the trust of a peer. It can be : Trust, Reputation or Satisfaction. Media_Codec represents the type of media codec used in Media_Resource. Audio_Codec is the Media_Codec used in audio resources. It can be MP3, Wav, MIDI…etc. Video_Codec is the Media_Codec for video resources.
3.1.2 Taxonomies of Concepts The taxonomies of concepts organize the concepts of the domain and propose suitable hierarchies of the knowledge. In Figure 1, on presents the classification relations between concepts. This classification is a very important semantic relation that will be used in reasoning on the ontology.
478
Y. Djaghloul and Z. Boufaida
Fig. 1. Taxonomies of concepts
For example, one can see that SimplePeer is a sub class of Peer, Reputation is a sub class of Trust_Mechanism, the same things for the other concepts. Because the PeerResource concept is a complex one, we prefer to present its taxonomy in Figure 2.
SPPODL: Semantic Peer Profile Based on Ontology and Description Logic
479
Fig. 2. Taxonomy of the concept PeerResource
3.1.3 Binary Relations between Concepts The binary relations allow to link two concepts together. Two types of relation are used; a) Sub class relation, b) a specific relation. For example, between SimplePeer and SupperPeer there is HasSupper relation that indicates that a simple peer has a supper peer in its cluster. The objective of Figure 3 is to propose the binary relations diagram of the proposed profile.
480
Y. Djaghloul and Z. Boufaida
Fig. 3. Binary relations diagram
3.1.4 Dictionary of Concepts The dictionary of concepts contains all the concepts of the profile with their attributes. In this step, concepts are considered as classes with their ones attributes. Table 1. Dictionary of concepts Name of concepts PeerGroup PeerCommunity
PeerCluster
Attributs of concepts -Number_Of_Peer -PeerComID -PeerComdesignation -PeerComDomain -Description -ClusterID -NumberOfPeers -NumberOfRedundancy -ClusterUpBandwith_Value -ClusterDownBandwith_Value
SPPODL: Semantic Peer Profile Based on Ontology and Description Logic Table 1. (continued) PeerProfile PeerLogicalProfile
PeerPlatformProfile
Term
PeerResource
PeerQuery
Search_Query Managemenget_Query PeerActivities
Media_Codec
Audio_Codec Video_Codec Document
Book Article Magazine Graphic
Media_Resource Audio_Resource
Video_Resource Film
-PeerLogicalProfile -PeerPlatformeProfile -PeerID -OnlineState -PlatformInUse -PeerTrust -IsSupperPeer -PeerPlatformID -PeerUpBandwith -PeerDownBandwith_Value -PeerTypeMachine_Value -Term_ID -Term_Domain -Term_Type -PeerResourceID -PeerResourceType -PeerResourceURI -QueryID -PeerQueryDescription -QueryType -Query_Termes -Query_Domain -Qeury_Code -QueryUsed -ActivityDateTime -PeerActivityStatus -Codec_Description -Codec_Designation -Codec_Type -Audio_Bitrate -Video_Biterate -Document_Tile -Document_Type -Document_Editor -Page_Number -Publication_Date -Book_Author -Article_Author -Volume_Number -Color_Number -Graphic_Resolution -Graphic_Type -Media_Duration -Media_Type -Artiste -Compositor -Sound_Quality -Video_Genre -Film_Studio -Film_Director -Film_Genre
481
482
Y. Djaghloul and Z. Boufaida Table 1. (continued) TV_Serie
-Serie_Season -Serie_Episode -Setie_Genre -TV_Presenter -FileName -CreatedBy -Size -FileType -Creation_Date
TV_Show File
3.1.5 Binary Relations Table This table enhances the binary relations diagram (Figure 3) with; the cardinality of the source (CS), the target cardinality (CT) and the name of the inverse relation. The Table 2. Binary relations table Relation BelongsToComm unity HasSupper BelongsToCluster SupperPeerInClus ter HasNeighbor DescribesLogically DescribedInPlatfo rm HasActivity PeerSendsQuery PeerSends_TM_Q uery PeerReceivesQuery ActsOn FileSendedBy Exports QueryUsed Encodes_Media Encodes_Audio Encodes_Video Has_Trust_In SavedAs Uses_Trust_Mech anism Transformed_To
Source Concept PeerCluster
CS
CT
Inverse Relation
1,1
Target Concept PeerCommunity
0,N
ContainsCluster
SimplePeer Peer PeerCluster
1,1 0,1 1,N
SupperPeer PeerCluster SupperPeer
1.N 0,N 1,1
PeerCluster PeerLogical Profile Peer
0,N 1,1
PeerCluster Peer
0,N 1,1
HasChild ContainsPeer IsSupperPeerInCl uster HasNeighbor DescribedLogically
1,N
1,N
Peer Peer Peer
1,N 1,N 1,N
Peer Manage ment_Query File Peer PeerActivities Media_Codec Audio_Codec Video_Codec Peer PeerResource Peer
1,N 0,N
PeerPlatform Profile PeerActivity PeerQuery Transformed_ Query PeerQuery PeerResource
Initial_User_ Query
0,N 0,N 1,1 1,1 1,1 1.1 0,N 1,N 1,N 1,1
1,1 1,1 1,1 1,N 0,N
Peer 0,N PeerRessource 1,N PeerQuery 1,N Media_Resource 1,N Audio_Resource 1,N Video_Codec 1,N Peer 0,N File 0,1 Trust_Mechan 0,N ism Transformed_ 1,1 Query
DescribesPeerPlat form IsDoneBy QuerySendedFrom TM_QuerySended From QueryReceivedBy IsPerformed SendsFile Exported DoneInActivity Encoded_By_MC Encoded_By_AC Encoded_By_VC Saves_Resource Trust_Mechanism e_Used_By -
SPPODL: Semantic Peer Profile Based on Ontology and Description Logic
483
latter adds more semantic for the relation between concepts and constitutes an important elements in the reasoning. Table 1 presents the binary relations table for SPPODL. At the end of this step, one has a conceptual ontology. The next phase will formalize this ontology by using the syntax of the description logic. 3.2 Formalization Phase In this step, we use the syntax of the description logic in order to formalize the conceptual ontology created in the last phase. This phase consists of :a) the description of concepts. b) The inclusion of concepts. c) The definition of roles and inverse roles. 3.2.1 Description of Concepts For each concept, one associates a logical expression based on the syntax and the grammar of description logic. The following code gives a part of this description; Debit:= Upload ∪ DownLoad DownLoad := ¬ Upload PeerCommunity:= PeerGroupe (∃ PeerComID.String)∩(∃ PeerComdesignation.String)∩(∃ PeerComDomain.String)∩ (∃ Description.String) ∩ (≤ 1 ContainsCluster PeerCluster PeerCluster:= (∃ClusterID.String) ∩ (∃ NumberOfPeers.Integer) ∩(∃NumberOfRedundancy.Integer) ∩ (∃ClusterUpBandwith.Double)∩ (∃ClusterDownBandwith.Double)∩ ( ∃ BelongsToCommunity PeerCommunity) ∩(≤ 1 ContainsPeer Peer) ∩(≤0 HasNeighbor PeerCluster) ∩(•3 3 SupperPeerInCluster SupperPeer) PeerProfile:= PeerLogicalProfile ∪ PeerPlatformProfile PeerLogicalProfile:= ¬ PeerPlatformProfile PeerLogicalProfile:=PeerProfile ∩ (∃ PeerID.String) ∩ (∃ OnlineState.Boolean)∩(∃ PlatformInUse.String)∩(∃ PeerQuality.Integer)∩(∃ IsSuperPeer.Boolean)∩( ∃ DescribesLogically Peer) PeerPlatformProfile:= PeerProfile ∩ (∃ PeerPlatformID.String) ∩ (∃PeerUpBandwith.Double) ∩ (∃ PeerDawnBandwith.Double) ∩ (∃ PeerTypeMachine.String) ∩ (<1 DescribesPeerPlatform Peer) PeerActivity := (∃ ActivityDateTime) ∩ (∃ PeerActivityStatues.String) ∩ (∃ QueryUsed PeerQuery) File := Torrent ∪ Compresed_File ∪ Executable_File ∪
484
Y. Djaghloul and Z. Boufaida
Specifi_File_Type ∩ (∃ CreatedBy.String) ∩ (∃ Size.Double) ∩ (∃ FileType.String) ∩ (<0 FileSendedBy Peer) SimplePeer := Peer ∩(∃ HasSupper SupperFile) SupperPeer := Peer ∩(∃ HasChild SimplePeer) PeerRessource:= Document ∪ Media_Soft ∪ SoftwareApplication ∪ Graphic(∃ PeerRessourceID.String) ∩ (∃ PeerRessourceType.String) ∩ (∃ PeerRessourcesURI.String) ∩ ( <1 IsPerformed Management_Query) (∃ SavedAs File) Peer: = SimplePeer ∪ SupperPeer ∩ (∃ DescribedLogically PeerLogicalProfile) ∩ (<1 DescribedInPlatform PeerPlatformProfile) ∩ (∃ BelongsToCluster PeerCluster) (<1 PeerSendsQuery Peerquery) (<1 PeerSends_TM_Query Transformed_Query)
… 3.2.2 Inclusion of Concepts In this step we provide the relations of subsumption between concepts by using the following syntax: A ⊆ B. The next table gives a part of inclusion relations between concepts. Table 3. A part of inclusion relations
PeerCommunity ⊆ PeerGroup PeerCluster ⊆ PeerGroup PeerQuery ⊆ Τ Search_Query ⊆ PeerQuery Management_Query ⊆ PeerQuery User_Query ⊆ Search_Query Initial_User_Query ⊆ User_Query Transformed_Query ⊆ User_Query … PeerUpBandwith ⊆ Upload PeerDownBandwith ⊆ Download PeerActivities ⊆ Τ Torrent ⊆ File Executable_File ⊆ File Ex_Windows ⊆ Executable_File …
PeerProfile ⊆ Τ PeerLogicalProfile⊆ PeerProfile PeerPlatformProfile⊆ PeerProfile SupperPeer ⊆ Peer PeerResource ⊆ Τ Document⊆ PeerResource Media_Resource⊆PeerResource Video_Ressource ⊆ Media_Resource Film ⊆ Video_Ressource … Software_Application⊆ PeerResource Professional_Soft⊆ Software_Application Office_Soft ⊆ Professional_Soft Spreadsheet ⊆ Office_Soft …
3.2.3 Definition of Roles and Inverse Roles To define a role, one uses this expression: R(C1,C2). Where R is the name of the role, C1 and C2 are the names of concepts.
SPPODL: Semantic Peer Profile Based on Ontology and Description Logic
485
The inverse relation is introduced in Table 2 as a column of the table. In the following, we present the same information with the appropriate syntax. Table 4. Definition of roles
Definition of roles BelongToCommunity (PeerCluster,PeerCommunity) HasSupper (SimplePeer,SupperPeer) BelongsToCluster (Peer,PeerCluster) HasNieghbor (PeerCluster, PeerCluster) DescribesLogically (PeerLogicalProfile, Peer) QuerySendedFrom ( PeerQuery, Peer) TM_QuerySendedFrom ( Transformed_Query, Peer) PeerReceivesQuery (Peer,PeerQuery) DescribedInPlatform (Peer,PeerPlatformProfile) HasActivity (Peer, PeerActivity) ActOn (Management_Query, PeerRessource) FileSendedBy (File, Peer) … Table 5. Inverse roles
Inverse roles BelongToCommunity ¯ = ContainesCluster HasSupper ¯ = HasChild BelongsToCluster ¯ = ContainsPeer HasNieghbor ¯ = HasNeighbor DescribesLogically ¯ = DescribedLogically QuerySendedFrom ¯ = PeerSendsQuery PeerReceivesQuery ¯ = QueryReceivedBy QueryUsed ¯ = DoneInActivity DescribedInPlatform ¯ = DescribesPeerPlatform HasActivity ¯ = IsDoneBy ActOn ¯ = IsPerformed FileSendedBy ¯ = SendsFile Encodes_Media¯ =Encoded_By_MC ..... 3.3 Codification Phase In this phase, the OWL language is generated. One used the Protégé-OWL to generate the OWL code that corresponds to the ontology codified with the syntax of the description logic. Figure 4 presents a screenshot of protégé with SPPODL.
486
Y. Djaghloul and Z. Boufaida
Fig. 4. Use of Protégé to codify the ontology
In the following, some example of the OWL code generated by Protégé. PeerProfile is the union of tow concepts ; PeerLogicalProfile and PeerPlatforlProfile OWL manages data types that are used in our case. Ex; the property ActivityDateTime has dateTime type. The following OWL code represents the definition of the DescripedLogically role which is the inverse role DescribesLogically
SPPODL: Semantic Peer Profile Based on Ontology and Description Logic
487
3.4 Verification Phase In this phase, the final ontology is verified by using a logical reasoned. In our case, we have used RacerPro 1.9. This step is very important to check the inconsistencies of to validate the proposed ontology. Protégé proposes a direct link with this Racer and permits to do that directly.
4 Conclusion In this paper we have presented our work on semantic peer profile. We have proposed SPPODL, a Semantic Peer Profile based on Ontology and Description Logic that gives a clear description of the peer and provides a formal semantic on its structure and its resources. In this paper, we have indicated that in P2P systems the peer profile plays an important role and all P2P systems provide their own peer definitions. However, the lack of semantic in the profile causes serious difficulties to deal with these systems. Our work aims to provide a peer profile that guarantee; a rich semantic description of the peer, a clear separation between the technical and the logical aspects and the se of description logic and OWL language. In section 3, we have used a rigorous process to build our ontology. We have followed different steps, from the glossary of terms to the encoded ontology in OWL, passing by conceptual tables and diagrams. The use of such process guarantees the clarity and the quality of the proposed profile. We think that SPPODL will be benefic in several cases. It will be helpful: • • • •
To facilitate the interoperability in heterogeneous peer platforms. To participate in data and resource integration in high distributed P2P environment. To guarantee a unique identity of the peer between several systems and networks. To provide more semantic on resources shared by the peer.
Future work includes the integration of SPPODL in data and resource integration approach based on trust management. The profile will be an important component in this collaborative approach to integrate resources in P2P environment.
References 1. Aberer, K., Cudré-Mauroux, P., Datta, A., Despotovic, Z., Hauswirth, M., Punceva, M., Schmidt, R.: P-grid: A self-organizing structured p2p system. ACM SIGMOD Record 32(3) (2003)
488
Y. Djaghloul and Z. Boufaida
2. Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P.: The Description Logic Handbook. Cambridge University Press, Cambridge (2002) 3. Berners-Lee, T.: Weaving the Web. Harpur, San Francisco (1999) 4. Cai, M., Frank, M.: RDFPeers: A Scalable Distributed RDF Repository based on A Structured Peer-to-Peer Network. In: International World Wide Web Conference, WWW (2004) 5. Clarke, I., Miller, S.G., Hong, T.W., Sandberg, O., Wiley, B.: Protecting Free Expression Online with Freenet. IEEE Internet Computing 6(1) (January/Feburary 2002) 6. Dean, M., Connolly, D., Van Harmelen, F., Hendler, J., Horrocks, I., L.McGuinness, D., Patel-Schneider, P.F, Stein, L.A.: Web ontology language (OWL) reference version 1.0. W3CWorking Draft (2003), http://www.w3.org/TR/2003/WD-owl-ref-20030331 7. Djaghloul, Y., Boufaida, Z.: Toward Peer to Peer Platform Integration based on OWL Ontology and Roaming Service. In: International Review on Computers and Software (IRECOS), vol. 1(1), pp. 31–42 (2006) ISSN 1828-6003 8. Ehrig, M., Haase, P., Siebes, R., Staab, S., Stuckenschmidt, H., Studer, R., Tempich, C.: The SWAP Data and Metadata Model for Semantics-Based Peer-to-Peer Systems. In: Schillo, M., Klusch, M., Müller, J., Tianfield, H. (eds.) MATES 2003. LNCS (LNAI), vol. 2831, pp. 144–155. Springer, Heidelberg (2003) 9. The gnutella protocol specification, http://www9.limewire.com/developer/gnutella_protocol_0.4.pdf 10. Hemam, M., Boufaida, Z.: An Ontology Development Process for the Semantic Web. In: ACIT 2004 N5, International Arab Conference on Information Technology, pp. 595–601 (2004) 11. Horrocks, I., Sattler, U., Tobies, S.: Practical reasoning for very expressive description logics. J. of the Interest Group in Pure and Applied Logic 8(3), 239–264 (2000) 12. Kazaa home page, http://www.kazaa.com 13. limeWire home page, http://www.limewire.org 14. Nejdl, N., Wolpers, M., Siberski, W., Schmitz, C., Schlosser, M., Brunkhorst, I., Loser, A.: Super-peer-based routing and clustering strategies for rdf-based peer-to-peer networks. In: Proceedings of the Twelfth International World Wide Web Conference (WWW 2003), Budapest, Hungary (2003) 15. Parkhomenko, P., Yugyung, L., Park, E.K.: Ontology-driven peer profiling in peer-to-peer enabled semantic web. In: Conference on Information and Knowledge Management Proceedings of the twelfth international conference on Information and knowledge management, pp. 564–567 (2003) 16. Ratnasamy, S., Francis, P., Handley, M., Karp, R., Shenker, S.: A scalable contentaddressable network. In: Proc. of ACM SIGCOMM ’01, San Diego, CA, USA (August 2001) 17. Stoica, I., Morris, R., Kaashoek, M., Balakrishnan, M.: Chord:A scalable peer-to-peer lookup service for Internet applications. In: Proc. of ACM SIGCOMM ’01, San Diego, CA, USA (August 2001) 18. Yang, B., Garcia-Molina, H.: Improving efficiency of peer-to-peer search. In: Proc. of the 28th Intl. Conf. on Distributed Computing Systems (July 2002) 19. Yang, B., Garcia-Molina, H.: Designing a super peer network. In: 19th International Conference on Data Engineering, ICDE ’03 (2003)
Ontology Based Tracking and Propagation of Provenance Metadata Miroslav Vacura and Vojtˇech Sv´ atek Faculty of Informatics and Statistics, University of Economics W. Churchill Sq.4, 130 67 Prague 3, Czech Republic vacuram,[email protected]
Abstract. Tracking the provenance of application data is of key importance in the network environment due to the abundance of heterogeneous and controllable resources. We focus on ontologies as a mean of knowledge representation and present a novel approach to representation of provenance metadata in knowledge bases, relying on an OWL 2 design pattern. We also outline an abstract method of propagation of provenance metadata during the reasoning process.
1
Introduction
One of important features of any complex network environment is the multiplicity of information sources. This situation completely changes the information processing paradigm: in a conventional information system data come from a limited and usually relatively small number of sources. Sources of data are controllable and uncertainty regarding data reliability can be limited. Huge networks like World Wide Web, on other hand, consists of an enormous number of different information sources, which are usually completely uncontrollable and their reliability is usually questionable. If we use WWW data for the purposes of entertainment, this character of WWW data is not a problem, but if we intend to use the WWW for serious business or scientific applications, keeping track of the origins of data becomes necessary. Namely, when working with typical network applications, which usually process data from multiple WWW sources, data provenance is of key importance. However, until now, most web applications lack any data provenance features. Buneman et al. [4,5] defines data provenance as ”the process of tracking and recording the origins of data and its movement between databases”. In following text we will focus on Semantic web [3] applications and context. However information about the origin of a piece of data and the process by which it arrived to information system is not only important in the case of Semantic Web applications. For many other types of applications this information is of critical importance too, like in the case of Molecular Biology or in the cases where legal or ethic issues are associated with the data involved [4]. F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 489–496, 2010. c Springer-Verlag Berlin Heidelberg 2010
490
M. Vacura and V. Sv´ atek
In the context of the Semantic web, provenance information can be attached to RDF triples using ad hoc reification. Such solutions however make the reuse of such information hard, and also do not fit well to annotation of ontologies themselves. We therefore propose a solution based on a design pattern that uniformly captures provenance information for an ontology as well as for the data (RDF knowledge base) that are based on it. The rest of the paper is organized as follows. Section 2 briefly discusses the state of the art in provenance (and similar metadata) representation in the Semantic web and presents the ontology pattern for representing provenance metadata, relying on the recent OWL 2 specification. Section 3 then proposes a mechanism for propagation of provenance metadata in ontologies. Finally, section 4 summarises the content.
2 2.1
Representation of Provenance Data in Ontologies State of the Art
There are generally two ways of conceptualising provenance data in ontologies. One way is to include provenance information in the same ontology along with other information. The other way is to use a separate ontology for regular data and a separate ontology for provenance data. This distinction not only holds for representing provenance metadata but also for describing any kind of metadata, including metadata regarding certainty or relevance of information. The first approach is used for example in the COMM multimedia ontology [1], a comprehensive framework based on the DOLCE foundational ontology and the MPEG-7 standard, which is focused on describing multimedia data. Provenance data can be included in this ontology along with other descriptive information regarding given multimedia data, and its representation is based on the “Descriptions & Situations” design pattern. The second approach—using a separate ontology for describing metadata—is used for example for describing relevance of information [6] or in our recent work regarding representation of uncertainty [11]. Both these approaches can be used, based on context and situation. The first approach is, from our point of view, more appropriate when the whole ontology contains information that has the character of metadata. This is case of COMM multimedia ontology, where multimedia data are stored in individual files in the file system, while the ontology contains metadata describing this multimedia with different characteristics. Then provenance data is just one kind of metadata associated with multimedia data. On the other hand, there are also situations when the ontology represents actual data and therefore it can be reasonable to use an additional metadata ontology to represent metadata information like the certainty of data, relevance of data or in our case provenance data. We don’t see this distinction as obligatory, as it is always a design decision of developers of the ontology how they intend to represent these kinds of information in their ontology, and this decision depends on concrete circumstances, domain and context of ontology that is being developed.
Ontology Based Tracking and Propagation of Provenance Metadata
491
In this paper we present an instance of the second approach, assuming it is adequate in many cases. 2.2
OWL DL Setting
Formal basis for ontologies is provided by Description Logics (DL) [2]. In DL we understand an ontology as a triple O =< KR , KT , KA >, where KR is the role box (RBox), KT is the terminology box (TBox), and KA is the assertional box (ABox) – see [2] for detailed description of DL. In reality an ontology O can be result of merging several other ontologies with different or same level of generality. Such ontology may be for example developed on the basis of some foundational ontology OF , with two other domain ontologies OD1 and OD2 from different sources merged. Formally D1 D2 D1 D2 F F ∪ KR ∪ KR ), (KTF ∪ KTD1 ∪ KTD2 ), (KA ∪ KA ∪ KA )> O =< (KR
In such a case it may be important not only to trace provenance data for assertional axioms (ABox), which form the extensional knowledge in ontologies (or knowledge bases and RDF data collections associated to them), but also provenance data for terminology or role axioms (TBox, RBox), which form the intensional knowledge of the ontology. Such provenance data then describe the origin of each individual terminology/role axiom. As we need to be able to assign provenance information to all elements of ontology, that means to RBox axioms, TBox axioms and ABox axioms, we need to be able to ’talk’ about these axioms. This is the well-known problem already investigated e.g. in [10]. Some of known solutions are presented in [12]: 1) it is possible to use an extensive metamodel of base OWL DL ontology that reifies all its axioms, 2) we can include meta information in annotation properties of the base ontology, 3) it is possible to annotate all axioms of the original ontology with an URI, and to refer to this unique identifier in the meta ontology. The first approach introduces an extensive meta ontology structure that can be used for our purpose because it exposes axioms of the base ontology as individuals of the meta ontology, but can be computationally difficult. The second approach is relatively simple but it presents provenance as non-logical information outside the (regular) logical semantics of OWL, and therefore is unusable for our needs. The third approach can be used to easily reify axioms in meta ontology without extensive ontological structures required by first approach. Authors of [12] discourage from using this approach because it requires extension of old OWL 1.0 standard in order to assign URIs to axioms. As such extension in XML based syntax they suggest the approach of SWRL, which allows URI references as an optional element [7]. In this paper we however suggest to overcome the drawback of the third approach by relying on the recent OWL 2 standard which allows every axiom of base ontology O to be annotated with a unique identifier – URI [9].
492
M. Vacura and V. Sv´ atek
Provenance ontology
Provenance data
Reified axioms
Base ontology Base ontology axioms
Fig. 1. Ontology Structure
2.3
Provenance Representation Pattern
A general overview of our approach to provenance metadata representation is depicted on Fig. 1. The first level of provenance ontology contains individuals representing reified axioms of the base ontology. These individuals are then assigned actual provenance data. The ontology pattern itself is depicted on Fig. 2, a detailed example of provenance represenation is then depicted on Fig. 3. We consider a base ontology with three axioms: α1 ∈ KR , α2 ∈ KT , and α3 ∈ KA (on diagram named rbox-axiom-alpha1, tbox-axiom-alpha2, and abox-axiom-alpha3). The description of provenance data is based on the provenance ontology (forP P , KTP , KA >) that contains reifications of axioms of the base mally OP =< KR
Base ontology
Provenance Ontology R_p
OntologyAxiom
prov-for
ProvenanceAtom
ontologyaxiom prov-type ProvenanceType RBoxAxiom
TBoxAxiom
ABoxAxiom
prov-attr ProvenanceAttribute
Fig. 2. Provenance pattern
Ontology Based Tracking and Propagation of Provenance Metadata
Base ontology
Provenance Ontology OntologyAxiom
RBoxAxiom rbox-axiomalpha1
tbox-axiomalpha3
ProvenanceType
TBoxAxiom
ProvenanceAttribute
ABoxAxiom
R_pr
axiom-reification-1 R_pt -AxiomURI = uri1
dublin-core
prov-type axiom-reification-2
prov-type
prov-for
axiom-reification-3
prov-attr
-AxiomURI = uri2
R_pa
-prefix = dc -name = Dublin Core -referenceURI = http://... prov-type
prov-for
aboxaxiomalpha3
493
prov-attribute-1 -name = creator -referenceURI = ...
-AxiomURI = uri3
-dc:creator = Mirek -dc:source = http:... ProvenanceAtom
prov-for
prov-atom-1
prov-attribute-2 -name = source -referenceURI = ...
prov-atom-2 -dc:creator = Peter
Fig. 3. Provenance pattern example
ontology plus the provenance information. The reification level of the provenance ontology consists of class OntologyAxiom with subclasses RBoxAxiom, TBoxAxiom, and ABoxAxiom. Individuals belonging to these classes are actually reifications of axioms of base ontology. We define the reification relation Rpt for TBox axioms as follows. Let a be an individual of the provenance ontology and let α be an axiom of the base ontology. Then Rpt (a, α) iff α is a TBox axiom and is annotated by a unique identifier URI and a is an individual belonging to class TBoxAxiom and its data type property AxiomURI has value URI. We presuppose that Rpt is functional and injective. Analogically we define the reification relation Rpr for RBox axioms and the reification relation Rpa for ABox axioms. We then also define the general reification relation Rp = Rpr ∪ Rpt ∪ Rpa . Note that reification relations are not DL relations defined in the ontology but meta-logical relations. Rp is relation connecting individuals of ontology OP with axioms of ontology O. Reified axioms are then assigned provenance information using the relation prov -for to individuals of class ProvenanceAtom. Note that the relation prov-for
494
M. Vacura and V. Sv´ atek
is N:N, so a reification of an axiom can be assigned multiple provenance information atoms (i.e. the same axiom was included in multiple original ontologies) and multiple axioms can be assigned a single provenance information atom (ontology from one source has usually multiple axioms). Each individual of this class has defined some provenance information as its datatype properties. It can be for example property ”dc:creator” with value ”John”. Each provenance atom individual is in relation prov-type with individuals of class ProvenanceType. This class is used to define what kind of provenance definition or standard are we using. Our example on Fig. 3 uses the well-known Dublin Core standard. For provenance types we can define list of attributes that each standard supports by class ProvenanceAttribute linked to class ProvenanceType by relation prov-attr. This approach enables us to use annotations by various provenance meta data standards in single ontology. This is important feature when working with provenance in heterogeneous area of World Wide Web.
3
Propagation of Provenance Metadata in Ontologies
Ontologies do not serve only as static (meta)information representation tool but also enable user to infer new knowledge. Inferred knowledge then can enrich the ontology or it can be used for another purpose. In any case we consider tracking provenance information for inferred knowledge necessary. Typical DL inferred knowledge in ontologies may include following [2]: 1. 2. 3. 4. 5.
C1 C2 C1 ≡ C2 C1 ∩ C2 = ∅ C ⊥ (equivalent to assertion that concept C is unsatisfiable) C(a) (for some arbitrary concept C and individual a).
These are the most common inference tasks results for ontologies, which can be performed by most reasoning engines and also their resulting assertions. It is now necessary to assign this inferred assertions appropriate provenance information. Natural way is to assign provenance meta data to this new knowledge on the basis of provenance meta data of knowledge from which it was inferred. We denote α an axiom that is inferred from ontology O, therefore O α. Now Kalyanpur et ales. [8] denotes JUST(α, O) ⊆ O as such fragment of ontology O, that JUST(α, O) α and ∀O ((O ⊂ JUST(α, O)) → (O α)). Informally this set is justification for inferred axiom α in ontology O. Next, ALLJUST(α, O) denotes set ofall justifications for α in O, formally {O ; JUST(α, O )} and we define OAJ(α) = ALLJUST(α, O). This is just formal step because while ALLJUST(α, O) is set of sets of axioms, our defined OAJ(α) is set of axioms of O (formally OAJ(α) ⊆ O) what is more appropriate for our use. When the axiom α is inferred it does not have assigned any provenance information. First it is necessary to annotate (based on OWL 2) the axiom with new unique URI, so it can be reified on first level of provenance ontology. Then new individual a of class OntologyAxiom (and its respective subclass) is introduced
Ontology Based Tracking and Propagation of Provenance Metadata
495
to provenance ontology as this reification, with data type property AxiomURI having as its value the URI of the axiom α. Formally now Rp (a, α) as we defined earlier.
D
O
Rp
a prov-for
OP
Rp1(OAJ (D ) ) OAJ (D )
prov-for Reification
prov for1 (Rp1 (OAJ(D ) ))
Provenance
Fig. 4. Provenance propagation
Now we can use set of axioms OAJ(α) and get its respective set of reifications at the first level of provenance ontology using relation Rp . Formally we denote this set of reifications Rp−1 (OAJ(α) ) (see. Fig. 4). These reifications have some provenance information assigned by relation prov-for and individual provenance information atoms of class ProvenanceAtom. We can formally denote appropriate set of provenance atoms as prov-for−1 (Rp−1 (OAJ(α) )). Now this is set of provenance atoms that are assigned to all axioms that are justifications for our inferred axiom α. Thats why we assign this provenance information to this axiom. We know that a is reification of axiom α, so now for every individual x of set prov-for−1 (Rp−1 (OAJ(α) )) we add to provenance ontology instance of relation prov-for(x, a).
4
Conclusions
Tracking the provenance of application data as well as of ontology elements is one of critical aspects of the Semantic web. We presented an approach to uniformly representing provenance information for data as well as axioms, which relies on a design pattern in OWL. The extended capabilities of the recent OWL 2 version of the language is taken into account. A general method of provenance propagation during ontology-based reasoning is also outlined.
Acknowledgments This work has been partially partially supported by the IGA VSE grant 20/08, IGS 4/2010 and by the CSF grant P202/10/0761 (Web Semantization).
496
M. Vacura and V. Sv´ atek
References 1. Arndt, R., Troncy, R., Staab, S., Hardman, L., Vacura, M.: COMM: Designing a Well-Founded Multimedia Ontology for the Web. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudr´e-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 30–43. Springer, Heidelberg (2007) 2. Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F. (eds.): The Description Logic Handbook: Theory, Implementation, and Applications. Cambridge University Press, Cambridge (2003) 3. Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web. Scientific American (May 2001) 4. Buneman, P., Khanna, S., Tan, W.-C.: Data Provenance: Some Basic Issues. In: Kapoor, S., Prasad, S. (eds.) FST TCS 2000. LNCS, vol. 1974, p. 87. Springer, Heidelberg (2000) 5. Buneman, P., Khanna, S., Tan, W.-C.: Why and Where: A Characterization of Data Provenance. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, p. 316. Springer, Heidelberg (2000) 6. G´ omez-Romero, J., Bobillo, F., Delgado, M.: An Ontology Design Pattern for Representing Relevance in OWL. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudr´e-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 71–84. Springer, Heidelberg (2007) 7. Horrocks, I., Patel-Schneider, P.F., Boley, H., Tabet, S., Grosof, B., Dean, M.: SWRL: A Semantic Web Rule Language Combining OWL and RuleML. In: W3C Member Submission, World Wide Web Consortium (2004) 8. Kalyanpur, A., Parsia, B., Horridge, M., Sirin, E.: Finding All Justifications of OWL DL Entailments. In: Aberer, K., Choi, K.-S., Noy, N. F., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudr´e-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 267–280. Springer, Heidelberg (2007) 9. Motik, B., Patel-Schneider, P.F., Parsia, B.: OWL 2 Web Ontology Language, Structural Specification and Functional-Style Syntax. In: W3C Recommendation, World Wide Web Consortium, October 27 (2009) 10. Staab, S., Maedche, A.: Axioms are objects too: Ontology engineering beyong the modeling of concepts and relations. Research report 399, Institute AIFB, Karlsruhe (2000) 11. Vacura, M., Sv´ atek, V., Smrˇz, P., Simou, N.: A Pattern-based Framework for Representation of Uncertainty in Ontologies. In: Proceedings of Uncertainty Reasoning for the Semantic Web, URSW (2007) 12. Vrandecic, D., V¨ olker, J., Haase, P., Tran, D.T., Cimiano, P.: A Metamodel for Annotations of Ontology Elements in OWL DL. In: Sure, Y., Brockmans, S., Jung, J. (eds.) Proceedings of the 2nd Workshop on Ontologies and Meta-Modeling, Karlsruhe, Germany, October 2006. GI Gesellschaft fur Informatik (2006)
A Real-Time In-Air Signature Biometric Technique Using a Mobile Device Embedding an Accelerometer ´ J. Guerra Casanova, C. S´ anchez Avila, A. de Santos Sierra, G. Bailador del Pozo, and V. Jara Vera Centro de Dom´ otica Integral (CeDInt-UPM) Universidad Polit´ecnica de Madrid Campus de Montegancedo, 28223 Pozuelo de Alarc´ on, Madrid {jguerra,csa,alberto,gbailador,vjara}@cedint.upm.es
Abstract. In this article an in-air signature biometric technique is proposed. Users would authenticate themselves by performing a 3-D gesture invented by them holding a mobile device embedding an accelerometer. All the operations involved in the process are carried out inside the mobile device, so no additional devices or connections are needed to accomplish this task. In the article, 34 different users have invented and repeated a 3-D gesture according to the biometric technique proposed. Moreover, three forgers have attempted to falsify each of the original gestures. From all these in-air signatures, an Equal Error Rate of 2.5% has been obtained by fusing the information of gesture accelerations of each axis X-Y-Z at decision level. The authentication process consumes less than two seconds, measured directly in a mobile device, so it can be considered as “real-time”. Keywords: Biometrics, gesture recognition, accelerometer, mobile devices, dynamic time warping, fuzzy logic.
1
Introduction
Nowadays most mobile devices provide access to Internet where some operations may require authentication. Looking up the balance of a bank account, buying a product in an online shop or gaining access to a secure site are some actions that may be performed within a mobile phone and may require authentication. In this mobile context, biometrics promises to raise again as a method to ensure identities. Some works trying to join classical biometric techniques in a mobile scenario have been already developed, based on iris recognition [1], face [2], voice recognition [3] or multimodal approaches [4]. In this article, a new mobile biometric technique is proposed. This technique is based on performing a 3-D gesture holding a mobile device embedding an accelerometer [5]. This gesture is considered as an in-air biometric signature, with information in axes X-Y-Z. This biometric technique may be considered as a combination between behavioral and physical techniques, since the repetition of a gesture in the space depends not only on the shape and the manner of performing F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 497–503, 2010. c Springer-Verlag Berlin Heidelberg 2010
498
J. Guerra Casanova et al.
the in-air signature but also on physical characteristics of the person (length of the arm, capability of turning the wrist or size of the hand holding the device). This 3-D signature technique proposed is similar to traditional handwrittensignature [6], but adapted to a mobile environment. In this proposal, feature extraction is directly performed within a mobile device without any additional device requirement. Besides, through this biometric technique based on 3-D gestures, it is intended to perform all the authentication process inside the device, executing all the algorithms involved without any additional device or server. Therefore, and due to the increasing process power of mobile devices, this biometric technique would achieve an important requirement: “real-time”. This article is divided in the following Sections. Firstly, Section 2 describes the method of analysis of gesture signals involved in this study. Next, Section 3 details the in-air gesture biometric database created to support the experiments of the article. Section 4 includes an explanation of the experimental work carried out, as well as Time and Equal Error Rate Results obtained. Finally, in Section 5, conclusions of this work and future lines are introduced.
2
Analysis method proposed
In this article, an algorithm based on Dynamic Time Warping [7] has been developed to analyze different signals, in order to elucidate whether a sample is truthful or not. For that purpose, the algorithm tries to find the best possible alignment between two signals in order to correct little variations at the performing of the gesture. A score matrix is calculated for each point of both sequences [8], and later, the path in this matrix that maximizes this score is obtained. Any vertical or horizontal movement in this path implies adding a zero value in a sequence to correct little deviations. The algorithm includes a fuzzy function in the score equation [9] representing to what extend a user is able to repeat a gesture. The score equation is shown in Equation 1. ⎧ ⎨ si,j−1 + h (1) si,j = max si−1,j−1 + Δ ⎩ si−i,j + h where h is a constant, known as gap penalty in the literature [10], whose value is obtained to maximize the overall performance and Δ is a fuzzy decision function that represents a Gaussian distribution: Δ = e−
(x−μ)2 2σ2
(2)
where μ and x are the values of the previous points in base to whom the score of the new points (i, j) are calculated. Finally, σ is a constant stating to what extend two values are similar. Despite a user performs the same gesture holding the mobile device in the same way, there will be always some little variations on the speed and manner
A Real-Time In-Air Signature Biometric Technique
499
the user performs his/her 3-D signature. This algorithm aligns a pair of signals, correcting those little deviations without compensating high differences by including some zero values and interpolating them in order to maximize the overall score function. As a result of this algorithm, signals length is duplicated. When the optimal alignment of the signals is accomplished, Euclidean distance is calculated in order to measure the differences between aligned signals. Consequently, a numerical value is obtained at the end of the analysis process; the lower the value is, the more similar the analyzed signals are, and viceversa.
3
Database Description
This article is developed with a database of 34 gestures of different users. This database has been obtained in two sessions: A first session where each user had to invent an identifying gesture and perform it in the air holding an embedding accelerometer device. This gesture is considered as the in-air biometric signature of this biometric technique based on gesture recognition. In this first session, 34 volunteers (from ages 19 to 60, 15 women and 19 men) have participated performing the gesture they would choose as their in-air signature in this technique by holding a device embedding an accelerometer. For this purpose, an application for iPhone 3G (an embedding accelerometer mobile device) has been developed to obtain the accelerations of the movement of the hand on each axis X-Y-Z while carrying out a gesture at a sampling rate of 10 ms; frequency precise enough to get representative signals of a hand movement in the air [11]. Each user has repeated 7 times his/her gesture, with intervals of 10 seconds in between, to reduce dependence between samples. Some instructions have been provided to help the volunteers in this aim, in order to perform remindful and complex enough gestures so that anyone except the truthful user may reproduce it immediately. Furthermore, all of these sessions of performing new gestures have been recorded on video. Users have reacted differently to the task of inventing a gesture repeatable by them and not easy to be forged by anyone who might see them. In fact, users have solved this proposal by: – Writing a word or a number in the air. – Performing an usual gesture: Playing the guitar, an own salute, using a tennis racket. . . – Drawing a symbol in the air: A star, a treble, a clef. . . – Drawing something real in the air: Clouds, trees. . . – Performing a complex gesture by concatenating simple gestures as squares, triangles, circles, turns. . . – Making their own signature in the air. In this study, 18 of 34 gestures (53%) are the truthful signatures of each person performed in the air whereas the rest are gestures of unlike levels of difficulty.
500
J. Guerra Casanova et al.
At the end of this first session, all volunteers have answered a survey to assess (1 very good - 5 very bad) different issues of the in-air gesture biometric technique proposed. Results are presented in Table 1: Table 1. Volunteers answers to different issues in order to validate the feasibility of the technique from user-experience point of view Question Average Mode Standard deviation Ease to invent an in-air signature 2.1 2 0.65 Ease to repeat an in-air signature 1.9 2 0.45 Collectively of the technique 1.9 1 0.71 Acceptability of the technique 2.7 2 0.85
From those answers, it can be inferred that users had low difficulty in inventing and repeating a 3-D gesture with a mobile device. As biometric data are acquired in a non-intrusive manner, users have evaluated the collectively of the technique as very low [12]. Besides, users have felt secure and comfortable when biometric characteristics have been extracted, so acceptability also receives a low score. Moreover, volunteers have been asked to compare the confidence of this in-air gesture biometric technique proposed respect to iris, face, handwritten signature, hand and fingerprint recognition techniques. In average, participants evaluated the confidence of in-air gesture signature over handwritten signature, next to face and hand recognition and far from iris and fingerprint. On the other hand, a second session has been performed by studying the videos recorded in the previous session. In this session, three different people have tried to forge each of the 34 original in-air biometric signatures. Each falsifier attempted to repeat each gesture seven times. As a result of both sessions, 238 samples of truthful gestures (34 users x 7 repetition) and 714 falsifications (34 users x 3 forgers x 7 attempts) have been obtained. An evaluation of the error rates of the technique has been developed from all the samples of the database created. The experiments and results obtained are described in Section 4.
4
Experimental Results
Three original samples of each gesture chosen randomly have been considered as the 3D biometric signature template; the other four original samples represent truthful attempts of verification that should be accepted. All impostor samples symbolize false trials that should be rejected. Summarizing, Equal Error Rate (EER) [13] have been calculated in this article from 136 (34 users, 4 accessing samples each) truthful and 714 (34 users, 3 forgers, 7 samples each) impostor access samples. This technique is assessed as powerful whether not only good results of EER are obtained, but also signal analysis carries a reasonable time to be considered “real-time”. According to this, a reader should notice that the longer the signals, the longer time to execute the algorithm. Furthermore, this growth in time is
A Real-Time In-Air Signature Biometric Technique
501
not linear but exponential. On the other hand, if the number of performances of the algorithm grows up with a constant length of signals, the total time to complete the whole process increases linearly. Each 3-D signature carries informations about the accelerations on each axis when the gesture is performed. Three different biometric fusion strategies have been tested: fusion at decision level, fusion at matching score and fusion at feature extraction [14]. In this article, only the first strategy is explained since best results have been obtained from it. Fusing information at decision level implies to execute in parallel but separately the alignment algorithm of each axis signal and calculate a unique comparison metric value from all of them. The resulting comparison metric value for two gestures A and B is calculated by Equation 3: dA,B =
dxA,B + dyA,B + dzA,B
(3) 3 where dxA,B , dyA,B and dzA,B are the values obtained by aligning the signals of each axis x, y and z separately and calculating their Euclidean distance by Equation 4 deA,B =
2L
(Ax,i − Bx,i )2
(4)
i=1
where A and B are the two gestures of length L trying to be analyzed. Ae and B e are the result of aligned the signals A and B corresponding to axis e. Since the length of these aligned signals is 2L, the resulting value deA,B for each axis e is obtained by calculating the differences between each point and from all the length of the signals. According to the proposed fusion information scenario, the algorithm is executed three times, one for each axis signal separately. The information is fused at decision level by calculating the average of the result of each process of each axis signal. With all these conditions, an Equal Error Rate of 2.5% has been obtained (Figure 1). This value has been obtained as the intersection of False Acceptance 0.08
FRR (x,y,z) FAR (x,y,z)
0.07 0.06
Error
0.05 0.04
EER = 2.5%
0.03 0.02 0.01 0 Comparison metric
Fig. 1. Resulting EER (%) of fusing X, Y and Z signals at decision level
502
J. Guerra Casanova et al.
Rate (FAR) curve when falsifiers tried to forge the system, and False Rejection Rate (FAR) curve obtained from the rejection error when truthful users tried to access the system performing their own signature. Let TE be the execution time of the alignment algorithm; which is the most consuming time process in an authentication activity. Then, the time consumed in this experiment for each comparison of two gesture samples is equivalent to three times the execution of the algorithm with two signals of length L (3TE (L)). This time has been measured in a mobile device (iPhone 3G) resulting to be 1.51 seconds in average. The calculation of this time has been obtained by the average of executing 10 times in a row an alignment algorithm of signals of 600 points (a six seconds gesture with a sampling rate of 100Hz).
5
Conclusion and Future Work
In this article, a proposal of a biometric technique in mobile devices has been explained. By analyzing an in-air signature performed by a gesture holding a mobile device embedding an accelerometer, a user is authenticated with low Equal Error Rates in “real time”. All the operations involved are carried out inside the mobile device taking advantage of the improving capacity of processing of mobile device. In order to study the feasibility of this technique, an in-air gesture biometric database has been created. For that purpose, 34 different users have invented and repeated an in-air signature performed with a mobile device. Besides, three falsifiers have attempted to forge all truthful gestures from video records. The volunteers involved in the construction of the database have assessed positively the ease, acceptability, collectively and confidence of the biometric technique proposed. From all the information stored in it, different scenarios of fusing information have been studied, obtaining best results when the fusion was carried out at decision level. Equal Error Rate has been calculated with truthful gestures to obtain False Reject Rate and falsifications of original gestures have been utilized to determine False Acceptance Rate. As a result, an Equal Error Rate of 2.5% has been obtained, validating the feasibility of the in-air signature biometric technique proposed in this article. Furthermore, an application has been developed in an embedding accelerometer mobile device to measure the consumption time involved in the authentication process. In 1.51 seconds all the required operations are completed without any additional devices. In future works other studies to reduce consumption time may be proposed following different strategies: Reducing sampling rates of the feature extraction of the gesture, applying slide windows in the algorithms or operating only with part of all the information. Besides, the most important parts of the signals and the axis of accelerations which carries more distinctive information may be evaluated in order to reduce the length or the parts of the signals required to obtain low Equal Error Rates so that consuming time would decrease.
A Real-Time In-Air Signature Biometric Technique
503
References 1. ho Cho, D., Park, K.R., Rhee, D.W., Kim, Y., Yang, J.: Pupil and iris localization for iris recognition in mobile phones. In: Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, International Conference on & Self-Assembling Wireless Networks, International Workshop on, pp. 197–201 (2006) 2. Tao, Q., Veldhuis, R.: Biometric authentication for a mobile personal device. In: Annual International Conference on Mobile and Ubiquitous Systems, pp. 1– 3 (2006) 3. Shabeer, H.A., Suganthi, P.: Mobile phones security using biometrics. In: International Conference on Computational Intelligence and Multimedia Applications, vol. 4, pp. 270–274 (2007) 4. Manabe, H., Yamakawa, Y., Sasamoto, T., Sasaki, R.: Security evaluation of biometrics authentications for cellular phones. In: International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 34–39 (2009) 5. Matsuo, K., Okumura, F., Hashimoto, M., Sakazawa, S., Hatori, Y.: Arm swing identification method with template update for long term stability. In: Lee, S.-W., Li, S.Z. (eds.) ICB 2007. LNCS, vol. 4642, pp. 211–221. Springer, Heidelberg (2007) 6. Friederike, A.J., Jain, A.K., Griess, F.D., Connell, S.D., Lansing, E.J.M.: On-line signature verification. Pattern Recognition 35 (2002) 7. Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech and Signal Processing 26(1), 43–49 (1978) 8. Durbin, R., Eddy, S., Krogh, A., Mitchison, G.: Biological sequence analysis, 11th edn. Cambridge University Press, Cambridge (2006) 9. de Santos Sierra, A., Avila, C., Vera, V.: A fuzzy dna-based algorithm for identification and authentication in an iris detection system. In: 42nd Annual IEEE International Carnahan Conference on Security Technology, ICCST 2008, October 2008, pp. 226–232 (2008) 10. Miller, W.: An introduction to bioinformatics algorithms. neil c. jones and pavel a. pevzner. Journal of the American Statistical Association 101, 855 (2006) 11. Verplaetse, C.: Inertial proprioceptive devices: self-motion-sensing toys and tools. IBM Syst. J. 35(3-4), 639–650 (1996) 12. Jain, A., Hong, L., Pankanti, S.: Biometric identification. ACM Commun. 43(2), 90–98 (2000) 13. Jain, A.K., Flynn, P., Ross, A.A.: Handbook of Biometrics. Springer, New York (2007) 14. Ross, A., Jain, A.: Information fusion in biometrics. Pattern Recognition Letters 24(13), 2115–2125 (2003)
On-Demand Biometric Authentication of Computer Users Using Brain Waves Isao Nakanishi and Chisei Miyamoto Graduate School of Engineering, Tottori University 4-101 Koyama-minami, Tottori, 680-8552 Japan [email protected]
Abstract. From the viewpoint of user management, on-demand biometric authentication is effective for achieving high-security. In such a case, unconscious biometrics is needed and we have studied to use a brain wave (Electroencephalogram: EEG). In this paper, we examine the performance of verification based on the EEG during a mental task. In particular, assuming the verification of computer users, we adopt the mental task where users are thinking of the contents of documents. From experimental results using 20 subjects, it is confirmed that the verification using the EEG is applicable even when the users are doing the mental task.
1
Introduction
In networked society, non-face-to-face communications are performed through computer networks; therefore, it is quite important to verify identity. For such a person authentication method, magnet cards, IC cards, or passwords have been used but the cards have forgery or robbery concerns, and the password tends to be forgotten. Consequently, the person authentication using biometrics gains public attention. Among biometric modalities, a fingerprint and an iris achieve higher performance and are already used in consumer security systems. However, it has been reported that the authentication systems using them were circumvented by fake fingers or printed iris images [1,2]. The reason is that the fingerprint and iris are revealed on the body surface. Veins are kept in the body; therefore, it is expected to have tolerability to the circumvention. However, it is also reported that even the authentication system using the vein accepted artifacts in enrollment and verification [3]. This is due to lack of the function of liveness detection which examines whether an object is a part of a living body. The liveness detection scheme is necessary for protecting biometric authentication systems from spoofing using artifacts. On the other hand, conventional biometrics systems mainly assume the applications based on one-time-only authentication such as access control, banking, passport control, and so on. However, from the viewpoint of user management,
Presently at the NEC Corporation, Tokyo, Japan.
F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 504–514, 2010. c Springer-Verlag Berlin Heidelberg 2010
On-Demand Biometric Authentication Using Brain Waves
505
Authentication
No Security During Use (a) One-Time-Only
A
A
A
A A During Use (b) On-Demand Authentication During Use (c) Continuous
t
A t
t
Fig. 1. Styles of authentication
the one-time-only authentication is low-security. After authenticating by a genuine user, even if he/she is switched to an imposter, the one-time-only authentication system could not detect such spoofing. In order to cope with this problem, some other style of authentication is needed. Figure 1 shows conceivable styles of authentication where (a), (b) and (c) are one-time-only, on-demand and continuous authentication, respectively. The term: “continuous” was used in [4] and the continuous authentication was realized by using multimodal biometrics. However, the continuous authentication cannot be realized by using a single biometric modality unless it adopts optical processing. Instead, we define “on-demand” authentication as a new style. In the ondemand authentication, users are verified on a regular or nonregular schedule on demand of authentication from the systems. By the way, the fingerprint and the iris are not suitable for the on-demand authentication because they ask users to present their biometric data every authenticating. In other words, the on-demand authentication needs unconscious biometrics. As the unconscious biometrics, a face, ear, voice, keystroke and gate are applicable but the face and the ear are easily imitated using artifacts and the voice, keystroke and gate limit applications. It has been proposed to use a brain wave as the biometrics [5]-[11]. We have also studied the authentication using the brain wave [12]. The brain wave is generated by the activities of neurons in the cerebral cortex; therefore, it is kept in the body and so it is effective for anti-circumvention. Of course, the brain wave possesses the function of liveness detection because it is generated only by live human beings. Moreover, the brain wave is generated autonomously and unconsciously; therefore, it enables the on-demand authentication. Conversely,
506
I. Nakanishi and C. Miyamoto
EEG
EEG sensor
sensor
Fig. 2. Authentication of operators using brain waves
since users are required to put sensors on their scalp every authentication under present technologies, the brain wave is not suitable for the one-time-only authentication. It will be solved if contactless sensors for detecting brain waves are invented in the future. Considering these facts, we assume operator verification of systems such as computers and vehicles as shown in Fig.2. The operators wear brain wave sensors and they are verified on demand while using the systems. For example, in the remote education system students should be authenticated while learning. It is even more so for the students who are trying to obtain some academic degree or public qualification. Also, operators of public transportation systems should be authenticated while operating since thousands of human lives depend on them. There are other examples: pilots of aircrafts, drivers of emergency vehicles, operators of military weapons and so on. Additionally, if the detection of catnapped and/or drunkard operators using the brain wave is possible, it is expected to be integrated with the operator’s on-demand authentication and will be valuable protection against having accidents. However, it is not easy to implement the proposed on-demand authentication in a single step. We must consider step by step which band of the brain wave we use and how we reduce the noise in the brain wave caused by the eye-blink. We have already confirmed the verification performance using the α band in the case where users are relaxed in eye-closed condition [12]. This is the first step. However, such a condition is not appropriate for practical applications. As the second step, we assume that users are not relaxed but concentrating on some mental task with closed eyes. At the final step, we will assume eye-opened users while doing some mental task. If the effectiveness of the authentication using brain waves is confirmed, we will build up the on-demand system using the proposed authentication method. This paper is at a point of the second step. We assume to construct an authentication system of which user has the authority to use his/her own personal computer; therefore, not the identification (one-to-many matching) but the verification (one-to-one matching) is assumed. Also, we adopt a mental task in which the user is mentally making sentences. Through experiments, we confirm that the authentication using the brain wave is possible even during the mental task.
On-Demand Biometric Authentication Using Brain Waves
2 2.1
507
Verification Using EEG in Mental Task Brain Wave
Electrical changes from large number of synapses (neurons) in the cerebral cortex are accumulated and then detected as a brain wave (Electroencephalogram: EEG)@on scalp using an electrode. Because of spatiotemporal dispersiveness of neurons, there are not distinct patterns in the EEG in general. However, when the activity of the cerebral cortex becomes low, brain waves partially become synchronous and thereby some distinctive wave is observed. As such waves, δ (0.5-3Hz), θ (4-7Hz), α (8-13Hz), and β (14-30Hz) are well known and detectable when human beings are during deep sleep, getting sleepy, relaxed with closed eyes, and in some mental activity, respectively. In particular, the α and/or β waves are applicable for the person authentication. 2.2
Mental Task
Several authentication methods using the EEG in mental activities have been proposed [13]-[15]. The mental tasks are, however, invented from a viewpoint of brain sciences: mental arithmetic, mental rotation of a three-dimensional block and so on. In the case of on-demand authentication, if actual tasks (works) are different from mental ones, users are required to perform the mental task every authentication and thereby it makes the authentication conscious. The mental task should be related with the actual one for keeping the authentication unconscious. In this paper, we assume the authentication of computer users who are making sentences mentally. For convenience, we call this task mental composition hereafter. The mental composition is a supposable task for the computer users; therefore, it enables unconscious authentication. 2.3
Feature Extraction
We have confirmed that the spectral distribution in the α band is an important feature for distinguishing individuals [12]. It is, however, known that when some mental activity is being done, the α wave is suppressed while the β wave becomes detectable. In this paper, we add a spectral feature in the β band to the conventional ones. Spectral Smoothing. Since the EEG spectrum has large intra-individual variation, all spectral data for feature extraction are pre-processed by smoothing. Concretely, by using spectral values at five adjacent frequency bins, we obtain an averaged spectral value |Xk | =
2 1 |Xk−n | 5 n=−2
(1)
where k is a frequency index and Xk is a discrete Fourier transform (DFT) of an EEG signal.
508
I. Nakanishi and C. Miyamoto
Maximum Power Spectrum
m urt ce pS re w oP
1/10
f
u
1
f
u
j
f
u
f
Fig. 3. Definition of the concavity of spectral distribution
Feature Extraction in α Band. In the α band, we have utilized two spectral features [12]. The one is spectral variance and the other is the concavity of spectral distribution. Assuming the spectrum in the α band is normally-distributed, its spectral variance is calculated by T 1 v= (sk − s¯)2 (2) T k=1
where T is the number of frequency bins in the α band. sk (k = 1, 2, · · · , T ) are power spectral values and s¯ is their mean one. The definition of the concavity of spectral distribution is shown in Fig. 3. First, the maximum value of the power spectrum is detected and then its tenth part is calculated and adopted as a criterion. Next, frequencies of which power spectral values are under the criterion are squared and then summed as Fu =
N
(fju )2
(3)
j=1
where fju (j = 1, 2, · · · , N ) is frequencies under the criterion. Fu is regarded as a feature representing the concavity of spectral distribution. Feature Extraction in β Band. The spectrum in the β band is relatively uniformly-distributed and thus the features described above are undetectable in the β band. In this paper, we propose to use the difference between the spectrum in relaxed condition and that during mental composition as the feature in the β band. First, the spectrum in the β band is measured L times in the relaxed condition and then their ensemble mean βkl is found at each frequency bin. Next, the
On-Demand Biometric Authentication Using Brain Waves
Enrollment of Template
Verification
Measurement during a mental task
Measurement during a mental task
FFT
FFT
Spectral Smoothing
Spectral Smoothing
Spectral Smoothing
Power Spectrum in β band
Power Spectrum in α and β bands
Power Spectrum in α and β band
Measurement in relaxed condition FFT
L times
Ensemble Averaging of L Data
Ensemble Averaging of L Data
Averaged Spectrum in β band
Averaged Spectrum in α and β bands
Feature Extraction β kl
v , F , Sβ t
t
Mean Power in α band
t
α km
u
509
α km , β km
α km β
Normalization
l k
Feature Extraction v, Fu , S β v , F , Sβ Matching t
Template
t
t
u
TD
Decision
Fig. 4. Block diagram of the proposed verification system
spectrum in the β band during the mental composition is also measured as βkm and then the Euclidean distance from the mean value of the spectrum in the relaxed condition is calculated and all distances are accumulated as a feature. N (4) Sβ = (βkm − βkl ) k=1
where N is the number of frequency bins in the β band. 2.4
Verification
The block diagram of the proposed verification system is described in Fig. 4. The details of each block are explained in the following sub-sections. Enrollment of Template. In advance of the verification stage, spectral features of all users are enrolled as templates. The EEG of each user is measured in relaxed condition and then its power spectrum in the β band is calculated by Fast Fourier transform (FFT). After that, the spectral smoothing described in Sect. 2.3 is performed. These processes are repeated L times and then the
510
I. Nakanishi and C. Miyamoto
ensemble mean of L spectra in the β band is found as βkl and it is stored as a template. In the same way as the relaxed condition, the spectrum during the mental composition is measured L times and then ensemble mean values of L spectra m in both α and β bands are respectively found as αm k and βk . m In the averaged spectrum αk (k = 1, 2, · · · , M ) where M is the number of frequency bins in the α band, the spectral variance and the concavity of spectral distribution are extracted as features and registered as templates: v t and Fut , respectively. Also, the mean value of the averaged power spectrum in the α band is stored simultaneously. This is used for the normalization described later. The template in the β band is given by N t Sβ = (βkm − βkl ) (5) k=1
Normalization and Matching. In the verification stage, each user declares who oneself is by giving his/her name or ID number to the system, which specifies his/her template. And the spectrum of the user during the mental composition is found and then smoothed, and only spectral elements in the α and β bands m are used as verification data: αm k and βk , respectively. By the way, the features in the α band are based on the absolute amount of the spectrum; therefore, they tend to be influenced by intra-individual variation. For suppressing such variation, the normalization is performed in advance of the feature extraction. Concretely, the mean value of the αm k (k = 1, 2, · · · , M ) is calculated and then the αm is adjusted (normalized) so that the mean value k may become equal to that of the template: αm stored in the system. k After that, the features in the α and β bands: the spectral variance, the concavity of spectral distribution and the spectral distance are extracted as v, Fu and Sβ , respectively. The difference between the extracted features and their templates are calculated and then normalized because they have deferent dimensions. Total distance (T D) is given by T D = x · |v − v t | + y · |Fu − Fut | + z · |Sβ − Sβt |
(6)
where x, y and z are coefficients for combining features and x + y + z = 1. If T D is less than a threshold which is preliminary determined, the user (declarer) is regarded as a normal user. If not so, he/she is rejected as an imposter.
3
Experiments
In order to examine the verification performance of the proposed method, we carried out experiments.
On-Demand Biometric Authentication Using Brain Waves
3.1
511
Conditions
The number of subjects was 20. All were healthy male around twenty and seated at rest with closed eyes in a silent room. In advance of launching measurements, we presented the subjects five themes: a letter for parents, a self-introduction, hometown PR, memories of college life, and a brief description of own researches. While measuring, they were required to make sentences about these themes mentally. The EEG signals were recorded using a consumer single-channel electroencephalograph during continuous one minute. By using a headband, a single electrode (sensor) was set on the frontal region of head which corresponded to the frontal pole (Fp1) defined by the international standard: 10/20 method. 10 EEG signals were obtained from each subject on the same day and 200 EEG signals were obtained in total. The average number L was set to five; therefore, five data of each user were used for generating his/her templates. The rest five data of each subject were used for verification and all other subjects’ data were used as those of imposters. 3.2
Results
Verification performance was evaluated by the equal error rate (EER) where the false rejection rate (FRR) was equal with the false acceptance rate (FAR). The EERs in several ratios of coefficients x, y, z for combining features are summarized in Table 1. Table 1. EERs at various coefficients for combining features Ratio (x : y : z) EER (%) 0.2 0.3 0.3 0.4 0.3 0.5
: : : : : :
0.3 0.2 0.4 0.3 0.5 0.3
: : : : : :
0.5 0.5 0.3 0.3 0.2 0.2
16 16 16 13 17 13
From these results, it is confirmed that the EER in the mental composition was about 15 %. For reference, we also examined the verification performance using only the features of the α wave and thereby obtained the EER of 15%. This suggests that the proposed feature of the β wave was not valid in this experiment. Moreover, we examined the verification performance during mental arithmetic which is generally used in the brain science. The number of subjects was 10 and they imagined to calculate 7 × 10, 7 × 11, 7 × 12, · · · with closed eyes until the end of measurement. The EER was about 11 %. Since the number of subjects is
512
I. Nakanishi and C. Miyamoto
not equal to that in the mental composition, it is not accurate to compare these results but the effect of using the feature of the β wave could be confirmed in the case of mental arithmetic. The difference between two cases might be due to the degree of mental activity, that is, the content of the mental task. In the mental arithmetic, the contents were clearly defined and they were not relatively easy to perform. On the other hand, in the case of the mental composition some themes were given as rough guidelines but actual contents depended on the subjects. Some subjects might make simple sentences, and others might make difficult ones. These had an influence on the degree of mental activity. Base on the above discussion, it is supposed that the harder the mental activity became, the better the verification performance became. On the other hand, it was worried that verification performance might be degraded because it is generally known that the mental activity suppresses appearance of the α wave while it makes the β wave detectable. But such degradation was not confirmed in this experiment. As a result, it is concluded that the verification using brain waves is possible even during the mental task.
4
Conclusions
From the viewpoint of user management, one-time-only authentication is lowsecurity; therefore, on-demand authentication is necessary but it requires an unconscious biometrics. We had studied the brain wave (EEG) as a powerful unconscious biometrics. However, in order to apply the authentication using the EEG to practical applications, we had to examine the verification performance during mental tasks. In addition, the mental task should be related with an actual one for keeping the authentication unconscious. In this paper, assuming the verification of computers users, we adopted mental composition in which the users were mentally making sentences. It was a supposable task for the computer users. Moreover, introducing the mental task required to add a new spectral feature in the β band, that is, the Euclidean distance between the spectrum in the relaxed condition and that during the mental task to conventional ones, that is, spectral variance and the concavity of spectral distribution in the α band. Verification was simply performed by combining the differences between the extracted features and their templates. In experiments using 20 subjects, the EER of about 15 % was obtained, so that we conclude that the verification using the EEG was possible even during the mental task. In addition, from the comparison with the results in mental arithmetic, it was confirmed that the degree of mental activity had influence on the verification performance. In this paper, since we cared about the influence of eye-blinks on the EEG, we assumed eye-closed condition. But in our exploratory experiments, we already confirmed that the noise in the EEG caused by the eye-blink did not have great influence on verification performance as long as individual features were extracted from the spectrum of the EEG. The frequency of eye-blink was
On-Demand Biometric Authentication Using Brain Waves
513
different from that of the α or β band. We are now examining the proposed authentication method at the final stage where users drive vehicles in eye-opened condition. Results will be available in the near future.
Acknowledgement A part of this work was supported by the Support Center for Advanced Telecommunications Technology Research, Foundation (SCAT) in Japan.
References 1. Matsumoto, T., Matsumoto, H., Yamada, K., Hoshino, S.: Impact of Artificial “Gummy” Fingers on Fingerprint Systems. In: Proc. of SPIE, January 2002, vol. 4677, pp. 275–289 (2002) 2. Matsumoto, T., Kusuda, T., Shikata, J.: On the Set of Biometric Test Objects for Security Evaluation of Iris Authentication Systems -Part 2- (in Japanese). In: Proc. of the 9th IEICE Technical Report of Biometrics Security Group, pp. 60–67 (2007) 3. Matsumoto, T.: Security Design and Security Measurement for Biometric Systems (in Japanese). In: Proc. of the 7th IEICE Technical Report of Biometrics Security Group, pp. 57–64 (2006) 4. Altinok, A., Turk, M.: Temporal Integration for Continuous Multimodal Biometrics. In: Proc. of 2003 Workshop on Multimodal User Authentication, December 2003, pp. 207–214 (2003) 5. Poulos, M., Rangoussi, M., Chrissikopoulos, V., Evangelou, A.: Person Identification Based on Parametric Processing of the EEG. In: Proc. of the 9th IEEE International Conference on Electronics, Circuits and Systems, vol. 1, pp. 283–286 (1999) 6. Poulos, M., Rangoussi, M., Alexandris, N.: Neural Networks Based Person Identification Using EEG Features. In: Proc. of 1999 International Conference on Acoustic Speech and Signal Processing, pp. 1117–1120 (1999) 7. Poulos, M., Rangoussi, M., Chissikopoulus, V., Evangelou, A.: Parametric Person Identification from the EEG Using Computational Geometry. In: Proc. of the 6th IEEE International Conference on Electronics, Circuits and Systems, pp. 1005– 1008 (1999) 8. Paranjape, R.B., Mahovsky, J., Benedicent, L., Koles, Z.: The Electroencephalogram as a Biometric. In: Proc. of 2001 Canadian Conference on Electrical and Computer Engineering, vol. 2, pp. 1363–1366 (2001) 9. Ravi, K.V.R., Palaniappan, R.: Recognition Individuals Using Their Brain Patterns. In: Proc. of the 3rd International Conference on Information Technology and Applications (2005) 10. Palaniappan, R.: Identifying Individuality Using Mental Task Based Brain Computer Interface. In: Proc. of the 3rd International Conference on Intelligent Sensing and Information Processing, pp. 239–242 (2005) 11. Mohammadi, G., Shoushtari, P., Ardekani, B.M., Shamsollahi, M.B.: Person Identification by Using AR Model for EEG Signals. In: Proc. of World Academy of Science, Engineering and Technology, February 2006, vol. 11(2), pp. 281–285 (2006) 12. Miyamoto, C., Baba, S., Nakanishi, I.: Biometric Person Authentication Using New Spectral Features of Electroencephalogram (EEG). In: Proc. of 2008 IEEE International Symposium on Intelligent Signal Processing and Communication Systems, December 2008, pp. 312–315 (2008)
514
I. Nakanishi and C. Miyamoto
13. Marcel, S., Millan, J.R.: Pearson Authentication Using Brainwaves (EEG) and Maximum A Posteriori Model Adaption. IEEE Trans. on Pattern Analysis and Machine Intelligence 29(4), 743–748 (2007) 14. Palaniappan, R., Mandic, D.P.: Biometrics from Brain Electrical Activity: A Machine Learning Approach. IEEE Trans. on Pattern Analysis and Machine Intelligence 29(4), 738–742 (2007) 15. Palaniappan, R.: Multiple Mental Thought Parametric Classification: A New Approach for Individual Identification. International Journal of Signal Processing 2(1), 222–225 (2005)
Encrypting Fingerprint Minutiae Templates by Random Quantization Bian Yang, Davrondzhon Gafurov, Christoph Busch, and Patrick Bours Norwegian Information Security Laboratory (NISlab), Gjøvik University College, Teknologiveien 22, N2821 Gjøvik, Norway {Bian.Yang,Davrondzhon.Gafurov,Christoph.Busch, Patrick.Bours}@hig.no
Abstract. An encryption method is proposed to use random quantization to generate diversified and renewable templates from fingerprint minutiae. The method first achieves absolute pre-alignment over local minutiae quadruplets (called minutiae vicinities) in the original template, resulting in a fixed-length feature vector for each vicinity; and second quantizes the feature vector into binary bits by random quantization; and last post-processes the resultant binary vector in a length tunable way to obtain a protected minutia. Experiments on the fingerprint database FVC2002DB2_A demonstrate the desirable biometric performance achieved by the proposed method. Keywords: Biometric template encryption; renewable biometric template; fingerprint minutiae; random quantization; minutia vicinity.
1 Introduction Standard encryption (DES, AES, etc) can be an option to encrypt biometric templates but in many cases it is insufficient because the encrypted template needs decryption to invert to its plain-text for comparison. This is insecure in some applications as full access to raw samples or unprotected biometric features is given to the potentially untrusted entity that conducts the comparison. To cope with this problem, renewability [1,2] was proposed as a requirement to enhance the security and privacy of biometric templates. Renewable biometric templates, diversified from the same biometric features and able to be compared directly for verification, are required to be irreversible to their original biometric features and unlinkable among each other. So far many schemes [3-7] have been proposed to protect ANSI or ISO standards conformed minutiae-based fingerprint templates, which are widely used nowadays. Robust minutiae hashing [8] is a geometric relationship based diversification method for fingerprint minutiae template protection, in which sign bits of the encrypted minutiae vicinities are extracted as the binary hashes. In this way only a limited number of equally-weighted binary hash bits can be generated from one original minutia. This limitation of the hash length results in a limited biometric performance. In this paper, we use multiple random quantizers to transform a geometrically-aligned minutia vicinity into a length-tunable binary minutia hash. Through the multiple random quantizations, the biometric performance of the minutiae hash can be improved. F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 515–522, 2010. © Springer-Verlag Berlin Heidelberg 2010
516
B. Yang et al.
The random quantizers are generated independently and thus one original minutia vicinity can be diversified into multiple minutiae hashes as pseudonymous identifiers defined in [1].
Fig. 1. An example of minutia vicinity geometric alignment (along the orientation O2)
The proposed method has the following properties: 1. tunable length and biometric performance of the protected binary templates (minutiae hashes); 2. reliable geometric alignment without global geometric reference (core or delta); 3. nonlinear quantization to achieve irreversibility, unlinkability, and security against the estimation attack towards the original biometric features.
2 Proposed Method The proposed method consists of the three steps – geometric alignment, random quantization, and post-processing to tune the hash length, over each minutia vicinity which is defined as introduced in [7,8]. 2.1 Geometrical Alignment of Minutiae Vicinity Same as in our previous work [8], the proposed method is based on local minutiae quadruplets (called minutiae vicinity throughout this paper). For each minutia mi (i=1,2,…,L) in the original template consisting of L minutiae, 3 closest (in terms of Euclidean distance) neighboring minutiae are found around mi and the quadruplets (including mi) are defined as a minutia vicinity Vi. Denote the 3 neighboring minutiae as companions cij (j = 1, 2, 3) indexed in an ascending order of distance from mi(d(mi,cij) in Fig.1), 6 orientations Ok (k = 1, 2, …,6) can be defined between the minutiae pairs (e.g., pair mi and ci2 defining O2 in Fig.1) and along each orientation the remaining minutiae pair (ci1 and ci3 in the example in Fig.1) can be geometricallyaligned (resulting Ja3 and Ja4 respectively). The minutiae pair to define the new coordinate system (mi and ci2) will not be included in the final geometric alignment result. With the 6 ordered orientations, 6 ordered minutiae pairs {Ja(2k-1), Ja(2k)} can be obtained as the aligned and ordered vicinity Vai = {Ja1, Ja2, …, Ja12}, where inside each
Encrypting Fingerprint Minutiae Templates by Random Quantization
517
ordered minutiae pair the two aligned minutiae are ranked in ascending order by their original index j. Associated with each Jaj (j=1,2,…,12)’s geometrically-aligned coordinates (x,y), the Jaj’s minutiae angle θ is updated by addition of the rotation angle caused by the alignment to its original minutia orientation angle value.
Fig. 2. An example of random binary quantizer generation in the value range [0,201]. Left: 201 random values S’ interpolated from 10 random values S. Right: random quantizer Q over the value range [0,201] generated by thresholding S’, where the threshold is 0.5.
2.2 Random Quantization Associated with each aligned point Jaj (j=1,2,…,12) in the vicinity Vai = {Ja1, Ja2, …, Ja12} are the coordinates (x,y) and the minutiae angle θ . Assuming all the coordinates and angle information are equally-weighted, a 36-dimensional real-value feature vector v = (v1, v2, …, v36) = (x1, y1, θ1 , …, x12, y12, θ12 ) can be obtained by concatenating all the point coordinates and transformed ridge angles. To obtain a 36M-dimensional binary hash vector b from v, random quantizers Q1, Q2, …, QM can be used to map the coordinates x, y and the angle θ into 36M binary bits: bi,j= Qj(vi) (i=1,2,…,36;j=1,2,…,M)
(1)
where the Qj (j=1,2,…,M) quantizes an input vi into M bits by the randomly-generated quantization bins inside the value range [Rmin, Rmax]. An example method to generate such random quantization bins is to cubically interpolate a p-length random value sequence S=(s1, s2, …, sp) into a (Rmax-Rmin+1)-length sequence S’=(s’1, s’2, …, s’RmaxRmin+1) and then threshold S’ by its mean value to obtain a set of random bins. Fig. 2 gives such an example to generate a Qj out of (Rmax-Rmin+1)=(200-0+1)=201 values S’ interpolated from p=10 random values S bounded in [0, 1]. The coordinate and angle values outside S’ can be processed into this range by modulus 201 and normalization. Increasing randomness by adjusting p will result in more secure hashes in terms of template irreversibility, and higher diversification ability as well; however, it can be expected the biometric performance will be degraded if the number of bins goes too high causing less robustness to biometric distortions.
518
B. Yang et al.
2.3 Hash Length Tuning By the random quantization, a 36M-bit hash H=(h1,h2, …,h36M) is generated for each minutia vicinity. M can be modified to obtain hashes in different length. To further tune the hash length to a number which is not multiples of 36 to satisfy an application’s specific requirement, exclusive-or (XOR) can be used to fine tune the length. Assuming a T-length (18M≤T<36M) hash is required by a specific application, a simple post-processing to H can be as follows to obtain the final vicinity hash Hv: Hv= (h1, h2,…, hT)XOR(h36M-T+1, h36M-T+2,…, h36M)
(2)
According to [9], the expectation value of XOR of two binary sequence x and y is E( x ⊕ y) =
1 1 1 − 2(E ( x ) − )(E ( y ) − ) − 2Cov( x, y ) 2 2 2
(3)
and it is not difficult to derive the variance as Var( x ⊕ y ) =
1 [1 − [(1 − 2E ( x))(1 − 2E ( y )) + 4Cov( x, y )]2 ] 4
(3)
This indicates that the XOR operation will not seriously affect H’s distribution and thus the biometric performance if the original biometric feature Vai and the quantizers are random enough making the expectation E(x)≈E(y)≈0.5 and Cov(x,y) ≈0.
3 Experimental Results The minutiae templates used in our experiments were generated by the VeriFinger Software [10] from the FVC2002DB2_A database [11]. The first 100 samples of all the 100 fingers with best quality were used for enrolment and the 2nd~8th 100 samples were used for verification. For each geometrically-aligned vicinity Vai with 12 points, the 36 coordinates and angle real values were quantized by the M random quantizers to generate a 36M-bit vicinity hash. Piece-wise comparison of these vicinity hashes can be done during verification. For different subjects or different applications for the same subject, different and independent sets of random quantizers can be generated to from the biometric system to achieve unlinkability among encrypted minutiae hashes. Comparison scores were calculated as the normalized number of match cases, which equals the match cases of all the vicinity hashes calculated from the probe sample against all the vicinity hashes in the reference database divided by the original minutiae number detected in the probe sample. The random quantizers for the coordinate values were generated in the scale range [0, 201] interpolated from 10 random real values with amplitudes bounded in [0, 1] as mentioned in the section 2.2. When quantizing the coordinate values of the geometrically-aligned vicinity Vai, all the coordinate values were offset by adding 100 into the value range [0, 201], and those values lower than 0 and higher than 201 were processed with modulus 201 into the value range [0, 201]. The random quantizers for the angle values were generated in the scale range [0, 360] interpolated from 5 random real values with amplitudes bounded in [0, 1].
Encrypting Fingerprint Minutiae Templates by Random Quantization
519
Table 1. Hash lengths settings in the experiments Quantizers Number M=2 M=4 M=8 M = 12
Hamming distance threshold 4 10 30 40
Hash length T (bits) 36 72 144 216
Table 2. Equal Error Rate (EER) comparison Test sets 2 3 4 5 6 7 8
M=2 0.0106 0.0159 0.0578 0.0553 0.0778 0.0731 0.0959
M=4 0.0044 0.0055 0.0406 0.0428 0.0549 0.0539 0.0781
M=8 0.0033 0.0054 0.0321 0.0319 0.0419 0.0344 0.0572
M = 12 0.0043 0.0028 0.0319 0.0300 0.0457 0.0440 0.0594
Table 1 presents the different vicinity hash lengths and Hamming distance thresholds set for comparison under the cases M = 2, 4, 8, and 12. Both the quantizer number M and the vicinity hash length T can be flexibly tuned to accommodate the different storage and biometric performance requirements in applications. The L minutiae hashes generated from each original minutiae template which consists of L original minutiae form a protected template. The random quantizers used to generate each protected template can be stored in two ways: either in a token which is uniquely linked to the protected template for identity claim, or re-generated in real time from the system during verification. In the latter case the PRNG (pseudo-random number generator)’s seeds to generate random quantizers can be stored together with the protected templates in the database, and the secret key to the PRNG needs well preserved and transmitted inside the biometric system. Table 2 presents the biometric performance in terms of Equal Error Rate (EER) achieved by the proposed method using the 2nd~8th 100 samples as probes (denoted as test sets 2~8 in the Table 2 respectively), respectively, in the four cases with various number of quantizers – M= 2, 4, 8, and 12 (thus various hash lengths according to Table 1). The performance shown in Table 2 evaluate the scenario of only one factor – the fingerprint sample is probed for verification and the corresponding random quantizers are re-generated by the system. In the case user’s token is used, this “only fingerprint probe” scenario corresponds to the token-stolen situation in the imposter cases where the genuine token is used by the imposter to access the genuine protected template. We found that extension of the hash length can in general improve the biometric performance in our experiments, but the differences between M=8 and M=12 are not significant. All results in the Table 2 were obtained as averages from 5 tests with different random quantizers for each test.
520
B. Yang et al.
(a) M = 2 (Q1, Q2)
(b) M = 3 (Q1, Q2, Q3)
(c) M = 4 (Q1, Q2, Q3, Q4) Fig. 3. An example to estimate the value ranges of quantizers’ input from a quantization resultant binary vector
4 Security Analysis In the proposed method, each component (coordinate or angle value) vi in the realvalue feature vector v can be quantized into M bits (bi,1, bi,2, …, bi,M) by M random quantizers. Suppose the set of possible M-bits vectors as the outputs of the M random quantizers is So. In general, increasing the random value p for interpolation as introduced in the section 2.2 will increase the size of So (with maximum value 2M as the full realization of any M bits) and thus the security of the proposed method in terms of irreversibility. However, this is not feasible in practice because high randomness of the quantizers will cause a distinct degradation of the associated biometric performance. In our experiments we intended to keep the biometric performance. Thus for example the size of So for the coordinate value quantizers were usually kept as less than 40 in the case of M=8, which is far less than the full permutation number 28 of a 8-bit binary vector. So for any M=8 bits quantization result (bi,1, bi,2, …, bi,8), usually less than 40 possible solutions can be searched to recover the two inputs for the XOR operation to generate one 8-bit segment of a specific vicinity hash. On the other hand, increasing M will cause increased information leakage of the quantizers’ input vi. Fig. 3 gives an example with an all-one quantization resultant
Encrypting Fingerprint Minutiae Templates by Random Quantization
521
vector how an input vi can be estimated within increasingly limited ranges as M increases from 2 till 4. Obviously, an increasing M will help limit the quantizers’ input into a narrower range, which indicates increasing information leakage on the input. Based on these two facts, the random quantizers needs well protection from public access. As discussed in the section 3, the random quantizers can be stored in a secured token; or the seeds to generate the quantizers can be stored together with the protected templates in the system’s database, with a secret key to the PRNG securely preserved by the biometric system. The secrecy of the random quantizers keeps the protected template irreversible to their original counterparts; and the randomness of the quantizers keeps the protected templates derived from the same minutiae templates hard to be linked to each other. Besides, the binary outputs of the random quantizers are nonlinear to their inputs, and this makes the estimation attack [12] on the linear operation based template protection methods less feasible in terms of information leakage in the case that the secure token might be stolen and used by the attacker for the estimation attack.
5 Conclusions A renewable fingerprint minutiae template encryption method is proposed in this paper to use the random generated quantizers to directly calculate a length tunable binary hash from each original minutia vicinity. Assuming the random quantizers can be securely stored or generated in real time from the system, irreversible and unlinkable protected templates can be generated thanks to the nonlinear quantization process. The experiments for the one factor (only the fingerprint sample) verification exhibit desirable biometric performance. Acknowledgments. The work is funded under the 7th Framework Programme of the European Union, Project TURBINE (ICT-2007-216339). It was created in the vicinity of the TURBINE project. All information is provided as is and no guarantee or warranty is given that the information is fit for any particular purpose. The user uses the information at its sole risk and liability. The European Commission has no liability in respect of this document, which is merely representing the authors’ view.
References 1. Information Technology – Security techniques – Biometric Template Protection, ISO/IEC 2ndCD 24745 (2010) 2. Breebaart, J., Yang, B., Buhan-Dulman, I., Busch, C.: Biometric Template Protection The Needs for Open Standards. Datenschutz und Datensicherheit – DuD 33(5), 1614–1702 (2009) 3. Juels, A., Sudan, M.: A Fuzzy Vault Scheme. In: Proc. of IEEE International Symposium on Information Theory, Lausanne, Switzerland. IEEE Press, New York (2002) 4. Ratha, N.K., Chikkerur, S., Connell, J.H., Bolle, R.M.: Generating Cancelable Fingerprint Templates. IEEE Trans. on Pattern Analysis and Machine Intelligence 29(4), 561–572 (2007)
522
B. Yang et al.
5. Arakala, A., Jeffers, J., Horadam, K.J.: Fuzzy Extractors for Minutiae-Based Fingerprint Authentication. In: Proc. of 2nd International Conference on Biometrics, Seoul, South Korea (2007) 6. Boult, T.E., Scheirer, W., Woodworth, J.R.: Revocable Fingerprint Biotokens: Accuracy and Security Analysis. In: Proc. IEEE Inter. Conf. on Comput. Vis. & Patt. Recog., USA (2007) 7. Yang, B., Busch, C.: Parameterized Geometric Alignment for Minutiae-Based Fingerprint Template Protection. In: Proc. of the IEEE 3rd International Conference on Biometrics: Theory, Applications and Systems, Washington D.C., U.S.A. (2009) 8. Yang, B., Busch, C., Bours, P., Gafurov, D.: Robust Minutiae Hash for Fingerprint Template Protection. In: Proc. of SPIE Media Forensics and Security, Electronic Imaging, San Jose, USA, January 17-21 (2010) 9. Davies, R.B.: Exclusive OR (XOR) and Hardware Random Number Generators (2002), http://www.robertnz.net/pdf/xor2.pdf 10. VeriFinger Software, http://www.neurotechnology.com 11. Fingerprint Verification Competition (FVC 2002) Database DB2_A (2002), http://bias.csr.unibo.it/fvc2002/databases.asp 12. Nagara, A., Nandakumarb, K., Jain, A.K.: Biometric Template Transformation: A Security Analysis. In: Proc. of SPIE Media Forensics and Security, Electronic Imaging, January 17-21, San Jose, USA (2010)
Method for Countering Social Bookmarking Pollution Using User Similarities Takahiro Hatanaka1 and Hiroyuki Hisamatsu2 1
Graduate School of Computer Science, Osaka Electro-Communication University, 1130-70 Kiyotaki, Shijonawate, Osaka 575-0063, Japan [email protected] 2 Department of Computer Science, Osaka Electro-Communication University, 1130-70 Kiyotaki, Shijonawate, Osaka 575-0063, Japan [email protected]
Abstract. In this paper, we propose a method for countering social bookmark pollution. First, we investigate the characteristics of social bookmark pollution and show that high similarities in the user bookmarks result in social bookmark pollution. Then, we discuss a bookmark number reduction method based on user similarities between the user bookmarks. We evaluate the proposed method by applying it to Hatena Bookmark. It is found that the proposed method only slightly reduces the bookmark number of the Web pages that are not affected by social bookmark pollution but greatly reduces the bookmark number of those Web pages that are affected by social bookmark pollution. Keywords: Social Bookmark (SBM), Collective Intelligence, Spam.
1 Introduction In recent years, several Web 2.0 services have been launched. Consumer-generated media (CGM) is one of the features of the Web 2.0 services. In a conventional Web service, many users could not independently publish Web pages. However, in recent years, there has been a surge in the popularity of services such as blogs and social network service (SNS), which can easily contribute users’ opinion on the Web even if users have no special knowledge to create Web pages. Users have actively availed of these services to publish information. Collective intelligence [1], which is an approach for aggregating information from several individuals and generating worthwhile information, has attracted considerable attention. The social bookmark (SBM) service is one of the most representative Web services that use collective intelligence. The SBM service is a Web service for displaying and sharing each user’s bookmark information. Currently, various corporations offer the SBM service [2, 3, 4, 5, 6]. In SBM service, the total number of users that bookmark a particular Web page means the worth of the Web page and is called the “bookmark number” of that Web page. The SBM service calculates the bookmark number of a Web page and displays the ranking of that Web page on the basis of its bookmark number. Many users expect the SBM service to be a new method for obtaining worthwhile information that may have been buried on the Internet. F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 523–528, 2010. c Springer-Verlag Berlin Heidelberg 2010
524
T. Hatanaka and H. Hisamatsu
However, an increase in the popularity of the SBM service and the number of the users of the SBM service results in an increase in the amount of SBM Spam, which abuses the service for commercial purposes such as advertisement of products. As a result, Web pages that are not desirable for many users are displayed on the ranking; this is a major problem with SBM service. There have been a few other problems in the SBM services besides the SBM Spam. For instance, in a lecture at an university, students performed an exercise of creating Web pages and bookmarking them to each other. Consequently, these Web pages were ranked high and did not accurately indicate the actual popularity of these Web pages. Such a case is an example of the drawbacks of the SBM services. Moreover, in “Attention videos” [7], which is a Web page that uses Hatena Bookmark—the most popular SBM service in Japan—to rank the current most popular videos on the Internet, several Web pages of animation videos were bookmarked; hence, these Web pages were ranked higher. This ranking was not useful for many users who were not interested in animation. As mentioned above, the SBM service has a serious drawback in that it provides a high ranking for Web pages that may not be useful for many users. This drawback is hereafter referred to as SBM pollution. It should be noted that SBM Spam is considered to be one case of SBM pollution. Already, a few researches have been conducted on the SBM service [8, 9, 10, 11, 12]. For instance, in [10, 11], a method was proposed for improving the reference accuracy of a search engine on the basis of information obtained using the SBM service. However, because SBM pollution is a relatively new problem, sufficient research has not been conducted for addressing it. Therefore, in this paper, we propose a method for countering SBM pollution. First, we investigate the characteristics of SBM pollution and we show that a high similarity between users’ bookmarking choices results in SBM pollution. Then, we propose the bookmark number reduction method based on the users’ bookmark similarities. We evaluate the performance of the proposed method by applying it to Hatena Bookmark. As the results of the performance evaluation, we show that our proposed method only slightly reduces the bookmark number of Web pages not affected by SBM pollution, whereas as it greatly reduces the bookmark number of Web pages strongly affected by SBM pollution. The rest of this paper is organized as follows. In Section 2, we analyze the characteristics of SBM pollution. In Section 3, we explain our method for our countering SBM pollution. In Section 4, we evaluate our method and show the effectiveness of our method in countering SBM pollution. Finally, in Section 5, we conclude this paper and discuss the possibility of a few future studies.
2 Analysis of Characteristics of SBM Pollution In this section, we analyze the characteristics of SBM pollution. We investigate the following two cases of SBM pollution in Hatena Bookmark: (a) In a lecture at an university, students performed an exercise of creating the Web pages and bookmaking them to each other. Consequently, “Attention web pages” [13], a service that lists Web pages with a high bookmark number, listed the Web pages that were only temporarily created by students. (b) In “Attention videos,” a service that lists videos with a high bookmark number, only those videos with the tag “idolmaster” were ranked.
1
1
0.8
0.8 Bookmark ratio
Bookmark ratio
Method for Countering Social Bookmarking Pollution
0.6 0.4 0.2
525
0.6 0.4 0.2
0
0
2
4
6
8 10 User ID
12
14
(a) Pages bookmarked in the exercise
5
10 15 20 25 30 35 40 45 50 User ID
(b) Pages with “idolmaster” tag
Fig. 1. User bookmark ratio
We investigated the user bookmarks that caused SBM pollution in each case. Figure 1 shows the user bookmark ratio in descending order. The bookmark number of these pages was higher than the actual bookmark number of these pages due to SBM pollution. Figure 1(a) shows that the ratio is over 0.7 for almost all the students; that is, almost all the student had bookmarked the Web pages created by other students in the same lecture. Moreover, Figure 1(b) shows that several users had bookmarked all of the videos with the “idolmaster” tag. Thus, the user that causes SBM pollution bookmarks a series of Web pages. Consequently, when the bookmarks of a user are compared, an extremely high user-bookmark similarity is observed. Therefore, SBM pollution is expected to be countered by reducing the bookmark number of users whose bookmark similarities are quite high and by not ranking those Web pages that are affected by SBM pollution.
3 Method for Countering SBM Pollution In this section, we explain the method for countering SBM pollution using the similarity between the bookmarks of users. First, we create a list of users, “blacklist,” that is, a list of users have high similarities in their bookmarks. Next, on the basis of this blacklist, the Web page whose bookmark number increased because of SBM pollution is identified and its bookmark number is reduced. The blacklist consists of users that have a high degree of similarity in their bookmarks. First, we acquire the bookmarks during period T . Then, we determine the similarity between the contents of bookmarks for different users. The bookmark similarity sij of users ui and uj is given by the following equation. ci→j cj→i sij = min , (1) mi mj where mi is the number of Web pages bookmarked by user ui during period T and ci→j is the number of Web pages bookmarked by user uj that were already bookmarked by user ui .
526
T. Hatanaka and H. Hisamatsu
Next, on the basis of the user similarity, we use the following algorithm to create the blacklists and register users to the blacklist. In this algorithm, if a user is registered to Algorithm 1. Creating blacklist 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18:
Let U be set of users Let γ be threshold for the bookmark similarity for each u ∈ U do if u has already been registered to the blacklist then next end if for each v ∈ U \ {u} do if bookmark similarity between u and v > γ then if v has already been registered to the blacklist L then if ∀ x ∈ L, bookmark similarity between u and x > γ then register u to the blacklist L end if else create a new blacklist & register u and v to it end if end if end for end for
the existing blacklist or the blacklist created newly, we end the processing to the user. Therefore, a user belongs to at most one blacklist. The bookmark number corresponding to the user who caused the SBM pollution is removed from the bookmark number of the Web page affected by the SBM pollution. Let K be the bookmark number of the Web page and m be the number of blacklists. The blacklist bli (i = 1, 2, · · · , m) has ni users; ki users in this blacklist bookmarked the Web page. The bookmark number K after reduction is given by the following equation. K = K −
l
ki logni ki
(2)
i=1
A large reduction in the bookmark number indicates a large number of appearances of a user in the blacklist.
4 Performance Evaluation In this section, we apply the proposed method for countering SBM pollution to Hatena Bookmark and show the effectiveness of our proposed method. First, we acquired those Web pages that the users of Hatena Bookmark had bookmarked in 30 days from January 23, 2009. Then, we calculated the bookmark similarity between users and created the blacklists. Next, we acquired the Web page that had appeared in “Attention Web pages” in Hatena Bookmark on February 23, 2009. The Web pages that are worthwhile for
Method for Countering Social Bookmarking Pollution
50 Original Filtered
40
Bookmark number
Bookmark number
50
527
30 20 10 0
Original Filtered
40 30 20 10 0
5
10
15 20 Entry ID
25
30
(a) Worthwhile Web pages for only a small number users
10 20 30 40 50 60 70 80 90 Entry ID
(b) Worthwhile Web pages for many users
Fig. 2. Number of bookmarks
a small number of users are extracted from the acquired Web page. In order extract these Web pages, we developed a questionnaire site to determine whether or not Web pages were worthwhile only for a small number of users. In the questionnaire, we asked whether the Web page would be considered to be the page that was worthwhile only for a small number of users about each pages, respectively. Figure 2 shows the bookmark numbers before and after the application of the proposed method to the Web pages that were worthwhile for only a small number of users and those that were worthwhile for many users. It should be noted that web pages are displayed in the ascending order of bookmark number. Figure 2(a) shows a reduction in the bookmark number for the Web pages that had bookmark number smaller than 20. However, there is no reduction in the bookmark number for the Web pages that had a bookmark number larger than 20. This is because the bookmark similarity between users decreases when several users bookmark a Web page. As a result, the blacklists cannot be created appropriately. In contrast, Fig. 2(b) shows roughly no reduction in the bookmark number of the Web pages that are worthwhile for many users. Hence, the proposed method is observed to be affected in the case of web pages with a small bookmark number.
5 Conclusion and Future Work In this paper, we have proposed a method for countering SBM pollution and have shown the effectiveness of our method. First, we have investigated the characteristics of SBM pollution and have shown that a high similarity between users’ bookmarking choices results in SBM pollution. Then, we have proposed a method for countering SBM pollution based on the users’ bookmark similarities. In order to evaluate the performance of the proposed method, we have applied our SBM pollution counter method to SBM service and have shown that our proposed method only slightly reduces the bookmark number of Web pages not affected by SBM pollution, whereas as it greatly reduces the bookmark number of Web pages strongly affected by SBM pollution.
528
T. Hatanaka and H. Hisamatsu
The appropriate parameter of the proposed method should be determined in future studies. Moreover, a method should be developed for increasing user satisfaction by using the bookmark similarity between users.
References 1. Surowiecki, J.: The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations, May 2004. Anchor Books, New York (2004) 2. del.icio.us, http://delicious.com/ 3. Diigo, http://www.diigo.com/ 4. Stumbleupon, http://www.stumbleupon.com/ 5. Faves, http://faves.com/home 6. Hatena bookmark, http://b.hatena.ne.jp/ 7. Attention videos, http://b.hatena.ne.jp/video 8. Xu, Y., Zhang, L., Li, W.: Cubic analysis of social bookmarking for personalized recommendation. In: Zhou, X., Li, J., Shen, H.T., Kitsuregawa, M., Zhang, Y. (eds.) APWeb 2006. LNCS, vol. 3841, pp. 733–738. Springer, Heidelberg (2006) 9. Wu, H., Zubair, M., Maly, K.: Harvesting social knowledge from folksonomy. In: Proceedings of the 17th ACM Conference on Hypertext and Hypermedia (HYPERTEXT 2006), August 2006, pp. 111–114 (2006) 10. Hotho, A., Jaschke, R., Schmitz, C., Stumme, G.: Information retrieval in folksonomies: Search and ranking. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, pp. 411–426. Springer, Heidelberg (2006) 11. Yanbe, Y., Jatowt, A., Nakamura, S., Tanaka, K.: Can social bookmarking enhance search in the Web? In: Proceedings of the 7th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL 2007), June 2007, pp. 107–116 (2007) 12. Noll, M., Meinel, C.: Web search personalization via social bookmarking and tagging. In: Proceedings of the 6th International Semantic Web Conference and 2nd Asian Semantic Web Conference (ISWC/ASWC 2007), November 2007, pp. 365–378 (2007) 13. Attention web pages, http://b.hatena.ne.jp/entrylist
A Human Readable Platform Independent Domain Specific Language for WSDL Balazs Simon and Balazs Goldschmidt Department of Control Engineering and Information Technology, Budapest University of Technology and Economics Magyar tudosok krt. 2., H-1117, Budapest, Hungary [email protected], [email protected]
Abstract. The basic building blocks of SOA systems are web services. WSDL, the standard language for defining web services, is far too complex and redundant to be efficiently handled by humans. Existing solutions use either graphical representations (UML, etc.), which are again inefficient in large scale projects, or define web services in the implementation’s native language, which is a bottom-up approach risking interface stability. Both lack support for concepts like conditions, access-rights, etc. The domain specific language introduced in this paper uses a Java and C#-like language for describing web service interfaces. It has the same descriptive power as WSDL while maintaining simplicity and readability. Examples show how to use the language, and how it can be compiled into WSDL. Keywords: SOA, Web Services, Domain Specific Language, WSDL.
1
Introduction
Complex distributed systems are best built from components with well-defined interfaces and a framework that helps connecting them. Web services implementing WSDL interfaces represent nowadays the components that both enterprises and governments use to construct their complex distributed systems, thus implementing a Service Oriented Architecture (SOA).[1] The advantage of building on web services technology is that one can get a vast set of standards from simple connections to middleware functionalities such as security or transaction-handling.[2] Usually a single SOA product is sufficient inside an enterprise, however, in an e-Government environment the government agencies may prefer different vendors. Most major software vendors offer web service capable environments, which are able to cooperate more-or-less with each other, as we proved this in [3]. In the SOA world XML is used for interface and process-description, and message-formatting. The aim is interoperability, but the drawback is that handling and transforming XML documents above a certain complexity is almost impossible. This is why most development environments have graphical tools for helping developers creating interface-descriptions, connections, process-flows F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 529–536, 2010. c Springer-Verlag Berlin Heidelberg 2010
530
B. Simon and B. Goldschmidt
and message-formats. The problem with the graphical approach is that it is neither efficient, nor reliably repeatable, nor sufficiently controllable, nor easily automatable. On the other hand, the standards usually have a lot of redundant parts that result in poor readability and manageability. For example the message element in WSDL 1.x was omitted from WSDL 2.0 because of its redundancy. The development tools do not support the inclusion of special extensions in the interface specification (like pre- and post-conditions, authorization, etc.) Even some tools have special naming conventions that are to be accepted, otherwise the generated code is even less readable than necessary. We have examined a lot of products [3] and have found a lot of peculiarities which have to be taken into account beyond the recommendations of the WS-I Basic Profile. Here are some examples: – Microsoft Windows Communication Framework requires that the WSDL message/part element is named as “parameters”, otherwise the generated C# interfaces use the more complicated MessageContract representation insead of the more readable DataContract representation. – Oracle 10g cannot handle WSDL descripions split into multiple files. – In the types part of the WSDL an XSD complex type should not have the same name as the corresponding element, otherwise Apache Axis2 will number the generated classes and they will be very hard to handle. Therefore, it is advisable to use a “Type” suffix on complex type names. To solve the above problems, the authors propose an abstract interface model that can manage WSDL concepts, and also introduce a new, extendable language called Service Oriented Architecture Language (SOAL) that can be used for describing such interfaces, and is more easily readable and manageable by humans. This model and language also enables automatization, vendor-specific WSDL generation, and compile time model checking. The paper is structured as follows. At first, the related works will be outlined, then, the proposed domain specific language and the architecture of the framework built around it will be introduced. Finally, our results will be summarized and our future plans will be listed.
2
Related Work
It is not a novel idea using domain-specific languages in Service-Oriented Architecture. There are two widely adopted branches of these efforts: applying an existing programming language or using graphical notations. In the first case, when the DSL is an already existing programming language, the usual approach is to use the development platform language. SOA interfaces are defined in Java or C#, and the corresponding WSDL description is generated from these definitions.[4,5] The advantage of this is that the developers can use the language they are familiar with for interface definitions. The drawback is that the generated WSDL does not always conform to the naming conventions of other tools and due to the bottom-up approach any change in the implementation may cause a modification in the WSDL description.
A Human Readable Platform Independent Domain Specific Language
531
In the second case, the tools of this approach usually use their own extended graphical notation. [6,7] UML is also a widely accepted language for WSDL description and generation.[8,9] UML has all necessary features the programming language approach lacked. Its disadvantage lies in its visual form. Efficient, repeatable, and automatable development processes need textual representation. Such representations of UML are not more suitable for SOA interface definition than WSDL itself. Another drawback is that the existing notation languages are not capable of defining pre and post-conditions, access rights, authorization, etc. There are only a few similar approaches to ours. There are solutions for compiling CORBA IDL to WSDL e.g. [10]. There is a similar language proposal to ours called relax-ws [11], however it is incomplete and has been unmaintained since 2008. Another interesting approach is the Service Component Architecture [12] specification from IBM, however, it is an XML-based language. The authors are not aware of any DSLs for WSDL that can take into account all the peculiarities mentioned in the introduction. Nor do they know any extensible human readable DSL that provides compile time type checking, pre- and post-condition support, handling of access rights, etc. Therefore, we decided to create a DSL for SOA to be able to handle all aspects of a SOA system. The SOAL for WSDL part introduced in this paper is only the base, although, it is already a powerful aid for rapid web service development. Later, the language will be extended with other constructs.
3 3.1
SOAL for WSDL Specification
This subsection introduces the specification of the SOAL domain specific language for XSD [13] and WSDL [14]. The main objective was to support the topdown development of web services and to enforce conformance with the WS-I Basic Profile. The language includes only those constructs of XSD which can be represented as normal Java and .NET types. SOAP header parts in WSDL messages are not yet supported, since these are only used in exceptional cases. The specification is introduced using the concepts of the more popular WSDL 1.1 standard, however, the language is compatible with WSDL 2.0, too. Both XSD and WSDL define a namespace and both of them can import other files from other namespaces. In SOAL the following construct illustrates this: namespace NamespaceName = "NamespaceUri" { using NamespaceA; ... }
The NamespaceUri is mapped to XML namespace URI, the NamespaceName is only used within SOAL. It is essential, that everything should be described in SOAL, i.e. no external linking of XSD or WSDL files is possible, they must be converted to SOAL. The rationale behind this is that SOAL supports compile time static type checking which is only possible if every construct is described in the language.
532
B. Simon and B. Goldschmidt
The language has the conventional built-in types (bool, byte, int, long, float, double, string, Date, Time, DateTime, TimeSpan) which can be directly mapped to common programming language types and XSD types. The language also supports array types (e.g. int[]), and nullable types (e.g. double?). Structured types can be constructed using the struct keyword or enum keyword, similarly to C#. An example is shown in Table 1. The message part of WSDL is redundant, therefore it is omitted from the language. The portType part of WSDL is mapped to interfaces. See Table 1. The binding part of the WSDL is mapped into a binding construct (see Table 1), which is decoupled from the interface as opposed to WSDL. This is important, because the binding construct is reusable across multiple interfaces. In the binding construct the transport and encoding protocols are mandatory, and it is possible to include other aspects such as WS-Policy assertions (currently only WS-Addressing and MTOM is supported). At the language level the WS-Policy assertions will be much simpler than the original ones defined by the WS-Policy standard family. The exact representation of these assertions is subject to further research but the main objective is to define a common configuration specification to support as many web service implementation frameworks (including .NET, Java and WS-Policy itself) and as many WS-* standards (e.g. WS-ReliableMessaging, WS-Security, WS-SecureConversation, WSAtomicTransaction, etc.) as possible. Another branch of further development includes Design-by-Contract. The last part of the WSDL lists the endpoints. In the proposed language an endpoint can be specified as seen in Table 1.
3.2
Architecture
The grammar for SOAL was implemented in the M language designed by Microsoft. The M language is a declarative language for working with data and building domain models. It is part of the SQL Server Modeling Services [15] (formerly Oslo) framework, which also includes an editor called IntelliPad for domain specific languages and a repository for storing data models. Based on the grammar a parser was generated. This parser can read in interface descriptions in the SOAL form and can transform the input into an object model (described by an abstract interface description model similar to UML interfaces), which is then easily processable from any .NET language. The object model conforming to this abstract interface description model can be directly mapped to the XML representation of WSDL, however, it is necessary to take the peculiarities of the different web service designers into account. The framework also generates directly importable projects for each web service designer. The transformations are defined using the Text Template Transformation Toolkit for Visual Studio. It is also possible to refactor existing WSDL XML files into the object model and then translate this back into the proposed domain specific language, or transform the representation into any web service designer’s project. This
SOAL syntax
endpoint CalculatorService : Calculator { binding HttpSoap11Binding; location "http://tempuri.org"; }
binding HttpSoap11Binding { transport HTTP; encoding SOAP { Version = SoapVersion.Soap12, EncodingStyle = SoapEncodingStyle.DocumentWrapped } }
interface Calculator { int Add(int left, int right); }
struct Error { string Message; byte[] Data; }
<xs:complexType name="Error"> <xs:sequence> <xs:element name="Message" type="xs:string"/> <xs:element name="Data" type="xs:base64Binary"/> <wsdl:message name="Calculator_Add_InputMessage"> <wsdl:part name="parameters" element="tns:Add" /> <wsdl:message name="Calculator_Add_OutputMessage"> <wsdl:part name="parameters" element="tns:AddResponse" /> <wsdl:portType name="Calculator"> <wsdl:operation name="Add"> <wsdl:input message="tns:Calculator_Add_InputMessage" /> <wsdl:output message="tns:Calculator_Add_OutputMessage" /> <wsdl:binding name="HttpSoap11Binding_Calculator" type="tns:Calculator"> <soap12:binding transport="http://schemas.xmlsoap.org/soap/http"/> <wsdl:operation name="Add"> <soap12:operation soapAction="http://myns/Calculator/Add" style="document" /> <sdl:input> <soap12:body use="literal" /> <wsdl:output> <soap12:body use="literal" /> <wsdl:service name="CalculatorService"> <wsdl:port name="HttpSoap11Binding_CalculatorService" binding="tns:HttpSoap11Binding_Calculator"> <soap12:address location="http://tempuri.org" />
XSD or WSDL syntax
Table 1. Mapping between SOAL and XSD, WSDL A Human Readable Platform Independent Domain Specific Language 533
534
B. Simon and B. Goldschmidt
capability can also be used to translate interface descriptions between different WSDL versions. The architecture of the platform independent SOAL for WSDL framework can be seen in Figure 1.
_g
+3
WSDL interface in SOAL
+3
parser
Interface object model
w
generator
_g
+3 transformator
+3 WSDL XML
w
refactor tool
Fig. 1. The architecture of the platform independent SOAL for WSDL framework
3.3
Evaluation
When the authors tried to create a SOA system based on products from multiple vendors [16] it took about a month to make them interoperable. The reason for this was that the bottom-up approach of creating web services only works in simple cases. If there are interdependencies between the participants of a SOA system, regenerating all the client proxies in the appropriate order is an unmaintainable process. In a second case study we tried to do the top-down approach, i.e. at first, we designed the WSDL interfaces and the client proxies and server implementations were generated from these. The advantage of this approach is that it is maintainable. However, we have found no WSDL designers that can enforce all the peculiarities of SOA products, we had to obey these rules ourselves. It took us about two weeks to realize our new case study. After completing the SOAL for WSDL framework we could rebuild both of our case studies from the ground up in a week. The framework takes care of all the peculiarities of the SOA products, we only have to deal with human readable interface descriptions and the service implementation itself. The framework can even generate complete projects for the SOA products where only the implementation of the service is to be filled. Currently Microsoft Windows Communication Foundation and GlassFish ESB are supported, but others are planned to be built in, e.g. Oracle SOA Suite, RedHat JBoss, IBM WebSphere and Apache Axis2. Table 2 shows how many files have to be generated for each product and how many of them have to be edited manually after code generation. It can be seen easily that an automatic generation support can strongly speed up development, especially in an e-Government environment, where multiple SOA products may be used.
A Human Readable Platform Independent Domain Specific Language
535
Table 2. Statistics on the number of files in the projects of SOA products for a single web-service Product name
Number of files Number of files in a project to be edited manually Microsoft WCF 18 1 Sun GlassFish ESB 37 1 Oracle SOA Suite 26 1 IBM WebSphere 48 1 RedHat JBoss 40 1 Apache Axis2 59 (+137 Axis2 jars, JSPs, etc.) 1
4
Summary
The proposed framework makes designing and handling WSDL interfaces easier. From the design point of view it is easier to write the interface description code in a C# and Java-like language without the burden of bothering with the redundant message elements. The resulting code is much more intuitive, more compact and more readable than the original WSDL XML. The tedious clicking in the GUI interface of the WSDL designers can also be eliminated. The proposed domain specific language can also provide static type checking for future extensions e.g. BPEL process descriptions. Another advantage of the framework is that it provides refactoring and migration capabilities between WSDL versions and web service designers, which can greatly increase interoperability in an e-Government environment. The authors have successfully tested the framework with Microsoft Windows Communication Foundation and GlassFish ESB, and it is planned to include support for other web service designers. The SOAL language and the generated XSD and WSDL files conform to all the peculiarities of these tools in a platform independent way, therefore, to support a new framework only minimal work has to be done: creating the appropriate directory structure of the target project and the generation of some configuration files. Future plans to extend the language and the framework include WS-Policy assertions, Design-by-Contract, security aspects (access rights, SAML attributes), business processes and model checking support on these.
References 1. OASIS. SOA Reference Model (2006), http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=soa-rm 2. OASIS. WS-* Standards (2002-2009), http://www.oasis-open.org/specs/ 3. Simon, B., Laszlo, Z., Goldschmidt, B., Kondorosi, K., Risztics, P.: Evaluation of WS-* Standards Based Interoperability of SOA Products for the Hungarian eGovernment Infrastructure. In: International Conference on the Digital Society, pp. 118–123. IEEE Computer Society, Los Alamitos (2010) 4. Sun Microsystems. JSR 224: Java API for XML-Based Web Services (JAX-WS) 2.0 (2006), http://jcp.org/en/jsr/detail?id=224
536
B. Simon and B. Goldschmidt
5. Microsoft. Windows Communication Foundation (2006), http://msdn.microsoft.com/en-us/netframework/aa663324.aspx 6. Oracle. Oracle JDeveloper 11g (2009), http://www.oracle.com/technology/products/jdev 7. Altova. XML Spy WSDL Editor (2010), http://www.altova.com/xmlspy/wsdl-editor.html 8. Vara, J.M., de Castro, V., Marcos, E.: WSDL Automatic Generation from UML Models in a MDA Framework. In: NWESP ’05: Proceedings of the International Conference on Next Generation Web Services Practices, Washington, DC, USA, p. 319. IEEE Computer Society, Los Alamitos (2005) 9. Aho, P., Maki, M., Pakkala, D., Ovaska, E.: MDA-based Tool Chain for Web Services Development. In: WEWST ’09: Proceedings of the 4th Workshop on Emerging Web Services Technology, pp. 11–18. ACM, New York (2009) 10. Sun Microsystems. IDL to WSDL Code Generation (2009), http://wiki.open-esb.java.net/Wiki.jsp?page=CORBABC.Idl2Wsdl 11. Papasando. relax-ws – A Relaxing Way to Create Web Service Definitions (2008), http://code.google.com/p/relax-ws/ 12. IBM: Service Component Architecture (2006), http://www.ibm.com/developerworks/library/specification/ws-sca/ 13. W3C. XML Schema (2004), http://www.w3.org/XML/Schema 14. W3C. Web Services Description Language (WSDL) 1.1 (2001), http://www.w3.org/TR/wsdl 15. Microsoft. SQL Server Modeling Services (2009), http://msdn.microsoft.com/en-us/data/ff394760.aspx 16. Simon, B., Laszlo, Z., Goldschmidt, B.: SOA Interoperability, a Case Study. In: Proceedings of the IADIS International Conference, Informatics 2008, pp. 131–138 (2008)
A Human Readable Platform Independent Domain Specific Language for BPEL Balazs Simon, Balazs Goldschmidt, and Karoly Kondorosi Department of Control Engineering and Information Technology, Budapest University of Technology and Economics Magyar tudosok krt. 2., H-1117, Budapest, Hungary [email protected], [email protected], [email protected]
Abstract. The basic building blocks of SOA systems are web services. High-level service orchestration is usually achieved by defining processes in BPEL. The available development environments, however, usually have visual tools for BPEL handling. The problem with this is that they are not satisfactory when efficiency, repeatability, and manageability is necessary. The domain specific language introduced in this paper uses a Java and C#-like language for describing web service interfaces and BPEL processes. It has the same descriptive power as WSDL and BPEL while maintaining simplicity and readability. Examples show how to use the language, and how it can be compiled into BPEL process descriptions. Keywords: SOA, Web Services, Domain Specific Language, BPEL.
1
Introduction
Complex distributed systems are best built from components with well-defined interfaces and a framework that helps connecting them. Web services and BPEL processes implementing WSDL interfaces represent nowadays the components that both enterprises and governments use to construct their complex distributed systems, thus implementing a Service Oriented Architecture (SOA).[1] The advantage of building on web services technology is that one can get a vast set of standards from simple connections to middleware functionalities such as security or transaction-handling.[2] Usually a single SOA product is sufficient inside an enterprise, however, in an e-Government environment the agencies may prefer different vendors. Most major software vendors offer web service and BPEL capable and more-or-less interoperable [3] environments. In the SOA world XML is used for interface and process-description, and message-formatting. The aim is interoperability, but the drawback is that handling and transforming XML documents above a certain complexity is almost impossible. This is why most development environments have graphical tools to help developers. The problem with the graphical approach is that it is neither efficient, nor reliably repeatable, nor sufficiently controllable, nor automatable. F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 537–544, 2010. c Springer-Verlag Berlin Heidelberg 2010
538
B. Simon, B. Goldschmidt, and K. Kondorosi
On the other hand, the standards usually have a lot of redundant parts that result in poor readability and manageability. For example, the partnerLinkTypepartnerLink constructs for partners, or the property-propertyAliascorrelationSet constructs for correlations mean unnecessary redundancy in BPEL. Unfortunately, the process designer tools usually map these constructs directly to the graphical interface instead of hiding them from the users. The development tools do not support the inclusion of special extensions in the process description (like pre- and post-conditions, authorization, etc.) Despite BPEL is a standard, different tools may implement different semantics for the same process description (e.g. unlike in GlassFish ESB [4], the order of the subelements of a complex variable in ActiveVOS [5] depend on the order of the assignments to the subelements, therefore, variables must always be initialized as XML literals before assignment to preserve the order defined in XSD). To solve the above problems, the authors propose an abstract model that can manage BPEL concepts, and also introduce a new, extendable language called Service Oriented Architecture Language (SOAL) that can be used for describing BPEL processes, and is more easily readable and manageable by humans. This model and language also enables automatization, vendor-specific BPEL generation, and compile time model checking. The paper is structured as follows. At first, the related works will be outlined, then, the proposed domain specific language and the architecture of the framework built around it will be introduced. Finally, our results will be summarized and our future plans will be listed.
2
Related Work
The BPEL standard defines an XML language for business processes. This kind of representation is inappropriate for human usage, since it is very verbose and very redundant. Therefore, BPEL process designer products map the BPEL standard into a graphical notation, although, this way they preserve an unnecessary redundancy. [6,4,5,7] Most of them are even capable of synchronizing the XML and the graphical view. The disadvantage of the graphical notation is that when a process becomes complicated it is very hard to get an overview of it and most designers become very slow even if the process contains only a few dozen activities. There are other approaches for generating BPEL code. Usually other languages (e.g. BPMN [8,9], UML activity diagrams [9], etc.) are used to define the business process, then this representation is converted into BPEL. The advantage of this is that the other representations can hide the redundancies of the BPEL standard, however, they are not BPEL compatible and therefore, the resulting BPEL code is bloated and cannot be processed by humans even in the graphical form. There are several approaches to extend the BPEL notation with other aspects such as constraints [10], integrated business rules [11] and aspect-oriented programming [12]. There are a lot of efforts [13,14] to transform BPEL processes in
A Human Readable Platform Independent Domain Specific Language
539
order to be able to execute model checking algorithms on them. It is very hard to support these tasks on every BPEL engine. The common problem with visual tools is that they are not satisfactory when efficiency, repeatability, manageability and extensibility is necessary. Our opinion is, that in the areas mentioned above a high level, abstract, and above all textual description language is of paramount importance. The authors are not aware of any DSLs that can take into account all these aspects. Nor do they know any extensible human readable DSL that provides compile time type checking, pre- and post-condition support, handling of access rights, etc. Therefore, we decided to create a DSL for SOA to be able to handle all aspects of a SOA system. The SOAL for WSDL and BPEL part introduced in this paper is only the beginning, although, it is already a powerful aid for rapid web service development. Later, the language will be extended with other constructs.
3 3.1
SOAL for BPEL Specification
This subsection introduces the specification of the SOAL language for BPEL. Due to space limitations only the most important aspects of BPEL will be enclosed. Interfaces (WSDL portTypes) and complex types (XSD sequence types) can be described similarly to C# and Java. BPEL is built on these interfaces and complex types (see Table 1). Most of the BPEL constructs are mapped to keywords in SOAL. if, while and for have the exact same syntax as in Java or C#. faultHandler-s are mapped to the familiar try-catch construct. This try-catch construct is extended with compensation, termination and events parts to be able to describe every aspect of a BPEL scope. We have mapped the other activities of BPEL into this kind of C# and Java-like syntax, but they are omitted from this paper due to space limitations. It is important that there is no need to define partnerLinkType-s and partner Link-s, since these can be automatically inferred from the process description. To call a service, the invoke activity can be used, the receive activity is responsible for accepting web service calls from the outside world. The operation names are always qualified with the implemented interface name, therefore it is always definite, which operation it is. It is possible to define correlations, but there is no need to create propertys, propertyAlias-es and correlationSet-s. Simply the correlation expression has to be defined, see Table 1. This approach is much simpler and much more readable than the one described in the BPEL standard, while it is still possible to map this kind of solution into BPEL. We have even transformed XPath expressions into statically type-checked C# and Java-like expressions. The 1-based indexing of XPath is mapped to the conventional 0-based indexing. The code generator and refactor tools of the SOAL framework ensure that the conversion between these representations is safe.
540
3.2
B. Simon, B. Goldschmidt, and K. Kondorosi
Correlations
Take the following example: invoke Service . Operation ( out p1 , out p2 ) join correlation ( v1 == p1 . Field1 && v2 == p1 . Field2 ) , ( v3 == p2 . Field3 && v4 == p2 . Field4 )
Let’s define the following concepts: – – – –
local variables participating in correlation expressions: v1, v4, etc. correlation expressions: v1 == p1.Field1, v4 == p2.Field4, etc. query expressions on parameters: p1.Field1, p3.Field3, etc. bracketed list of correlation expressions: (v1 == p1.Field1 && v2 == p1.Field2), etc.
A bracketed list of correlation expressions in SOAL can be mapped to a correlation set in BPEL. The bracketing of correlation expressions containing the same variables in two different activities is consistent if they can be mapped to the same correlation set in BPEL. SOAL makes defining correlation expressions much more intuitive than defining correlation sets in BPEL. However, the mapping of correlations is not a direct mapping between BPEL and SOAL. In order to have maintainable process descriptions, transforming back and forth between the two representations should always result in the same descriptions. This means that the transformation should be isomorphic. This sub-section proves that the correlation expressions in SOAL have the same descriptive power as correlation sets in BPEL, and in order to have efficient transformation between them, explicit bracketing of correlation expressions in SOAL is necessary. Assume, that a BPEL process has the following – properties: P = {p1 , p2 , . . . , pn }, n ∈ Z+ – property aliases: R = {r1 , r2 , . . . , rm }, m ∈ Z+ – correlation sets: S = {s1 , s2 , . . . , sk }, k ∈ Z+ , si ∈ 2P , si = {pi1 , pi2 , . . . , piqi }, qi ∈ Z+ , i = {1, 2, . . . , k} – activities (invoke, receive, reply) A = {a1 , a2 , . . . , al }, l ∈ Z+ with the correlation sets: c : A → 2S Assume, that we have a SOAL representation of this process with: – local variables in correlation expressions: V = {v1 , v2 , . . . , vn }, n ∈ Z+ – query expressions on parameters in activites (invoke, receive, reply) with correlations: T = {t1 , t2 , . . . , tm }, m ∈ Z+ – activities (invoke, receive, reply) A = {a1 , a2 , . . . , al }, l ∈ Z+ with the correlation expressions: d : A → 2V ∗T In order to have an isomorphic transformation between the two representations, the following conditions must be met: – l = l and there is a bijection between A and A
A Human Readable Platform Independent Domain Specific Language
541
– there is a bijection between R and T – there is a bijection between P and V – activity ai has the correlation sets c(ai ) = {sg1 , sg2 , . . . , sgj } and the corresponding activity ai has the correlation expressions d(ai ) = {(vh1 th1 ), (vh2 , th2 ), . . ., (vhj , thj )} bracketed so that the correlation expressions in the brackets correspond to the correlation sets of ai based on the bijection between V and P This mapping also shows that correlation expressions in SOAL and correlation sets in BPEL have the same descriptive power. If there is no explicit bracketing, the transformation framework must find the appropriate bracketing of correlation expressions in SOAL so that the tranformation remains isomorphic. Let’s call this problem correlation bracketing (CB), and it can be formalized as follows: – Input: a set of local variables V for correlations, a set of query expressions T on parameters, a set of activities A with correlation expressions, and a function d : A → 2V ∗T which connects the variables and queries to the activities – Question: Can the connections of d : A → 2V ∗T be partitionned into o ≤ O disjoints sets W1 , W2 , ..., Wo such that, 1 ≤ i ≤ o, Wi induces consistent correlation sets between the activities and the variables? Theorem 1. The automatic bracketing of correlation expressions to achieve the minimum number of correlation sets is an NP-complete problem. Proof. First, it is clear that the correctness of a given bracketing can be checked in polynomial time, therefore CB ∈ N P . The NP-complete biclique decomposition [15] problem (BD) is the following: – Input: Bipartite graph B = (X, Y, E), positive integer K – Question: Can the edges of B be partitionned into k ≤ K disjoints sets E1 , E2 , ..., Ek such that, for 1 ≤ i ≤ k, Ei induces a complete bipartite subgraph of B? We will prove that BD ≺ CB. Let V = Y , A = X and T = Y . Let d(ai ≡ yi ) = {(vj ≡ xj , tj ≡ xj )|(xj , yj ) ∈ E}. And finally, let O = K. It is easy to see that in this case BD can be solved iff CB can be solved. Therefore CB ∈ NP-hard and based on the first part of the proof also CB ∈ NP-complete. 3.3
Architecture
The grammar for the proposed DSL was implemented in the M language designed by Microsoft. The M language is a declarative language for working with data and building domain models. It is part of the SQL Server Modeling [16] (formerly Oslo) framework, which also includes an editor called IntelliPad for domain specific languages and a repository for storing data models.
SOAL syntax
}
parallel { instantiate receive ICalculator.Left(out id, out left) join correlation requestID == id; instantiate receive ICalculator.Right(out id, out right) join correlation requestID == id; } assign { result = left+right; } invoke callback.Result(requestID, result);
interface ICalculator { asynchronous Left(int id, int left); asynchronous Right(int id, int right); } interface ICalculatorCallback { asynchronous Result(int id, int result); } bpel CalculatorProcess : ICalculator { ICalculatorCallback callback; int id, left, right, result, requestID;
struct Error { string Message; byte[] Data; }
WSDL: ... ... BPEL: <process name="CalculatorProcess" ...> <partnerLinks> <partnerLink name="ICalculatorPL" partnerLinkType="tns:ICalculatorPLT" myRole="ICalculatorRole" /> <partnerLink name="callback" partnerLinkType="tns:ICalculatorCallbackPLT" partnerRole="ICalculatorCallbackPLT" /> ... <sequence>
<xs:complexType name="Error"> <xs:sequence> <xs:element name="Message" type="xs:string"/> <xs:element name="Data" type="xs:base64Binary"/> WSDL: <wsdl:portType name="ICalculator"> <wsdl:operation name="Left"> <wsdl:input message="tns:ICalculator_Left_InputMessage" /> ... ...
XSD or WSDL syntax
Table 1. Mapping between SOAL and XSD, WSDL, BPEL
542 B. Simon, B. Goldschmidt, and K. Kondorosi
A Human Readable Platform Independent Domain Specific Language
543
Based on the grammar a parser can read in BPEL process descriptions in the SOAL form and can transform the input into an object model (described by an abstract process model of BPEL), which is then easily processable from any .NET language. The abstract process model is the exact mapping of the standard BPEL elements into .NET classes. The object model conforming to this abstract process model can be directly mapped to the XML representation of BPEL, however, due to the differences between the execution semantics of different BPEL engines, it is necessary to take the peculiarities of each engine into account. The framework also generates directly importable projects for each BPEL designer. The transformations are defined using the Text Template Transformation Toolkit for Visual Studio. It is also possible to refactor existing BPEL XML files into the object model and then translate this back into the proposed domain specific language, or transform the representation into another engine’s XML form. The architecture of the framework can be seen in Figure 1. The authors have successfully tested the framework with GlassFish ESB and ActiveVOS, and it is planned to include support for other BPEL engines such as Oracle BPEL, IBM WebSphere or Apache ODE. It is hard to find the peculiarities of each engine, therefore, supporting a new tool requires a lot of testing. However, this can be automated, and it is subject of further research.
+3
BPEL process in SOAL
^f
+3 BPEL process
parser
x
generator
object model
^f
+3 transformator +3 BPEL XML for GlassFish EEEE E EEEE EEEE EEEE E & x refactor tool
ks
BPEL XML for ActiveVOS
Fig. 1. The architecture of the platform independent SOAL for BPEL framework
4
Summary
The proposed framework makes designing and handling BPEL processes easier. From the design point of view it is easier to write the process code in a C# and Java-like language without the burden of bothering with the redundant partner links and correlation sets. The resulting code is much more intuitive, more compact and more readable than the original BPEL XML. The tedious clicking in the GUI interface of the BPEL designers can also be eliminated. The proposed domain specific language provides static type checking even if the BPEL designer tools fail to do so e.g. in XPath expressions. Another advantage of the framework is that it provides refactoring capabilities, which can also be used for migration. This can increase productivity especially in e-Government environments. It is even desirable to provide instance
544
B. Simon, B. Goldschmidt, and K. Kondorosi
migration of half-executed BPEL processes. The framework can make this task easier by enforcing the required design patterns like the ones proposed in [17]. Future plans to extend the language and the framework include WS-Policy assertions, Design-by-Contract, security aspects (access rights, SAML attributes) and model checking support on these. The main objective is to create a domain specific language to describe as many aspects of SOA as possible.
References 1. OASIS. SOA Reference Model (2006), http://www.oasis-open.org/committees/tchome.php?wgabbrev=soa-rm 2. OASIS, WS-* Standards (2002-2009), http://www.oasis-open.org/specs/ 3. Simon, B., Laszlo, Z., Goldschmidt, B., Kondorosi, K., Risztics, P.: Evaluation of ws-* standards based interoperability of soa products for the hungarian egovernment infrastructure. In: International Conference on the Digital Society, pp. 118–123. IEEE Computer Society, Los Alamitos (2010) 4. Sun Microsystems. GlassFish ESB, https://glassfish.dev.java.net/ 5. Active Endpoints. ActiveVOS, http://www.activevos.com/ 6. Oracle. JDeveloper 11g, http://www.oracle.com/technology/products/jdev 7. IBM. WebSphere Business Modeler, http://www-01.ibm.com/software/integration/wbimodeler/ advanced/features/ 8. Ouyang, C., et al.: From bpmn process models to bpel web services. In: IEEE International Conference on Web Services, pp. 285–292 (2006) 9. Ouyang, C., et al.: Translating standard process models to bpel. In: Dubois, E., Pohl, K. (eds.) CAiSE 2006. LNCS, vol. 4001, pp. 417–432. Springer, Heidelberg (2006) 10. Baresi, L., et al.: Policies and aspects for the supervision of bpel processes. In: Krogstie, J., Opdahl, A.L., Sindre, G. (eds.) CAiSE 2007 and WES 2007. LNCS, vol. 4495, pp. 340–354. Springer, Heidelberg (2007) 11. Rosenberg, F., Dustdar, S.: Business rules integration in bpel – a service-oriented apporach. In: Proceedings of the 7th International IEEE Conference on ECommerce Technology, CEC ’05 (2005) 12. Charfi, A., Mezini, M.: Middleware support for bpel workflows in the ao4bpel engine. In: Dustdar, S., Fiadeiro, J.L., Sheth, A.P. (eds.) BPM 2006. LNCS, vol. 4102. Springer, Heidelberg (2006) 13. Fu, X., Bultan, T., Su, J.: Analysis of interacting bpel web services, pp. 621–630. ACM Press, New York (2004) 14. Hinz, S., et al.: Transforming bpel to petri nets. In: van der Aalst, W.M.P., Benatallah, B., Casati, F., Curbera, F. (eds.) BPM 2005. LNCS, vol. 3649, pp. 220–235. Springer, Heidelberg (2005) 15. Janssen, P., Amilhastre, J., Vilarem, M.C.: Complexity of minimum biclique cover and decomposition for a class of bipartite graphs. Technical Report 96035, LIRMM (1995) 16. Microsoft. SQL Server Modeling, http://msdn.microsoft.com/en-us/data/ff394760.aspx 17. Simon, B., Goldschmidt, B., Laszlo, Z.: Bpel movie framework: Replaying bpel processes from logs. In: Proceedings of the 13th IASTED International Conference, Software Engineering and Applications 2009, pp. 242–249. ACTA Press (2009)
Impact of the Multimedia Traffic Sources in a Network Node Using FIFO scheduler Tatiana Annoni Pazeto1, Renato Moraes Silva1, and Shusaburo Motoyama2 1
Federal University the Mato Grosso (UFMT) Rondonópolis - MT – Brazil [email protected], [email protected] 2 Telematics Department - School of Electrical and Computing Engineering State University of Campinas - (UNICAMP) Campinas - SP- Brazil [email protected]
Abstract. The recent substantial growing of video and voice traffics in the Internet may cause a great impact on the conception and dimensioning of a network node designed to attend only data traffic. The objective of this paper is to investigate how these traffics can affect the present network which works with FIFO scheduler. For this purpose, a simulation platform consisted of several different kinds of sources, a buffer and a FIFO scheduler was developed in C++ language. Many different scenarios of traffic compositions were simulated for the study of the network node behavior in relation to queue and system times which are important parameters for QoS definition. The results showed that video traffic can impact very strongly all the network infra-structure including the whole congestion of system. The results also showed that voice traffic has less impact on the network but in a mixed operation of traffics, the system cannot guarantee the QoS of voice traffic which needs almost real time treatment. The main conclusion of this paper is that the admission control mechanism and QoS provisioning need to be implemented urgently to cope with the fast growing of video and voice traffics. Keywords: Multimedia traffic, FIFO Scheduler, Queue and system times.
1 Introduction Until recently the majority of traffics transported in computer networks were the data traffic. However, nowadays all types of traffic such as voice, video, images and data are being transported in a network. Thus, the transport of these traffics, called multimedia here, may cause a great impact on the design and dimensioning of a network node. One of the main concerns is the impact on the scheduler used. The FIFO scheduler is the most commonly used in present network, but it was designed for only one type of traffic, and is not prepared to attend a variety of traffics. Thus, many types of schedulers have been proposed to deal with mixed types of traffic, such as DRR (Deficit Round Robin) [2], WFQ (Weighted Fair Queuing) [1], among others. These F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 545–555, 2010. © Springer-Verlag Berlin Heidelberg 2010
546
T.A. Pazeto, R.M. Silva, and S. Motoyama
schedulers are designed to ensure fairness or QoS (Quality of Service) of each type of traffic. The searching for the most appropriate scheduler for the network with a variety of traffics is still a very active field of research. The scheduling process and its performance are influenced by the traffic source model adopted. Poisson type of traffic is not appropriate anymore after W. E. Leland et al showed in their study the self-similar nature of data traffic [3]. Since then, many types of self-similar traffic and queue models have been proposed [4], [5] and [6]. However, such mathematical models of traffic are complex and not yet fully fitted to actual traffic. The objective of this work is to study the impact of multimedia traffics on the FIFO scheduler. The traffic model used is based on the On/Off type of source. In this type of traffic, there are two periods, one is called On period, in which the source is active and is generating data. In another period, called Off, no data is generated. The distributions of lengths of the period and the intervals between periods may be chosen to adapt each type of traffic. Although this type of source is not totally appropriate to the actual traffic, it may be, by choosing appropriate parameters, a good model of traffic to represent different types of traffic. The main impacts that we are concerned are the waiting time, service time and system time in a network node. In section 2 the parameters of On/Off sources used in this work are defined and different proposals of sources are described. The description of the simulated scenarios is carried out in section 3. The results of impacts using FIFO scheduler are shown in section 4. Finally, in section 5, the main conclusions are presented.
2 Traffic Model and Simulation Considerations The choice of traffic model for a network with multimedia traffics such as voice, video and data is a difficult task because each has different characteristics. In general, the self-similar type of traffic is modeled as the aggregation of incoming traffic in a node. To model a single source, different approaches were adopted. One approach used in [7], is to divide one Internet connection into sections, each section with different distributions, which makes the relatively complex process to implement. The On/Off traffic model for each source is very simple to implement and also the aggregate of these sources represents a good approach of self-similar traffic and thus has been used to study the performance of the network [8] and [9]. In the present work, we used the On/Off traffic model for individual source. Table 1 show the parameters used in the On/Off sources. These are the values of the parameters commonly used in the literature. Table 1. Parameters for generation of voice, video and data sources Type of Traffic Voice Video Data
Packet Size (bits) 1360 4096 4096
Peak Rate(bps) 65536 1048576 307200
Average of Toff (s) 0,0016 0,001 0,003
Impact of the Multimedia Traffic Sources in a Network Node Using FIFO scheduler
547
Three types of On/Off sources are used in this work. In the first type, called On/Off fixed, only one packet in each active interval (Ton) is generated. In addition, the packet is generated with the size and Ton fixed, and its duration is calculated by dividing the packet size by the peak rate of the source. After Ton, there is a period of Off (Toff) and its duration has negative exponential distribution. The second type of source, entitled On/Off Variable 1.0, can generate a fixed amount of packets per interval or not. In this source is generated a number of packets and then is calculated the necessary Ton to transmit these packets. Moreover, the size of packets and the Off period are generated according to the negative exponential distribution. The last type of source called On/Off Variable 2.0 is similar to the previous model, because the Ton and the size of the packets are not fixed, having exponential distribution. The number of packets per interval is not fixed. In this source, Ton is generated first and then, as many as packets that can be fitted in Ton are generated. The fitness verification is packet by packet basis. If the last packet cannot be fitted in Ton, it will be stored to be sent in the next interval. A simulation platform was developed in C++ that includes all types of sources, the FIFO scheduler and the buffer that represents the node.
3 Description of the Analyzed Scenarios The configuration of the analyzed network node is shown in Fig. 1. At the input of the node voice, video and data users are provided and at the output a link of 2 Mbits/sec is used. A very large buffer is provided to simulate infinite buffer.
Fig. 1. Network configuration used in simulations
The video, voice and data users were implemented in C++ according to the ON/OFF sources discussed in the previous section. After the implementation of the sources, a simulator for the FIFO scheduling was developed, also using the C++ Builder. The simulator was implemented according to the flow diagram of Fig. 2.
548
T.A. Pazeto, R.M. Silva, and S. Motoyama
Reference: [10]
Fig. 2. Flow of the simulation process for the FIFO scheduling
For voice and video traffics just the Fixed On/Off Source was used and for data the other sources described in previous section was used. In all sources generated the parameters presented in Table 1 were considered, according to the type of traffic that was being analyzed. To study the behavior of the types of traffics, several scenarios were analyzed, gradually increasing the number of users of the system. This can show the stable and unstable conditions of the system. The average service time (E{X}), the average queue time in buffer (E{W}), the average system time (E{T}), equivalent to E{W} + E{X}, and the packet loss (PP) are the parameters used to study the behavior of the system. The scenarios considered are presented in Table 2. Table 2 shows that six scenarios were analyzed, confronting the performance of network, which is operated primarily with data traffic, and the performance that will be achieved in the near future, assuming that same resources are used. As mentioned, the simulations were performed for a link of 2Mbits/sec, discarding the transient period. For this, 1000 initial packets were discarded before starting the collection of data for statistical reporting purposes.
Impact of the Multimedia Traffic Sources in a Network Node Using FIFO scheduler
549
Table 2. Scenarios considered Scenarios 1 2
3
4
5
6
Traffic Types of sources Data On/Off Variable 1.0 On/Off Variable 2.0 Data On/Off Variable 1.0 On/Off Variable 2.0 Voice Fixed On/Off source Data On/Off Variable 1.0 On/Off Variable 2.0 Video Fonte On/Off Fixa Data On/Off Variable 1.0 On/Off Variable 2.0 Voice Fixed On/Off source Video Fixed On/Off source Data On/Off Variable 1.0 On/Off Variable 2.0 Voice Fixed On/Off source Video Fixed On/Off source Data On/Off Variable 1.0 On/Off Variable 2.0 Voice Fixed On/Off source Video Fixed On/Off source
Initial user quantity 5
Final user quantity 15
4
4
1 4
16 4
1 4
6 4
5 1 4
5 3 4
1 1 4
13 1 4
1 2
5 2
4 Analyses of Results To analyze the behavior of network traffic nowadays, simulations were made to examine the growth of video and voice traffics over the data traffic which is the majority now. Thus, in the first scenario simulations were performed only with data users, in order to determine the impact of the growth of data traffic in network system. Thus, Fig. 3 shows the results of these simulations. For each number of users a simulation was made using Source On/Off Variable 1.0 and another using Source On/Off Variable 2.0. In the figure, the values at left of the number of data users represent Source On/Off Variable 1.0 and at right the Source On/Off Variable 2.0. As can be seen in Fig. 3, the queue and system times for Source On/Off Variable 1.0 are smaller than Source On / Off Variable 2.0. The same is true regarding the time of service. However, in this scenario, the service times are much lower than the waiting time and time of system. Another important point in this scenario is that the values of the times of queue and system have great tendency to increase with eight users for Source On/Off Variable 1.0 and with nine users for Source On/Off Variable 2.0. These numbers of users are the points of the starts of instability of the system, meaning that waiting time tends to the infinite. But this behavior is different in relation to the service time, because with nine users, this parameter does not undergo any further significant increase, as expected, because the service time depends only on average of packet length and link capacity.
550
T.A. Pazeto, R.M. Silva, and S. Motoyama
Fig. 3. Simulation with increase of data users
Furthermore, it is important to mention that regardless of the source model used, no packets loss occurred. In the second scenario the simulations were made with the Source On/Off Variable 1.0, but increasing the number of voice users and keeping the number of data users constant in four. The amount of data packets generated was 200000 in all simulations. However, the number of voice packets generated started with 50000 in the first simulation with just one voice user and in the last simulation, 800000 voice packets were generated. The results of these simulations are shown in Fig. 4.
Fig. 4. Simulation increasing the voice users
Impact of the Multimedia Traffic Sources in a Network Node Using FIFO scheduler
551
Through the Fig. 4 it can be seen that the increasing of voice users causes a smooth growth in the queue and system times, meaning that the system can accept a good amount of voice traffic. Another point to note is that the increase of voice users lowers service time because the proportion of voice traffic in relation to the data traffic is increasing so that the average service time is approaching to the value weighted by voice and data service times. In the third scenario, simulations were made with the Source On / Off Variable 1.0, keeping the number of data users in four users and increasing the number of video users, starting with one user and finishing with six users. The results of this scenario are shown in Fig. 5.
Fig. 5. Simulation with increase of video users
The results presented in Fig. 5 show that the increase of video users causes a gradual increase in the queue and system times, having a significant increase with four users. However, comparing the Fig. 5 to the Fig. 3, the increase of queue and system times is significant with three video users. This is due to the more intense nature of video traffic. Although, the increase is not drastic as happen with only data traffic which is unstable suddenly with eight data users. This phenomenon may happen due to the constant video packet length, contrasting with the random nature of data packet length. As can be observed in Fig. 5, the increase of queue and system times is smoother, but it is devastating with five video users. As seen in Fig. 5, in relation to the service time, there is no significant variation because the video service time and data service time are in average the same value, so that it remains in the average value. In the fourth scenario, simulations were made with mixed voice, video and data traffics. The first two types of traffics were generated by the Source On/Off fixed, while the latter was generated by Source On/Off Variable 1.0 and Source On/Off Variable 2.0. Moreover, all simulations had a fixed amount of four data users and five voice users, while the number of video users was changing.
552
T.A. Pazeto, R.M. Silva, and S. Motoyama
It was generated a total of 250000 voice packets and 200000 data packets per simulation. However, the number of video users was increased gradually. Thus, in the simulation with one video user 50000 video packets were generated while with two and three video users 100000 and 150000 video packets, respectively, were generated. In Fig. 6, the results using the Source On/Off Variable 1.0 and the Source On/Off Variable 2.0 for data users are shown.
Fig. 6. Simulation with increase of video users
Analyzing the results presented in Fig. 6, it is observed that in both the simulations with Source On/Off Variable 1.0 and with source On/Off Variable 2.0, the increase of video users causes an increase in queue and system times. However, this increase is more intense for the first source. It can be noticed that same phenomenon observed in Fig. 5 which is the fast reaching of unstable point with three users, meaning the great influence of this kind of users in the system. As can be seen in Fig. 6 the behavior of service time starts in a value weighted by data, voice and video traffics, and as video users increase the service time also increases slightly, implying that in this scenario data and video packets are majorities and the service time is given by these two traffics. In the fifth scenario the video and voice packets were generated by the Source On/Off fixed, while data packets were generated by the Source On/Off Variable 1.0. In this scenario, the number of video users was set in one user and the amount of data users was set in four and voice users were increased gradually. The results obtained for this scenario are shown in Fig. 7. As can be seen in Fig. 7, in this scenario the queue and system times have very smooth increase as it was observed in second scenario which results are shown in Fig. 4. As the voice packet lengths are smaller than video and data packets the impact on the queue and system times is smaller since it means smaller traffic intensity. It can be again concluded that the system admits a good amount of voice traffic without reaching unstable condition.
Impact of the Multimedia Traffic Sources in a Network Node Using FIFO scheduler
553
Fig. 7. Simulation with increase of voice users
The service time is almost constant, starting in a value weighted by video, data traffics and slightly lowering as the voice traffic with smaller packet lengths is increasing. The sixth scenario is similar to the fifth scenario. The difference is that the amount of video users is fixed in two. The results of this scenario are exhibited in Fig. 8.
Fig. 8. Simulation with increase of voice users
Through the Fig. 8 it is noticed that despite the similarities between the scenarios five and six, the results show large differences in some aspects. The main contrast between the two scenarios is the difference in queue and system times which in this scenario are very high values. This behavior is due the increase of video users and consequently the higher video traffic intensity which is leading to the near the unstable
554
T.A. Pazeto, R.M. Silva, and S. Motoyama
condition. Furthermore, it can be noticed that the influence of voice traffic, in this case, is much more significant as the voice users increase. Since the system is operating near unstable condition, any increment of voice traffic has great impact on the queue and system times. This scenario emphasizes the need of some kind of user admission control in the present network so that the system will not suffer any bottleneck. It is also shows that it cannot guarantee any quality of service (QoS) for voice users because the system time is too large and the voice traffic needs to be treated almost in real time. The user admission control techniques and introduction of QoS in the present network have been the objects of the study for long time and many solutions have been proposed. The scenarios studied in this paper show the need for urgent implementation of both user admission control and QoS in the present network which has the perspective of a great growing of video and voice traffics. The service time has same behavior as before scenario, as it was expected, and due to the huge buffer size defined in one million positions, no packet loss has occurred.
5 Conclusions In this paper the impact of the multimedia traffic sources in a network node using FIFO scheduler was studied. The intent of the research was to examine how the voice and video traffics can impact the present network infra-structure which is based on FIFO scheduler and has not any control access mechanism or quality of service provisioning. Recently, these two kinds of traffics have grown with great intensity. To achieve above purposes, a simulation platform consisting of different types of sources, a buffer and a FIFO scheduler using a link of 2 Mbits/sec was developed and six different scenarios having different combinations of voice, video and data traffics were analyzed. By analyzing the results of proposed scenarios it was concluded that video traffic caused a fast growing in the queue and system times. This conclusion became more evident in the scenarios 3 and 4 where just one video user caused an increase of more than 5 seconds in the system time. Besides, it was verified that only three video users (scenario 4) can lead the system to the unstable condition. Confronting the scenario 6 with the scenarios 2 and 5, which analyzed the behavior of voice users growing, it was concluded that if the system is in stable condition, there is no significant impact on the system time, and the inclusion of great amount of this kind of traffic is possible. However, if the system is near unstable condition which is the case of scenario 6, one user can cause an expressive rise in the queue and system times. The main conclusion of this paper is that some admission control mechanism and QoS provisioning need to be implemented urgently in a network with high growing of video and voice traffics.
References 1. Demers, A., Keshav, S., Shenker, S.: Analysis and simulation of a fair queueing algorithm. Journal of Internetworking Research and Experience, pp. 3–26 (October 1990); Also in Proceedings of ACM SIGCOMM’89, pp. 3–12 (1990) 2. Shreedhar, M., Varghese, G.: Efficient fair queuing using deficit round-robin. IEEE/ACM Transactions on Networking 4, 375–385 (1996)
Impact of the Multimedia Traffic Sources in a Network Node Using FIFO scheduler
555
3. Leland, W.E., Taqqu, M.S., Willinger, W., Wilson, D.V.: On the self-similar nature of ethernet traffic (extended version). IEEE/ACM Trans. Networking 2, 1–15 (1994) 4. Norros, I.: A storage model with self-similar input. Queuing Systems (1994) 5. Fischer, M.J., Harris, C.: A Method for Analyzing Congestion in Pareto and Related Queues. The Telecommunications Review (1999) 6. Liu, N.X., Baras, J.S.: Statistical Modeling and Performance Analysis of Multi-Scale Traffic. IEEE INFOCOM (2003) 7. 3GPP2 WG5 Evaluation Ad Hoc.: 1xEV-DV Evaluation methodology- addendum (V6) (July 2001) 8. Papapanagiotou, I., Vardakas, J.S., Paschos, G.S., Logothetis, M.D., Kotsopoulos, S.A.: Performance Evaluation of IEEE 802.11e based on ON-OFF Traffic Model. In: Proceedings of the 3rd international conference on Mobile multimedia communications. ACM International Conference Proceeding Series, vol. 329 (2007) 9. Seo, H.-H., Ryu, B.-H., Cho, C.-H., Lee, H.-W.: Traffic characteristics based performance analysis model for efficient random access in OFDMA-PHY system. In: Braun, T., Carle, G., Koucheryavy, Y., Tsaoussidis, V. (eds.) WWIC 2005. LNCS, vol. 3510, pp. 213–222. Springer, Heidelberg (2005) 10. Marques, L.B.S.: Study of System Performance 3G 1xEV-DO Using Real Traffic Models. 2005. 95 f. Thesis (Master in Electrical Engineering) – dissertation, State University of Campinas (Unicamp), Telematics Department - School of Electrical and Computing Engineering, Campinas (2005)
Assessing the LCC Websites Quality Saleh Alwahaishi1 and Václav Snášel2 1
Department of Accounting and MIS King Fahd University for Petroleum and Minerals 2 Department of Computer Science FEECS, VŠB - Technical University of Ostrava [email protected], [email protected]
Abstract. The airline industry has been witness to increased additional turbulence as a result of the entry of airlines adopting new business models referred to as Low Cost Carriers (LCC), no-frills airlines or budget carriers. Whereas some budget carriers, such as Southwest airlines, have been competing in the US market for over 30 years, they have only been flourishing in Europe for the last 10 years and are at an early development stage in the Middle East, Asia Pacific and the rest of the world. This has accelerated customer acceptance of the Internet as a suitable medium for booking airline travel. Therefore, Web site quality is now considered a critical factor to attract customers' attention and build loyalty. This paper presents a modified assessment model to assist Middle Eastern LCC companies in evaluating their websites by examining it against four main categories that include 36 evaluation factors. Keywords: Web site; internet; quality; low cost carriers; Middle East.
1 Introduction As global competition increases, nations are forced to adopt new trading practices in order to gain market share and increase export revenues. Liberalizing and opening the economies has become an imperative to survival in the global community. The need to use more sophisticated methods, to improve the telecommunication environment, to adopt common and harmonized procedures and standards, make electronic commerce and other information technology solutions valuable options to pursue in improving regional trade [1]. Consequently, the airline environment has changed drastically. In addition all aspects of the business model continue to change at an ever increasing rate. The lowcost phenomenon equals innovative use of economics, marketing, geography and management, which yield the born of the so called Low-Cost Carriers (LLC). These carriers are able to offer low-priced tickets, sometimes virtually at zero prices, in part because online bookings enable them to enjoy substantial cost savings. Further, savings come from their offering lower qualities of service on flights, while enjoying F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 556–565, 2010. © Springer-Verlag Berlin Heidelberg 2010
Assessing the LCC Websites Quality
557
additional revenues from the complementary services offered in their websites, such as transportation and hotel reservations. LCCs seem to spread like wildfire. Last year they transported more than 550 million passengers. There are now more than 100 low-cost carriers, 65 of which have launched over the last four years. A majority of the start-ups have cropped up in emerging markets, in particular Asia, where low-cost carriers grew 35% last year and now account for 19% of the world's total low-cost market [2]. Since 2004, 28 carriers have launched low-cost operations in the Asia-Pacific region, 18 in Europe, 11 in the Americas, five in Africa and three in the Middle East. Low-cost or low-fare airlines have long followed the outline written down in someone else's book. But as the low-fares approach spreads around the world and as the first generation of low-cost airlines grows up, the page they read is not always taken from a predecessor such as Southwest or Ryanair. Instead, old models are yielding to new ways of doing business as a new paradigm is -written, region by region, and carrier by carrier. The benefits presented by the new paradigms are many and varied; customers and companies alike may benefit from the new ways of exchanging information, communicating and conducting trade. LCC companies have established their own Web sites in order to detour travel agent intermediaries, becoming increasingly focused on online communication, information and transactions. The Internet’s interactivity allows LCCs to respond more quickly to customer requests. Moreover, the ever-increasing speed of the Internet allows companies to communicate more quickly with current and potential customers, which is essential to retaining current customers and attracting new ones. In a recent study, it was found that traditional travel agencies continued to lose ground in favor of online intermediaries – a trend that is also encouraging direct bookings with hotels, airlines and other carriers. About 161 million trips were booked online in 2008. The proportion of trips booked through travel agents (including those selling via the internet) has fallen to 24%. [3]. LLCs have adopted the Internet as a platform to generate new streams of profits. First, via online marketing and sale of their core services (e.g. flights, check-in and transportation) they have broadened their customer base. Maintaining long-term customer loyalty has been made easier by the integration of Customer Relationship Management (CRM) systems into their websites and the distribution of newsletters and offers by email [4][5]. Complementary services, such as car rental and hotel bookings, are also now offered on LCC websites, contributing substantial streams of revenues. In this paper, we utilize the AWAI assessment index [6] and modify it to assess the LCC website quality. To this end, we present a measure, i.e. an instrument, of airlines Web site quality in order to evaluate and test the results of Web site quality. Specifically, we examine seven low-cost airlines operated in the Middle East– Bahrain Air, Jazeera Airways, Wataniya Airways, Sama Airlines, Nas Air, Air Arabia, and Fly Dubai. Air Arabia was established on February 3, 2003 and started operations on October 29, 2003 in Sharjah Emirate of the United Arab Emirates. In Kuwait the low cost airline Jazeera Airways was launched October 30, 2005. It was the first and so far only airline in the Middle East to have a high profit from its first year of operations and has become the first choice of the Kuwaiti traveler giving the already struggling
558
S. Alwahaishi and V. Snášel
Kuwait Airways a huge competition. Saudi Arabia also launched two low frills carrier by the name of Nas Air and Sama Airlines in 2007. The Kingdom of Bahrain has launched a low cost carrier with the name of Bahrain Air in January 2008. Dubai Government has announced its low cost carrier FlyDubai, which started its operations in the second quarter of 2009, in collaboration with Emirates Airlines [7]. This paper is organized as follows. In the next section we provide a literature review, including an overview of Web site quality, tools and instruments. Next, we describe the research methodology and framework. In section 4, the instrument development, scoring mechanism and analysis are demonstrated. In section 5, the findings of this research are presented and discussed. Finally, in the last section, the paper provides conclusions from this research, including implications for future research and practice.
2 Web Sites Assessments Tools: A Review of the Literature The Internet’s interactivity allows companies to respond more quickly to customer requests. Moreover, the ever-increasing speed of the Internet allows companies to communicate more quickly with current and potential customers, which is essential to retaining current customers and attracting new ones. A company with a Web site that is difficult to use and interact with will reflect a poor image on the Internet and weaken the company's position. It is therefore important that a company be able to make an assessment of the quality of their e-commerce offering, as perceived by their customers and in the context of the industry. In doing so, companies can improve their offerings over time and benchmark against competitors and best practice in any industry. Boyd Collins developed the first formal approach to the evaluation of Web sites in late 1995. He founded the infofilter project, a model intended for librarians who needed to evaluate the quality of information on the internet, the model was based on six criteria, developed by combining evaluation criteria for printed media, and considering what was relevant for Web sites. These criteria are content, authority, organization, searchability, graphic design and innovative use. Various instruments have been developed for evaluating the different aspects of Web site quality [8]. The Web site quality evaluation method (QEM) proposed by Olsina et al. [9] can be considered as one of the main approaches. Between the main factors analyzed in this work we can find: functionality, usability (site map, addresses directory), efficiency and site reliability. Liu and Arnett [10] surveyed Webmasters for Fortune 1000 companies to ascertain the factors critical to Web site success with consumers. The result was five factors: quality of information, service, system use, playfulness, and design of the Web site. Loiacono et al. [11] designed an instrument to evaluate retail Web site quality called WebQual. The instrument assessed 12 components of retail Web quality: informational fit-to-task, interactivity, trust, response time, design appeal, intuitiveness, visual appeal, innovativeness, flow-emotional appeal, integrated communications, business process, and viable substitute.
Assessing the LCC Websites Quality
559
SiteQual was developed by asking students in two marketing classes to generate appropriate questions. Fifty-four unique items were generated, forming the basis for an instrument completed by 69 students for three self-selected sites [12]. eQual 4.0 (previously called WebQual 4.0) has been iteratively developed over time. The authors have used the instrument on student and customer samples to assess the quality of a number of different types of Web sites. In Barnes and Vidgen [13], a total of 380 student respondents evaluated online bookstores, using an instrument with 22 questions. Based on exploratory factor analysis, five dimensions emerged: usability, design, information quality, trust and empathy. Kim and Stoel [14], in their more simplified instrument, include three of the factors of Loiacono [9] plus three slightly different factors - entertainment, web appearance and transaction capability. E-S-Qual and E-RecS-Qual scales were developed for assessing the full cycle of service quality for online B2C e-commerce Web sites. The E-S-Qual scale developed in a 22-item scale of four dimensions: efficiency, fulfillment, system availability, and privacy. The second E-RecS-QUAL scale contained three dimensions (responsiveness, compensation, contact) with an 11-item scale [15]. Internet Standards Assessment Report (ISAR), provides industry benchmarks for Web site development and is based on data collected from more than 18,000 web site evaluations since 1997. The report evaluates average scores in more than 80 industries to create defined benchmarks in seven categories, including design, innovation, content, technology, interactivity, copywriting and ease of use. Currently, the results of web sites evaluations are very subjective; thus, site evaluators should be given precise guidelines to rate every factor. In order to avoid this subjectivity problem, a web assessment index can be used. According to Evans and King, this index represents a web assessment tool and must have five main components: categories, factors, weights, ratings and total score [16].
3 Research Framework and Methodology There are a number of core features that a good quality Web site should possess [17] [18] [19]. First, an effective Web site should serve as a major source of information; provide complete information on the products and services; and allow for quick access to information through tools like search engines. Customers also demand appropriate levels of service interaction including customer service, personalization and ease of contact in the event of a problem. Furthermore, a Web site should incorporate appropriate security measures and adopt privacy practices in order to develop customers' trust. In addition, a Web site should be easy to navigate and typically have aesthetic appeal and an appearance that is appropriate for the organization. Such features also contribute to enjoyment or emotional appeal, which help to create flow [9] and retain the customer. In this paper, we have utilized a modified AWAI index [6], which was considering four broad categories as the basis for an airline company's quality Web site; transactional content, informational content, passenger enjoyment/ support, and website design (see Figure 1).
560
S. Alwahaishi and V. Snášel
Fig. 1. Framework for evaluating Low-Cost Airlines website quality [6]
4 LCC Website Index Development Based on the literature review and AWAI index, this paper focuses on four categorizes for developing the assessment index. To measure those four categories, 36 factors out of the original 40 factors of the AWAI index [6] were chosen. The four omitted factors were found unrelated to LCC industry like showing the alliance partners or code share flights, and the frequent flyer programs. The key factors within each category are chosen based on the researchers’ experience, and must reflect what are generally considered to be important components and features of web sites by users. Transactional content (40 points) is emphasized by Barnes and Vidgen [11] and Liu and Arnett [8] in their research instruments to measure Web site quality. In this research, TC is measured by 14 factors (online booking, search for ticket by date/ price, seat selection, onboard services, change/ cancel ticket, view booking, multi city booking, payment method, delivery of ticket, online boarding pass, ability to choose city name instead of airport code, and search flights within a range of time). Informational content (20 points) is also emphasized by Barnes and Vidgen [11] and Liu and Arnett [8] in their research instruments. In this research, IC is measured by 9 factors (destination services, flight details, flight schedule, info for business travelers, info about destination, in-flight services / entertainment information, fleet info, privacy policy, and multimedia info clips). Passenger enjoyment/ support (20 points) comes from the Liu and Arnett's [8] framework, Ethier's [20] emotion system in using a Web site, and Kim and Stoel's [13] research, which involves the context of consumer behavior. In this research, we use 7 factors to measure this category (domestic/ overseas holidays, onboard merchandise, queries/ feedback, currency calculator, special offers/deals, and offices/ sales representatives' contacts). Web site design (20 points) is based on the user-perceived Web quality instrument of Aladwani and Pal via [21] and Kim and Stoel's [13] analysis, which contain items
Assessing the LCC Websites Quality
561
for Web site content and Web site appearance. There are 6 factors used to measure this category (searching engine, secure site, customized website, company info, career opportunities, and popularity). Alexa indicator (www.alexa.com) is used to measure the website popularity, which measure the website traffic based on three months of aggregated historical traffic data from millions of Alexa toolbar users and data obtained from other, diverse traffic data sources.
Fig. 2. Assessment Index categories, factors and weights
Based on the Evan and Kings approach, weight should be assigned to the factors in order to numerically measure and assess the website quality (see Figure 2). The AWAI point scale was adopted here with a slightly change to accommodate the omitted factors of the original index.
5 Result Analysis The top seven airline companies that are operating in the Middle East are examined and the assessment index tool was applied to their websites during November 2009 and January 2010.
562
S. Alwahaishi and V. Snášel
The highest scoring site using the assessment index was of the Air Arabia followed by Jazeera Airways, Fly Dubai. Wataniya Airways, Bahrain Air, then Fly Sama and Nas Air sharing the lowest score (see Figure 3).
Fig. 3. Overall website quality index values
Transactional content category is a vital factor of a website quality as it contains many factors that reflect the e-readiness of an airline company through e-ticketing, online boarding pass, e-payment methods, seats preselecting, and different ticket search and booking options. Those quality factors hold the highest weight (40%) compared to other factors due to its crucial importance to the airlines websites. It is not a coincidence that the highest and the lowest scored companies in this category
Fig. 4. Transactional content category
Assessing the LCC Websites Quality
563
are the same of the overall assessment score, Air Arabia and Nas Air respectively (see Figures 4 and 5), which reflects a strong positive correlation between the transactional content of an airline website and the overall satisfaction and quality of that site.
Fig. 5. Transactional content vs. overall quality
The informational content category, weighs (20%) of the overall index, represents the company's commitment towards its passengers in terms of on board and destination services. This category has many factors such as hotel booking, car rental, on board entertainment, flights schedule and privacy policy. All the companies scattered fairly around the average score (14.5) with a standard deviation of (1.9) which reflects the awareness of airlines companies to the importance of such information to their customers (see Figure 6).
Fig. 6. Informational content category
564
S. Alwahaishi and V. Snášel
Developing and maintaining passengers' loyalty and satisfaction is a crucial criterion in the airline industry. Passenger Enjoyment/ support category contains several factors that measure this criterion such as domestic or international holiday packages, special offers, onboard merchandise as well as listening to passengers' complaints and queries. Offering low cost holiday packages are one of the attractions that LCCs are using to maintain the loyalty of their customers, a factor that being utilized tidily by Air Arabia making it leading the others in this category with (85%) while the rest average around (55%). Website design is the fundamental measure of website popularity as well as navigability since it includes many indication factors such as website popularity, searching engine, customizable website, and the website security. LCCs become a global phenomenon, which means that prices, offers and holiday packages should be introduced to customers in multi currency option. In addition, enabling the multilingual interface makes the airline website more accessible and at the same time it reflects the success of the company in introducing itself as a global carrier. It is perceptible that the top scored airlines in this category is the best overall company, Air Arabia, since the passengers’ prime concerns when performing online transactions are security and website’s ease of use.
6 Conclusion and Future Work This paper proposes and tests a model for assessing low cost careers’ website quality. The LCC assessment index provides an integrated approach for researchers, managers, and people who are interested in the LCC industry to compare attributes and components of airlines’ websites in order to specify limitations and opportunities. The main challenge in the elaboration of the index was to avoid subjective factors, which have been predominant in previous assessment tools. The index is based on four broad categories: transactional content, informational content, passenger enjoyment/support, and website design, which are quantified in an objective and logical way. The statistical correlations among web page quality factors have been identified, in order to help airline companies to concentrate on the catalyst factors which will bring quality to their websites. This research provides avenues for future work by applying the LCC index on top ranked US and European low cost careers and comparing them with their Middle Eastern counterparts to investigate the difference of the airline website quality in the developed and the developing countries. Finally, since Websites are dynamic and changeable mediums, it would be essential to re-evaluate the sites periodically. This evaluation over time would also shed some light on whether there is a greater divergence or convergence of Web activities.
References 1. Alwahaishi, S., Nehari, A., Snasel, V.: Electronic commerce growth in developing countries: Barriers and challenges. In: NDT ’09: Proceedings of the IEEE International Conference on Networked Digital Technologies, pp. 225–232 (2009) 2. Flightblogbal (ed.): Low-cost carriers from emerging markets struggle to reach profitability (2008), http://www.flightglobal.com/articles/2008/04/21/223158/ low-cost-carriers-from-emerging-markets-struggle-to-reach.html (last accessed October 2009)
Assessing the LCC Websites Quality
565
3. ITB (2009). ITB World Travel Trends Report 2009 (2009), http://www1.messe-berlin.de/vip8_1/website/Internet/Internet/ www.itb-berlin/pdf/Publikationen/worldttr2009.pdf (last accessed October 2009) 4. Otim, S., Grover, V.: An empirical study on web-based services and customer loyalty. European Journal of Information Systems 15, 527–541 (2006) 5. Li, D., Browne, G.J., Chau, P.Y.K.: An empirical investigation of web site use using a commitment-based model. Decision Sciences 37, 427–444 (2006) 6. Alwahaishi, S., Snasel, V., Nehari, A.: Web site assessment in the airline industry: An empirical study of GCC airline companies. In: ICADIWT ’09: Proceedings of the IEEE International Conference on the Applications of Digital Information and Web Technologies, pp. 193–198 (2009) 7. Wikipedia. Low-cost Carrier, http://en.wikipedia.org/wiki/Low-cost_carrier (last accessed August 2009) 8. Xie, Z.C., Barnes, S.J.: Web Site Quality In The Uk Airline Industry: A Longitudinal Examination. The Journal of Computer Information Systems (2008), http://www.allbusiness.com/company-activities-management/ operations-customer/12300931-1.html (retrieved April 1, 2009) 9. Olsina, L., Godoy, D., Lafuente, G.J., Rossi, G.: Specifying quality characteristics and attributes for websites. In: First ICSE Workshop on Web Engineering, Los Angeles, USA (1999) 10. Liu, C., Arnett, K.: Exploring the factors associated with Web site success in the context of electronic commerce. Information and Management 38(1), 23–33 (2000) 11. Loiacono, E.T., Watson, R.T., Goodhue, D.L.: WebQual: A Web Site Quality Instrument. Working Paper 2000-126-0, University of Georgia (2000) 12. Yoo, B., Donthu, N.: Developing a scale to measure the perceived service quality. Quarterly Journal of Electronic Commerce 2(1), 31–47 (2001) 13. Barnes, S.J., Vidgen, R.T.: An integrative approach to the assessment of e-commerce quality. Journal of Electronic Commerce Research 3(3), 114–127 (2002) 14. Kim, S., Stoel, L.: Dimensional hierarchy of retail website quality. Information and Management 41(5), 619–633 (2004) 15. Parasuraman, A., Zeithaml, V.A., Malhotra, A.: E-S-Qual: a multiple-item scales for assessing electronic service quality. Journal of Service Research 7(3), 213–233 (2005) 16. Evans, J.R., King, V.E.: Business-to-business marketing and the World Wide Web: Planning, managing and assessing web sites. Industrial Marketing management 28, 41–50 (1999) 17. Green, D., Pearson, J.M.: Development of a Web site usability instrument based on ISO 9241-11. Journal of Computer Information Systems 47(1), 66–72 (2006) 18. Huang, E.: The acceptance of women-centric websites. Journal of Computer Information Systems 45(4), 75–83 (2005) 19. Tarafdar, M., Zhang, J.: Analysis of critical website characteristics. Journal of Computer Information Systems 46(2), 14–24 (2005) 20. Ethier, J., Hadaya, P., Talbot, J., Cadieux, J.: B2C Web site quality and emotions during online shopping. Information and Management 43(4), 627–639 (2006) 21. Aladwania, A.M., Palvia, P.C.: Developing and validating an instrument for measuring user-perceived Web quality. Information and Management 39(6), 467–476 (2002)
Expediency Heuristic in University Conference Webpage Roslina Mohd Sidek1, Noraziah Ahmad1, Mohamad Fadel Jamil Klaib1, and Mohd Helmy Abd Wahab2 1
Faculty of Computer Systems & Software Engineering, University Malaysia Pahang, Lebuhraya Tun Razak, 26300 Kuantan, Pahang, Malaysia 2 Faculty of Electric & Electronic, University Tun Hussin Onn Malaysia Beg Berkunci 101 Parit Raja, Batu Pahat 86400 Johor Malaysia
Abstract. In this paper, we present a webpage that has been developed based on heuristic elements which refer to the International Conference Software Engineering and Computer Systems for requirement. Heuristic and web base method have been applied in this system with combination of PHP language and MySQL as database system. The objective of this paper is the development of a webpage that has been developed based on heuristic elements which refer to university conference is to simplify and get quality webpage in using conference webpage for the Faculty of Computer Systems and Software Engineering, University Malaysia Pahang in Malaysia. Besides minimize human contacts thus provide fast, efficient, transparent and effective service to author, admin and reviewer. Therefore this system can facilitate user by computerized all the form accordance to paper submission, review papers, download paper, assign reviewer through heuristic online university conference webpage. Keywords: Conference, webpage, reviewer, software, heuristic.
1 Introduction Conference is one of the events very important for all academicians. Nowadays, all the conference process is computerized and can be viewed by the whole world. Information and communication technology (ICT) helps research industry such as facilitate integration of various processes in the conference, standardization of information and faster and fewer flow of information. To set up a knowledge database for the conference would require a huge amount of resources especially in the application of information and communication technology. To assist in the process, the utilization of ICT and automated software can provide efficiency and effective solutions to the problems of mass data and information handling [1],[2]. Object-oriented software engineering methodology the idea object model for the business relates to the use case model of the supporting information system [2]. One of the changes that have to make in conference webpage is submission of paper, viewing paper for reviewer, and payment of registration. Before exist the expediency heuristic of conference webpage, organization develop the webpage without consider the usability of webpage. F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 566–576, 2010. © Springer-Verlag Berlin Heidelberg 2010
Expediency Heuristic in University Conference Webpage
567
In this paper, we present a web based system for university conference webpage facilitates users by computerized all the forms accordance to the paper submission, paper format, download paper for reviewer, through online. Heuristic is applied in confirming the quality of webpage is under consideration which is in-term of usability of the webpage and software engineering method are deployed while developing this webpage. This webpage has been developed for Faculty of Computer Systems and Software Engineering, University Malaysia Pahang conference. This webpage helps user in manage to create an easy upload and downloading system for paper sending or reviewing. This website is also developed to make the user feel comfortable with the interface and design used. Using programming language PHP and MySQL database, this system is an online system, where user sharing their data using internet. Other software had been used include XAMPP as Apache Web Server to support the database, Acrobat Professional to design the interface for the form and Microsoft Word and Microsoft Project to make documents.
2 Literature Review This section presents the related concept, conventional webpage and existing used webpage in Malaysia. 2.1 Existing System University Technology and Engineering Malaysia conference website, Malaysian Technical Universities Conference on Engineering and Technology, MUCET 2010 [3] built with very clear background. The menus are located at the left side of the page. The user can know the latest information about the conference as it has the announcement at the top of the page. Figure 1 shows the main page for MUCET 2010. Meanwhile, the International Conference on Software Engineering and Computer Systems, ICSECS 2009 website [4] has a very informative header. The menus are at the left side of the page. This website is only uses two colors for the background which are grey and white. Figure 2 shows the main page for ICSECS 09. Based on the websites, the system that is built has clear menu at the left side of the screen shows the visibility to the user that uses the system. The font and the language that are used are consistent and all easy to understand by the first user that uses the system. No computer or jargon language are being used as the system not only focus on the user that are expert in the computer and technical field, but also to ease the user that is still searching for the information or the idea for their project or study. The webpage can track and save the person’s id so that they will no need to type and key-in it again every time they log in into this website. As mentioned before, the menu is at the left side. The menu will change the color if the user has been clicked it. It shows that the user does not need to remember which link they has visited and reduce the memory load. This webpage is design to prevent and ease to use. The author of the conference does not need to convert their paper into certain format before upload and submit it via email. The system prompts the user to fill the online form and upload the paper straight from their computer. The system will save the paper in database.
568
R.M. Sidek et al.
Fig. 1. Main Page MUCET
The background color, which is being used in the system, is black and white. It is built in this way to ease the user and prevent the eyes constraint. The old ICSECS in Figure 2 webpage has striking orange-yellow header to attract the eyes of the user and make the interface looks more attractive. But, in the registration page the page is in different windows and the font is small. So, user feels difficult to read the text. In the page also no tooltips or description appears. Tooltips used to give and guide user about the control. The page also is sometimes too plain.
3 Methodology Development Rapid Application Development (RAD) is one of methodology development lifecycle designed to give much faster development and higher quality than the traditional lifecycle [6]. RAD consists of four phases as shown in Figure 3. RAD is suitable in developing ICSECS website because: Time constraint: Request from the users of the system estimate the development of the system between three to four months. Since submission of the paper and reviewing paper is main thing in order to get approval for publishing of paper, authors needs
Expediency Heuristic in University Conference Webpage
569
Fig. 2. Main Page of old ICSECS
Fig. 3. Rapid Application Development (RAD)
the webpage develop as fast as possible to ease the process and activities submission and approval of paper. RAD methodology design for a faster development and higher quality because RAD model provides “high-speed” development process. Medium size of system: ICSECS website is developing for conference organizer in Faculty of Computer Systems and Software Engineering and people that involved in the conference organizer such as reviewer, author and participant. Ability to identify and repair problems in early stage: In developing this webpage, developer and user are worked together to get the best result of the system, therefore, if user needs changes on the requirement, developer will do the changes at any stages
570
R.M. Sidek et al.
to fulfill the user requirement. This method improved working relationship and trust between developers and clients in order to get the best system result and the system meets the user requirement.
4 Analysis Several analyses have been done before develop this webpage. This analysis includes physical context, technical context, organizational context and social and cultural context. Refer Table 1. Table 1. Requirement Analyses
Context Element 1. Physical context
Information University’s Conferences Website can be used in anywhere that has wireless to connect the internet. This system designed with more systematic and easier to understand for first time user and also provide the best conference website in terms of its interface and flow of the system.
2. Technical context
University’s Conferences Website is the web based application that must connect to the internet for browsing it via web browser such as Internet Explorer or Mozilla. This system fully computerized including manage the participant paperwork because it’s embedded their websites with conferences management system (CMS). The website may also be designed to allow small screen devices such as PDA or mobile phone to access.
3. Organizational context
This web based application should have the administrator to organize this system. Moreover this application will be used by anyone that is interested to join and take part in the conference. University’s Conferences Website is developed for conference that only have in Malaysia. This website is for an organizational information system that is used by people outside the organization which the user should register into the website.
Expediency Heuristic in University Conference Webpage
571
Table 1. (continued)
4. Social and cultural context
The conferences website can be accessed in anywhere from all over the world. Beside that’s anyone can use this web application because the language that will be used is very readily understandable.
5 System Design Expediency Heuristic Conference Webpage is more usability than previous development of webpage. Data analysis will be process very fast and safe and the interface is according to the Jakob Nielson Heuristic [5]. But most software is delicate: even the slightest error, such as changing a single bit, can make it crash .Thus, development techniques emphasis on design should be managed correctly to overcome this fragility [6]. System flow describes most basic flow of system. A context diagram shows the system boundary. All external agents and all data flows into and out of the system are shown in one diagram, with the entire system represented as one process. Figure 4 shows the system flow for conference website.
Fig. 4. System Flow for ICSECS
The sketch work for the interface of university conference webpage shows in Figure 5 and Figure 6 . This is the main page of the system. The layout design were followed the standard layout of the website to make sure that all types of user can used it. Navigation menu is located at the right side of the bar. Consistencies of menu button were applied at every page of navigation menu.
572
R.M. Sidek et al.
Fig. 5. Sketch work for Main Page Design
Fig. 6. Sketch work for Admin
User needs only a few minutes to learn how to use and navigate the web page since the orderings are easy to understand. The buttons is consistent in each page. The layout design is simple and color combination is also well design where its keep the concept of simplicity. We reduce the concept of the scroll-bar to make user much easier to navigate the page and get the information.
6 Interface Design In this phase, the method of the interface design detail explained. Macromedia Dreamweaver 8 has been used as a tool in designing the interface for conference website. Interface plays the important role in interaction between human and computer. Good interface design makes user feel comfortable while accessing the system. In the conference website several of 10 Heuristic Usability by Jacob Nielsen [5] that suitable in the system has been implemented. Figure 5 shows the webpage of heuristic implemented in ICSECS.
Expediency Heuristic in University Conference Webpage
a)
573
Match between system and the real world New ICSECS is a web based application that developed for conference in Faculty of Computer Systems and Software Engineering, Kuantan Malaysia. The management in the new ICSECS website is more efficiency in managing the submission of form by client. b) Consistency and standards In developing the interface for new ICSECS, consistency and standard of the interface has been carefully designed. Besides, consistency and standard in designing the interface help user feel they monopoly the system. Consistent in color layout, font and icons applied in this system. Figure 7 and Figure 8 show the consistency of menu at the left side. c) Help users recognize, diagnose, and recover from errors When the users of the system did not fill the required fill which is important information, system error massage will be appear to help user from mistake shown in Figure 9.
Fig. 7. Interfaces for New ICSECS homepage
574
R.M. Sidek et al.
Fig. 8. Admin view
Fig. 9. Error message as a guidance to user
d)
Error prevention Error prevention used in order to alert the user when the user makes the mistake. For example user did not login to access the system. Error massage will appear to inform the mistake of user as shows in Figure 10.
Fig. 10. Error message to alert user has to login to access the system
e)
Help and documentation After user fulfills all the information needed, user can print the form as their reference when dealing with the OSC staff.
7 Result The result of this conference web pages are mention in Table 2. The result is refer to the analysis study of the conference webpage result. The analytical study included paper submission, reviewer, and admin job.
Expediency Heuristic in University Conference Webpage
575
Table 2. Evaluation Result Evaluation Metrics Physical / Ergonomic Concern Legibility Audibility Safety in use
Cognitive / Usability concern Fewer errors and easy recovery
Conference Web 98% of the potential user can read the text and view the image with ease. 100% of the potential user thinks that there should be some sound to comfort or to navigate user through the page since the web page doesn’t contain any sound. 95% of the potential user thinks that there is no harm by using this web page and it doesn’t impose any health concerns.
User needs only a few minutes to learn how to use and navigate the web page since the orderings are Easy to use easy to understand Each task can be done separately and easily. Easy to remember how to use Registration are made easy by sending information over the net and not manually. Easy to learn Error rate should be less than 1 in every 50 users for each task. The steps to use the page are very easy to be remembered. Easy error recovery for every single error occurred. Effective, Emotional, Intrinsic Motivation Concern Aesthetically pleasing 98% of tested users should have 4 out of 5 with 5 Engaging the highest for the below ratings: Aesthetic, Enjoyable, Engaging, ,Satisfactory Trustworthy 90% of the tested users trust this conference web Satisfying page for credit card use, information and password security. Enjoyable No unnecessary anxieties impose by the interface Entertaining such as user needs to complete the registration or file uploading in 10 seconds. Fun The interface is quite interesting and creates a calm situation for the user to use the web page. Extrinsic Motivation / Usefulness Concern Support individual’s tasks User only can get the topics when logged into the system. Can do some tasks that would not so The events update and change of date can be without the system viewed by the user without need to contact the person in charge. Extend one’s capability
576
R.M. Sidek et al.
8 Conclusion ICSECS webpage is developed for conference in Faculty of Computer Systems and Software Engineering in Kuantan. Basically, this system used Rapid Application Development (RAD) methodology during developing this system. Consist of four phases which are requirement planning, user design, construction and cutover, this method is suitable implement for developing this system because the size and scope for the system is medium and developed for staff in OSC and client of the OSC. By using RAD methodology, this system developed to solve the problem occurs in manual or current operation in submission of application in building plan. This system enable in helps user manage their times in submission the application, the approval times and reduce the usage of form by using electronic form.
Acknowledgement A very thankful to the reviewer to review this paper and give their comments . A big thankful too to Abang Fairul Syarmil Abang Mohammad, Nurul Ain Ramli, Siti Nur Azwani Ismail, Siti Hanisah Majid, and Farhah Mohd Hajari to support during this web development.
References [1] Daniel, L.M.: Metrics for evaluating the quality of entity relationship models. LNCS, pp. 211–225 (1998), doi:10.1007/b68220 [2] Thomas, M., Connolly, C.E.: Database Systems: A Practical Approach to Design, Implementation and Management, 4th edn. International Computer Science Ser. Pearson, London (2004) ISBN-13: 9780321210258 [3] Malaysian Technical Universities Conference on Engineering and Technology Mucet (2010), http://www.utem.edu.my/web/index.php?option=com_content&task =view&id=280&Itemid=160 (March 2, 2010) [4] International Conference on Software Engineering and Computer Systems, ICSECS 2009 (2009), http://icsecs.ump.edu.my/ (March 2, 2010) [5] Nielson, J.: Heuristic Evaluation (2005) ISSN 1548-5552, http://www.useit.com/papers/heuristic/ [6] Shelly, Chasman, Rosenblatt (eds.): System Analysis and Design, 4th edn., United States of America, Course Technology (2001) [7] Roy, P.V.: Self Management and the Future of Software Design. Electronic Notes in Theoretical Computer Science, pp. 201–217 (2007)
Author Index
Abd Alla, Ahmed N. I-314 Abdullah, Shahidan M. I-333 Abd Wahab, Mohd. Helmy II-488 Abd Wahab, Mohd Helmy I-566, II-619 Abo-Hammour, Zaer. S. II-193, II-564 Abu-Kamel, Abedulhaq II-123 Adamatti, Diana F. II-376 Adib, M. II-28 Agarwal, Ajay II-321 Ahmad, Noraziah I-566, II-466, II-488, II-509, II-619 Akhavan, Peyman II-172 Akyoku¸s, Selim II-715 Al-Abadi, Anwar I-321 Alarabi, Ali II-699 Alavi, Meysam II-594 Albakaa, Ammar II-523 Alboaie, Lenuta I-369 Al-Haj, Ali I-143 Al-Hamami, Alaa II-184 AL-Harby, Fahad I-254 Alhomoud, Adeeb M. I-169 Almarimi, Abdelsalam I-306 Al-Qawasmi, Khaled E. II-184 Al-Salman, Abdul Malik S. II-676 Al-Smadi, Adnan M. II-184, II-193, II-564 Alsmadi, Othman M.K. II-193, II-564 Al-Towiq, Mohammad I-85 Alwahaishi, Saleh I-556 Al-Zabin, Nawal I-321, II-553 Amirat, Yacine II-604 Annanper¨ a, Elina I-410 Anwar-ul-Haq, M. I-297 Aramudhan, M. II-639 Ariwa, Ezendu II-28 Aslan, Bora I-213 ´ Avila, C. S´ anchez I-497 Avramouli, Dimitra II-381 Ayyoub, Belal II-553 Babamir, Faezeh Sadat II-545 Babamir, Seyed Mehrdad II-545 Babamir, Seyed Morteza I-241
B¨ achle, Sebastian II-683 B˘ adic˘ a, Amelia II-402 B˘ adic˘ a, Costin II-402 Bargouthi, Heba I-321 Bartoˇs, Tom´ aˇs II-706 Bashar, Abul II-99 Ba¸s¸cift¸ci, Fatih I-1 Beg, Abul Hashem II-466 Belkacem, Benadda II-366 Ben Brahim, Monia I-183 Bendimerad, Fethi Tarik II-366 Ben Jemaa, Maher I-183 Berkenbrock, Gian Ricardo II-295 Beydoun, Mohammad II-443 Blanchfield, Peter II-523 Borumand Saeid, A. I-163 Borumand Saeid, Arsham II-648 Boufaida, Zizette I-473 Boukerram, Abdellah I-45 Bourret, Christian II-7 Bours, Patrick I-515 Bradshaw, Jeffrey M. I-451 Buccafurri, Francesco I-391 Bulu¸s, Ercan I-213 Burete, Radu II-402 Busch, Christoph I-515 B¨ uy¨ uksara¸co˘ glu, Fatma I-213 Carpentieri, Bruno I-91 Casanova, J. Guerra I-497 Cassens, J¨ org II-533 Ceken, ¸ Cınar ¸ II-36 Ceken, ¸ Ka˘ gan II-36 Chaoui, Allaoua I-343, II-604 Chao, Yu-Chang II-60 Cheng, Ching-I I-467 Chen, Li-Ting I-467 Chen, Yu-Lin I-70 Chen, Z. I-358 Cherifi, Chantal II-80 Condamines, Thierry I-420 Constantinescu, Zoran II-533 Dabbaghian, Mehdi II-545 Dadarlat, Vasile Teodor II-91
578
Author Index
del Pozo, G. Bailador I-497 de Santos Sierra, A. I-497 Didry, Yoann II-430 Djaghloul, Younes I-473 Djouani, Karim I-343, II-604 Dutta, Prabal II-274 Dvorsk´ y, Jiˇr´ı II-656 Elliman, Dave II-523 Elmadani, Ahmed B. I-288 El-Qawasmeh, Eyas I-85, II-676 Embong, Abdullah I-55, II-15 Ert¨ urk, Mehmet Ali II-715 Eskridge, Tom I-451 Fasuga, Radoslav II-203, II-333 Fathian, Mohammad I-135, II-172 Fathi, Leila II-456 Fauzi, Ainul Azila Che II-466 Favetta, Franck II-136 Feltz, Fernand II-430 Fenu, Gianni II-215 Figueroa, Patricia E. II-306 Gafurov, Davrondzhon I-515 Gajdoˇs, Petr I-21, II-333 Ghazanfari, Mehdi I-135 Goldschmidt, Balazs I-529, I-537 Gordjusins, Andris II-417 Granitzer, Michael I-98 Hai, Tao I-128 Hamed, Osama II-123 Hamed, Samer I-321, II-553 Hanna, James I-451 Haraty, Ramzi A. II-443 H¨ arder, Theo II-683 Harrag, Fouzi II-676 Hasan, Yaser II-162 Hassan, Mohammad II-162 Hatamizadeh, Alireza II-545 Hatanaka, Takahiro I-523 Heck, Markus II-1 Hirata, Celso Massaki II-295 Hisamatsu, Hiroyuki I-523 Holub, Libor II-203 Hongwu, Qin I-314 Hori, Yukio II-152 Hsu, Wen-Chiao I-70 Hwang, Yuan-Chu I-383
Iancu, Bogdan II-91 Ibrahim, Hamidah I-151, II-456 Igoshi, Kazuho II-342 Imai, Yoshiro II-152 Jafari, Mostafa I-135, II-172 Jamil Klaib, Mohamad Fadel I-566 Jan´ aˇcek, Jaroslav I-259 Jmaiel, Mohamed I-183 Johari, Ayob II-619 J´ o´zwiak, Ireneusz II-396 Jukic, Nenad I-120 Kadir, Herdawatie Abdul II-619 Kahloul, Laid I-343, II-604 Kaipio, Pekka II-577 Kamala, Mumtaz I-254 Karaahmeto˘ glu, Osman I-213 Karaca, Celal I-1 Karageorgos, Anthony II-381 Karam, Roula II-136 Kasarda, J´ an II-706 Kawada, Nobue II-152 Kawauchi, Kazuyoshi II-152 Khan, Shoab A. I-297 Kilany, Rima II-136 Kim, Hyunsook II-631 Klaib, Mohammad Fadel Jamil II-488, II-509 Klassen, Myungsook II-256 Kondo, Mayumi II-152 Kondorosi, Karoly I-537 Kormaris, Giorgos I-430 Koutsoukos, Xenofon II-281 Kr¨ omer, Pavel I-21 Kruliˇs, Martin II-474 Kuchaki Rafsanjani, M. I-163 Kuchaki Rafsanjani, Marjan II-648 Kucharczyk, Marcin I-228 Labatut, Vincent II-80 Ladan, Mohamad I. II-70 Laurini, Robert II-136 Lax, Gianluca I-391 ´ L´edeczi, Akos II-274, II-281 Lee, Chiw Yi I-151 Lee, Po-Chin I-273 Lee, Sang-Hong I-7 Liao, I-En I-70 Liepins, Martins II-417
Author Index Lim, Joon S. I-7 Liu, Damon Shing-Min Lokman, Abbas Saliimi
I-467 I-31
Maharmeh, Iman I-321 Malik, Asad Waqar I-297 Manaf, Azizah A. I-333 Mansour, Ahmed I-85 Mantar, Hacı A. II-262 Mardukhi, Farhad II-112 Markkula, Jouni I-402, I-410, II-577 Marten, Holger II-503 Masadeh, Shadi I-143 Masamila, Bossi I-175 Ma, Xiuqin I-128 McClean, Sally II-99 Mednis, Artis II-417 Mesleh, Abdelwadood I-321 Mesut, Andac S ¸ ahin I-213 Miki, Rieko II-152 Mirabi, Meghdad II-456 Mishra, Kamta Nath II-699 Miura, Takao II-342 Miˇs´ ak, Stanislav II-656 Miyamoto, Chisei I-504 Mohan, K. II-639 Mohd. Sidek, Roslina II-466, II-488 Mohd Sidek, Roslina II-509, II-619 Mosharrof Hossain Sarker, Md. II-28 Motoyama, Shusaburo I-545 Mtenzi, Fredrick I-175 Muthuraman, Sangeetha II-509 Nagarjun, Bollam II-669 Nakanishi, Isao I-504 Naseri, Hadi I-241 Nauck, Detlef II-99 Nebti, Salima I-45 Negrat, K. I-306 NematBaksh, Naser II-112 Neˇcask´ y, Martin II-706 Nikmehr, Hooman II-594 Norouzi, Ali II-545 Nowrouzi, Reyhane I-241 Ntalos, Georgios II-381 Odeh, Ashraf I-143 Ok, Min-hwan II-145 Othman, Mohamed I-151
Parisot, Olivier II-430 Park, Dae-Hyeon II-46 Park, Duck-shin II-145 Parr, Gerard II-99 Paturi, Nikhila II-256 Pazeto, Tatiana Annoni I-545 Peculea, Adrian II-91 P´erez, Jes´ us A. II-306 Picconi, Massimiliano II-215 Platoˇs, Jan I-21 Prokop, Luk´ aˇs II-656 Prusiewicz, Agnieszka II-226 Purnami, Santi Wulan II-15 Qahwaji, Rami I-254 Qin, Hongwu I-128 Radeck´ y, Michal II-203, II-333 Rezaie, Hamed II-112 Riman, Ahmad Al’ Hafiz II-619 Rohunen, Anna I-402 Rouhani, Saeed I-135, II-172 Sabol, Vedran I-98 Safins, Renads II-587 Said, Jafari I-175 Sakallı, M. Tolga I-213 Sallai, J´ anos II-274, II-281 Samek, Jan II-356 Santhosh Chaitanya, S. II-669 Santucci, Jean-Fran¸cois II-80 Sanudin, Rahmat II-619 Sathish, L. II-669 Schmidt, Guenter II-1 Scotney, Bryan II-99 Seifert, Christin I-98 Selavo, Leo II-417 Seno, Yoshitaka II-152 Shahnasser, Hamid I-112 Shieh, Jiann-Cherng II-239 Shin, Dong-Kun I-7 Shin, Jeong-Hoon II-46 Shioya, Isamu II-342 Siddiqui, Fuzail II-28 Sidek, Roslina Mohd I-566 Sikora, Tadeusz II-656 Silva, Renato Moraes I-545 Simon, Balazs I-529, I-537 ˇ unek, Milan I-15 Sim˚ Smko, Raoof I-306 Sn´ aˇsel, V´ aclav I-21, I-556
579
580
Author Index
Sone, Toshikazu II-152 Spruit, Marco I-430 Stoklosa, Janusz II-496 Strazdins, Girts II-417 Sv´ atek, Vojtˇech I-489 Syed Ahmed, S. II-28 Szczepanik, Michal II-396 Szilv´ asi, S´ andor II-274, II-289 Szyma´ nski, Julian II-248 Tada, Shinobu II-152 Tamisier, Thomas II-430 Tammisto, Teppo I-15 Tanvir Ansari, Md. II-669 Tao, Hai I-314 Tao, Jie II-503 Tapaswi, Shashikala II-669 Tenschert, Axel I-444 Tfaili, Walid II-604 Tinabo, Rose I-175 Tiwari, Lokesh I-112 Tjortjis, Christos II-381 Tolun, Mehmet R. II-36 Tripathi, Arun Kumar II-321 Tsuru, Masato I-197 T¨ uys¨ uz, M. Fatih II-262 Uszok, Andrzej Vacura, Miroslav Vaida, Mircea-F.
I-451 I-489 I-369
Velacso, Miguel I-120 Vera, V. Jara I-497 Vladoiu, Monica II-533 V¨ olgyesi, P´eter II-274, II-289 Volgyesi, P´eter II-281 Wang, Chih-Hung I-273 Wan Mohd, Wan Maseri Binti Wax, J´erˆ ome II-430 Wu, Jin-Neng II-60
I-55
Yaakob, Razali I-151 Yaghob, Jakub II-474 Yamashita, Yoshiyuki I-197 Yang, Bian I-515 Yıldırım, Pınar II-36 Yokoyama, Miho II-152 Yoon, Jong P. I-358 Zaghal, Raid II-123 Zahran, Bilal I-321 Zaim, A. Halim II-715 Zain, Jasni Mohamad I-31, I-128, I-314, II-15 Zain, Jasni Mohd I-55 Zawadzki, Piotr I-234 Zboril, Frantisek II-356 Maciej II-226 Zieba, Zin, Noriyani Mat II-466, II-488 Zi´ olkowski, Bartlomiej II-496